This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
assignments:assignment3 [2013/10/04 20:51] asa |
assignments:assignment3 [2013/10/04 21:06] asa |
||
---|---|---|---|
Line 14: | Line 14: | ||
When using this SVM formulation it may be useful to add a constant to the | When using this SVM formulation it may be useful to add a constant to the | ||
kernel matrix. Explain why this can be beneficial. | kernel matrix. Explain why this can be beneficial. | ||
+ | |||
+ | |||
+ | ===== Part 3: Using the SVM ===== | ||
+ | |||
+ | Download the dataset associated with this assignment from the homework | ||
+ | page of the course. | ||
+ | In this assignment we will explore the dependence of classifier accuracy on | ||
+ | the kernel, kernel parameters, kernel normalization, and SVM parameter. | ||
+ | The use of the SVM class is discussed in the PyML [[http://pyml.sourceforge.net/tutorial.html#svms|tutorial]]. | ||
+ | |||
+ | By default a dataset is instantiated with a linear kernel attached to it. | ||
+ | To use a different kernel you need to attach a new kernel to the dataset: | ||
+ | <code python> | ||
+ | >>> from PyML import ker | ||
+ | >>> data.attachKernel(ker.Gaussian(gamma = 0.1)) | ||
+ | </code> | ||
+ | or | ||
+ | <code python> | ||
+ | >>> from PyML import her | ||
+ | >>> data.attachKernel(ker.Polynomial(degree = 3)) | ||
+ | </code> | ||
+ | In this question we will consider both the Gaussian and polynomial kernels: | ||
+ | $$ | ||
+ | K_{gaus}(\mathbf{x}, \mathbf{x'} = \exp(-\gamma || \mathbf{x} - \mathbf{x}' ||^2) | ||
+ | $$ | ||
+ | $$ | ||
+ | K_{poly}(\mathbf{x}, \mathbf{x'} = (1 + \mathbf{x}^T \mathbf{x}') ^{p} | ||
+ | $$ | ||
+ | Plot the accuracy of the classifier, measured using the success rate and the area under the ROC curve | ||
+ | as a function of both the ridge parameter of the classifier, and the free parameter | ||
+ | of the kernel function. | ||
+ | Show a couple of representative cross sections of this plot for a given value | ||
+ | of the ridge parameter, and for a given value of the kernel parameter. | ||
+ | Comment on the results. When exploring the values of a continuous | ||
+ | classifier/kernel parameter it is | ||
+ | useful use values that are distributed on an exponential grid, | ||
+ | i.e. something like 0.01, 0.1, 1, 10, 100 (note that the degree of the | ||
+ | polynomial kernel is not such a parameter). | ||
+ | |||
+ | The data for this question comes from a database called SCOP (structural | ||
+ | classification of proteins), which classifies proteins into classes | ||
+ | according to their structure. The data is a two-class classification | ||
+ | problem | ||
+ | of distinguishing a particular class of proteins from a selection of | ||
+ | examples sampled from the rest of the SCOP database. | ||
+ | I chose to represent the proteins in | ||
+ | terms of their motif composition. A sequence motif is a | ||
+ | pattern of nucleotides/amino acids that is conserved in evolution. | ||
+ | Motifs are usually associated with regions of the protein that are | ||
+ | important for its function, and are therefore useful in predicting protein | ||
+ | function. | ||
+ | A given protein will typically contain only a handful of motifs, and | ||
+ | so the data is very sparse. It is also very high dimensional, since | ||
+ | the number of conserved patterns in the space of all proteins is | ||
+ | large. | ||
+ | More information about motifs and their usefulness in classifying | ||
+ | proteins can be found in the following paper: | ||
+ | |||
+ | * A. Ben-Hur and D. Brutlag. Protein sequence motifs: Highly predictive features of protein function. In: Feature extraction, foundations and applications. I. Guyon, S. Gunn, M. Nikravesh, and L. Zadeh (eds.) Springer Verlag, 2006. | ||
+ | |||
+ | For this type of sparse dataset it is useful to normalize the input features before | ||
+ | training and testing your classifier. | ||
+ | One way to do so is to divide each input example by its norm. This is | ||
+ | accomplished in PyML by: | ||
+ | <code python> | ||
+ | data.normalize() | ||
+ | </code> | ||
+ | Compare the results under this normalization with what you obtain | ||
+ | without normalization. | ||
+ | |||
+ | You can visualize the whole kernel matrix associated with the data using the following commands: | ||
+ | <code python> | ||
+ | >>> from PyML import ker | ||
+ | >>> ker.showKernel(data) | ||
+ | </code> | ||
+ | Explain the structure that you are seeing in the plot (it is more | ||
+ | interesting when the data is normalized). | ||
+ |