This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
assignments:assignment3 [2015/10/02 12:08] asa [Submission] |
assignments:assignment3 [2015/10/02 12:42] asa |
||
---|---|---|---|
Line 1: | Line 1: | ||
========= Assignment 3: Support Vector Machines ============ | ========= Assignment 3: Support Vector Machines ============ | ||
- | Due: October 20th at 6pm | + | Due: October 16th at 11pm |
===== Part 1: SVM with no bias term ===== | ===== Part 1: SVM with no bias term ===== | ||
Line 7: | Line 7: | ||
Formulate a soft-margin SVM without the bias term, i.e. one where the discriminant function is equal to $\mathbf{w}^{T} \mathbf{x}$. | Formulate a soft-margin SVM without the bias term, i.e. one where the discriminant function is equal to $\mathbf{w}^{T} \mathbf{x}$. | ||
Derive the saddle point conditions, KKT conditions and the dual. | Derive the saddle point conditions, KKT conditions and the dual. | ||
- | Compare it to the standard SVM formulation. | + | Compare it to the standard SVM formulation that was derived in class. |
- | As we discussed in class, SMO-type algorithms for the dual optimize the smallest number of variables at a time, which is two variables. | + | In class we discussed SMO-type algorithms for optimizing the dual SVM. At each step SMO optimizes two variables at a time, which is the smallest number possible. |
- | Is this still the case for the formulation you have derived? | + | Is this still the case for the formulation you have derived? In other words, is two the smallest number of variables that can be optimized at a time? |
Hint: consider the difference in the constraints. | Hint: consider the difference in the constraints. | ||
Line 45: | Line 45: | ||
d1scta_,a.1.1.2 31417:1.0 32645:1.0 39208:1.0 42164:1.0 .... | d1scta_,a.1.1.2 31417:1.0 32645:1.0 39208:1.0 42164:1.0 .... | ||
</code> | </code> | ||
- | The first column is the ID of the protein, the second is the class it belongs to (the values for the class variable are ''a.1.1.2'', which is the given class of proteins, and ''rest'' which is the negative class representing the rest of the database), and the rest of the elements are pairs of the form ''feature_id:value'' - an id of a feature and the value associated with it. | + | The first column is the ID of the protein, the second is the class it belongs to (the values for the class variable are ''a.1.1.2'', which is the given class of proteins, and ''rest'' which is the negative class representing the rest of the database); the remainder consists of elements of the form ''feature_id:value''which provide an id of a feature and the value associated with it. |
This is an extension of the format used by LibSVM, that scikit-learn can read. | This is an extension of the format used by LibSVM, that scikit-learn can read. | ||
- | See a discussion [[http://scikit-learn.org/stable/datasets/#datasets-in-svmlight-libsvm-format | here]]. | + | See a discussion of this format and how to read it [[http://scikit-learn.org/stable/datasets/#datasets-in-svmlight-libsvm-format | here]]. |
We note that the data is very high dimensional since | We note that the data is very high dimensional since | ||
Line 64: | Line 64: | ||
K_{gauss}(\mathbf{x}, \mathbf{x'}) = \exp(-\gamma || \mathbf{x} - \mathbf{x}' ||^2) | K_{gauss}(\mathbf{x}, \mathbf{x'}) = \exp(-\gamma || \mathbf{x} - \mathbf{x}' ||^2) | ||
$$ | $$ | ||
+ | and | ||
$$ | $$ | ||
- | K_{poly}(\mathbf{x}, \mathbf{x'}) = (\mathbf{x}^T \mathbf{x}' + 1) ^{p} | + | K_{poly}(\mathbf{x}, \mathbf{x'}) = (\mathbf{x}^T \mathbf{x}' + 1) ^{p}. |
$$ | $$ | ||
Line 108: | Line 109: | ||
Grading sheet for assignment 2 | Grading sheet for assignment 2 | ||
- | Part 1: 45 points. | + | Part 1: 40 points. |
(10 points): Primal SVM formulation is correct | (10 points): Primal SVM formulation is correct | ||
(10 points): Lagrangian found correctly | (10 points): Lagrangian found correctly | ||
Line 115: | Line 116: | ||
( 5 points): Discussion of the implication of the form of the dual for SMO-like algorithms | ( 5 points): Discussion of the implication of the form of the dual for SMO-like algorithms | ||
- | Part 2: 15 points. | + | Part 2: 10 points. |
Part 3: 40 points. | Part 3: 40 points. |