This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
assignments:assignment5 [2015/10/29 14:50] asa |
assignments:assignment5 [2015/10/30 12:53] asa |
||
---|---|---|---|
Line 26: | Line 26: | ||
Run the L1-SVM on the datasets mentioned above. | Run the L1-SVM on the datasets mentioned above. | ||
- | In scikit-learn use ''LinearSVC(penalty='l1', loss='hinge')'' to create one. | + | In scikit-learn use ''LinearSVC(penalty='l1', dual=False)'' to create one. |
- | How many features have non-zero weight vector coefficients? Compare the accuracy of a regular L2 SVM trained on those features with an L2 SVM trained on all the features computed using 5-fold cross-validation. | + | How many features have non-zero weight vector coefficients? (Note that you can obtain the weight vector of a trained SVM by looking at its ''coef0'' attribute. |
+ | Compare the accuracy of an L1 SVM to an SVM that uses RFE to select relevant features. | ||
- | L1-SVMs often leads to solutions that are too sparse. As a workaround, implement the following strategy: | + | Compare the accuracy of a regular L2 SVM trained on those features with an L2 SVM trained on all the features computed using 5-fold cross-validation. |
+ | |||
+ | It has been argued in the literature that L1-SVMs often leads to solutions that are too sparse. As a workaround, implement the following strategy: | ||
* Create $k$ sub-samples of the data in which you randomly choose 80% of the examples. | * Create $k$ sub-samples of the data in which you randomly choose 80% of the examples. | ||
- | * For each sub-sample train an L1-SVM | + | * For each sub-sample train an L1-SVM. |
+ | * For each feature compute a score that is the average weight vector | ||