This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
assignments:assignment5 [2015/10/31 09:07] asa |
assignments:assignment5 [2015/10/31 09:24] asa |
||
---|---|---|---|
Line 41: | Line 41: | ||
Compare the accuracy of an L1 SVM to an SVM that uses RFE to select relevant features. | Compare the accuracy of an L1 SVM to an SVM that uses RFE to select relevant features. | ||
- | Compare the accuracy of a regular L2 SVM trained on those features with an L2 SVM trained on all the features computed using 5-fold cross-validation. | + | Compare the accuracy of a regular L2 SVM trained on the features selected by the L1 SVM with the accuracy of an L2 SVM trained on all the features (compute accuracy using 5-fold cross-validation). |
It has been argued in the literature that L1-SVMs often leads to solutions that are too sparse. As a workaround, implement the following strategy: | It has been argued in the literature that L1-SVMs often leads to solutions that are too sparse. As a workaround, implement the following strategy: | ||
Line 47: | Line 47: | ||
* Create $k$ sub-samples of the data in which you randomly choose 80% of the examples. | * Create $k$ sub-samples of the data in which you randomly choose 80% of the examples. | ||
* For each sub-sample train an L1-SVM. | * For each sub-sample train an L1-SVM. | ||
- | * For each feature compute a score that is the average weight vector | + | * For each feature compute a score that is the number of sub-samples for which that feature yielded a non-zero score. |
+ | ===== Part 3: Method comparison ===== | ||
+ | |||
+ | Compute the accuracy of a Linear L2 SVM as a function of the number of selected features on the leukemia and Arcene datasets for the following feature selection methods: | ||
+ | |||
+ | * The Golub score | ||
+ | * L1-SVM feature selection using subsamples | ||
+ | * RFE-SVM | ||
+ | |||
+ | Make sure that your evaluation provides an un-biased estimate of classifier performance. | ||
+ | Comment on the results. | ||
+ | |||
+ | For the above experiment you do not need to select the optimal value for the SVM soft-margin constant. | ||
+ | Compare these results to results obtained using internal cross-validation for selecting | ||
+ | the soft margin constant $C$ over a grid of values. | ||
+ | |||
+ | In writing your code, use scikit-learn's ability to combine analysis steps using the [[http://scikit-learn.org/stable/modules/pipeline.html |Pipeline class]]. This will be particularly useful for performing model selection. | ||
- | Do your results change if you do model selection for the resulting classifier over a grid of values for the soft margin constant $C$? | ||