Differences

This shows you the differences between two versions of the page.

--- assignments:assignment5 [2015/10/31 09:24]
asa
+++ assignments:assignment5 [2015/11/16 16:27]
asa [Part 2: Embedded methods: L1 SVM]
@@ Line 4: / Line 4: @@
-==== Data ====
+===== Data =====
 In this assignment you will compare several feature selection methods on several datasets.
@@ Line 26: / Line 26: @@
 where $\mu_i^{(+)}$ is the average of feature $i$ in the positive examples,
 where $\sigma_i^{(+)}$ is the standard deviation of feature $i$ in the positive examples, and $\mu_i^{(-)}, \sigma_i^{(-)}$ are defined analogously for the negative examples.
-In order for your function to work with the scikit-learn filter framework it needs to have two parameters: ''golub(X, y)'', where X is the feature matrix, and y is a vector of labels.  All scikit-learn filter methods return two values - a vector of scores, and a vector of p-values.  For our purposes, we won't use p-values associated with the Golub scores, so just return the computed vector of scores twice:  if your vector of scores is stored in an array called scores, have the return statement be:
+In order for your function to work with the scikit-learn filter framework it needs to have two parameters: ''golub(X, y)'', where X is the feature matrix, and y is a vector of labels.  All scikit-learn filter methods return two values - a vector of scores, and a vector of p-values.  For our purposes, we won't use p-values associated with the Golub scores, so just return the computed vector of scores twice (''return scores,scores'' if your vector of scores is stored in an array called scores)
-''return scores,scores''
 ===== Part 2:  Embedded methods:  L1 SVM =====
@@ Line 38: / Line 34: @@
 Run the L1-SVM on the datasets mentioned above.
 In scikit-learn use ''LinearSVC(penalty='l1', dual=False)'' to create one.
-How many features have non-zero weight vector coefficients?  (Note that you can obtain the weight vector of a trained SVM by looking at its ''coef0'' attribute.
+How many features have non-zero weight vector coefficients?  (Note that you can obtain the weight vector of a trained SVM by looking at its ''coef0_'' attribute.
-Compare the accuracy of an L1 SVM to an SVM that uses RFE to select relevant features.
-Compare the accuracy of a regular L2 SVM trained on the features selected by the L1 SVM with the accuracy of an L2 SVM trained on all the features (compute accuracy using 5-fold cross-validation).
+Compare the accuracy of the following approaches using cross-validation on the two datasets:
+   * L1 SVM
+   * L2 SVM trained on the features selected by the L1 SVM
+   * L2 SVM trained on all the features
+   * L2 SVM that uses RFE (with an L2-SVM) to select relevant features; use the class ''RFECV'' which automatically selects the number of features.
 It has been argued in the literature that L1-SVMs often leads to solutions that are too sparse.  As a workaround, implement the following strategy:
-  * Create $k$ sub-samples of the data in which you randomly choose 80% of the examples.
+  * Create $k$ sub-samples of the training data.  For each sub-sample randomly choose a subset consisting of 80% of the training examples.
   * For each sub-sample train an L1-SVM.
-  * For each feature compute a score that is the number of sub-samples for which that feature yielded a non-zero score.
+  * For each feature compute a score that is the number of sub-samples for which that feature yielded a non-zero weight vector coefficient.
+In the next part of the assignment you will compare this approach to RFE and the Golub filter method that you implemented in part 1.
 ===== Part 3:  Method comparison =====
@@ Line 72: / Line 72: @@
 ===== Submission =====
-Submit the pdf of your report via Canvas.  Python code can be displayed in your report if it is succinct (not more than a page or two at the most) or submitted separately.  The latex sample document shows how to display Python code in a latex document.  Code needs to be there so we can make sure that you implemented the algorithms and data analysis methodology correctly.  Canvas allows you to submit multiple files for an assignment, so DO NOT submit an archive file (tar, zip, etc).  Canvas will only allow you to submit pdfs (.pdf extension) or python code (.py extension).
+Submit the pdf of your report and python code via Canvas.  Python code can be displayed in your report if it is succinct (not more than a page or two at the most) or submitted separately.  The latex sample document shows how to display Python code in a latex document.  Code needs to be there so we can make sure that you implemented the algorithms and data analysis methodology correctly.  Canvas allows you to submit multiple files for an assignment, so DO NOT submit an archive file (tar, zip, etc).  Canvas will only allow you to submit pdfs (.pdf extension) or python code (.py extension).
 For this assignment there is a strict 8 page limit (not including references and code that is provided as an appendix).  We will take off points for reports that go over the page limit.
 In addition to the code snippets that you include in your report, make sure you provide complete code from which we can see exactly how your results were generated.
@@ Line 88: / Line 88: @@
 Grading sheet for assignment 3
-Part 1:  40 points.
+Part 1:  15 points.
-(10 points):  Primal SVM formulation is correct
+(15 points):  Correct implementation of the Golub score
-( 7 points):  Lagrangian found correctly
-( 8 points):  Derivation of saddle point equations
-(10 points):  Derivation of the dual
-( 5 points):  Discussion of the implication of the form of the dual for SMO-like algorithms
-Part 2:  10 points.
+Part 2:  35 points.
+(15 points):  Comparison of L1 chosen features with use of all features.
+(20 points):  Correct implementation of L1-SVM feature selection using sub-samples.
 Part 3:  40 points.
-(20 points):  Accuracy as a function of parameters and discussion of the results
+(25 points):  Accuracy as a function of number of features and discussion of the results
-(15 points):  Comparison of normalized and non-normalized kernels and correct model selection
+(15 points):  Same, with model selection
-( 5 points):  Visualization of the kernel matrix and observations made about it
 Report structure, grammar and spelling:  10 points

CS545 fall 2016

User Tools

Site Tools

Differences

Page Tools