Warning: Declaration of action_plugin_tablewidth::register(&$controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /s/bach/b/class/cs545/public_html/fall16/lib/plugins/tablewidth/action.php on line 93
assignments:assignment5 [CS545 fall 2016]

User Tools

Site Tools


assignments:assignment5

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
assignments:assignment5 [2015/10/31 09:24]
asa
assignments:assignment5 [2015/10/31 09:36]
asa
Line 4: Line 4:
  
  
-==== Data ====+===== Data =====
  
 In this assignment you will compare several feature selection methods on several datasets. In this assignment you will compare several feature selection methods on several datasets.
Line 26: Line 26:
 where $\mu_i^{(+)}$ is the average of feature $i$ in the positive examples, ​ where $\mu_i^{(+)}$ is the average of feature $i$ in the positive examples, ​
 where $\sigma_i^{(+)}$ is the standard deviation of feature $i$ in the positive examples, and $\mu_i^{(-)},​ \sigma_i^{(-)}$ are defined analogously for the negative examples. where $\sigma_i^{(+)}$ is the standard deviation of feature $i$ in the positive examples, and $\mu_i^{(-)},​ \sigma_i^{(-)}$ are defined analogously for the negative examples.
-In order for your function to work with the scikit-learn filter framework it needs to have two parameters: ''​golub(X,​ y)'',​ where X is the feature matrix, and y is a vector of labels. ​ All scikit-learn filter methods return two values - a vector of scores, and a vector of p-values. ​ For our purposes, we won't use p-values associated with the Golub scores, so just return the computed vector of scores twice:  ​if your vector of scores is stored in an array called scores, have the return statement be: +In order for your function to work with the scikit-learn filter framework it needs to have two parameters: ''​golub(X,​ y)'',​ where X is the feature matrix, and y is a vector of labels. ​ All scikit-learn filter methods return two values - a vector of scores, and a vector of p-values. ​ For our purposes, we won't use p-values associated with the Golub scores, so just return the computed vector of scores twice (''​return scores,​scores'' ​if your vector of scores is stored in an array called scores)
- +
-''​return scores,​scores''​ +
- +
  
 ===== Part 2:  Embedded methods: ​ L1 SVM ===== ===== Part 2:  Embedded methods: ​ L1 SVM =====
Line 88: Line 84:
 Grading sheet for assignment 3 Grading sheet for assignment 3
  
-Part 1:  ​40 points. +Part 1:  ​15 points. 
-(10 points):  ​Primal SVM formulation is correct +(15 points):  ​Correct implementation ​of the Golub score
-( 7 points): ​ Lagrangian found correctly +
-( 8 points): ​ Derivation of saddle point equations +
-(10 points): ​ Derivation of the dual +
-( 5 points): ​ Discussion of the implication of the form of the dual for SMO-like algorithms+
  
-Part 2:  ​10 points.+Part 2:  ​35 points
 +(15 points): ​ Comparison of L1 chosen features with use of all features. 
 +(20 points): ​ Correct implementation of L1-SVM feature selection using sub-samples.
  
 Part 3:  40 points. Part 3:  40 points.
-(20 points): ​ Accuracy as a function of parameters ​and discussion of the results +(25 points): ​ Accuracy as a function of number of features ​and discussion of the results 
-(15 points):  ​Comparison of normalized and non-normalized kernels and correct ​model selection +(15 points):  ​Same, with model selection
-( 5 points): ​ Visualization of the kernel matrix and observations made about it+
  
 Report structure, grammar and spelling: ​ 10 points Report structure, grammar and spelling: ​ 10 points
assignments/assignment5.txt · Last modified: 2016/10/18 09:18 by asa