Warning: Declaration of action_plugin_tablewidth::register(&$controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /s/bach/b/class/cs545/public_html/fall16/lib/plugins/tablewidth/action.php on line 93
assignments:assignment5 [CS545 fall 2016]

User Tools

Site Tools


assignments:assignment5

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
assignments:assignment5 [2015/10/30 19:10]
asa
assignments:assignment5 [2015/10/31 09:07]
asa
Line 1: Line 1:
-========= Assignment 5: Feature selection ​============+======== Assignment 5: Feature selection ===========
  
 Due:  November 15th at 11pm Due:  November 15th at 11pm
  
-In this assignment ​we will compare several feature selection methods on several datasets. + 
-The datasets ​we will use are the yeast gene expression dataset ​+==== Data ==== 
 + 
 +In this assignment ​you will compare several feature selection methods on several datasets. 
 +The first dataset is the [[https://​archive.ics.uci.edu/​ml/​datasets/Arcene| Arcene]] dataset which was used in the 2003 NIPS feature selection competition. ​ The dataset is produced by mass spectrometry of biological samples that comes from different types of cancer. 
 + 
 +The second dataset describes the expression of human genes in two types of leukemia The original publication that describes the data: 
 + 
 +T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander.  
 +[[https://​www.broadinstitute.org/​mpr/​publications/​projects/​Leukemia/​Golub_et_al_1999.pdf | Molecular classification of cancer: class discovery and class prediction by gene expression ​monitoring]].  
 +Science, 286(5439):​531,​ 1999. 
 + 
 +Download a processed version of the dataset ​in libsvm format from the [[https://​www.csie.ntu.edu.tw/​~cjlin/​libsvmtools/​datasets/​binary.html | libsvm data repository]]. ​ Look for the dataset named "​leukemia"​. ​ There are two files, one a training set and another which contains a test set.  Merge the two files into a single file for your experiments.
  
 ===== Part 1:  Filter methods ===== ===== Part 1:  Filter methods =====
assignments/assignment5.txt · Last modified: 2016/10/18 09:18 by asa