Warning: Declaration of action_plugin_tablewidth::register(&$controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /s/bach/b/class/cs545/public_html/fall16/lib/plugins/tablewidth/action.php on line 93
assignments:assignment5 [CS545 fall 2016]

User Tools

Site Tools


assignments:assignment5

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
assignments:assignment5 [2013/11/04 20:18]
asa created
assignments:assignment5 [2016/10/17 19:20]
asa
Line 1: Line 1:
-========= Assignment 5: Naive Bayes ============+~~NOTOC~~
  
-Due ​November 17th at 6pm+======== Assignment 5Neural networks ===========
  
-===== Part 1:  ​A few short questions about naive Bayes =====+Due:  ​October 31st at 11:59pm
  
 +===== Part 1:  Multi-layer perceptrons =====
  
-  ​Can you use naive Bayes for data that contains ​both categorical ​and real-valued features? +In the first few slides about neural networks (also section 7.1 in chapter e-7) we discussed the expressive power of multi-layer perceptrons with a "​sign"​ activation function. ​ Describe in detail a multi-layer perceptron that implements the following decision boundary: 
-  - The basic assumption ​in naive Bayes is that all attributes are independent given the label.  ​How can you model just 2 of $d$ features as dependent+ 
-  - Given trained naive Bayes classifier, and without access ​to the training datahow would you select ​subset of features ​that are most predictive of the class label?+{{ :​assignments:​boundary.png?​200 |}} 
 + 
 + 
 +===== Part 2:  Exploring neural networks for digit classification ===== 
 + 
 +In this segment of the assignment we will explore classification of handwritten digits with neural networks. ​ For that task, we will use part of the [[http://​yann.lecun.com/​exdb/​mnist/​ |MNIST]] dataset, which is very commonly used in the machine learning community. 
 +Your task is to explore various aspects of multi-layer neural networks using this dataset. 
 + 
 +For simplicity, use 25 percent of the data for evaluating network performance,​ and the rest reserve for training. ​  
 +Normalize the data by dividing the features by the maximum value, which will normalize them to the range [0,1] (since the minimum is 0). 
 +As a basis for your implementation use the neural network code I showed in class. 
 + 
 +Here's what you need to do: 
 + 
 +  * Plot network accuracy as a function of the number of hidden units for a single-layer network with a logistic activation function. ​ Use a range of values where the network displays ​both under-fitting ​and over-fitting. 
 +  ​* Plot network accuracy as a function of the number of hidden units for a two-layer network with a logistic activation function. ​ Here, also demonstrate performance ​in a range of values where the network exhibits both under-fitting and over-fitting.  ​Does this dataset benefit from the use of more than one layer
 +  ​* Add weight decay regularization to the neural network class you used (explain in your report how you did it).  Does the network demonstrate less over-fitting on this dataset with the addition of weight decay? 
 +  * The provided implementation uses the same activation function in each layer. ​ For solving regression problems we need to use linear activation function to produce the output of the network. ​ Explain why, and what changes need to be made in the code. 
 + 
 +The code that was provided does not really have a bias for all but the first layer. ​ For 5 extra pointsmodify the code so that it correctly uses bias for all layers. 
 + 
 + 
 + 
 +===== Submission ===== 
 + 
 +Submit your report via Canvas. ​ Python code can be displayed in your report if it is short, and helps understand what you have done. The sample LaTex document provided in assignment 1 shows how to display Python code.  Submit the Python code that was used to generate ​the results as a file called ''​assignment3.py''​ (you can split the code into several .py files; Canvas allows you to submit multiple files). ​ Typing  
 + 
 +<​code>​ 
 +$ python assignment4.py 
 +</​code>​ 
 +should generate all the tables/​plots used in your report.  ​
  
-===== Part 2:  naive Bayes implementation ===== 
  
-Implement a naive Bayes classifier for either categorical or continuous data.  Compare its performance to that of an SVM (make sure to perform proper model selection for classifier parameters using internal cross-validation). ​ Use two UCI repository datasets for this task.  There are several datasets that have categorical data: e.g. [[http://​archive.ics.uci.edu/​ml/​datasets/​Nursery | nursery school application ranking]], [[http://​archive.ics.uci.edu/​ml/​datasets/​Adult | census income prediction]],​ and [[http://​archive.ics.uci.edu/​ml/​datasets/​Molecular+Biology+(Splice-junction+Gene+Sequences)| splice junction detection]]. ​ If you are implementing naive Bayes for categorical data, make sure to include pseudo-counts to avoid over fitting. 
  
  
 ===== Grading ===== ===== Grading =====
  
-Here is what the grading sheet will look like for this assignment.  ​A few general guidelines for this and future assignments in the course:+A few general guidelines for this and future assignments in the course:
  
-  * Always provide a description of the method you used to produce a given result in sufficient detail such that the reader can reproduce your results on the basis of the description. ​ You can use a few lines of python code or pseudo-code.  If your code is more than a few lines, you can include it as an appendix to your report. ​ For example, for the first part of the assignment, provide the protocol you use to evaluate classifier accuracy.+  ​* Your answers should be concise and to the point. ​  
 +  * You need to use LaTex to write the report. 
 +  * The report is well structured, the writing is clear, with good grammar and correct spelling; good formatting of math, code, figures and captions (every figure and table needs to have a caption that explains what is being shown). 
 +  * Whenever you use information from the web or published papers, a reference should be provided. ​ Failure to do so is considered plagiarism. 
 +  ​* Always provide a description of the method you used to produce a given result in sufficient detail such that the reader can reproduce your results on the basis of the description. ​ You can use a few lines of python code or pseudo-code.
   * You can provide results in the form of tables, figures or text - whatever form is most appropriate for a given problem. ​ There are no rules about how much space each answer should take.  BUT we will take off points if we have to wade through a lot of redundant data.   * You can provide results in the form of tables, figures or text - whatever form is most appropriate for a given problem. ​ There are no rules about how much space each answer should take.  BUT we will take off points if we have to wade through a lot of redundant data.
-  * In any machine learning paper there is a discussion of the results. ​ There is a similar expectation from your assignments that you reason about your results. ​ For example, for the learning curve problem, what can you say on the basis of the observed learning curve?+  * In any machine learning paper there is a discussion of the results. ​ There is a similar expectation from your assignments that you reason about your results. 
 + 
 +We will take off points if these guidelines are not followed.
  
 <​code>​ <​code>​
-Grading sheet for assignment 5+Grading sheet for assignment ​
 + 
 +Part 1:  40 points. 
 +points): ​ Primal SVM formulation is correct 
 +(10 points): ​ Lagrangian found correctly 
 +(10 points): ​ Derivation of saddle point equations 
 +(15 points): ​ Derivation of the dual 
 + 
 +Part 2:  15 points.
  
-Part 1:  ​30 points. +Part 2:  ​15 points.
-(10 points): ​ 1st question +
-(10 points): ​ 1st question +
-(10 points): ​ 1st question+
  
-Part 2:  ​50 points. +Part 3:  ​30 points. 
-(10 points):  ​Experimental protocol +(15 points):  ​Accuracy as a function of parameters and discussion of the results 
-(20 points): ​ Correct classifier implementation +(10 points):  ​Comparison of normalized and non-normalized kernels and correct model selection 
-(10 points):  ​Results for the two classifiers on both datasets +points):  ​Visualization ​of the kernel matrix and observations made about it
-(10 points):  ​Discussion ​of the results+
  
-Report structure, grammar and spelling: ​ 10 points 
-( 3 points): ​ Heading and subheading structure easy to follow and 
-              clearly divides report into logical sections. 
-( 4 points): ​ Code, math, figure captions, and all other aspects of  ​ 
-              report are well-written and formatted. 
-( 3 points): ​ Grammar, spelling, and punctuation. 
 </​code>​ </​code>​
assignments/assignment5.txt · Last modified: 2016/10/18 09:18 by asa