Due: November 17th at 6pm
Implement a naive Bayes classifier for either categorical or continuous data. Compare its performance to that of an SVM (make sure to perform proper model selection for classifier parameters using internal cross-validation). Use two UCI repository datasets for this task. There are several datasets that have categorical data: e.g. nursery school application ranking, census income prediction, and splice junction detection. If you are implementing naive Bayes for categorical data, make sure to include pseudo-counts to avoid over fitting.
Here is what the grading sheet will look like for this assignment. A few general guidelines for this and future assignments in the course:
Grading sheet for assignment 5 Part 1: 40 points. (14 points): 1st question (13 points): 2nd question (13 points): 3rd question Part 2: 50 points. (10 points): Experimental protocol (20 points): Correct classifier implementation (10 points): Results for the two classifiers on both datasets (10 points): Discussion of the results Report structure, grammar and spelling: 10 points ( 3 points): Heading and subheading structure easy to follow and clearly divides report into logical sections. ( 4 points): Code, math, figure captions, and all other aspects of report are well-written and formatted. ( 3 points): Grammar, spelling, and punctuation.