This is an old revision of the document!
Formulate a soft-margin SVM without the bias term, i.e. $f(\x) = \w^{\tr} \x$. Derive the saddle point conditions, KKT conditions and the dual. Compare it to the standard SVM formulation. What is the implication of the difference on the design of SMO-like algorithms? Recall that SMO algorithms work by iteratively optimizing two variables at a time. Hint: consider the difference in the constraints.
Discuss the merit of the bias-less formulation as the dimensionality of the data (or the feature space) is varied. When using this SVM formulation it may be useful to add a constant to the kernel matrix. Explain why this can be beneficial.