This is an old revision of the document!
First let's import some modules and read in some data:
In [1]: import numpy as np In [2]: from sklearn import cross_validation In [3]: from sklearn import svm In [4]: from sklearn import metrics In [5]: data=np.genfromtxt("../data/heart_scale.data", delimiter=",") In [6]: X=data[:,1:] In [7]: y=data[:,0]
The simplest form of model evaluation uses a validation/test set:
In [9]: X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.4, random_state=0) In [10]: classifier = svm.SVC(kernel='linear', C=1).fit(X_train, y_train) In [11]: classifier.score(X_test, y_test) Out[11]: 0.7592592592592593
Next, let'd perform cross-validation:
In [12]: scores = cross_validation.cross_val_score(classifier, X, y, cv=5, scoring='accuracy') In [13]: In [13]: scores = cross_validation.cross_val_score(classifier, X, y, cv=5, scoring='roc_auc') In [14]: # you can also obtain the predictions by cross-validation and then compute the accuracy: In [15]: y_predict = cross_validation.cross_val_predict(classifier, X, y, cv=5) In [16]: metrics.accuracy_score(y, y_predict) Out[16]: 0.83703703703703702