This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
code:model_selection [2015/10/05 13:49] asa |
code:model_selection [2015/10/05 13:56] asa |
||
---|---|---|---|
Line 38: | Line 38: | ||
<code python> | <code python> | ||
- | In [12]: scores = cross_validation.cross_val_score(classifier, X, y, cv=5, scoring='accuracy') | + | In [18]: cross_validation.cross_val_score(classifier, X, y, cv=5, scoring='accuracy') |
+ | Out[18]: array([ 0.7962963 , 0.83333333, 0.88888889, 0.83333333, 0.83333333]) | ||
- | In [13]: | + | In [19]: |
- | In [13]: scores = cross_validation.cross_val_score(classifier, X, y, cv=5, scoring='roc_auc') | + | In [19]: # you can obtain accuracy for other metrics, such as area under the roc curve: |
- | In [14]: # you can also obtain the predictions by cross-validation and then compute the accuracy: | + | In [20]: cross_validation.cross_val_score(classifier, X, y, cv=5, scoring='roc_auc') |
+ | Out[20]: array([ 0.89166667, 0.89166667, 0.95833333, 0.87638889, 0.91388889]) | ||
- | In [15]: y_predict = cross_validation.cross_val_predict(classifier, X, y, cv=5) | + | In [21]: |
+ | |||
+ | In [21]: # you can also obtain the predictions by cross-validation and then compute the accuracy: | ||
+ | |||
+ | In [22]: y_predict = cross_validation.cross_val_predict(classifier, X, y, cv=5) | ||
+ | |||
+ | In [23]: metrics.accuracy_score(y, y_predict) | ||
+ | Out[23]: 0.83703703703703702 | ||
- | In [16]: metrics.accuracy_score(y, y_predict) | ||
- | Out[16]: 0.83703703703703702 | ||
</code> | </code> | ||
+ | H ere's an alternative way of doing cross-validation. | ||
+ | |||
+ | <code python> | ||
+ | In [25]: # first divide the data into folds: | ||
+ | |||
+ | In [26]: cv = cross_validation.StratifiedKFold(y, 5) | ||
+ | |||
+ | In [27]: # now use these folds: | ||
+ | |||
+ | In [28]: print cross_validation.cross_val_score(classifier, X, y, cv=cv, scoring='roc_auc') | ||
+ | [ 0.89166667 0.89166667 0.95833333 0.87638889 0.91388889] | ||
+ | |||
+ | In [29]: | ||
+ | |||
+ | In [29]: # you can see how examples were divided into folds by looking at the test_folds attribute: | ||
+ | |||
+ | In [30]: print cv.test_folds | ||
+ | [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | ||
+ | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | ||
+ | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 | ||
+ | 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 | ||
+ | 2 2 2 2 2 2 2 2 2 2 2 2 3 3 2 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 | ||
+ | 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 | ||
+ | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 | ||
+ | 4 4 4 4 4 4 4 4 4 4 4] | ||
+ | |||
+ | </code> | ||