Python库实现K-Fold Corss Vavlidation


k 折交叉验证(k-fold cross validation)

示例1

最简单的方法是直接调用 cross_val_score
我们可以直接看一下 K-fold 是怎样划分数据的: X 有四个数据,把它分成 2 折, 结果中最后一个集合是测试集,前面的是训练集, 每一行为 1 折:

import numpy as np
from sklearn.model_selection import KFold

X = ["a", "b", "c", "d"]
kf = KFold(n_splits=2)
for train, test in kf.split(X):
 print("%s %s" % (train, test))
[2 3] [0 1]
[0 1] [2 3]

示例2

# 下面代码演示了K-fold交叉验证是如何进行数据分割的
# simulate splitting a dataset of 25 observations into 5 folds
from sklearn.cross_validation import KFold
kf = KFold(25, n_folds=5, shuffle=False)

# print the contents of each training and testing set
print ('{} {:^61} {}'.format('Iteration', 'Training set observations', 'Testing set observations'))
for iteration, data in enumerate(kf, start=1):
    print ('{:^9} {} {:^25}'.format(iteration, str(data[0]), str(data[1])))
Iteration                   Training set observations                   Testing set observations
    1     [ 5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]        [0 1 2 3 4]       
    2     [ 0  1  2  3  4 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]        [5 6 7 8 9]       
    3     [ 0  1  2  3  4  5  6  7  8  9 15 16 17 18 19 20 21 22 23 24]     [10 11 12 13 14]     
    4     [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 20 21 22 23 24]     [15 16 17 18 19]     
    5     [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]     [20 21 22 23 24]

LeaveOneOut

同样的数据 X,我们看 LeaveOneOut 后是什么样子, 那就是把它分成 4 折, 结果中最后一个集合是测试集,只有一个元素,前面的是训练集, 每一行为 1 折:

from sklearn.model_selection import LeaveOneOut

X = [1, 2, 3, 4]
loo = LeaveOneOut()
for train, test in loo.split(X):
 print("%s %s" % (train, test))
[1 2 3] [0]
[0 2 3] [1]
[0 1 3] [2]
[0 1 2] [3]

results matching ""

    No results matching ""