Nested CV: estimate performance AFTER tuning
A GridSearchCV score is optimistic because the hyperparameters were chosen on those very folds. The outer loop of a nested CV gives the unbiased estimate of the full procedure.
Prerequisites
scikit-learn
Python
from sklearn.model_selection import GridSearchCV, cross_val_score, KFold
from sklearn.ensemble import RandomForestClassifier
inner = KFold(n_splits=3, shuffle=True, random_state=1) # tuning
outer = KFold(n_splits=5, shuffle=True, random_state=2) # évaluation
search = GridSearchCV(
RandomForestClassifier(random_state=42),
{"max_depth": [4, 8, None], "min_samples_leaf": [1, 5, 20]},
cv=inner, scoring="roc_auc", n_jobs=-1,
)
# La boucle externe ne sert QU'À évaluer la procédure (tuning inclus)
scores = cross_val_score(search, X, y, cv=outer, scoring="roc_auc")
print(f"AUC nested (non biaisée) : {scores.mean():.3f} +/- {scores.std():.3f}")
# Le score interne, lui, est typiquement plus haut (optimiste) :
search.fit(X, y)
print(f"AUC interne du tuning : {search.best_score_:.3f}")Result
AUC nested (non biaisée) : 0.842 +/- 0.018
AUC interne du tuning : 0.871
>>> search.best_params_
{'max_depth': 8, 'min_samples_leaf': 5}Nested CVGridSearchCVBiaisÉvaluation