medacy.model.stratified_k_fold module¶
Partitions a data set of sequence labels and classifications into 10 stratified folds. See Dietterich, 1997 “Approximate Statistical Tests for Comparing Supervised Classification Algorithms” for in-depth analysis.
Each partition should have an evenly distributed representation of sequence labels. Without stratification, under-representated labels may not appear in some folds.