medacy.model.stratified_k_fold module

Partitions a data set of sequence labels and classifications into 10 stratified folds. See Dietterich, 1997 “Approximate Statistical Tests for Comparing Supervised Classification Algorithms” for in-depth analysis.

Each partition should have an evenly distributed representation of sequence labels. Without stratification, under-representated labels may not appear in some folds.

class medacy.model.stratified_k_fold.SequenceStratifiedKFold(folds=10)[source]

Bases: object