medacy.model.model module

A medaCy named entity recognition model wraps together three functionalities

class medacy.model.model.Model(medacy_pipeline=None, model=None, n_jobs=4)[source]

Bases: object

_extract_features(data_file, medacy_pipeline, is_metamapped)[source]

A multi-processed method for extracting features from a given DataFile instance. :param conn: pipe to pass back data to parent process :param data_file: an instance of DataFile :return: Updates queue with features for this given file.

cross_validate(num_folds=10)[source]

Performs k-fold stratified cross-validation using our model and pipeline. :param num_folds: number of folds to split training data into for cross validation :return: Prints out performance metrics

dump(path)[source]

Dumps a model into a pickle file :param path: Directory path to dump the model :return:

fit(dataset)[source]

Runs dataset through the designated pipeline, extracts features, and fits a conditional random field. :param training_data_loader: Instance of Dataset. :return model: a trained instance of a sklearn_crfsuite.CRF model.

get_info(return_dict=False)[source]

Retrieves information about a Model including details about the feature extraction pipeline, features utilized, and learning model. :param return_dict: Returns a raw dictionary of information as opposed to a formatted string :return: Returns structured information

load(path)[source]

Loads a pickled model. :param path: File path to directory where fitted model should be dumped :return:

static load_external(package_name)[source]

Loads an external medaCy compatible Model. Require’s the models package to be installed Alternatively, you can import the package directly and call it’s .load() method. :param package_name: the package name of the model :return: an instance of Model that is configured and loaded - ready for prediction.

predict(dataset, prediction_directory=None)[source]
Parameters
  • documents – a string or Dataset to predict

  • prediction_directory – the directory to write predictions if doing bulk prediction (default: /prediction sub-directory of Dataset)

Returns