medacy.pipeline_components.tokenization.clinical_tokenizer module

class medacy.pipeline_components.tokenization.clinical_tokenizer.ClinicalTokenizer(nlp)[source]

Bases: object

A tokenizer for clinical text

_get_infix_regex()[source]

Custom infix tokenization rules :return:

_get_prefix_regex()[source]

Custom prefix tokenization rules :return:

_get_suffix_regex()[source]

Custom suffix tokenization rules :return:

add_exceptions(exceptions)[source]

Adds exception for tokenizer to ignore. :param exceptions: an array of terms to not split on during tokenization :return: