health_multimodal.text.data.io

Classes

TextInput(tokenizer)

Text input class that can be used for inference and deployment.

class health_multimodal.text.data.io.TextInput(tokenizer)[source]

Text input class that can be used for inference and deployment.

Implements tokenizer related operations and ensure that input strings conform with the standards expected from a BERT model.

Parameters

tokenizer (BertTokenizer) – A BertTokenizer object.

assert_special_tokens_not_present(prompt)[source]

Check if the input prompts contain special tokens.

Return type

None

tokenize_input_prompts(prompts, verbose)[source]

Tokenizes the input sentence(s) and adds special tokens as defined by the tokenizer. :type prompts: Union[str, List[str]] :param prompts: Either a string containing a single sentence, or a list of strings each containing

a single sentence. Note that this method will not correctly tokenize multiple sentences if they are input as a single string.

Parameters

verbose (bool) – If set to True, will log the sentence after tokenization.

Return type

Any

Returns

A 2D tensor containing the tokenized sentences