module onnxrt.ops_cpu.op_tokenizer
#
Short summary#
module mlprodict.onnxrt.ops_cpu.op_tokenizer
Runtime operator.
Classes#
class |
truncated documentation |
---|---|
See Tokenizer. |
|
Defines a schema for operators added in this package such as |
Properties#
property |
truncated documentation |
---|---|
|
Returns the list of arguments as well as the list of parameters with the default values (close to the signature). … |
|
Returns the list of modified parameters. |
|
Returns the list of optional arguments. |
|
Returns the list of optional arguments. |
|
Returns all parameters in a dictionary. |
Methods#
method |
truncated documentation |
---|---|
Tokenizes y charaters. |
|
Tokenizes using separators. The function should use a trie to find text. |
|
Tokenizes using separators. The function should use a trie to find text. |
|
Tokenizes a char level. |
Documentation#
Runtime operator.
- class mlprodict.onnxrt.ops_cpu.op_tokenizer.Tokenizer(onnx_node, desc=None, **options)#
Bases:
OpRunUnary
See Tokenizer.
- __init__(onnx_node, desc=None, **options)#
- _find_custom_operator_schema(op_name)#
- _run(text, attributes=None, verbose=0, fLOG=None)#
Should be overwritten.
- _run_char_tokenization(text, stops)#
Tokenizes y charaters.
- _run_regex_tokenization(text, stops, exp)#
Tokenizes using separators. The function should use a trie to find text.
- _run_sep_tokenization(text, stops, separators)#
Tokenizes using separators. The function should use a trie to find text.
- _run_tokenization(text, stops, split)#
Tokenizes a char level.
- class mlprodict.onnxrt.ops_cpu.op_tokenizer.TokenizerSchema#
Bases:
OperatorSchema
Defines a schema for operators added in this package such as
TreeEnsembleClassifierDouble
.- __init__()#