module onnxrt.ops_cpu.op_string_normalizer#

Inheritance diagram of mlprodict.onnxrt.ops_cpu.op_string_normalizer

Short summary#

module mlprodict.onnxrt.ops_cpu.op_string_normalizer

Runtime operator.

source on GitHub

Classes#

class

truncated documentation

StringNormalizer

The operator is not really threadsafe as python cannot play with two locales at the same time. stop words should …

Properties#

property

truncated documentation

args_default

Returns the list of arguments as well as the list of parameters with the default values (close to the signature). …

args_default_modified

Returns the list of modified parameters.

args_mandatory

Returns the list of optional arguments.

args_optional

Returns the list of optional arguments.

atts_value

Returns all parameters in a dictionary.

Methods#

method

truncated documentation

__init__

_remove_stopwords

_run

Normalizes strings.

_run_column

Normalizes string in a columns.

strip_accents_unicode

Transforms accentuated unicode symbols into their simple counterpart. Source: sklearn/feature_extraction/text.py. …

Documentation#

Runtime operator.

source on GitHub

class mlprodict.onnxrt.ops_cpu.op_string_normalizer.StringNormalizer(onnx_node, desc=None, **options)#

Bases: OpRunUnary

The operator is not really threadsafe as python cannot play with two locales at the same time. stop words should not be implemented here as the tokenization usually happens after this steps.

source on GitHub

__init__(onnx_node, desc=None, **options)#
_remove_stopwords(text, stops)#
_run(x, attributes=None, verbose=0, fLOG=None)#

Normalizes strings.

source on GitHub

_run_column(cin, cout)#

Normalizes string in a columns.

source on GitHub

strip_accents_unicode(s)#

Transforms accentuated unicode symbols into their simple counterpart. Source: sklearn/feature_extraction/text.py.

Parameters:

s – string The string to strip

Returns:

the cleaned string

source on GitHub