.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "gyexamples/plot_woe_transformer.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_gyexamples_plot_woe_transformer.py: .. _example-woe-transformer: Converter for WOE ================= WOE means Weights of Evidence. It consists in checking that a feature X belongs to a series of regions - intervals -. The results is the label of every intervals containing the feature. .. index:: WOE, WOETransformer .. contents:: :local: A simple example ++++++++++++++++ X is a vector made of the first ten integers. Class :class:`WOETransformer ` checks that every of them belongs to two intervals, `]1, 3[` (leftright-opened) and `[5, 7]` (left-right-closed). The first interval is associated to weight 55 and and the second one to 107. .. GENERATED FROM PYTHON SOURCE LINES 26-50 .. code-block:: default import os import numpy import pandas as pd from onnx.tools.net_drawer import GetPydotGraph, GetOpNodeProducer from onnxruntime import InferenceSession import matplotlib.pyplot as plt from skl2onnx import to_onnx from skl2onnx.sklapi import WOETransformer # automatically registers the converter for WOETransformer import skl2onnx.sklapi.register # noqa X = numpy.arange(10).astype(numpy.float32).reshape((-1, 1)) intervals = [ [(1., 3., False, False), (5., 7., True, True)]] weights = [[55, 107]] woe1 = WOETransformer(intervals, onehot=False, weights=weights) woe1.fit(X) prd = woe1.transform(X) df = pd.DataFrame({'X': X.ravel(), 'woe': prd.ravel()}) df .. raw:: html
X woe
0 0.0 0.0
1 1.0 0.0
2 2.0 55.0
3 3.0 0.0
4 4.0 0.0
5 5.0 107.0
6 6.0 107.0
7 7.0 107.0
8 8.0 0.0
9 9.0 0.0


.. GENERATED FROM PYTHON SOURCE LINES 51-56 One Hot +++++++ The transformer outputs one column with the weights. But it could return one column per interval. .. GENERATED FROM PYTHON SOURCE LINES 56-65 .. code-block:: default woe2 = WOETransformer(intervals, onehot=True, weights=weights) woe2.fit(X) prd = woe2.transform(X) df = pd.DataFrame(prd) df.columns = ['I1', 'I2'] df['X'] = X df .. raw:: html
I1 I2 X
0 0.0 0.0 0.0
1 0.0 0.0 1.0
2 55.0 0.0 2.0
3 0.0 0.0 3.0
4 0.0 0.0 4.0
5 0.0 107.0 5.0
6 0.0 107.0 6.0
7 0.0 107.0 7.0
8 0.0 0.0 8.0
9 0.0 0.0 9.0


.. GENERATED FROM PYTHON SOURCE LINES 66-68 In that case, weights can be omitted. The output is binary. .. GENERATED FROM PYTHON SOURCE LINES 68-77 .. code-block:: default woe = WOETransformer(intervals, onehot=True) woe.fit(X) prd = woe.transform(X) df = pd.DataFrame(prd) df.columns = ['I1', 'I2'] df['X'] = X df .. raw:: html
I1 I2 X
0 0.0 0.0 0.0
1 0.0 0.0 1.0
2 1.0 0.0 2.0
3 0.0 0.0 3.0
4 0.0 0.0 4.0
5 0.0 1.0 5.0
6 0.0 1.0 6.0
7 0.0 1.0 7.0
8 0.0 0.0 8.0
9 0.0 0.0 9.0


.. GENERATED FROM PYTHON SOURCE LINES 78-84 Conversion to ONNX ++++++++++++++++++ *skl2onnx* implements a converter for all cases. onehot=False .. GENERATED FROM PYTHON SOURCE LINES 84-89 .. code-block:: default onx1 = to_onnx(woe1, X) sess = InferenceSession(onx1.SerializeToString(), providers=['CPUExecutionProvider']) print(sess.run(None, {'X': X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0.] [ 0.] [ 55.] [ 0.] [ 0.] [107.] [107.] [107.] [ 0.] [ 0.]] .. GENERATED FROM PYTHON SOURCE LINES 90-91 onehot=True .. GENERATED FROM PYTHON SOURCE LINES 91-97 .. code-block:: default onx2 = to_onnx(woe2, X) sess = InferenceSession(onx2.SerializeToString(), providers=['CPUExecutionProvider']) print(sess.run(None, {'X': X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0. 0.] [ 0. 0.] [ 55. 0.] [ 0. 0.] [ 0. 0.] [ 0. 107.] [ 0. 107.] [ 0. 107.] [ 0. 0.] [ 0. 0.]] .. GENERATED FROM PYTHON SOURCE LINES 98-102 ONNX Graphs +++++++++++ onehot=False .. GENERATED FROM PYTHON SOURCE LINES 102-116 .. code-block:: default pydot_graph = GetPydotGraph( onx1.graph, name=onx1.graph.name, rankdir="TB", node_producer=GetOpNodeProducer( "docstring", color="yellow", fillcolor="yellow", style="filled")) pydot_graph.write_dot("woe1.dot") os.system('dot -O -Gdpi=300 -Tpng woe1.dot') image = plt.imread("woe1.dot.png") fig, ax = plt.subplots(figsize=(10, 10)) ax.imshow(image) ax.axis('off') .. image-sg:: /gyexamples/images/sphx_glr_plot_woe_transformer_001.png :alt: plot woe transformer :srcset: /gyexamples/images/sphx_glr_plot_woe_transformer_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (-0.5, 2066.5, 3321.5, -0.5) .. GENERATED FROM PYTHON SOURCE LINES 117-118 onehot=True .. GENERATED FROM PYTHON SOURCE LINES 118-132 .. code-block:: default pydot_graph = GetPydotGraph( onx2.graph, name=onx2.graph.name, rankdir="TB", node_producer=GetOpNodeProducer( "docstring", color="yellow", fillcolor="yellow", style="filled")) pydot_graph.write_dot("woe2.dot") os.system('dot -O -Gdpi=300 -Tpng woe2.dot') image = plt.imread("woe2.dot.png") fig, ax = plt.subplots(figsize=(10, 10)) ax.imshow(image) ax.axis('off') .. image-sg:: /gyexamples/images/sphx_glr_plot_woe_transformer_002.png :alt: plot woe transformer :srcset: /gyexamples/images/sphx_glr_plot_woe_transformer_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (-0.5, 2226.5, 5696.5, -0.5) .. GENERATED FROM PYTHON SOURCE LINES 133-138 Half-line +++++++++ An interval may have only one extremity defined and the other can be infinite. .. GENERATED FROM PYTHON SOURCE LINES 138-150 .. code-block:: default intervals = [ [(-numpy.inf, 3., True, True), (5., numpy.inf, True, True)]] weights = [[55, 107]] woe1 = WOETransformer(intervals, onehot=False, weights=weights) woe1.fit(X) prd = woe1.transform(X) df = pd.DataFrame({'X': X.ravel(), 'woe': prd.ravel()}) df .. raw:: html
X woe
0 0.0 55.0
1 1.0 55.0
2 2.0 55.0
3 3.0 55.0
4 4.0 0.0
5 5.0 107.0
6 6.0 107.0
7 7.0 107.0
8 8.0 107.0
9 9.0 107.0


.. GENERATED FROM PYTHON SOURCE LINES 151-152 And the conversion to ONNX using the same instruction. .. GENERATED FROM PYTHON SOURCE LINES 152-157 .. code-block:: default onxinf = to_onnx(woe1, X) sess = InferenceSession(onxinf.SerializeToString(), providers=['CPUExecutionProvider']) print(sess.run(None, {'X': X})[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 55.] [ 55.] [ 55.] [ 55.] [ 0.] [107.] [107.] [107.] [107.] [107.]] .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 9.771 seconds) .. _sphx_glr_download_gyexamples_plot_woe_transformer.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_woe_transformer.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_woe_transformer.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_