Train, convert and predict with ONNX Runtime

This example demonstrates an end to end scenario starting with the training of a machine learned model to its use in its converted from.

Train a logistic regression

The first step consists in retrieving the iris datset.

from sklearn.datasets import load_iris

iris = load_iris()
X, y = iris.data, iris.target

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y)

Then we fit a model.

from sklearn.linear_model import LogisticRegression

clr = LogisticRegression()
clr.fit(X_train, y_train)
somewhere/.local/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
LogisticRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


We compute the prediction on the test set and we show the confusion matrix.

from sklearn.metrics import confusion_matrix

pred = clr.predict(X_test)
print(confusion_matrix(y_test, pred))
[[13  0  0]
 [ 0 10  1]
 [ 0  1 13]]

Conversion to ONNX format

We use module sklearn-onnx to convert the model into ONNX format.

from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

initial_type = [("float_input", FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types=initial_type)
with open("logreg_iris.onnx", "wb") as f:
    f.write(onx.SerializeToString())

We load the model with ONNX Runtime and look at its input and output.

import onnxruntime as rt

sess = rt.InferenceSession("logreg_iris.onnx", providers=rt.get_available_providers())

print("input name='{}' and shape={}".format(sess.get_inputs()[0].name, sess.get_inputs()[0].shape))
print("output name='{}' and shape={}".format(sess.get_outputs()[0].name, sess.get_outputs()[0].shape))
input name='float_input' and shape=[None, 4]
output name='output_label' and shape=[None]

We compute the predictions.

input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

import numpy

pred_onx = sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
print(confusion_matrix(pred, pred_onx))
[[13  0  0]
 [ 0 11  0]
 [ 0  0 14]]

The prediction are perfectly identical.

Probabilities

Probabilities are needed to compute other relevant metrics such as the ROC Curve. Let’s see how to get them first with scikit-learn.

prob_sklearn = clr.predict_proba(X_test)
print(prob_sklearn[:3])
[[6.78847088e-04 3.60544494e-01 6.38776659e-01]
 [7.91789065e-06 4.83436050e-02 9.51648477e-01]
 [9.82746231e-01 1.72536996e-02 6.95009084e-08]]

And then with ONNX Runtime. The probabilies appear to be

prob_name = sess.get_outputs()[1].name
prob_rt = sess.run([prob_name], {input_name: X_test.astype(numpy.float32)})[0]

import pprint

pprint.pprint(prob_rt[0:3])
[{0: 0.0006788466707803309, 1: 0.360544353723526, 2: 0.6387768387794495},
 {0: 7.917877155705355e-06, 1: 0.04834353178739548, 2: 0.9516485929489136},
 {0: 0.9827462434768677, 1: 0.017253704369068146, 2: 6.950092767965543e-08}]

Let’s benchmark.

from timeit import Timer


def speed(inst, number=10, repeat=20):
    timer = Timer(inst, globals=globals())
    raw = numpy.array(timer.repeat(repeat, number=number))
    ave = raw.sum() / len(raw) / number
    mi, ma = raw.min() / number, raw.max() / number
    print("Average %1.3g min=%1.3g max=%1.3g" % (ave, mi, ma))
    return ave


print("Execution time for clr.predict")
speed("clr.predict(X_test)")

print("Execution time for ONNX Runtime")
speed("sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]")
Execution time for clr.predict
Average 0.000332 min=0.000305 max=0.000432
Execution time for ONNX Runtime
Average 0.000125 min=0.000123 max=0.000136

0.00012542196549475193

Let’s benchmark a scenario similar to what a webservice experiences: the model has to do one prediction at a time as opposed to a batch of prediction.

def loop(X_test, fct, n=None):
    nrow = X_test.shape[0]
    if n is None:
        n = nrow
    for i in range(0, n):
        im = i % nrow
        fct(X_test[im : im + 1])


print("Execution time for clr.predict")
speed("loop(X_test, clr.predict, 100)")


def sess_predict(x):
    return sess.run([label_name], {input_name: x.astype(numpy.float32)})[0]


print("Execution time for sess_predict")
speed("loop(X_test, sess_predict, 100)")
Execution time for clr.predict
Average 0.029 min=0.0289 max=0.0296
Execution time for sess_predict
Average 0.00822 min=0.00821 max=0.00832

0.008224687550391536

Let’s do the same for the probabilities.

print("Execution time for predict_proba")
speed("loop(X_test, clr.predict_proba, 100)")


def sess_predict_proba(x):
    return sess.run([prob_name], {input_name: x.astype(numpy.float32)})[0]


print("Execution time for sess_predict_proba")
speed("loop(X_test, sess_predict_proba, 100)")
Execution time for predict_proba
Average 0.0446 min=0.0444 max=0.0449
Execution time for sess_predict_proba
Average 0.00826 min=0.00824 max=0.00833

0.008257166115217842

This second comparison is better as ONNX Runtime, in this experience, computes the label and the probabilities in every case.

Benchmark with RandomForest

We first train and save a model in ONNX format.

from sklearn.ensemble import RandomForestClassifier

rf = RandomForestClassifier()
rf.fit(X_train, y_train)

initial_type = [("float_input", FloatTensorType([1, 4]))]
onx = convert_sklearn(rf, initial_types=initial_type)
with open("rf_iris.onnx", "wb") as f:
    f.write(onx.SerializeToString())

We compare.

sess = rt.InferenceSession("rf_iris.onnx", providers=rt.get_available_providers())


def sess_predict_proba_rf(x):
    return sess.run([prob_name], {input_name: x.astype(numpy.float32)})[0]


print("Execution time for predict_proba")
speed("loop(X_test, rf.predict_proba, 100)")

print("Execution time for sess_predict_proba")
speed("loop(X_test, sess_predict_proba_rf, 100)")
Execution time for predict_proba
Average 4.75 min=4.74 max=4.77
Execution time for sess_predict_proba
Average 0.0103 min=0.00946 max=0.0137

0.010250672480324283

Let’s see with different number of trees.

measures = []

for n_trees in range(5, 51, 5):
    print(n_trees)
    rf = RandomForestClassifier(n_estimators=n_trees)
    rf.fit(X_train, y_train)
    initial_type = [("float_input", FloatTensorType([1, 4]))]
    onx = convert_sklearn(rf, initial_types=initial_type)
    with open("rf_iris_%d.onnx" % n_trees, "wb") as f:
        f.write(onx.SerializeToString())
    sess = rt.InferenceSession("rf_iris_%d.onnx" % n_trees, providers=rt.get_available_providers())

    def sess_predict_proba_loop(x):
        return sess.run([prob_name], {input_name: x.astype(numpy.float32)})[0]

    tsk = speed("loop(X_test, rf.predict_proba, 100)", number=5, repeat=5)
    trt = speed("loop(X_test, sess_predict_proba_loop, 100)", number=5, repeat=5)
    measures.append({"n_trees": n_trees, "sklearn": tsk, "rt": trt})

from pandas import DataFrame

df = DataFrame(measures)
ax = df.plot(x="n_trees", y="sklearn", label="scikit-learn", c="blue", logy=True)
df.plot(x="n_trees", y="rt", label="onnxruntime", ax=ax, c="green", logy=True)
ax.set_xlabel("Number of trees")
ax.set_ylabel("Prediction time (s)")
ax.set_title("Speed comparison between scikit-learn and ONNX Runtime\nFor a random forest on Iris dataset")
ax.legend()
Speed comparison between scikit-learn and ONNX Runtime For a random forest on Iris dataset
5
Average 0.347 min=0.346 max=0.351
Average 0.00773 min=0.00768 max=0.00781
10
Average 0.578 min=0.577 max=0.581
Average 0.00774 min=0.00773 max=0.00777
15
Average 0.811 min=0.809 max=0.812
Average 0.00778 min=0.00777 max=0.00782
20
Average 1.04 min=1.03 max=1.04
Average 0.00788 min=0.00786 max=0.00791
25
Average 1.27 min=1.27 max=1.28
Average 0.00791 min=0.00789 max=0.00794
30
Average 1.5 min=1.5 max=1.5
Average 0.00791 min=0.0079 max=0.00793
35
Average 1.75 min=1.75 max=1.76
Average 0.0082 min=0.00818 max=0.00824
40
Average 1.97 min=1.97 max=1.98
Average 0.0084 min=0.00839 max=0.00841
45
Average 2.21 min=2.2 max=2.21
Average 0.00831 min=0.00829 max=0.00833
50
Average 2.42 min=2.42 max=2.43
Average 0.00834 min=0.00832 max=0.00837

<matplotlib.legend.Legend object at 0x7fd17011cd00>

Total running time of the script: ( 22 minutes 6.471 seconds)

Gallery generated by Sphinx-Gallery