module mlmodel.interval_regressor#

Inheritance diagram of mlinsights.mlmodel.interval_regressor

Short summary#

module mlinsights.mlmodel.interval_regressor

Implements a piecewise linear regression.

source on GitHub

Classes#

class

truncated documentation

IntervalRegressor

Trains multiple regressors to provide a confidence interval on prediction. It only works for single regression. …

Properties#

property

truncated documentation

_repr_html_

HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should …

n_estimators_

Returns the number of estimators = the number of buckets the data was split in.

Methods#

method

truncated documentation

__init__

fit

Trains the binner and an estimator on every bucket.

predict

Computes the average predictions.

predict_all

Computes the predictions for all estimators.

predict_sorted

Computes the predictions for all estimators. Sorts them for all observations.

Documentation#

Implements a piecewise linear regression.

source on GitHub

class mlinsights.mlmodel.interval_regressor.IntervalRegressor(estimator=None, n_estimators=10, n_jobs=None, alpha=1.0, verbose=False)#

Bases: BaseEstimator, RegressorMixin

Trains multiple regressors to provide a confidence interval on prediction. It only works for single regression. Every training is made with a new sample of the training data, parameter alpha let the user choose the size of this sample. A smaller alpha increases the variance of the predictions. The current implementation draws sample by random but keeps the weight associated to each of them. Another way could be to draw a weighted sample but give them uniform weights.

source on GitHub

Parameters:
  • estimator – predictor trained on every bucket

  • n_estimators – number of estimators to train

  • n_jobs – number of parallel jobs (for training and predicting)

  • alpha – proportion of samples resampled for each training

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

source on GitHub

__init__(estimator=None, n_estimators=10, n_jobs=None, alpha=1.0, verbose=False)#
Parameters:
  • estimator – predictor trained on every bucket

  • n_estimators – number of estimators to train

  • n_jobs – number of parallel jobs (for training and predicting)

  • alpha – proportion of samples resampled for each training

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

source on GitHub

fit(X, y, sample_weight=None)#

Trains the binner and an estimator on every bucket.

Parameters:
  • X – features, X is converted into an array if X is a dataframe

  • y – target

  • sample_weight – sample weights

Returns:

self: returns an instance of self.

Fitted attributes:

  • binner_: binner

  • estimators_: dictionary of estimators, each of them

    mapped to a leave to the tree

  • mean_estimator_: estimator trained on the whole

    datasets in case the binner can find a bucket for a new observation

  • dim_: dimension of the output

  • mean_: average targets

source on GitHub

property n_estimators_#

Returns the number of estimators = the number of buckets the data was split in.

source on GitHub

predict(X)#

Computes the average predictions.

Parameters:

X – features, X is converted into an array if X is a dataframe

Returns:

predictions

source on GitHub

predict_all(X)#

Computes the predictions for all estimators.

Parameters:

X – features, X is converted into an array if X is a dataframe

Returns:

predictions

source on GitHub

predict_sorted(X)#

Computes the predictions for all estimators. Sorts them for all observations.

Parameters:

X – features, X is converted into an array if X is a dataframe

Returns:

predictions sorted for each observation

source on GitHub