module ml.ml_grid_benchmark#

Inheritance diagram of mlstatpy.ml.ml_grid_benchmark

Short summary#

module mlstatpy.ml.ml_grid_benchmark

About Machine Learning Benchmark

source on GitHub

Classes#

class

truncated documentation

MlGridBenchMark

The class tests a list of model over a list of datasets.

Properties#

property

truncated documentation

Appendix

Returns the metrics.

Graphs

Returns images of graphs.

Metadata

Returns the metrics.

Metrics

Returns the metrics.

Name

Returns the name of the benchmark.

Methods#

method

truncated documentation

__init__

bench_experiment

Calls meth fit.

end

nothing to do

fit

Trains a model.

graphs

Plots multiples graphs.

plot_graphs

Plots all graphs in the same graphs.

predict_score_experiment

Calls method score.

preprocess_dataset

Splits the dataset into train and test.

score

Scores a model.

Documentation#

About Machine Learning Benchmark

source on GitHub

class mlstatpy.ml.ml_grid_benchmark.MlGridBenchMark(name, datasets, clog=None, fLOG=<function noLOG>, path_to_images='.', cache_file=None, progressbar=None, graphx=None, graphy=None, **params)#

Bases : GridBenchMark

The class tests a list of model over a list of datasets.

source on GitHub

Paramètres:
  • name – name of the test

  • datasets – list of dictionary of dataframes

  • clog – see CustomLog or string

  • fLOG – logging function

  • params – extra parameters

  • path_to_images – path to images and intermediate results

  • cache_file – cache file

  • progressbar – relies on tqdm, example tnrange

  • graphx – list of variables to use as X axis

  • graphy – list of variables to use as Y axis

If cache_file is specified, the class will store the results of the method bench. On a second run, the function load the cache and run modified or new run (in param_list).

datasets should be a dictionary with dataframes a values with the following keys:

  • 'X': features

  • 'Y': labels (optional)

source on GitHub

__init__(name, datasets, clog=None, fLOG=<function noLOG>, path_to_images='.', cache_file=None, progressbar=None, graphx=None, graphy=None, **params)#
Paramètres:
  • name – name of the test

  • datasets – list of dictionary of dataframes

  • clog – see CustomLog or string

  • fLOG – logging function

  • params – extra parameters

  • path_to_images – path to images and intermediate results

  • cache_file – cache file

  • progressbar – relies on tqdm, example tnrange

  • graphx – list of variables to use as X axis

  • graphy – list of variables to use as Y axis

If cache_file is specified, the class will store the results of the method bench. On a second run, the function load the cache and run modified or new run (in param_list).

datasets should be a dictionary with dataframes a values with the following keys:

  • 'X': features

  • 'Y': labels (optional)

source on GitHub

bench_experiment(ds, **params)#

Calls meth fit.

source on GitHub

end()#

nothing to do

source on GitHub

fit(ds, model, **params)#

Trains a model.

Paramètres:
  • ds – dictionary with the data to use for training

  • model – model to train

source on GitHub

graphs(path_to_images)#

Plots multiples graphs.

Paramètres:

path_to_images – where to store images

Renvoie:

list of tuple (image_name, function to create the graph)

source on GitHub

plot_graphs(grid=None, text=True, **kwargs)#

Plots all graphs in the same graphs.

Paramètres:
  • grid – grid of axes

  • text – add legend title on the graph

Renvoie:

grid

source on GitHub

predict_score_experiment(ds, model, **params)#

Calls method score.

source on GitHub

preprocess_dataset(dsi, **params)#

Splits the dataset into train and test.

Paramètres:
  • dsi – dataset index

  • params – additional parameters

Renvoie:

dataset (like info), dictionary for metrics

source on GitHub

score(ds, model, **params)#

Scores a model.

source on GitHub