module ml.roc
¶
Methods¶
method |
truncated documentation |
---|---|
Initialisation with a dataframe and two or three columns: |
|
usual |
|
Shows first elements, precision rate. |
|
Shows first elements, precision rate. |
|
Computes the area under the curve (:epkg:`AUC`). |
|
Determines a confidence interval for the :epkg:`AUC` with bootstrap. |
|
Computes a ROC curve with nb points avec nb, if nb == -1, there are as many as points as the data contains, … |
|
Computes the confusion matrix for a specific score or all if score is None. |
|
Plots a ROC curve. |
|
Computes the precision. |
|
Resamples among the data. |
|
The ROC curve is defined by a set of points. This function interpolates those points to determine … |
|
Computes a confidence interval for the value returned by |
Documentation¶
About ROC.
-
class
mlstatpy.ml.roc.
ROC
(y_true=None, y_score=None, sample_weight=None, df=None)¶ Bases :
object
Helper to draw a ROC curve.
Initialisation with a dataframe and two or three columns:
column 1: score (y_score)
column 2: expected answer (boolean) (y_true)
column 3: weight (optional) (sample_weight)
- Paramètres
y_true – if df is None, y_true, y_score, sample_weight must be filled, y_true is whether or None the answer is true. y_true means the prediction is right.
y_score – score prediction
sample_weight – weights
df – dataframe or array or list, it must contains 2 or 3 columns always in the same order
-
class
CurveType
(value)¶ Bases :
enum.Enum
Curve types:
PROBSCORE: 1 - False Positive / True Positive
ERRPREC: error / recall
RECPREC: precision / recall
ROC: False Positive / True Positive
SKROC: False Positive / True Positive (scikit-learn)
-
property
Data
¶ Returns the underlying dataframe.
-
__init__
(y_true=None, y_score=None, sample_weight=None, df=None)¶ Initialisation with a dataframe and two or three columns:
column 1: score (y_score)
column 2: expected answer (boolean) (y_true)
column 3: weight (optional) (sample_weight)
- Paramètres
y_true – if df is None, y_true, y_score, sample_weight must be filled, y_true is whether or None the answer is true. y_true means the prediction is right.
y_score – score prediction
sample_weight – weights
df – dataframe or array or list, it must contains 2 or 3 columns always in the same order
-
__len__
()¶ usual
-
__repr__
()¶ Shows first elements, precision rate.
-
__str__
()¶ Shows first elements, precision rate.
-
auc
(cloud=None)¶ Computes the area under the curve (:epkg:`AUC`).
- Paramètres
cloud – data or None to use
self.data
, the function assumes the data is sorted.- Renvoie
AUC
The first column is the label, the second one is the score, the third one is the weight.
-
auc_interval
(bootstrap=10, alpha=0.95)¶ Determines a confidence interval for the :epkg:`AUC` with bootstrap.
- Paramètres
bootstrap – number of random estimation
alpha – define the confidence interval
- Renvoie
dictionary of values
-
compute_roc_curve
(nb=100, curve=<CurveType.ROC: 5>, bootstrap=False)¶ Computes a ROC curve with nb points avec nb, if nb == -1, there are as many as points as the data contains, if bootstrap == True, it draws random number to create confidence interval based on bootstrap method.
- Paramètres
nb – number of points for the curve
curve – see
CurveType
boostrap – builds the curve after resampling
- Renvoie
DataFrame (metrics and threshold)
If curve is SKROC, the parameter nb is not taken into account. It should be set to 0.
-
confusion
(score=None, nb=10, curve=<CurveType.ROC: 5>, bootstrap=False)¶ Computes the confusion matrix for a specific score or all if score is None.
- Paramètres
score – score or None.
nb – number of scores (if score is None)
curve – see
CurveType
boostrap – builds the curve after resampling
- Renvoie
One row if score is precised, many roww is score is None
-
plot
(nb=100, curve=<CurveType.ROC: 5>, bootstrap=0, ax=None, thresholds=False, **kwargs)¶ Plots a ROC curve.
- Paramètres
nb – number of points
curve – see
CurveType
boostrap – number of curves for the boostrap (0 for None)
ax – axis
thresholds – use thresholds for the X axis
kwargs – sent to pandas.plot
- Renvoie
ax
-
precision
()¶ Computes the precision.
-
random_cloud
()¶ Resamples among the data.
- Renvoie
DataFrame
-
roc_intersect
(roc, x)¶ The ROC curve is defined by a set of points. This function interpolates those points to determine y for any x.
- Paramètres
roc – ROC curve
x – x
- Renvoie
y
-
roc_intersect_interval
(x, nb, curve=<CurveType.ROC: 5>, bootstrap=10, alpha=0.05)¶ Computes a confidence interval for the value returned by
roc_intersect
.- Paramètres
roc – ROC curve
x – x
curve – see
CurveType
- Renvoie
dictionary