module mlmodel._kmeans_constraint_
#
Short summary#
module mlinsights.mlmodel._kmeans_constraint_
Implémente la classe ConstraintKMeans
.
Functions#
function |
truncated documentation |
---|---|
Changes weights mapped to every cluster. weights < 1 are used for big clusters, weights > 1 are used for small … |
|
Computes weights difference. |
|
Creates a matrix |
|
Completes the constraint k-means. |
|
Completes the constraint k-means, the function sorts points by distance to the closest cluster and associates … |
|
Completes the constraint k-means. Follows the method described in Same-size k-Means Variation. … |
|
Associates points to clusters. |
|
Runs KMeans iterator but weights cluster among them. |
|
Computes total weighted inertia. |
|
Computes weighted inertia. It also adds a fraction of the whole inertia depending on how balanced the clusters are. … |
|
Randomizes index depending on the value. Swap indexes. Modifies index. |
|
Tries to switch clusters. Modifies labels inplace. |
|
Completes the constraint k-means. |
|
Computes the predictions but tries to associates the same numbers of points in each cluster. |
|
Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to :epkg:`csr_matrix` … |
Documentation#
Implémente la classe ConstraintKMeans
.
- mlinsights.mlmodel._kmeans_constraint_._adjust_weights(X, sw, weights, labels, lr)#
Changes weights mapped to every cluster. weights < 1 are used for big clusters, weights > 1 are used for small clusters.
- Parameters:
X – features
centers – centers
sw – sample weights
weights – cluster weights
lr – learning rate
labels – known labels
- Returns:
labels
- mlinsights.mlmodel._kmeans_constraint_._compute_balance(X, sw, labels, nbc=None)#
Computes weights difference.
- Parameters:
X – features
sw – sample weights
labels – known labels
nbc – number of clusters
- Returns:
(weights per cluster, expected weight, total weight)
- mlinsights.mlmodel._kmeans_constraint_._compute_strategy_coefficient(distances, strategy, labels)#
Creates a matrix
- mlinsights.mlmodel._kmeans_constraint_._constraint_association(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)#
Completes the constraint k-means.
- Parameters:
X – features
labels – initialized labels (unused)
centers – initialized centers
x_squared_norms – norm of X
limit – number of point to associate per cluster
leftover – number of points to associate at the end
counters – allocated array
leftclose – allocated array
labels – allocated array
distances_close – allocated array
strategy – strategy used to sort point before mapping them to a cluster
state – random state
- mlinsights.mlmodel._kmeans_constraint_._constraint_association_distance(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)#
Completes the constraint k-means, the function sorts points by distance to the closest cluster and associates them into that order. It deals first with the further point and maps it to the closest center.
- Parameters:
X – features
labels – initialized labels (unused)
centers – initialized centers
x_squared_norms – norm of X
limit – number of point to associate per cluster
leftover – number of points to associate at the end
counters – allocated array
leftclose – allocated array
labels – allocated array
distances_close – allocated array
strategy – strategy used to sort point before mapping them to a cluster
state – random state (unused)
- mlinsights.mlmodel._kmeans_constraint_._constraint_association_gain(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)#
Completes the constraint k-means. Follows the method described in Same-size k-Means Variation.
- Parameters:
X – features
labels – initialized labels (unused)
centers – initialized centers
x_squared_norms – norm of X
limit – number of points to associate per cluster
leftover – number of points to associate at the end
counters – allocated array
leftclose – allocated array
labels – allocated array
distances_close – allocated array
strategy – strategy used to sort point before mapping them to a cluster
state – random state
- mlinsights.mlmodel._kmeans_constraint_._constraint_association_weights(X, centers, sw, weights)#
Associates points to clusters.
- Parameters:
X – features
centers – centers
sw – sample weights
weights – cluster weights
- Returns:
labels
- mlinsights.mlmodel._kmeans_constraint_._constraint_kmeans_weights(X, labels, sample_weight, centers, inertia, it, max_iter, verbose=0, state=None, learning_rate=1.0, history=False, fLOG=None)#
Runs KMeans iterator but weights cluster among them.
- Parameters:
X – features
labels – initialized labels (unused)
sample_weight – sample weight
centers – initialized centers
inertia – initialized inertia (unused)
it – number of iteration already done
max_iter – maximum of number of iteration
verbose – verbose
state – random state
learning_rate – learning rate
history – keeps all centers accross iterations
fLOG – logging function (needs to be specified otherwise verbose has no effects)
- Returns:
tuple (best_labels, best_centers, best_inertia, weights, it)
- mlinsights.mlmodel._kmeans_constraint_._inertia(X, sw)#
Computes total weighted inertia.
- Parameters:
X – features
sw – sample weights
- Returns:
inertia
- mlinsights.mlmodel._kmeans_constraint_._labels_inertia_weights(X, centers, sw, weights, labels, total_inertia)#
Computes weighted inertia. It also adds a fraction of the whole inertia depending on how balanced the clusters are.
- Parameters:
X – features
centers – centers
sw – sample weights
weights – cluster weights
labels – labels
total_inertia – total inertia
- Returns:
inertia
- mlinsights.mlmodel._kmeans_constraint_._randomize_index(index, weights)#
Randomizes index depending on the value. Swap indexes. Modifies index.
- mlinsights.mlmodel._kmeans_constraint_._switch_clusters(labels, distances)#
Tries to switch clusters. Modifies labels inplace.
- Parameters:
labels – labels
distances – distances
- mlinsights.mlmodel._kmeans_constraint_.constraint_kmeans(X, labels, sample_weight, centers, inertia, iter, max_iter, strategy='gain', verbose=0, state=None, learning_rate=1.0, history=False, fLOG=None)#
Completes the constraint k-means.
- Parameters:
X – features
labels – initialized labels (unused)
sample_weight – sample weight
centers – initialized centers
inertia – initialized inertia (unused)
iter – number of iteration already done
max_iter – maximum of number of iteration
strategy – strategy used to sort observations before mapping them to clusters
verbose – verbose
state – random state
learning_rate – used by strategy ‘weights’
history – return list of centers accross iterations
fLOG – logging function (needs to be specified otherwise verbose has no effects)
- Returns:
tuple (best_labels, best_centers, best_inertia, iter, all_centers)
- mlinsights.mlmodel._kmeans_constraint_.constraint_predictions(X, centers, strategy, state=None)#
Computes the predictions but tries to associates the same numbers of points in each cluster.
- Parameters:
X – features
centers – centers of each clusters
strategy – strategy used to sort point before mapping them to a cluster
state – random state
- Returns:
labels, distances, distances_close
- mlinsights.mlmodel._kmeans_constraint_.linearize_matrix(mat, *adds)#
Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to :epkg:`csr_matrix` but null values are kept.
- Parameters:
mat – matrix
adds – additional square matrices
- Returns:
new matrix
adds defines additional matrices, it adds columns on the right side and fill them with the corresponding value taken into the additional matrices.