module graphhelper.corrplot
¶
Short summary¶
module pyensae.graphhelper.corrplot
CorrPlot functionalities.
It comes from corrplot.py which I copied here because the module does not properly work on Python 3 (import issues). See also biokit license.
- author:
Thomas Cokelaer
- references:
http://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html
Classes¶
class |
truncated documentation |
---|---|
An implementation of correlation plotting tools (corrplot). the class requires scipy. … |
Methods¶
method |
truncated documentation |
---|---|
Plots the content of square matrix that contains correlation values. |
|
Rearranges the order of rows and columns after clustering. |
|
Plots the correlation matrix from the content of |
Documentation¶
CorrPlot functionalities.
It comes from corrplot.py which I copied here because the module does not properly work on Python 3 (import issues). See also biokit license.
- author:
Thomas Cokelaer
- references:
http://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html
- class pyensae.graphhelper.corrplot.Corrplot(data, na=0)¶
Bases:
Linkage
An implementation of correlation plotting tools (corrplot). the class requires scipy.
Here is a simple example with a correlation matrix as an input (stored in a pandas dataframe):
import pandas import numpy letters = "ABCDEFGHIJKLM"[0:10] df = pandas.DataFrame(dict(( (k, numpy.random.random(10)+ord(k)-65) for k in letters))) import matplotlib.pyplot as plt plt.close('all') plt.style.use('ggplot') from pyensae.graph_helper import Corrplot c = Corrplot(df) c.plot() plt.show()
This class requires module colormap.
Plots the content of square matrix that contains correlation values.
- Parameters:
data – input can be a dataframe (Pandas), or list of lists (python) or a numpy matrix. Note, however, that values must be between -1 and 1. If not, or if the matrix (or list of lists) is not squared, then correlation is computed. The data or computed correlation is stored in
df
attribute.na – replace NA values with this value (default 0)
The
params
contains some tunable parameters for the colorbar in theplot()
method.# can be a list of lists, the correlation matrix is then a 2x2 matrix c = Corrplot([[1,1], [2,4], [3,3], [4,4]])
- __init__(data, na=0)¶
Plots the content of square matrix that contains correlation values.
- Parameters:
data – input can be a dataframe (Pandas), or list of lists (python) or a numpy matrix. Note, however, that values must be between -1 and 1. If not, or if the matrix (or list of lists) is not squared, then correlation is computed. The data or computed correlation is stored in
df
attribute.na – replace NA values with this value (default 0)
The
params
contains some tunable parameters for the colorbar in theplot()
method.# can be a list of lists, the correlation matrix is then a 2x2 matrix c = Corrplot([[1,1], [2,4], [3,3], [4,4]])
- _add_patches(df, method, fill, ax, diagonal=True)¶
- _set_default_cmap()¶
- order(method='complete', metric='euclidean', inplace=False)¶
Rearranges the order of rows and columns after clustering.
- Parameters:
method – any scipy method (e.g., single, average, centroid, median, ward). See scipy.cluster.hierarchy.linkage
metric – any scipy distance (euclidean, hamming, jaccard) See scipy.spatial.distance or scipy.cluster.hieararchy
inplace (bool) – if set to True, the dataframe is replaced
You probably do not need to use that method. Use
plot()
and the two parameters order_metric and order_method instead.
- plot(fig=None, grid=True, rotation=30, lower=None, upper=None, shrink=0.9, axisbg='white', colorbar=True, label_color='black', fontsize='small', edgecolor='black', method='ellipse', order_method='complete', order_metric='euclidean', cmap=None, ax=None, binarise_color=False, figsize=None)¶
Plots the correlation matrix from the content of
df
(dataframe).By default, the correlation is shown on the upper and lower triangle and is symmetric wrt to the diagonal. The symbols are ellipses. The symbols can be changed to e.g. rectangle. The symbols are shown on upper and lower sides but you could choose a symbol for the upper side and another for the lower side using the lower and upper parameters.
- Parameters:
fig – Create a new figure by default. If an instance of an existing figure is provided, the corrplot is overlayed on the figure provided. Can also be the number of the figure.
grid – add grid (Defaults to grey color). You can set it to False or a color.
rotation – rotate labels on y-axis
lower – if set to a valid method, plots the data on the lower left triangle
upper – if set to a valid method, plots the data on the upper left triangle
shrink (float) – maximum space used (in percent) by a symbol. If negative values are provided, the absolute value is taken. If greater than 1, the symbols wiill overlap.
axisbg – color of the background (defaults to white).
colorbar – add the colorbar (defaults to True).
label_color (str) – (defaults to black).
fontsize – size of the fonts defaults to ‘small’.
method – shape to be used in ‘ellipse’, ‘square’, ‘rectangle’, ‘color’, ‘text’, ‘circle’, ‘number’, ‘pie’.
order_method – see
order
.order_metric – see : meth:order.
cmap – a valid cmap from matplotlib or colormap package (e.g., ‘jet’, or ‘copper’). Default is red/white/blue colors.
binarise_color – two colors only, negative, positive
ax – a matplotlib axes.
figsize – gives that parameter to the new created figure
- Returns:
ax (matplotlib axes)
The colorbar can be tuned with the parameters stored in
params
. Here is an example. See notebook for other examples:c = corrplot.Corrplot(dataframe) c.plot(cmap=('Orange', 'white', 'green')) c.plot(method='circle') c.plot(colorbar=False, shrink=.8, upper='circle' )