module mlhelper.table_formula
¶
Short summary¶
module pyensae.mlhelper.table_formula
Adds functionalities to a dataframe.
Classes¶
class |
truncated documentation |
---|---|
Extends class :epkg:`pandas:DataFrame` or proposes extensions to existing functions using lambda functions. See … |
Properties¶
property |
truncated documentation |
---|---|
|
Can we transpose this DataFrame without creating any new array objects. |
|
|
|
|
|
Whether all the columns in a DataFrame have the same type. Returns ——- bool See Also … |
|
|
|
Return boolean indicating if self is view of another array |
|
|
|
|
|
Analogue to ._values that may return a 2D ExtensionArray. |
|
Access a single value for a row/column label pair. Similar to |
|
Dictionary of global attributes of this dataset. |
|
Return a list representing the axes of the DataFrame. It has the row axis labels and column axis labels as the … |
|
Return the dtypes in the DataFrame. This returns a Series with the data type of each column. The result’s … |
|
Indicator whether Series/DataFrame is empty. True if Series/DataFrame is entirely empty (no items), meaning any … |
|
Get the properties associated with this pandas object. The available flags are |
|
Access a single value for a row/column pair by integer position. Similar to |
|
Purely integer-location based indexing for selection by position. |
|
Access a group of rows and columns by label(s) or a boolean array. |
|
Return an int representing the number of axes / array dimensions. Return 1 if Series. Otherwise return 2 if DataFrame. … |
|
Return a tuple representing the dimensionality of the DataFrame. See Also ——– ndarray.shape … |
|
Return an int representing the number of elements in this object. Return the number of rows if Series. Otherwise … |
|
Returns a Styler object. Contains methods for building a styled HTML representation of the DataFrame. … |
|
The transpose of the DataFrame. Returns ——- DataFrame The transposed DataFrame. … |
|
Return a Numpy representation of the DataFrame. |
Methods¶
method |
truncated documentation |
---|---|
Changes the index. |
|
Adds a column knowing its name and a vector of values. |
|
Adds a column knowing its name and a lambda function. |
|
Groups information based on columns defined by lambda functions. |
|
Sorts rows based on the values returned by function_sort. |
Documentation¶
Adds functionalities to a dataframe.
- class pyensae.mlhelper.table_formula.TableFormula(data=None, index: Axes | None = None, columns: Axes | None = None, dtype: Dtype | None = None, copy: bool | None = None)¶
Bases:
DataFrame
Extends class :epkg:`pandas:DataFrame` or proposes extensions to existing functions using lambda functions. See Extending Pandas.
- property _constructor¶
Used when a manipulation result has the same dimensions as the original.
- _mgr: BlockManager | ArrayManager¶
- add_column_index(index, name=None)¶
Changes the index.
- Parameters:
index – new_index
name – name of the index
The changes happen inplace.
- add_column_vector(name, values)¶
Adds a column knowing its name and a vector of values.
- Parameters:
name – name of the column
values – values
The changes happen inplace.
- addc(name, function_value)¶
Adds a column knowing its name and a lambda function.
- Parameters:
name – name of the column
function_value – function
The changes happen inplace.
- fgroupby(function_key, function_values, columns=None, function_agg=None, function_weight=None)¶
Groups information based on columns defined by lambda functions.
- Parameters:
function_key – defines the key
function_values – defines the values
columns – name of the columns, if None, new ones will be created
function_agg – how to aggregate the data, if None, the default is :epkg:`pandas:DataFrame:sum`.
function_weight – defines weights, can be None
The function uses columns
__key__
,__weight__
. You should not use these names. Others columns are created__value_{0}__
and__weight_{0}__
. All of them are created and removed before returning the result.Example:
group = table.groupby(lambda v: v["name"], [lambda v: v["d_a"]], ["sum_d_a"], [lambda vec, w: sum(vec) / w], lambda v: v["d_b"])
- graph_XY(curves, xlabel=None, ylabel=None, marker=True, link_point=False, title=None, format_date='%Y-%m-%d', legend_loc=0, figsize=None, ax=None)¶
- Parameters:
curves – list of 3-uples (generator for X, generator for Y, label) for some layout, it can also be: (generator for X, generator for Y, generator for labels, label)
xlabel – label for X axis
ylabel – label for Y axis
marker – add a marker for each point
link_point – link points between them
title – graph title
format_date – if X axis is a datetime object, the function will use this format to print dates
legend_loc – location of the legend
figsize – size of the figure
ax – :epkg:`matplotlib:Axis` or None to create a new one
- Returns:
For the legend position, see matplotlib.
Example:
table.graph_XY ( [ [ lambda v: v["sum_a"], lambda v: v["sum_b"], "xy label 1"], [ lambda v: v["sum_b"], lambda v: v["sum_c"], "xy label 2"], ])
- sort(function_sort, reverse=False)¶
Sorts rows based on the values returned by function_sort.
- Parameters:
function_sort – lambda function
reverse – reverse order
The function creates a column
__key__
and removes it later. The changes happen inplace.