module `pandashelper.tblformat`¶

Short summary¶

module pyquickhelper.pandashelper.tblformat

To format a pandas dataframe

Functions¶

function	truncated documentation
`df2html`	Converts the table into a html string.
`df2rst`	Builds a string in RST format from a dataframe.
`enumerate_split_df`	Splits a dataframe by columns to display shorter dataframes.

Documentation¶

To format a pandas dataframe

source on GitHub

pyquickhelper.pandashelper.tblformat.df2html(self, class_table=None, class_td=None, class_tr=None, class_th=None)[source]¶

Converts the table into a html string.

Parameters:

self – dataframe (to be added as a class method)
class_table – adds a class to the tag table (None for none)
class_td – adds a class to the tag td (None for none)
class_tr – adds a class to the tag tr (None for none)
class_th – adds a class to the tag th (None for none)

Returns:

HTML

source on GitHub

pyquickhelper.pandashelper.tblformat.df2rst(df, add_line=True, align='l', column_size=None, index=False, list_table=False, title=None, header=True, sep=',', number_format=None, replacements=None, split_row=None, split_row_level='+', split_col_common=None, split_col_subsets=None, filter_rows=None, label_pattern=None)[source]¶

Builds a string in RST format from a dataframe.

Parameters:

df – dataframe
add_line – (bool) add a line separator between each row
align – r or l or c
column_size – something like [1, 2, 5] to multiply the column size, a dictionary (if list_table is False) to overwrite a column size like {'col_name1': 20} or {3: 20}
index – add the index
list_table – use the list_table
title – used only if list_table is True
header – add one header
sep – separator if df is a string and is a filename to load
number_format – formats number in a specific way, if number_format is an integer, the pattern is replaced by {numpy.float64: '{:.2g}'} (if number_format is 2), see also pyformat.info>`__
replacements – replacements just before converting into RST (dictionary)
split_row – displays several table, one column is used as the name of each section
split_row_level – title level if option split_row is used
split_col_common – splits the dataframe by columns, see enumerate_split_df
split_col_subsets – splits the dataframe by columns, see enumerate_split_df
filter_rows – None or function to removes rows, signature def filter_rows(df: DataFrame) -> DataFrame
label_pattern – if split_row is used, the function may insert a label in front of every section, example: ".. _lpy-{section}:"

Returns:

string

If list_table is False, the format is the following.

None values are replaced by empty string (4 spaces). It produces the following results:

+------------------------+------------+----------+----------+
| Header row, column 1   | Header 2   | Header 3 | Header 4 |
| (header rows optional) |            |          |          |
+========================+============+==========+==========+
| body row 1, column 1   | column 2   | column 3 | column 4 |
+------------------------+------------+----------+----------+
| body row 2             | ...        | ...      |          |
+------------------------+------------+----------+----------+

If list_table is True, the format is the following:

.. list-table:: title
    :widths: 15 10 30
    :header-rows: 1

    * - Treat
      - Quantity
      - Description
    * - Albatross
      - 2.99
      - anythings
    ...

Convert a dataframe into RST

<<<

from pandas import DataFrame
from pyquickhelper.pandashelper import df2rst

df = DataFrame([{'A': 0, 'B': 'text'},
                {'A': 1e-5, 'C': 'longer text'}])
print(df2rst(df))

>>>

    +-------+------+-------------+
    | A     | B    | C           |
    +=======+======+=============+
    | 0.0   | text |             |
    +-------+------+-------------+
    | 1e-05 |      | longer text |
    +-------+------+-------------+

Convert a dataframe into markdown

<<<

from io import StringIO
from textwrap import dedent
import pandas

from_excel = dedent('''
Op;axes;shape;SpeedUp
ReduceMax;(3,);(8, 24, 48, 8);2.96
ReduceMax;(3,);(8, 24, 48, 16);2.57
ReduceMax;(3,);(8, 24, 48, 32);2.95
ReduceMax;(3,);(8, 24, 48, 64);3.28
ReduceMax;(3,);(8, 24, 48, 100);3.05
ReduceMax;(3,);(8, 24, 48, 128);3.11
ReduceMax;(3,);(8, 24, 48, 200);2.86
ReduceMax;(3,);(8, 24, 48, 256);2.50
ReduceMax;(3,);(8, 24, 48, 400);2.48
ReduceMax;(3,);(8, 24, 48, 512);2.90
ReduceMax;(3,);(8, 24, 48, 1024);2.76
ReduceMax;(0,);(8, 24, 48, 8);19.29
ReduceMax;(0,);(8, 24, 48, 16);11.83
ReduceMax;(0,);(8, 24, 48, 32);5.69
ReduceMax;(0,);(8, 24, 48, 64);5.49
ReduceMax;(0,);(8, 24, 48, 100);6.13
ReduceMax;(0,);(8, 24, 48, 128);6.27
ReduceMax;(0,);(8, 24, 48, 200);5.46
ReduceMax;(0,);(8, 24, 48, 256);4.76
ReduceMax;(0,);(8, 24, 48, 400);2.21
ReduceMax;(0,);(8, 24, 48, 512);4.52
ReduceMax;(0,);(8, 24, 48, 1024);4.38
ReduceSum;(3,);(8, 24, 48, 8);1.79
ReduceSum;(3,);(8, 24, 48, 16);0.79
ReduceSum;(3,);(8, 24, 48, 32);1.67
ReduceSum;(3,);(8, 24, 48, 64);1.19
ReduceSum;(3,);(8, 24, 48, 100);2.08
ReduceSum;(3,);(8, 24, 48, 128);2.96
ReduceSum;(3,);(8, 24, 48, 200);1.66
ReduceSum;(3,);(8, 24, 48, 256);2.26
ReduceSum;(3,);(8, 24, 48, 400);1.76
ReduceSum;(3,);(8, 24, 48, 512);2.61
ReduceSum;(3,);(8, 24, 48, 1024);2.21
ReduceSum;(0,);(8, 24, 48, 8);2.56
ReduceSum;(0,);(8, 24, 48, 16);2.05
ReduceSum;(0,);(8, 24, 48, 32);3.04
ReduceSum;(0,);(8, 24, 48, 64);2.57
ReduceSum;(0,);(8, 24, 48, 100);2.41
ReduceSum;(0,);(8, 24, 48, 128);2.77
ReduceSum;(0,);(8, 24, 48, 200);2.02
ReduceSum;(0,);(8, 24, 48, 256);1.61
ReduceSum;(0,);(8, 24, 48, 400);1.59
ReduceSum;(0,);(8, 24, 48, 512);1.48
ReduceSum;(0,);(8, 24, 48, 1024);1.50
''')

df = pandas.read_csv(StringIO(from_excel), sep=";")
print(df.columns)

sub = df[["Op", "axes", "shape", "SpeedUp"]]
piv = df.pivot_table(values="SpeedUp", index=['axes', "shape"], columns="Op")
piv = piv.reset_index(drop=False)

print(piv.to_markdown(index=False))

>>>

    Index(['Op', 'axes', 'shape', 'SpeedUp'], dtype='object')
    | axes   | shape             |   ReduceMax |   ReduceSum |
    |:-------|:------------------|------------:|------------:|
    | (0,)   | (8, 24, 48, 100)  |        6.13 |        2.41 |
    | (0,)   | (8, 24, 48, 1024) |        4.38 |        1.5  |
    | (0,)   | (8, 24, 48, 128)  |        6.27 |        2.77 |
    | (0,)   | (8, 24, 48, 16)   |       11.83 |        2.05 |
    | (0,)   | (8, 24, 48, 200)  |        5.46 |        2.02 |
    | (0,)   | (8, 24, 48, 256)  |        4.76 |        1.61 |
    | (0,)   | (8, 24, 48, 32)   |        5.69 |        3.04 |
    | (0,)   | (8, 24, 48, 400)  |        2.21 |        1.59 |
    | (0,)   | (8, 24, 48, 512)  |        4.52 |        1.48 |
    | (0,)   | (8, 24, 48, 64)   |        5.49 |        2.57 |
    | (0,)   | (8, 24, 48, 8)    |       19.29 |        2.56 |
    | (3,)   | (8, 24, 48, 100)  |        3.05 |        2.08 |
    | (3,)   | (8, 24, 48, 1024) |        2.76 |        2.21 |
    | (3,)   | (8, 24, 48, 128)  |        3.11 |        2.96 |
    | (3,)   | (8, 24, 48, 16)   |        2.57 |        0.79 |
    | (3,)   | (8, 24, 48, 200)  |        2.86 |        1.66 |
    | (3,)   | (8, 24, 48, 256)  |        2.5  |        2.26 |
    | (3,)   | (8, 24, 48, 32)   |        2.95 |        1.67 |
    | (3,)   | (8, 24, 48, 400)  |        2.48 |        1.76 |
    | (3,)   | (8, 24, 48, 512)  |        2.9  |        2.61 |
    | (3,)   | (8, 24, 48, 64)   |        3.28 |        1.19 |
    | (3,)   | (8, 24, 48, 8)    |        2.96 |        1.79 |

Nan value are replaced by empty string even if number_format is not None.

source on GitHub

pyquickhelper.pandashelper.tblformat.enumerate_split_df(df, common, subsets)[source]¶

Splits a dataframe by columns to display shorter dataframes.

Parameters:

df – dataframe
common – common columns
subsets – subsets of columns

Returns:

split dataframes

<<<

from pandas import DataFrame
from pyquickhelper.pandashelper.tblformat import enumerate_split_df

df = DataFrame([{'A': 0, 'B': 'text'},
                {'A': 1e-5, 'C': 'longer text'}])
res = list(enumerate_split_df(df, ['A'], [['B'], ['C']]))
print(res[0])
print('-----')
print(res[1])

>>>

             A     B
    0  0.00000  text
    1  0.00001   NaN
    -----
             A            C
    0  0.00000          NaN
    1  0.00001  longer text

source on GitHub

module `pandashelper.tblformat`¶

Short summary¶

Functions¶

Documentation¶

Links

Contents

Information

Related Topics

This Page

module pandashelper.tblformat¶

Short summary¶

Functions¶

Documentation¶

module `pandashelper.tblformat`¶