.. _10plottinglibrariesrst:
=====================
10 plotting libraries
=====================
.. only:: html
**Links:** :download:`notebook <10_plotting_libraries.ipynb>`, :downloadlink:`html <10_plotting_libraries2html.html>`, :download:`PDF <10_plotting_libraries.pdf>`, :download:`python <10_plotting_libraries.py>`, :downloadlink:`slides <10_plotting_libraries.slides.html>`, :githublink:`GitHub|_doc/notebooks/2016/pydata/10_plotting_libraries.ipynb|*`
Review of plotting libraries.
`Xavier Dupré `__
``xavier.dupre AT gmail.com``
Senior Engineer at **Microsoft France** on `Azure
ML `__,
**Teacher in Computer Science** at the `ENSAE `__
|Azure ML| |ENSAE|
.. |Azure ML| image:: logo_azureml.png
.. |ENSAE| image:: ENSAE_logo_developpe.jpg
**Objectives of this talk**
Nobody makes plot without an existing library anymore.
- How to choose a plotting library ?
- List of available options
- How to extend an existing library ?
- How to wrap a javascript library ?
.. code:: ipython3
from jyquickhelper import add_notebook_menu
add_notebook_menu(last_level=2)
.. contents::
:local:
**Material**
- Notebooks for this talk:
`http://www.xavierdupre.fr/… `__
- Azure ML: `Introducing Jupyter Notebooks in Azure ML
Studio `__
- Teachings at ENSAE: `Python pour un Data
Scientist `__
**Microsoft, Python and Open Source**
- 2014/11: `.NET Core is Open
Source `__
- 2015/07: `Introducing Jupyter Notebooks in Azure ML
Studio `__
- 2015/07: `Python Tools for Visual
Studio `__ moves to Github
- 2016/02: `Creating web apps with Flask in
Azure `__
- 2016/06: `Build Machine Learning applications to run on Apache Spark
clusters on HDInsight
Linux `__
- 2016/06: `azure-sdk-python
2.0.rc4 `__: Python
interface to access Azure services
.. figure:: img_ptvs.png
:alt: ptvs
ptvs
**Microsoft in Data Science**
- `Developing the Next Wave of Data
Scientists `__
- Microsoft is one of the sponsors of the
`DataScienceGame `__
`Microsoft - ENSAE -
Hackathon `__
Elements of decision
--------------------
.. code:: ipython3
add_notebook_menu(keep_item=0)
.. contents::
:local:
Graph language
~~~~~~~~~~~~~~
We like them because we read them faster.
.. code:: ipython3
%matplotlib inline
.. code:: ipython3
from jupytalk.talk_examples.pydata2016 import example_cartopy
ax = example_cartopy()
ax.set_title("map", size=20);
.. image:: 10_plotting_libraries_11_0.png
.. code:: ipython3
import numpy, matplotlib.pyplot as plt
N = 150
x, y = numpy.random.normal(0, 1, N), numpy.random.normal(0, 1, N)
x[-1], y[-1] = 8, 5
plt.scatter(x, y, alpha=0.5)
plt.title("outlier", size=20)
.. parsed-literal::
Text(0.5,1,'outlier')
.. image:: 10_plotting_libraries_12_1.png
.. code:: ipython3
import numpy, matplotlib.pyplot as plt
N = 150
x = numpy.random.normal(0, 1, N)
y = x + numpy.random.normal(0, 0.5, N) + 1
plt.scatter(x, y, alpha=0.5)
plt.title("correlation", size=20)
.. parsed-literal::
Text(0.5,1,'correlation')
.. image:: 10_plotting_libraries_13_1.png
.. code:: ipython3
from jupytalk.talk_examples.pydata2016 import example_confidence_interval
ax = example_confidence_interval()
# https://github.com/sdpython/jupytalk/blob/master/src/jupytalk/talk_examples/pydata2016.py
ax.set_title("incertainty", size=20)
.. parsed-literal::
Text(0.5,1,'incertainty')
.. image:: 10_plotting_libraries_14_1.png
.. code:: ipython3
from jupytalk.talk_examples.pydata2016 import example_networkx
ax = example_networkx()
# https://github.com/sdpython/jupytalk/blob/master/src/jupytalk/talk_examples/pydata2016.py
ax.set_title("network", size=20)
.. parsed-literal::
Text(0.5,1,'network')
.. image:: 10_plotting_libraries_15_1.png
Why so many?
~~~~~~~~~~~~
- Every domain has its own data representation (statistics, machine
learning, biology, maps…)
- Many supports (images, web sites, notebooks)
- High volume of data requires specific solution (maps)
Example: seaborn
~~~~~~~~~~~~~~~~
`seaborn `__
- collection of plots used for any new projects
- See
`regplot `__.
.. code:: ipython3
import seaborn; seaborn.set(color_codes=True)
tips = seaborn.load_dataset("tips")
ax = seaborn.regplot(x="total_bill", y="tip", data=tips)
ax.set_title("regplot")
.. parsed-literal::
c:\python370_x64\lib\site-packages\scipy\stats\stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
.. parsed-literal::
Text(0.5,1,'regplot')
.. image:: 10_plotting_libraries_18_2.png
Why using a programming language to plot?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+----------------------------------------------------+-----------------+
| Justification | Case |
+====================================================+=================+
| **automate** complex graph | **update** a |
| | presentation |
+----------------------------------------------------+-----------------+
| **share** customized graph | easier to read |
| | among a team, |
| | build a common |
| | **graph |
| | language** |
+----------------------------------------------------+-----------------+
| **combine** data processing and plotting | handle **huge |
| | volume** of |
| | data |
+----------------------------------------------------+-----------------+
What did Internet change?
~~~~~~~~~~~~~~~~~~~~~~~~~
- **Remote access:** interact with the graph cheaper than drawing again
- Many plotting libraries: `javascript plotting
libraries `__
- `20 best JavaScript charting
libraries `__
Impact of notebook on Python
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- **Before:** graphs libraries were **mostly static** (images)
- **After:** graphs are now **interactive**
- Notebook can easily leverage javascript libraries
Decisions
~~~~~~~~~
**Decision 1: the audience?**
- The plot is just for you?
- The plot will be inserted in a report? In a PowerPoint presentation?
- The plot will be internally shared?
- The plot will be shared with customers on a website?
**Decision 2: which volume of data to plot?**
- How many points to draw 10.000, 1M, 1B?
- How fast do you need to draw?
- Do you need to preprocess the data?
**Decision 3: which technology?**
- **static** *(image, PDF, no zoom)*
- `matplotlib `__ based
- `reportlab `__ based
- `Pillow `__ based
- **interactive** *(zoom, move, not always great in a book)*
- javascript based
- Python and javascript based
- **pure javascript** *(if you don’t find what you want)*
- from a notebook
- from a web page
**Final check: is the library maintained?**
- License: is it free only for research?
- Source are available on github: is the last commit recent?
- The library was mentioned in a conference.
- The library is used by many others to create customized graphs?
- It works on many platforms.
- The documentation is great.
- Libraries for static plots
----------------------------
.. code:: ipython3
add_notebook_menu(keep_item=1)
.. contents::
:local:
Static never fails
~~~~~~~~~~~~~~~~~~
- Images works anywhere
- Images are self contained
- Easy to combine
.. figure:: img_combine.png
:alt: combine
combine
Five steps to plot
~~~~~~~~~~~~~~~~~~
1. Create a **figure**: pixel system.
2. Create **Axis**: coordinate system.
3. Draw **inside** the plotting area
4. Add element **outside** the plotting area
5. **Render** the image.
.. figure:: img_step5.png
:alt: step5
step5
matplotlib for all
^^^^^^^^^^^^^^^^^^
`matplotlib `__: the standard
.. code:: ipython3
import numpy as np, matplotlib.pyplot as plt
N = 50
x, y, colors = np.random.rand(N), np.random.rand(N), np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2
fig, ax = plt.subplots() # steps 1, 2
ax.scatter(x, y, s=area, c=colors, alpha=0.5) # step 3
ax.set_title("scatter plot") # step 4
fig.savefig("example_scatterplot.png") # step 5
.. image:: 10_plotting_libraries_31_0.png
networkx for networks
^^^^^^^^^^^^^^^^^^^^^
`networkx `__
.. figure:: img_networkx.png
:alt: networkx
networkx
seaborn for statistics
^^^^^^^^^^^^^^^^^^^^^^
`seaborn `__
.. figure:: img_seaborn.png
:alt: seaborn
seaborn
basemap for maps
^^^^^^^^^^^^^^^^
+----------------------------------------------+------------+
| `basemap `__ | . |
+==============================================+============+
| |basemap| | |basemap2| |
+----------------------------------------------+------------+
See also `cartopy `__
.. |basemap| image:: img_basemap.png
.. |basemap2| image:: img_basemap2.png
ete3 for trees
^^^^^^^^^^^^^^
`ete3 `__
.. figure:: img_ete3.png
:alt: ete3
ete3
reportlab for pdf
^^^^^^^^^^^^^^^^^
`reportlab `__: standard for PDF
.. figure:: img_reportlab.png
:alt: reprotlab
reprotlab
plotnine for the syntax
^^^^^^^^^^^^^^^^^^^^^^^
`plotnine `__
.. figure:: img_ggplot_code.png
:alt: plotnine
plotnine
.. figure:: img_ggplot.png
:alt: plotnine
plotnine
missingno for the missing values
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`missingno `__
.. figure:: img_missingno.png
:alt: missingno
missingno
biopython for genes
^^^^^^^^^^^^^^^^^^^
`biopython `__
.. figure:: img_biopython.png
:alt: biopyhon
biopyhon
lifelines for survival analysis
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`lifelines `__
.. figure:: img_lifelines.png
:alt: lifelines
lifelines
In short
~~~~~~~~
- Many libraries available in many domains.
- Many scripts available (github, stackoverflow)
- Look for the gallery.
- Pick the closest graph to your needs
- Tweak
- Libraries for interactivity
-----------------------------
.. code:: ipython3
add_notebook_menu(keep_item=2)
.. contents::
:local:
Interactivity is javascript
~~~~~~~~~~~~~~~~~~~~~~~~~~~
- A browser is needed
- A server might be needed (bqplot)
- Better to know javascript
Steps to plot
~~~~~~~~~~~~~
1. Create a figure:
2. Create Axis: coordinate system
3. Draw inside the plotting area
4. Add elements outside the plotting area
5. **Implement interactivity if not automated**
6. Write the corresponding HTML, Javascript code
bokeh for all
^^^^^^^^^^^^^
`bokeh `__ **default interactivity:** zoom,
move, reset ; **custom** python, javascript
.. figure:: img_bokeh.png
:alt: bokeh
bokeh
plotly for its design
^^^^^^^^^^^^^^^^^^^^^
`plotly `__ **default interactivity:** zoom, move,
reset, text popup ; **plus** integration with pandas
.. figure:: img_plotly.png
:alt: plotly
plotly
mpld3 for matplotlib
^^^^^^^^^^^^^^^^^^^^
`mpld3 `__ = matplotlib in javascript **default
interactivity:** zoom, move, reset **custom** python, javascript
(simple)
.. figure:: img_mpld3.png
:alt: mpld3
mpld3
python-lightning for its simplicity
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`python-lightning `__ also works with R
**default interactivity:** zoom, move, reset, text popup
.. figure:: img_lightning2.png
:alt: lightning
lightning
pygal, leather for SVG
^^^^^^^^^^^^^^^^^^^^^^
`pygal `__
`leather `__
**default interactivity:** text popup
.. figure:: img_pygal.png
:alt: pygal
pygal
vega for its simplicity
^^^^^^^^^^^^^^^^^^^^^^^
`vega `__ **default interactivity:**
text popup
.. figure:: img_vega2.png
:alt: vega
vega
folium for maps
^^^^^^^^^^^^^^^
`folium `__ = map with
`OpenStreetMap `__ **default
interactivity:** zoom, move, reset **custom** text popup, marker
.. code:: ipython3
import folium
center = [48.862, 2.346]
paris = folium.Map(center, zoom_start=13)
folium.Marker(center, popup='Les Halles').add_to(paris)
paris
.. raw:: html
pythreejs for 3D
^^^^^^^^^^^^^^^^
`pythreejs `__ **default
interactivity:** zoom, move, rotate, reset
.. figure:: screencast.gif
:alt: pythreejs
pythreejs
pydy for mechanics
^^^^^^^^^^^^^^^^^^
`pydy `__ **default
interactivity:** visualize a scene
.. figure:: img_pydy.png
:alt: pydy
pydy
In short
~~~~~~~~
Are you looking for?
- Standard interactivity (all of them)
- Custom interactivity (Python, Javascript) (bokeh)
- Easy export to websites (SVG, vega)
- Libraries mixing Javascript, Python, …
----------------------------------------
.. code:: ipython3
add_notebook_menu(keep_item=3)
.. contents::
:local:
Hide the complexity
~~~~~~~~~~~~~~~~~~~
- Mix of technologies
- Wrapped in one module
- Easy examples
- But cryptic bugs for newbies
bqplot for the interactions in python
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`bqplot `__
.. figure:: img_bqplot.png
:alt: bqplot
bqplot
brython, bythonmagic to avoid javascript
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Magic command ``%%brython%%``
- easy to modify the notebook with Python
- no javascript
- place to start if you don’t like javascript
.. figure:: img_brython.png
:alt: brython
brython
geoplotlib for maps in a GUI
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`geoplotlib `__
.. figure:: img_geoplotlib.png
:alt: geoplotlib
geoplotlib
vispy for computational graphics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`http://vispy.org/installation.html `__ creates graphs demanding
heavy computation. It requires the knowledge of C++.
.. figure:: img_vispy_mandelbrot.png
:alt: vispy
vispy
In short
~~~~~~~~
- Very suitable for research purpose
- Uneasy to export the results
- Libraries for high volume of data
-----------------------------------
.. code:: ipython3
add_notebook_menu(keep_item=4)
.. contents::
:local:
Challenge
~~~~~~~~~
Two extremes:
- Plotting huge volume takes time to process
- Interactivity requires fast processing
Compromise?
+ datashader
~~~~~~~~~~~~
`datashader `__ = bokeh + Python
interaction + data interpolation
.. figure:: img_datashader.png
:alt: datashader
datashader
In short
~~~~~~~~
Work in progress.
Deeper into programming
-----------------------
.. code:: ipython3
add_notebook_menu(keep_item=5)
.. contents::
:local:
Extend an existing library
~~~~~~~~~~~~~~~~~~~~~~~~~~
- Follow existing design
- Constraints:
- Add the plot to an existing one
- Add complementery elements
Wrong design
^^^^^^^^^^^^
.. figure:: img_wrong2.png
:alt: wrong
wrong
Right design
^^^^^^^^^^^^
.. figure:: img_right.png
:alt: right
right
Parameter ``ax``
^^^^^^^^^^^^^^^^
.. figure:: img_subplots.png
:alt: subplots
subplots
Wrapping a javascript library
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Chosen from `10 JavaScript libraries to draw your own
diagrams `__
- Search for ``python + ``
- `JoinJS `__
- `ChartJS `__
Dummy example with treant
^^^^^^^^^^^^^^^^^^^^^^^^^
- `treant-js `__: `tennis
draw `__
- `notebook `__
- `code `__
.. figure:: img_treant.png
:alt: treant
treant
Part 1: HTML
^^^^^^^^^^^^
- a DIV with an id
- a script for the library
.. figure:: img_thtml.png
:alt: html
html
Part 2: Json data
^^^^^^^^^^^^^^^^^
- JSON most of the time
.. figure:: img_tdata.png
:alt: json
json
Part 3: javascript
^^^^^^^^^^^^^^^^^^
.. figure:: img_tjs2.png
:alt: js
js
In short
~~~~~~~~
- Pratice with existing libraries first
- Think about others users
Conclusion
----------
- Static images are not obsolete!
- Interactivity still requires a bit of work.
- Huge volume of data is still a work in progress
- Easy to create your own library
**Un bon croquis vaut mieux qu’un long discours.** *Napoléon Bonaparte*
**This is only the beginning**
*Thank you*
- http://www.xavierdupre.fr/
- ``xavier.dupre AT gmail.com``