.. _10plottinglibrariesrst: ===================== 10 plotting libraries ===================== .. only:: html **Links:** :download:`notebook <10_plotting_libraries.ipynb>`, :downloadlink:`html <10_plotting_libraries2html.html>`, :download:`PDF <10_plotting_libraries.pdf>`, :download:`python <10_plotting_libraries.py>`, :downloadlink:`slides <10_plotting_libraries.slides.html>`, :githublink:`GitHub|_doc/notebooks/2016/pydata/10_plotting_libraries.ipynb|*` Review of plotting libraries. `Xavier Dupré `__ ``xavier.dupre AT gmail.com`` Senior Engineer at **Microsoft France** on `Azure ML `__, **Teacher in Computer Science** at the `ENSAE `__ |Azure ML| |ENSAE| .. |Azure ML| image:: logo_azureml.png .. |ENSAE| image:: ENSAE_logo_developpe.jpg **Objectives of this talk** Nobody makes plot without an existing library anymore. - How to choose a plotting library ? - List of available options - How to extend an existing library ? - How to wrap a javascript library ? .. code:: ipython3 from jyquickhelper import add_notebook_menu add_notebook_menu(last_level=2) .. contents:: :local: **Material** - Notebooks for this talk: `http://www.xavierdupre.fr/… `__ - Azure ML: `Introducing Jupyter Notebooks in Azure ML Studio `__ - Teachings at ENSAE: `Python pour un Data Scientist `__ **Microsoft, Python and Open Source** - 2014/11: `.NET Core is Open Source `__ - 2015/07: `Introducing Jupyter Notebooks in Azure ML Studio `__ - 2015/07: `Python Tools for Visual Studio `__ moves to Github - 2016/02: `Creating web apps with Flask in Azure `__ - 2016/06: `Build Machine Learning applications to run on Apache Spark clusters on HDInsight Linux `__ - 2016/06: `azure-sdk-python 2.0.rc4 `__: Python interface to access Azure services .. figure:: img_ptvs.png :alt: ptvs ptvs **Microsoft in Data Science** - `Developing the Next Wave of Data Scientists `__ - Microsoft is one of the sponsors of the `DataScienceGame `__ `Microsoft - ENSAE - Hackathon `__ Elements of decision -------------------- .. code:: ipython3 add_notebook_menu(keep_item=0) .. contents:: :local: Graph language ~~~~~~~~~~~~~~ We like them because we read them faster. .. code:: ipython3 %matplotlib inline .. code:: ipython3 from jupytalk.talk_examples.pydata2016 import example_cartopy ax = example_cartopy() ax.set_title("map", size=20); .. image:: 10_plotting_libraries_11_0.png .. code:: ipython3 import numpy, matplotlib.pyplot as plt N = 150 x, y = numpy.random.normal(0, 1, N), numpy.random.normal(0, 1, N) x[-1], y[-1] = 8, 5 plt.scatter(x, y, alpha=0.5) plt.title("outlier", size=20) .. parsed-literal:: Text(0.5,1,'outlier') .. image:: 10_plotting_libraries_12_1.png .. code:: ipython3 import numpy, matplotlib.pyplot as plt N = 150 x = numpy.random.normal(0, 1, N) y = x + numpy.random.normal(0, 0.5, N) + 1 plt.scatter(x, y, alpha=0.5) plt.title("correlation", size=20) .. parsed-literal:: Text(0.5,1,'correlation') .. image:: 10_plotting_libraries_13_1.png .. code:: ipython3 from jupytalk.talk_examples.pydata2016 import example_confidence_interval ax = example_confidence_interval() # https://github.com/sdpython/jupytalk/blob/master/src/jupytalk/talk_examples/pydata2016.py ax.set_title("incertainty", size=20) .. parsed-literal:: Text(0.5,1,'incertainty') .. image:: 10_plotting_libraries_14_1.png .. code:: ipython3 from jupytalk.talk_examples.pydata2016 import example_networkx ax = example_networkx() # https://github.com/sdpython/jupytalk/blob/master/src/jupytalk/talk_examples/pydata2016.py ax.set_title("network", size=20) .. parsed-literal:: Text(0.5,1,'network') .. image:: 10_plotting_libraries_15_1.png Why so many? ~~~~~~~~~~~~ - Every domain has its own data representation (statistics, machine learning, biology, maps…) - Many supports (images, web sites, notebooks) - High volume of data requires specific solution (maps) Example: seaborn ~~~~~~~~~~~~~~~~ `seaborn `__ - collection of plots used for any new projects - See `regplot `__. .. code:: ipython3 import seaborn; seaborn.set(color_codes=True) tips = seaborn.load_dataset("tips") ax = seaborn.regplot(x="total_bill", y="tip", data=tips) ax.set_title("regplot") .. parsed-literal:: c:\python370_x64\lib\site-packages\scipy\stats\stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval .. parsed-literal:: Text(0.5,1,'regplot') .. image:: 10_plotting_libraries_18_2.png Why using a programming language to plot? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +----------------------------------------------------+-----------------+ | Justification | Case | +====================================================+=================+ | **automate** complex graph | **update** a | | | presentation | +----------------------------------------------------+-----------------+ | **share** customized graph | easier to read | | | among a team, | | | build a common | | | **graph | | | language** | +----------------------------------------------------+-----------------+ | **combine** data processing and plotting | handle **huge | | | volume** of | | | data | +----------------------------------------------------+-----------------+ What did Internet change? ~~~~~~~~~~~~~~~~~~~~~~~~~ - **Remote access:** interact with the graph cheaper than drawing again - Many plotting libraries: `javascript plotting libraries `__ - `20 best JavaScript charting libraries `__ Impact of notebook on Python ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - **Before:** graphs libraries were **mostly static** (images) - **After:** graphs are now **interactive** - Notebook can easily leverage javascript libraries Decisions ~~~~~~~~~ **Decision 1: the audience?** - The plot is just for you? - The plot will be inserted in a report? In a PowerPoint presentation? - The plot will be internally shared? - The plot will be shared with customers on a website? **Decision 2: which volume of data to plot?** - How many points to draw 10.000, 1M, 1B? - How fast do you need to draw? - Do you need to preprocess the data? **Decision 3: which technology?** - **static** *(image, PDF, no zoom)* - `matplotlib `__ based - `reportlab `__ based - `Pillow `__ based - **interactive** *(zoom, move, not always great in a book)* - javascript based - Python and javascript based - **pure javascript** *(if you don’t find what you want)* - from a notebook - from a web page **Final check: is the library maintained?** - License: is it free only for research? - Source are available on github: is the last commit recent? - The library was mentioned in a conference. - The library is used by many others to create customized graphs? - It works on many platforms. - The documentation is great. - Libraries for static plots ---------------------------- .. code:: ipython3 add_notebook_menu(keep_item=1) .. contents:: :local: Static never fails ~~~~~~~~~~~~~~~~~~ - Images works anywhere - Images are self contained - Easy to combine .. figure:: img_combine.png :alt: combine combine Five steps to plot ~~~~~~~~~~~~~~~~~~ 1. Create a **figure**: pixel system. 2. Create **Axis**: coordinate system. 3. Draw **inside** the plotting area 4. Add element **outside** the plotting area 5. **Render** the image. .. figure:: img_step5.png :alt: step5 step5 matplotlib for all ^^^^^^^^^^^^^^^^^^ `matplotlib `__: the standard .. code:: ipython3 import numpy as np, matplotlib.pyplot as plt N = 50 x, y, colors = np.random.rand(N), np.random.rand(N), np.random.rand(N) area = np.pi * (15 * np.random.rand(N))**2 fig, ax = plt.subplots() # steps 1, 2 ax.scatter(x, y, s=area, c=colors, alpha=0.5) # step 3 ax.set_title("scatter plot") # step 4 fig.savefig("example_scatterplot.png") # step 5 .. image:: 10_plotting_libraries_31_0.png networkx for networks ^^^^^^^^^^^^^^^^^^^^^ `networkx `__ .. figure:: img_networkx.png :alt: networkx networkx seaborn for statistics ^^^^^^^^^^^^^^^^^^^^^^ `seaborn `__ .. figure:: img_seaborn.png :alt: seaborn seaborn basemap for maps ^^^^^^^^^^^^^^^^ +----------------------------------------------+------------+ | `basemap `__ | . | +==============================================+============+ | |basemap| | |basemap2| | +----------------------------------------------+------------+ See also `cartopy `__ .. |basemap| image:: img_basemap.png .. |basemap2| image:: img_basemap2.png ete3 for trees ^^^^^^^^^^^^^^ `ete3 `__ .. figure:: img_ete3.png :alt: ete3 ete3 reportlab for pdf ^^^^^^^^^^^^^^^^^ `reportlab `__: standard for PDF .. figure:: img_reportlab.png :alt: reprotlab reprotlab plotnine for the syntax ^^^^^^^^^^^^^^^^^^^^^^^ `plotnine `__ .. figure:: img_ggplot_code.png :alt: plotnine plotnine .. figure:: img_ggplot.png :alt: plotnine plotnine missingno for the missing values ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `missingno `__ .. figure:: img_missingno.png :alt: missingno missingno biopython for genes ^^^^^^^^^^^^^^^^^^^ `biopython `__ .. figure:: img_biopython.png :alt: biopyhon biopyhon lifelines for survival analysis ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `lifelines `__ .. figure:: img_lifelines.png :alt: lifelines lifelines In short ~~~~~~~~ - Many libraries available in many domains. - Many scripts available (github, stackoverflow) - Look for the gallery. - Pick the closest graph to your needs - Tweak - Libraries for interactivity ----------------------------- .. code:: ipython3 add_notebook_menu(keep_item=2) .. contents:: :local: Interactivity is javascript ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - A browser is needed - A server might be needed (bqplot) - Better to know javascript Steps to plot ~~~~~~~~~~~~~ 1. Create a figure: 2. Create Axis: coordinate system 3. Draw inside the plotting area 4. Add elements outside the plotting area 5. **Implement interactivity if not automated** 6. Write the corresponding HTML, Javascript code bokeh for all ^^^^^^^^^^^^^ `bokeh `__ **default interactivity:** zoom, move, reset ; **custom** python, javascript .. figure:: img_bokeh.png :alt: bokeh bokeh plotly for its design ^^^^^^^^^^^^^^^^^^^^^ `plotly `__ **default interactivity:** zoom, move, reset, text popup ; **plus** integration with pandas .. figure:: img_plotly.png :alt: plotly plotly mpld3 for matplotlib ^^^^^^^^^^^^^^^^^^^^ `mpld3 `__ = matplotlib in javascript **default interactivity:** zoom, move, reset **custom** python, javascript (simple) .. figure:: img_mpld3.png :alt: mpld3 mpld3 python-lightning for its simplicity ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `python-lightning `__ also works with R **default interactivity:** zoom, move, reset, text popup .. figure:: img_lightning2.png :alt: lightning lightning pygal, leather for SVG ^^^^^^^^^^^^^^^^^^^^^^ `pygal `__ `leather `__ **default interactivity:** text popup .. figure:: img_pygal.png :alt: pygal pygal vega for its simplicity ^^^^^^^^^^^^^^^^^^^^^^^ `vega `__ **default interactivity:** text popup .. figure:: img_vega2.png :alt: vega vega folium for maps ^^^^^^^^^^^^^^^ `folium `__ = map with `OpenStreetMap `__ **default interactivity:** zoom, move, reset **custom** text popup, marker .. code:: ipython3 import folium center = [48.862, 2.346] paris = folium.Map(center, zoom_start=13) folium.Marker(center, popup='Les Halles').add_to(paris) paris .. raw:: html
pythreejs for 3D ^^^^^^^^^^^^^^^^ `pythreejs `__ **default interactivity:** zoom, move, rotate, reset .. figure:: screencast.gif :alt: pythreejs pythreejs pydy for mechanics ^^^^^^^^^^^^^^^^^^ `pydy `__ **default interactivity:** visualize a scene .. figure:: img_pydy.png :alt: pydy pydy In short ~~~~~~~~ Are you looking for? - Standard interactivity (all of them) - Custom interactivity (Python, Javascript) (bokeh) - Easy export to websites (SVG, vega) - Libraries mixing Javascript, Python, … ---------------------------------------- .. code:: ipython3 add_notebook_menu(keep_item=3) .. contents:: :local: Hide the complexity ~~~~~~~~~~~~~~~~~~~ - Mix of technologies - Wrapped in one module - Easy examples - But cryptic bugs for newbies bqplot for the interactions in python ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `bqplot `__ .. figure:: img_bqplot.png :alt: bqplot bqplot brython, bythonmagic to avoid javascript ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Magic command ``%%brython%%`` - easy to modify the notebook with Python - no javascript - place to start if you don’t like javascript .. figure:: img_brython.png :alt: brython brython geoplotlib for maps in a GUI ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `geoplotlib `__ .. figure:: img_geoplotlib.png :alt: geoplotlib geoplotlib vispy for computational graphics ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `http://vispy.org/installation.html `__ creates graphs demanding heavy computation. It requires the knowledge of C++. .. figure:: img_vispy_mandelbrot.png :alt: vispy vispy In short ~~~~~~~~ - Very suitable for research purpose - Uneasy to export the results - Libraries for high volume of data ----------------------------------- .. code:: ipython3 add_notebook_menu(keep_item=4) .. contents:: :local: Challenge ~~~~~~~~~ Two extremes: - Plotting huge volume takes time to process - Interactivity requires fast processing Compromise? + datashader ~~~~~~~~~~~~ `datashader `__ = bokeh + Python interaction + data interpolation .. figure:: img_datashader.png :alt: datashader datashader In short ~~~~~~~~ Work in progress. Deeper into programming ----------------------- .. code:: ipython3 add_notebook_menu(keep_item=5) .. contents:: :local: Extend an existing library ~~~~~~~~~~~~~~~~~~~~~~~~~~ - Follow existing design - Constraints: - Add the plot to an existing one - Add complementery elements Wrong design ^^^^^^^^^^^^ .. figure:: img_wrong2.png :alt: wrong wrong Right design ^^^^^^^^^^^^ .. figure:: img_right.png :alt: right right Parameter ``ax`` ^^^^^^^^^^^^^^^^ .. figure:: img_subplots.png :alt: subplots subplots Wrapping a javascript library ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Chosen from `10 JavaScript libraries to draw your own diagrams `__ - Search for ``python + `` - `JoinJS `__ - `ChartJS `__ Dummy example with treant ^^^^^^^^^^^^^^^^^^^^^^^^^ - `treant-js `__: `tennis draw `__ - `notebook `__ - `code `__ .. figure:: img_treant.png :alt: treant treant Part 1: HTML ^^^^^^^^^^^^ - a DIV with an id - a script for the library .. figure:: img_thtml.png :alt: html html Part 2: Json data ^^^^^^^^^^^^^^^^^ - JSON most of the time .. figure:: img_tdata.png :alt: json json Part 3: javascript ^^^^^^^^^^^^^^^^^^ .. figure:: img_tjs2.png :alt: js js In short ~~~~~~~~ - Pratice with existing libraries first - Think about others users Conclusion ---------- - Static images are not obsolete! - Interactivity still requires a bit of work. - Huge volume of data is still a work in progress - Easy to create your own library **Un bon croquis vaut mieux qu’un long discours.** *Napoléon Bonaparte* **This is only the beginning** *Thank you* - http://www.xavierdupre.fr/ - ``xavier.dupre AT gmail.com``