# 10 plotting libraries#

Review of plotting libraries.

Xavier Dupré

`xavier.dupre AT gmail.com`

Senior Engineer at Microsoft France on Azure ML, Teacher in Computer Science at the ENSAE

Objectives of this talk

Nobody makes plot without an existing library anymore.

• How to choose a plotting library ?

• List of available options

• How to extend an existing library ?

• How to wrap a javascript library ?

Material

Microsoft, Python and Open Source

Microsoft in Data Science

Microsoft - ENSAE - Hackathon

## Elements of decision#

### Graph language#

We like them because we read them faster.

```%matplotlib inline
```
```from jupytalk.talk_examples.pydata2016 import example_cartopy
ax = example_cartopy()
ax.set_title("map", size=20);
``` ```import numpy, matplotlib.pyplot as plt
N = 150
x, y = numpy.random.normal(0, 1, N), numpy.random.normal(0, 1, N)
x[-1], y[-1] = 8, 5
plt.scatter(x, y, alpha=0.5)
plt.title("outlier", size=20)
```
``` ```import numpy, matplotlib.pyplot as plt
N = 150
x = numpy.random.normal(0, 1, N)
y = x + numpy.random.normal(0, 0.5, N) + 1
plt.scatter(x, y, alpha=0.5)
plt.title("correlation", size=20)
```
``` ```from jupytalk.talk_examples.pydata2016 import example_confidence_interval
ax = example_confidence_interval()
# https://github.com/sdpython/jupytalk/blob/master/src/jupytalk/talk_examples/pydata2016.py
ax.set_title("incertainty", size=20)
```
``` ```from jupytalk.talk_examples.pydata2016 import example_networkx
ax = example_networkx()
# https://github.com/sdpython/jupytalk/blob/master/src/jupytalk/talk_examples/pydata2016.py
ax.set_title("network", size=20)
```
``` ### Why so many?#

• Every domain has its own data representation (statistics, machine learning, biology, maps…)

• Many supports (images, web sites, notebooks)

• High volume of data requires specific solution (maps)

### Example: seaborn#

seaborn

• collection of plots used for any new projects

• See regplot.

```import seaborn; seaborn.set(color_codes=True)
ax = seaborn.regplot(x="total_bill", y="tip", data=tips)
ax.set_title("regplot")
```
``` ### Why using a programming language to plot?#

Justification

Case

automate complex graph

update a presentation

share customized graph

easier to read among a team, build a common graph language

combine data processing and plotting

handle huge volume of data

### Impact of notebook on Python#

• Before: graphs libraries were mostly static (images)

• After: graphs are now interactive

• Notebook can easily leverage javascript libraries

### Decisions#

Decision 1: the audience?

• The plot is just for you?

• The plot will be inserted in a report? In a PowerPoint presentation?

• The plot will be internally shared?

• The plot will be shared with customers on a website?

Decision 2: which volume of data to plot?

• How many points to draw 10.000, 1M, 1B?

• How fast do you need to draw?

• Do you need to preprocess the data?

Decision 3: which technology?

• static (image, PDF, no zoom)

• interactive (zoom, move, not always great in a book)

• javascript based

• Python and javascript based

• pure javascript (if you don’t find what you want)

• from a notebook

• from a web page

Final check: is the library maintained?

• License: is it free only for research?

• Source are available on github: is the last commit recent?

• The library was mentioned in a conference.

• The library is used by many others to create customized graphs?

• It works on many platforms.

• The documentation is great.

• Libraries for static plots

### Static never fails#

• Images works anywhere

• Images are self contained

• Easy to combine

### Five steps to plot#

1. Create a figure: pixel system.

2. Create Axis: coordinate system.

3. Draw inside the plotting area

4. Add element outside the plotting area

5. Render the image.

#### matplotlib for all#

matplotlib: the standard

```import numpy as np, matplotlib.pyplot as plt
N = 50
x, y, colors = np.random.rand(N), np.random.rand(N), np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2
fig, ax = plt.subplots()                       # steps 1, 2
ax.scatter(x, y, s=area, c=colors, alpha=0.5)  # step 3
ax.set_title("scatter plot")                   # step 4
fig.savefig("example_scatterplot.png")         # step 5
``` networkx

seaborn

#### basemap for maps#

basemap

.  ete3

#### reportlab for pdf#

reportlab: standard for PDF

plotnine

missingno

biopython

lifelines

### In short#

• Many libraries available in many domains.

• Many scripts available (github, stackoverflow)

• Look for the gallery.

• Pick the closest graph to your needs

• Tweak

• Libraries for interactivity

### Interactivity is javascript#

• A browser is needed

• A server might be needed (bqplot)

• Better to know javascript

### Steps to plot#

1. Create a figure:

2. Create Axis: coordinate system

3. Draw inside the plotting area

4. Add elements outside the plotting area

5. Implement interactivity if not automated

6. Write the corresponding HTML, Javascript code

#### bokeh for all#

bokeh default interactivity: zoom, move, reset ; custom python, javascript

#### plotly for its design#

plotly default interactivity: zoom, move, reset, text popup ; plus integration with pandas

#### mpld3 for matplotlib#

mpld3 = matplotlib in javascript default interactivity: zoom, move, reset custom python, javascript (simple)

#### python-lightning for its simplicity#

python-lightning also works with R default interactivity: zoom, move, reset, text popup

#### pygal, leather for SVG#

pygal leather default interactivity: text popup

#### vega for its simplicity#

vega default interactivity: text popup

#### folium for maps#

folium = map with OpenStreetMap default interactivity: zoom, move, reset custom text popup, marker

```import folium
center = [48.862, 2.346]
paris = folium.Map(center, zoom_start=13)
paris
```

#### pythreejs for 3D#

pythreejs default interactivity: zoom, move, rotate, reset

#### pydy for mechanics#

pydy default interactivity: visualize a scene

### In short#

Are you looking for?

• Standard interactivity (all of them)

• Custom interactivity (Python, Javascript) (bokeh)

• Easy export to websites (SVG, vega)

• Libraries mixing Javascript, Python, …

### Hide the complexity#

• Mix of technologies

• Wrapped in one module

• Easy examples

• But cryptic bugs for newbies

bqplot

#### brython, bythonmagic to avoid javascript#

Magic command `%%brython%%`

• easy to modify the notebook with Python

• no javascript

• place to start if you don’t like javascript

geoplotlib

#### vispy for computational graphics#

http://vispy.org/installation.html creates graphs demanding heavy computation. It requires the knowledge of C++.

### In short#

• Very suitable for research purpose

• Uneasy to export the results

• Libraries for high volume of data

### Challenge#

Two extremes:

• Plotting huge volume takes time to process

• Interactivity requires fast processing

Compromise?

datashader = bokeh + Python interaction + data interpolation

### In short#

Work in progress.

## Deeper into programming#

### Extend an existing library#

• Constraints:

• Add the plot to an existing one

### Wrapping a javascript library#

#### Part 1: HTML#

• a DIV with an id

• a script for the library

#### Part 2: Json data#

• JSON most of the time

### In short#

• Pratice with existing libraries first

## Conclusion#

• Static images are not obsolete!

• Interactivity still requires a bit of work.

• Huge volume of data is still a work in progress

• Easy to create your own library

Un bon croquis vaut mieux qu’un long discours. Napoléon Bonaparte

This is only the beginning

Thank you