Categories
Data Science Software Development

How to make beautiful offline plots in a few lines of code in Python

We all love watching beautiful and self-understanding plots instead of a bunch of data presented in tabular forms. Also, we want to present our data as plots to be more convincing.

There are numerous platforms and tools to help us with this job, starting from common office suits and ending with dedicated software packages or online platforms. Many of you are familiar already with plotly which is Python implementation of the data visualization software built on the top of splendid D3.js library. But using bare plotly requires a lot of effort (counted in lines of code) to get expected results. Salvation brings here the cufflinks library which is a bridge between plotly and pandas dataframes – probably the most often used datasource in the Python ecosystem. This explosive mixture can be used simply, without the need for setting up any complex development environments, with the help of jupiter notebooks. So let’s start without delay‚Ķ

First, let’s check if we have Python3 binaries installed. Python3 is preinstalled in a desktop distribution of Ubuntu 18.

$ python3 --version
Python 3.6.8

Then we have to install pip3 – a package manager for Python3 packages.

$ sudo apt install python3-pip

To keep the python configuration clean, we will install jupyter in an isolated environment, so we have to install virtualenv package first.

sudo -H pip3 install --upgrade pip
sudo -H pip3 install virtualenv

Let’s create a folder dedicated for jupyter setup, create a virtual environment for it and activate this virtual environment.

$ mkdir ~/jupyter
$ cd ~/jupyter/
$ virtualenv .
$ source ./bin/activate

Now, we are ready to install jupyter.

pip install jupyter

After that, we need to install our datavis components.

pip install plotly
pip install cufflinks

To start jupyter notebooks we have just to type the command below.

(jupyter) user@ubuntu18:~/jupyter$jupyter notebook

The application will start and open the web browser window.

Worth to know that recently plotly moves toward online services named Chart Studio. To be able to draw offline without a need to set up an account on Chart Studio platform we have to use plotly.offline and explicitly switch to offline mode, which is presented below.

import plotly.offline as py
import cufflinks as cf
import pandas as pd
import numpy as np
print (cf.version)

cf.go_offline();

Now, we are able to compare the traditional way of using plotly.

df = cf.datagen.lines()

py.iplot([{
    'x': df.index,
    'y': df[col],
    'name': col
}  for col in df.columns], filename='draw')

And the usage with the help of cufflinks leading to exactly the same results.

df.iplot(kind='scatter', filename='draw')

Isn’t it lovely?

Another view angles

  1. In addition to the one described above (plotly), there are many other libraries for data visualization for the python ecosystem. At least Matplotlib, Seaborn (based on the former one), GGplot and Bokeh should be mentioned here. However, plotly is basing on a famous and powerful D3.js library and also offers rich online services, which make it a first-class gamer.
  2. There exists also ‘more native’ data binder for the plotly library – Plotly Express, but at least for now it is much less powerful.
  3. It should also be noted that using cufflinks we do not have access to all the possibilities offered by plotly, for example, we will not be able to create a sunburst chart.
  4. And finally, it is worth mentioning that there is a possibility of even simpler installation of jupiter notebooks – I mean containerization technology, namely the docker platform.

1 reply on “How to make beautiful offline plots in a few lines of code in Python”

Leave a Reply

Your email address will not be published. Required fields are marked *