Python Environments

This notebook describes and creates the default Python 2 & 3 environments in Nextjournal. Check out the showcase if you want to see what the environment contains. To see how it’s built, see setup.

Showcase

The Python 3 environment runs Python

'3.7.7'
. Here’s a list of all included packages:

pip freeze
1.7s
Python 3 Tests (Bash in Python)
Python 3

System Packages and Basics

A wide variety of support libraries are installed, as well as gcc v7.

Python packages are installed using conda, or pip version

'20.1.1'
. setuptools version
'47.3.1.post20200622'
is also included for convenience. Please refer to the Python section of Installing Software and Packages for more detailed information.

Plotting

The default environment comes with plotly version

'4.8.2'
and matplotlib version
'3.2.2'
. Here are some examples of how they are used in Nextjournal:

Plotly

Plot a histogram using Plotly, a plotting library for making interactive graphs online.

import plotly.graph_objs as go
import numpy as np
x0 = np.random.randn(500)
x1 = np.random.randn(500)+1
trace1 = go.Histogram(x=x0, opacity=0.75)
trace2 = go.Histogram(x=x1, opacity=0.75)
layout = go.Layout(barmode='overlay')
go.Figure(data=[trace1, trace2], layout=layout)
0.8s
Python 3 Tests (Python)
Python 3

Matplotlib

Plot a 5 hertz sine wave using matplotlib, a Python plotting library.

import matplotlib.pyplot as plt, numpy as np
# Data for plotting
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin((5 * 2) * np.pi * t)
# Note that using plt.subplots below is equivalent to using
# fig = plt.figure() and then ax = fig.add_subplot(111)
_, ax = plt.subplots()
ax.plot(t, s)
ax.set(xlabel='time (s)', ylabel='voltage (mV)', title='Sine Wave')
ax.grid()
plt.show()
0.3s
Python 3 Tests (Python)
Python 3

Data Structures

Nextjournal's default Python environment contains several packages for data manipulation and parsing.

  • The SciPy ecosystem is available, including scipy version, numpy, and pandas.

  • simplejson makes it easy to encode/decode JSON data structures.

  • six is included to help smooth differences between Python 2 and 3.

Numpy

Numpy's main object is a N-dimensional array useful for linear algebra, Fourier transforms, and random number capabilities. Here it is used to create a Mandelbrot set which is ultimately plotted using matplotlib.

import numpy as np, matplotlib.pyplot as plt
def mandelbrot( h,w, maxit=10):
    y,x = np.ogrid[ -1.4:1.4:h*1j, -2:0.8:w*1j ]
    c = x+y*1j
    z = c
    divtime = maxit + np.zeros(z.shape, dtype=int)
    for i in range(maxit):
        z  = z**2 + c
        diverge = z * np.conj(z) > 2**2       # who is diverging
        div_now = diverge & (divtime==maxit)  # who is diverging now
        divtime[div_now] = i + 100            # note when
        z[diverge] = 2                        # avoid diverging too much
    return divtime
plt.subplots(1,figsize=(20,20))
plt.imshow(mandelbrot(1000,1000)) 
plt.axis('off')
plt.show()
0.9s
Python 3 Tests (Python)
Python 3

Pandas

Pandas makes data analysis easier in Python. For example, a single instantiation of pandas' Series class can include all label and data information. 1000 random values are generated by numpy and the final graph is plotted with matplotlib.

import pandas as pd, matplotlib.pyplot as plt, numpy as np
ts = pd.Series(np.random.randn(1000), 
               index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()
_, ax = plt.subplots()
ax = ts.plot()
plt.show()
0.6s
Python 3 Tests (Python)
Python 3

Simplejson

Import and export JSON on Nextjournal using simplejson. In the example below, a Python data structure input results in JSON output—the change from None to null is a clear indicator.

import simplejson as json
json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
0.1s
Python 3 Tests (Python)
Python 3

Six

Six makes it easy to write Python code that is compatible with both Python 2 and Python 3.

For example, Python 2's urllib, urllib2, and urlparse modules have been combined in the urllib package in Python 3. The six.moves.urllib package is a version-independent location for this functionality.

Python 2:

from __future__ import print_function
from six.moves.urllib.request import urlopen
url = urlopen("http://nextjournal.com")
print(url.read())
0.6s
Python 2 Tests (Python)
Python 2

Python 3:

from __future__ import print_function
from six.moves.urllib.request import urlopen
url = urlopen("http://nextjournal.com")
print(url.read())
0.4s
Python 3 Tests (Python)
Python 3

Data Storage

Apache Arrow

import numpy as np
import pandas as pd
import pyarrow as pa
# Converting Pandas Dataframe to Apache Arrow Table
df = pd.DataFrame({"one": [20, np.nan, 2.5],
                   "two": ["january", "february", "march"],
                   "three": [True, False, True]},index=list("abc"))
table = pa.Table.from_pandas(df)
# Writing a parquet file from Apache Arrow
import pyarrow.parquet as pq
pq.write_table(table, "/shared/example.parquet")
# Reading a parquet file
table2 = pq.read_table("/shared/example.parquet")
# Reading a parquet file
df_new = table2.to_pandas()
df_new == df
0.2s
Python 3 Tests (Python)
Python 3

Setup

Build a Minimal Python 3 Environment

Download and install conda.

CONDA_VER="4.8.3"
PYTHON_VER="py37"
file="Miniconda3-${PYTHON_VER}_${CONDA_VER}-Linux-x86_64.sh"
wget -q --show-progress --progress=bar:force -P /results \
  https://repo.continuum.io/miniconda/${file}
3.5s
Minimal Python 3 (Bash)
bash 
Miniconda3-py37_4.8.3-Linux-x86_64.sh
-b -p /opt/conda
22.1s
Minimal Python 3 (Bash)

Links to make sure conda Python supersedes system Python for non-absolute, non-versioned calls.

ln -s /opt/conda/bin/pip /opt/conda/bin/pip3
ln -s /opt/conda/bin/pip /opt/conda/bin/pip3.7
ln -s /opt/conda/bin/python3.7 /opt/conda/bin/python3m
ln -s /opt/conda/bin/python3.7m-config /opt/conda/bin/python3m-config
1.8s
Minimal Python 3 (Bash)

Add conda's library directory so ldconfig will pick it up, set conda config, and ensure pip is reasonably updated. We also pin Python to the installed minor version, allowing only patch-version up/downgrades.

# make this the last alphabetically => lowest precedence libraries
echo "/opt/conda/lib" >> /etc/ld.so.conf.d/zz-conda.conf
mkdir ~/.conda/pkgs # prevent a warning
conda config --set always_yes True
pip_ver=$(pip --version | sed 's/pip \(.*\) from.*/\1/')
echo "pip >=$pip_ver" > /opt/conda/conda-meta/pinned # prevent pip downgrade
# upgrade Python within minor version
python_minor=$(python --version | sed 's/Python \(.*\)\..*/\1/')
echo "python =$python_minor" >> /opt/conda/conda-meta/pinned
conda update python pip
conda update -yn base conda
conda clean -qtipy
ldconfig
python -V
pip -V
14.3s
Minimal Python 3 (Bash)

Package up the installation for use in other environments.

du -hsx /
6.4s
Minimal Python 3 (Bash)
tar -zcPf /results/minimal-python3.tgz /opt/conda
16.3s
Minimal Python 3 (Bash)

Build the Default Python 3 Environment

Install

Just need a few system libraries, particularly for HDF5 support.

apt-get -qq update
DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends \
  libxext6 libhdf5-100
apt-get clean
rm -r /var/lib/apt/lists/*
14.2s
Python 3 (Bash)
Python 3

This default image has support for a number of general-use packages, including pandas, scipy, scikit-learn, scikit-image, and opencv-python. For graphical output, matplotlib and plotly are installed. We'll also install some basic utilities, as well as setuptools to make any additional installs less difficult. We're installing Jedi to have code completions for Python, and Jupyter to support notebook imports.

conda install -c plotly \
  setuptools six simplejson dill pillow pytables h5py \
  plotly matplotlib tqdm termcolor tabulate \
  python-dateutil more-itertools toolz cython cffi attrs decorator jedi \
  numpy scipy patsy statsmodels pandas pandas-datareader seaborn \
  scikit-learn scikit-image \
  jupyter
conda clean -qtipy
ldconfig
# make sure jupyter components are up-to-date
# also add non-anaconda-main packages here (conda-forge packages can be broken)
pip install --upgrade altair pandas pyarrow feather-format \
  pipenv jupyter-client jupyter-core
python -V
pip -V
jupyter --version
jupyter kernelspec list
jupyter --paths
132.9s
Python 3 (Bash)
Python 3

And we'll install the unofficial wheel of OpenCV.

pip install opencv-python-headless
6.9s
Python 3 (Bash)
Python 3

Pre-import packages to speed up cold boot time.

PI_PKGS="altair, backcall, bleach, certifi, cffi, chardet, cloudpickle, conda, conda_package_handling, cryptography, cycler, cython, cytoolz, dask, decorator, defusedxml, dill, entrypoints, feather, h5py, idna, imageio, importlib_metadata, ipykernel, ipython_genutils, ipywidgets, jedi, jinja2, joblib, jsonschema, jupyter, jupyter_client, jupyter_console, jupyter_core, kiwisolver, lxml, markupsafe, matplotlib, mistune, mkl_fft, mkl_random, mock, more_itertools, nbconvert, nbformat, networkx, notebook, numexpr, numpy, olefile, cv2, pandas, pandas_datareader, pandocfilters, parso, patsy, pexpect, pickleshare, PIL, pipenv, plotly, prometheus_client, prompt_toolkit, ptyprocess, pyarrow, pycosat, pycparser, pygments, OpenSSL, pyparsing, pyrsistent, socks, dateutil, pytz, pywt, zmq, qtconsole, requests, retrying, ruamel_yaml, skimage, sklearn, scipy, seaborn, send2trash, simplejson, six, statsmodels, tables, tabulate, termcolor, terminado, testpath, toolz, tornado, tqdm, traitlets, urllib3, virtualenv, wcwidth, webencodings, widgetsnbextension, zipp, ipykernel.pylab.backend_inline"
python -c "import $PI_PKGS"
pkgs=$(echo $PI_PKGS | sed 's/,//g')
for pkg in $pkgs; do
  python -c "from $pkg import *"
done
29.5s
Python 3 (Bash)
Python 3

Finally, set up default fonts for matplotlib.

mkdir -p ~/.config/matplotlib/
echo 'font.family: sans-serif
font.sans-serif: Fira Sans, PT Sans, Open Sans, Roboto, DejaVu Sans, Liberation Sans, sans-serif
font.serif: PT Serif, Noto Serif, DejaVu Serif, Liberation Serif, serif
font.monospace: Fira Mono, Roboto Mono, DejaVu Sans Mono, Liberation Mono, Fixed, Terminal, monospace' > ~/.config/matplotlib/matplotlibrc
1.8s
Python 3 (Bash)
Python 3

Check size and final tests.

python -V
pip -V
conda -V
du -hsx /
10.1s
Python 3 (Bash)
Python 3

Incremental Additions

pip install vega_datasets
5.4s
Python 3 (Bash)
Python 3

Test

python --version
0.8s
Python 3 Tests (Bash in Python)
Python 3
jupyter kernelspec list
jupyter --version
jupyter --paths
2.5s
Python 3 Tests (Bash in Python)
Python 3
import platform; platform.python_version()
0.0s
Python 3 Tests (Python)
Python 3
import pip;
pip.__version__
0.0s
Python 3 Tests (Python)
Python 3
import plotly; plotly.__version__
0.0s
Python 3 Tests (Python)
Python 3
import numpy as np; np.__version__
0.0s
Python 3 Tests (Python)
Python 3
import matplotlib; matplotlib.__version__
0.0s
Python 3 Tests (Python)
Python 3
import setuptools; setuptools.__version__
0.4s
Python 3 Tests (Python)
Python 3
import six; six.__version__
0.0s
Python 3 Tests (Python)
Python 3
import simplejson; simplejson.__version__
0.0s
Python 3 Tests (Python)
Python 3
import pandas; pandas.__version__
0.0s
Python 3 Tests (Python)
Python 3
import scipy; scipy.__version__
0.1s
Python 3 Tests (Python)
Python 3

Minimal Python 2

Download and install conda.

CONDA_VER="4.8.3"
PYTHON_VER="py27"
file="Miniconda2-${PYTHON_VER}_${CONDA_VER}-Linux-x86_64.sh"
wget -q --show-progress --progress=bar:force -P /results \
  https://repo.continuum.io/miniconda/${file}
1.4s
Minimal Python 2 (Bash)
bash 
Miniconda2-py27_4.8.3-Linux-x86_64.sh
-b -p /opt/conda
14.7s
Minimal Python 2 (Bash)

Setup conda, ld, and pip.

# make this the last alphabetically => lowest precedence libraries
echo "/opt/conda/lib" >> /etc/ld.so.conf.d/zz-conda.conf
mkdir ~/.conda/pkgs # prevent a warning
conda config --set always_yes True
pip_ver=$(pip --version | sed 's/pip \(.*\) from.*/\1/')
echo "pip >=$pip_ver" > /opt/conda/conda-meta/pinned # prevent pip downgrade
echo "python =2.7" >> /opt/conda/conda-meta/pinned # Stick to Python 2.7
conda update python pip
conda update -yn base conda
conda clean -qtipy
ldconfig
python -V
pip -V
15.1s
Minimal Python 2 (Bash)
du -hsx /
4.8s
Minimal Python 2 (Bash)

Default Python 2

Install

apt-get -qq update
DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends \
  libxext6 libhdf5-100
apt-get clean
rm -r /var/lib/apt/lists/*
conda install -c plotly \
  setuptools six simplejson dill pillow pytables h5py \
  plotly matplotlib tqdm termcolor tabulate \
  python-dateutil more-itertools toolz cython cffi attrs decorator jedi \
  numpy scipy patsy statsmodels pandas pandas-datareader seaborn \
  scikit-learn scikit-image \
  jupyter
conda clean -qtipy
ldconfig
# make sure jupyter components are up-to-date
# also add non-anaconda-main packages here (conda-forge packages can be broken)
pip install --upgrade altair pandas pyarrow feather-format \
  pipenv jupyter-client jupyter-core
pip install opencv-python-headless
mkdir -p ~/.config/matplotlib/
echo 'font.family: sans-serif
font.sans-serif: Fira Sans, PT Sans, Open Sans, Roboto, DejaVu Sans, Liberation Sans, sans-serif
font.serif: PT Serif, Noto Serif, DejaVu Serif, Liberation Serif, serif
font.monospace: Fira Mono, Roboto Mono, DejaVu Sans Mono, Liberation Mono, Fixed, Terminal, monospace' > ~/.config/matplotlib/matplotlibrc
python -V
pip -V
jupyter --version
jupyter kernelspec list
jupyter --paths
du -hsx /
139.4s
Python 2 (Bash)
Python 2
PI_PKGS="altair, bleach, certifi, cffi, chardet, cloudpickle, conda, conda_package_handling, cryptography, cycler, cython, cytoolz, dask, decorator, defusedxml, dill, entrypoints, feather, h5py, idna, imageio, importlib_metadata, ipykernel, ipython_genutils, ipywidgets, jedi, jinja2, jsonschema, jupyter, jupyter_client, jupyter_console, jupyter_core, kiwisolver, lxml, markupsafe, matplotlib, mistune, mkl_fft, mkl_random, mock, more_itertools, nbconvert, nbformat, networkx, notebook, numexpr, numpy, olefile, cv2, pandas, pandas_datareader, pandocfilters, parso, patsy, pexpect, pickleshare, PIL, pipenv, plotly, prometheus_client, prompt_toolkit, ptyprocess, pyarrow, pycosat, pycparser, pygments, OpenSSL, pyparsing, pyrsistent, socks, dateutil, pytz, pywt, zmq, qtconsole, requests, retrying, ruamel_yaml, skimage, sklearn, scipy, seaborn, send2trash, simplejson, six, statsmodels, tables, tabulate, termcolor, terminado, testpath, toolz, tornado, tqdm, traitlets, urllib3, virtualenv, wcwidth, webencodings, widgetsnbextension, zipp, ipykernel.pylab.backend_inline"
python -c "import $PI_PKGS"
8.0s
Python 2 (Bash)
Python 2

Test

python --version
0.8s
Python 2 Tests (Bash in Python)
Python 2
jupyter kernelspec list
jupyter --version
jupyter --paths
1.8s
Python 2 Tests (Bash in Python)
Python 2
import platform; platform.python_version()
0.0s
Python 2 Tests (Python)
Python 2
Runtimes (6)