No module named 'matplotlib' on hailctl dataproc

I seem to not be able to run a
import matplotlib.pyplot as plt
on a hail dataproc. Numpy, scipy, and other packages are working, and matplotlib is installed on my computer. Is there somewhere on the vm that I can run a pip install matplotlib?


You can pass packages to install at cluster creation time with --pkgs, e.g. --pkgs matplotlib.

If you’re using a jupyter notebook running on an existing cluster, you can run shell commands in a jupyter cell:

! pip install matplotlib

I ran a
! pip install matplotlib
and got the error:
/bin/sh: 1: pip: not found

try python3 -m pip install matplotlib?

I get the same error, that there is no module named pip. I also tried taking out the pip, after which the error was no module named install

I am currently restarting the cluster with the --pkgs matplotlib tag

Looking at our init scripts, it appears the Python installation running the notebook is /opt/conda/default/bin/python. So I think /opt/conda/default/bin/python -m pip install matplotlib should work

Restarting the cluster with the pkgs tag worked, thank you so much!