How to add libraries to the hailctl dataproc?

I ran hail on the cloud using hailctl dataproc, but there was a problem installing packages for python.

How do I install packages like:

  • seaborn;
  • matplotlib;
  • imblearn;
  • scikit-learn ?

I have already searched for a possible solution, but I could not apply it exactly for hailctl.

We should document this better on the website. Right now hailctl dataproc's documentation is mostly found by using -h on the command line.

If you do hailctl dataproc start -h, you’ll see one of the options is --packages.

Based on that, I think this should work:

hailctl dataproc start --packages seaborn,matplotlib,imblearn,scikit-lean

Thanks, all worked well!