Hail tutorials work, but otherwise hail does not import

Hi,
I am having problems importing the hail package into python.

Weirdly, I can run the jupyter hail tutorials no problem.

However, creating a new jupyter notebook in that same tutorials folder and trying to import hail there produces the error
ImportError: No module named hail

Running interactive python from elsewhere and attempting to import hail produces the same No module named hail error.

I have set SPARK_HOME and PATH variables as in the ‘Getting started’ instructions, and they now contain the locations of the relevant packages.

I generally use anaconda environments for running python, is there a way to easily make hail available within an anaconda environment that I have created?

Help is appreciated, many thanks!

Since we haven’t registered Hail on PyPI, I think it’s hard to make a conda yaml file to set up everything automatically.

The problem is probably that the Hail python zip isn’t on your path – are you using the jhail script in the distribution? That sets it up. Otherwise, you’ll need to add the Hail python library to the PYTHONPATH environment variable.

Let me know if this is confusing (I’ve slightly confused myself).

Yes, I was using the jhail command, I had forgotten that.

What exactly should I add to my PYTHONPATH variable? This variable is indeed empty right now, as it’s discouraged if you are using anaconda. However I am not afraid to change it.

Thank you for the almost instantaneous response!

Hmm, interesting – I should really look into what Anaconda prefers, it’s a great Python installation and we should totally play nicely with it. I use Anaconda and PYTHONPATH on my local machine and everything seems to work (I have Hail and py4j in my PYTHONPATH and they persist between virtual envs).

Here’s what mine looks like:

export PYTHONPATH="$PYTHONPATH:$HAIL_HOME/python:$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.3-src.zip"

It sounds like you have the Python .zip, so the $HAIL_HOME/python should be the path to the .zip there.

also if you have a different Spark, the py4j version might be different.

I really like Anaconda. They say it’s ok to mix it with having a filled PYTHONPATH variable, but just sometimes causes problems depending on what’s being done.

So I have HAIL_HOME and SPARK_HOME variables set already, according to the getting started instructions.

Are you saying those might need to be altered, or do you think I can copy what you have except for the ‘python/lib/py4j-0.10.3-src.zip’ part? (Or maybe even that will be the same for me, I just have to look and see what seems to correspond on my system).

I should have the ‘correct’ spark; I hadn’t used spark before so I just downloaded the one that should ‘match’ with hail.

Thanks again!

Oh fun.

I copied what you had exactly, and I got a new error:

Traceback (most recent call last):
File “”, line 1, in
File “/Applications/hail/python/hail/init.py”, line 1, in
import hail.expr
File “/Applications/hail/python/hail/expr.py”, line 2, in
from hail.java import scala_object, Env, jset
File “/Applications/hail/python/hail/java.py”, line 7, in
from decorator import decorator
ImportError: No module named decorator

Oh, good – just pip install decorator.

(I thought decorator was bundled with Anaconda… I guess not)

I see Decorator 4.1.2 seems to be available through Anaconda. Is a specific version required? Thanks again.

I think anything will work.

OK, got decorator.

This now doesn’t throw an error:

from hail import *

But this does:

hc = HailContext()

Using Spark’s default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to “WARN”.
To adjust logging level use sc.setLogLevel(newLevel).
Traceback (most recent call last):
File “”, line 1, in
File “”, line 2, in init
File “/Applications/hail/python/hail/typecheck/check.py”, line 245, in _typecheck
return f(*args, **kwargs)
File “/Applications/hail/python/hail/context.py”, line 88, in init
parquet_compression, min_block_size, branching_factor, tmp_dir)
TypeError: ‘JavaPackage’ object is not callable

OK, that probably means that the jar isn’t accessible on your path. you’ll need to set SPARK_CLASSPATH.

I just found the script used in jhail/ihail/etc. It sets these:

export HAIL_HOME="$(dirname "$(cd "$(dirname $0)" && pwd)")"
export PYTHONPATH="$HAIL_HOME/python:$SPARK_HOME/python:$(echo ${SPARK_HOME}/python/lib/py4j-*-src.zip | tr '\n' ':')$PYTHONPATH"
export SPARK_CLASSPATH=$HAIL_HOME/jars/hail-all-spark.jar

Since you’ve got the other two working, just copy in the SPARK_CLASSPATH one.

Yeah it worked:

Welcome to
__ __ <>__
/ // /__ __/ /
/ __ / _ `/ / /
/
/ //_,/// version 0.1-b374ef3

Thanks very much!!

great! Good to know what got you stuck, we’ll try to clarify the docs / getting started page in the future.

It just comes down to those pesky environment variables. Perhaps you could include something about these in a troubleshooting section or something? Also a note that people should not assume that because the tutorials run, everything is set?
I’m looking forward to messing with some vds’s now. Thanks again.

Yeah, that’s a good idea.

Just to add I think Decorator does come with Anaconda (I had an old version), but even so I needed to add it to the Anaconda environment I was working in for it to be available.

How did you get Hail?

The files in the scripts folder are not meant to be used with the source (from git / GitHub) distribution of hail. Those are files that are used by the packaging mechanism to produce the pre-packaged Hail distributions.

If you downloaded the pre-built Hail distribution from the Getting Started page and you encountered these problems, then we should understand why and fix the pre-built distribution.

I strongly recommend using the pre-built distribution mentioned on the getting started page.

I did get it from the Getting Started page, not from GitHub.