Error in Terra When Initializing HAIL

I was unable to implement the 01-genome-wide-association-study.ipynb notebook on Hail-Notebook-Tutorials. I received the following warning after the hl.init() command:

/opt/conda/lib/python3.10/site-packages/hailtop/aiocloud/aiogoogle/ UserWarning: Reading spark-defaults.conf to determine GCS requester pays configuration. This is deprecated. Please use `hailctl config set gcs_requester_pays/project` and `hailctl config set gcs_requester_pays/buckets`.
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See for further details.
SLF4J: Class path contains SLF4J bindings targeting slf4j-api versions 1.7.x or earlier.
SLF4J: Ignoring binding found at [jar:file:/usr/lib/spark/jars/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See for an explanation.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Running on Apache Spark version 3.3.0
SparkUI available at http://saturn-26616ce0-7b3e-4578-bb69-4567cf490d15-m.c.terra-67a6826d.internal:37893

Then I got the following error when attempting to download the file:

2023-10-03 21:56:25.636 Hail: INFO: downloading 1KG VCF ...
2023-10-03 21:56:28.692 Hail: INFO: importing VCF and writing to matrix table...
Py4JNetworkError                          Traceback (most recent call last)
TypeError: Log._log() got an unexpected keyword argument 'exc_info'

Hi @beneopp !

This looks like a broken Hail installation. Can you describe how you started your Hail cluster and how you installed Hail?

I launched the Jupyter Cloud Environment on Terra. I used the standard Hail application configuration and reduced the number of CPU’s to 2. Attached is a screenshot of the configurations I used.

@beneopp hmm. Most folks use clusters, not “Spark single node”, so it’s possible that path is less well tested in Terra.

Can you trigger the error again and send us the Hail .log file? The easiest way to get that is to SSH to the master node of the cluster (always named CLUSTER_NAME-m). If you don’t have the ability to SSH, you’ll have to use the notebook to gsutil cp ...log gs://your-bucket/logfile then download it on your laptop.

Hail is definitely failing to start properly and the reason why should be in that log file.

Thank you for your prompt reply.

I tried the environment with a “spark single node” and 4 CPU’s, and the starting step worked fine. I think the error might be caused by the machine used. Attached is the log file from when 2CPUs are used and the error occurs.

In general, I am wondering what makes the “Spark single node” parameter important. I am new to using this tool and don’t know anything about Spark. If it is important, I think it would be helpful to explain this in the Hail-Notebook-Tutorials workspace.


hail-20231004-1700-0.2.120-f00f916faf78.log (21.6 KB)

@beneopp , which workspace are you referring to? The Hail team doesn’t own any workspaces, those are produced by Terra. I can ask them to update the workspaces though if there is specific feedback.

We have some general advice on using the cloud in the Hail docs. In general, people use “Spark clusters”. Google has a “Dataproc” product which allows you to start and stop clusters. We provide a tool that helps you do that called hailctl dataproc.

In Terra, you have to use their UI instead. For most analyses, you don’t want to use “Spark single node” because that means you’re using one “node” (aka “VM” aka “computer”) rather than a cluster of nodes. That means you’re limited to the number of cores on that one computer. Most sequencing datasets are too large to wait for a single computer to read all the data.

Hmm, this log indicates that you have two SparkContexts and that one of them is stopped and the other is active. This shouldn’t normally happen. If you start a new notebook from scratch and run

import hail as hl
hl.balding_nichols_model(1, 10, 10).show(10, 10)

Do you see a matrix of genotypes? If not, can you copy all the output you see here? We need to sort out why your notebook somehow has two SparkContext s