is.hail.kryo.HailKryoRegistrator ClassNotFoundException

atebbe · April 26, 2018, 3:07pm

We are getting an exception when calling the hl.import_vcf() function. The hl.init() function completes without issue, which makes me think the hail jar is loading properly. The import_vcf function is the first one we are running that creates a job on the cluster. We are running this on an AWS EMR cluster with hail (master branch), and spark 2.2.1.

Exception from task attempt logs:
org.apache.spark.SparkException: Failed to register classes with Kryo
Caused by: java.lang.ClassNotFoundException: is.hail.kryo.HailKryoRegistrator

Relevant settings in spark-defaults.conf
spark.driver.extraClassPath …:./hail-all-spark.jar
spark.executor.extraClassPath …:./hail-all-spark.jar
spark.kryo.registrator is.hail.kryo.HailKryoRegistrator
spark.serializer org.apache.spark.serializer.KryoSerializer

tpoterba · April 26, 2018, 4:14pm

I suspect that this problem is related to the hail jar not being properly visible on Spark workers. hl.init() will fail if the driver isn’t configured correctly, but if the workers are misconfigured, I’d expect a java.lang.ClassNotFoundException in the first place that loads Hail classes on worker machines.

I think maybe you need to pass --jars or the appropriate config in the spark-defaults to get the jar to ship correctly to the workers.

danking · April 27, 2018, 2:18pm

Hi @atebbe,

I’m sorry you’re running into this issue! Since hl.init() executed successfully, I suspect the Hail jar is located in the correct location on the driver node. However, the import_vcf function must actually communicate with the executors (worker nodes). Based on the error message, I suspect hail-all-spark.jar is not located in the current working directory of the Spark processes on your executors. If you are using spark-shell, are you also passing the --jars parameter? If you’re not using spark-shell what command are you using to start interacting with the cluster?

atebbe · April 27, 2018, 5:13pm

We are interacting with hail using the Apache Toree - Pyspark kernel for Jupyter. The jar is on the namenode of the cluster in /home/hadoop. The following is the first cell of our notebooks:

sc.addFile(’/home/hadoop/hail-all-spark.jar’)
sc.addPyFile(’/home/hadoop/hail-python.zip’)
import hail as hl
hl.init(sc)

My assumption was sc.addFile was adding the jar to hdfs. This worked fine with 0.1 - this is our first attempt with 0.2.

tpoterba · April 27, 2018, 5:36pm

Does sc.addJar have different semantics from sc.addFile? Maybe try that?

atebbe · April 30, 2018, 12:25pm

I get an error that addJar does not exist

sc.addJar(’/home/hadoop/hail-all-spark.jar’)
sc.addPyFile(’/home/hadoop/hail-python.zip’)
…

Name: org.apache.toree.interpreter.broker.BrokerException
Message: Traceback (most recent call last):
File “/tmp/kernel-PySpark-583e9b04-451e-4331-9e04-300634d28644/pyspark_runner.py”, line 194, in
eval(compiled_code)
File “”, line 1, in
AttributeError: ‘SparkContext’ object has no attribute ‘addJar’

tpoterba · April 30, 2018, 2:15pm

Ah, okay. addJar must have been something from Spark 1.X that was removed in 2.0. This syntax looks right…

danking · May 4, 2018, 2:41pm

Sorry for the long delay on my reply, @atebbe

Let’s recall your spark class path settings:

spark.driver.extraClassPath …:./hail-all-spark.jar
spark.executor.extraClassPath …:./hail-all-spark.jar

These assert that, on both the driver and the executors, the jar is located in the working directory of the driver process. If you ssh to one of your executors and find the spark job working directory (try looking in /var/run/spark/work), I suspect you will not find hail-all-spark.jar in that directory. While you’re at it, can you open a terminal in your Jupyter notebook and verify that the hail-all-spark.jar is indeed in the working directory of your executor?

This StackOverflow post suggests that addFile is inappropriate for “runtime dependencies”.

So. Assuming the jar is indeed missing from the working directory of your executors, we need to figure out how to get it there.

First, try sc._jsc.addJar instead of sc.addFile.

If that fails, Apache Toree suggests using the %AddJar magics invocation to add a jar.

atebbe · May 4, 2018, 7:09pm

Thanks for following up on this. sc._jsc.addJar did the trick! My worker nodes don’t have /var/run/spark. I searched for the jar on the entire filesystem of the worker node and did not find it. Is it recommended to use _jsc?

Thanks,

Adam

tpoterba · May 4, 2018, 7:10pm

I don’t know why they don’t expose it in Python! Clearly it’s necessary…

Topic		Replies	Views
Hail 0.2 class not found exception on EMR Hail Query & hailctl	29	2791	August 20, 2018
I want to run a spark shell with the Hail JAR on Google Dataproc, but I get errors Help [0.1]	1	4093	May 17, 2017
Not able to write to vds Help [0.1]	9	2122	September 1, 2017
Incompatibility between Hail and Spark 3.3.2 Hail Batch & General Cloud	2	342	October 18, 2023
ClassNotFoundException: is.hail.asm4s.AsmFunction2 Help [0.1]	16	3304	August 9, 2019

is.hail.kryo.HailKryoRegistrator ClassNotFoundException

Related topics