Hail 0.2 class not found exception on EMR


#1

We are getting an error when trying to import a larg(ish) vcf file
sc._jsc.addJar(’/home/hadoop/hail-all-spark.jar’)
sc.addPyFile(’/home/hadoop/hail-python.zip’)

import hail as hl
import hail.expr.aggregators as agg
hl.init(sc)

ds3 = hl.import_vcf(‘s3://out_bucket/ALL.chr*.b38_AIMs.vcf.bgz’, reference_genome=‘GRCh38’)

I would note that we are also the ones who had this issue, which is likely related… it seems like the worker nodes still don’t have the jar on the classpath. I confirmed that the class in question is indeed present in the jar file that we have on the master node of the cluster.

I saw similar issues here:


Stack Trace
Name: org.apache.toree.interpreter.broker.BrokerException
Message: Traceback (most recent call last):
File “/tmp/kernel-PySpark-381de7a9-f2fa-4e85-aac0-9f64e8dabc0d/pyspark_runner.py”, line 194, in
eval(compiled_code)
File “”, line 2, in
File “”, line 2, in import_vcf
File “/mnt/tmp/spark-91033a4b-a19c-40a0-9485-9c070d20705d/userFiles-7fa922d8-4d81-4f59-ade1-72de2efa3087/hail-python.zip/hail/typecheck/check.py”, line 490, in typecheck
return orig_func(*args
, **kwargs_)
File “/mnt/tmp/spark-91033a4b-a19c-40a0-9485-9c070d20705d/userFiles-7fa922d8-4d81-4f59-ade1-72de2efa3087/hail-python.zip/hail/methods/impex.py”, line 1666, in import_vcf
joption(rg), joption(contig_recoding))
File “/usr/lib/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py”, line 1133, in call
answer, self.gateway_client, self.target_id, self.name)
File “/mnt/tmp/spark-91033a4b-a19c-40a0-9485-9c070d20705d/userFiles-7fa922d8-4d81-4f59-ade1-72de2efa3087/hail-python.zip/hail/utils/java.py”, line 196, in deco
‘Error summary: %s’ % (deepest, full, hail.version, deepest)) from None
hail.utils.java.FatalError: ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration

Java stack trace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 8.0 failed 4 times, most recent failure: Lost task 1.3 in stage 8.0 (TID 71, ip-10-66-115-121.goldfinch.lan, executor 7): java.io.IOException: com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1310)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:798)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:797)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:65)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:246)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$8.apply(TorrentBroadcast.scala:293)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1337)
at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:294)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:226)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
… 19 more
Caused by: java.lang.ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:681)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1858)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1744)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2032)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1566)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:426)
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:63)
… 26 more

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1708)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1696)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1695)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1695)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:855)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:855)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1923)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1878)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1867)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:671)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2029)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2069)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:918)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:916)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.foreach(RDD.scala:916)
at is.hail.io.vcf.LoadVCF$.apply(LoadVCF.scala:797)
at is.hail.HailContext$$anonfun$importVCFs$2.apply(HailContext.scala:564)
at is.hail.HailContext$$anonfun$importVCFs$2.apply(HailContext.scala:562)
at is.hail.HailContext.forceBGZip(HailContext.scala:532)
at is.hail.HailContext.importVCFs(HailContext.scala:562)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)java.io.IOException: com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1310)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:798)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:797)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:65)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:246)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$8.apply(TorrentBroadcast.scala:293)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1337)
at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:294)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:226)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:798)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:797)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)java.lang.ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:681)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1858)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1744)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2032)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1566)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:426)
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:63)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:246)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$8.apply(TorrentBroadcast.scala:293)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1337)
at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:294)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:226)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:798)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:797)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Hail version: devel-24923b8
Error summary: ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration

StackTrace: org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)
org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)
scala.Option.foreach(Option.scala:257)
org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
py4j.Gateway.invoke(Gateway.java:280)
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
py4j.commands.CallCommand.execute(CallCommand.java:79)
py4j.GatewayConnection.run(GatewayConnection.java:214)
java.lang.Thread.run(Thread.java:748)


#2

This error means that the Hail jar has not been successfully distributed to the worker nodes, but is visible on the driver.

sc._jsc.addJar(’/home/hadoop/hail-all-spark.jar’) probably doesn’t distribute the jar because it’s a local path.

Which version of spark are you using? If you’re using 2.3.0 + we can make this easier by using our HTTPS paths.


#3

We are currently on spark 2.2.0 because I believe I had issues with 2.3.0 but happy to try this again if it will make life easier. Do I need to manually add the jar to hdfs and reference it from that path instead?


#4

Er, sorry – we need to deploy 2.3.0 jars first. I’ll report back on this after tomorrow.


#5

Is there any update on this? Did you deploy the 2.3.0 jars?


#6

Ah, thanks for the prompt. We discussed this and agreed that we need to start deploying to AWS, but haven’t totally set things up yet.


#7

That’s great to hear! Do you have any tentative timeline for this? We really need to make progress on our project and have deadlines approaching…


#8

Ah, sorry to leave this over the weekend. I don’t have a timeline right now, but I’ll try to find a temporary workaround. This will probably involve compiling one-off jars for Spark 2.3.0, having you download them and put them in S3, then using those URIs during Spark startup.


#9

I made a 2.3.0 jar and zip:

gs://hail-common/hail-ae9e34fb3cbf.zip
gs://hail-common/hail-ae9e34fb3cbf-spark-2.3.0.jar
https://storage.googleapis.com/hail-common/hail-ae9e34fb3cbf.zip
https://storage.googleapis.com/hail-common/hail-ae9e34fb3cbf-spark-2.3.0.jar

#10

I’m now rereading and realizing you’re on Spark 2.2.0, which is great because it means we can use the deployed GS jars, though you’ll need to copy them over.

gs://hail-common/builds/devel/jars/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar
gs://hail-common/builds/devel/python/hail-devel-ae9e34fb3cbf.zip

https://storage.googleapis.com/hail-common/builds/devel/jars/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar
https://storage.googleapis.com/hail-common/builds/devel/python/hail-devel-ae9e34fb3cbf.zip

If you download these files and put them in S3, then you should be able to use addPyFile and _jsc.addJar easily.


#11

Hi Tim,

Thanks for your continued help with this. I made the following change to my code after transferring the jar and zip files to one of our s3 buckets:

sc._jsc.addJar('s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar')

sc.addPyFile('s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip')

import hail as hl

import hail.expr.aggregators as agg

hl.init(sc)

I also created /mnt/tmp in hdfs with 777 perms.

Running that block above produces the following error:

Name: org.apache.toree.interpreter.broker.BrokerException

Message: Traceback (most recent call last):

  File "/tmp/kernel-PySpark-66f08197-35ee-41a5-a533-a8f67337ca3c/pyspark_runner.py", line 194, in <module>

    eval(compiled_code)

  File "<string>", line 5, in <module>

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py", line 547, in wrapper

    return f(*args_, **kwargs_)

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/context.py", line 158, in init

    default_reference)

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py", line 547, in wrapper

    return f(*args_, **kwargs_)

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/context.py", line 53, in __init__

    min_block_size, branching_factor, tmp_dir)

  File "/usr/lib/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__

    answer, self.gateway_client, self.target_id, self.name)

  File "/usr/lib/spark/python/pyspark/sql/utils.py", line 63, in deco

    return f(*a, **kw)

  File "/usr/lib/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 323, in get_return_value

    format(target_id, ".", name, value))

py4j.protocol.Py4JError: An error occurred while calling z:is.hail.HailContext.apply. Trace:

 py4j.Py4JException: Method apply([class org.apache.spark.SparkContext, class java.lang.String, class scala.None$, class java.lang.String, class java.lang.String, class java.lang.Boolean, class java.lang.Boolean, class java.lang.Integer, class java.lang.Integer, class java.lang.String]) does not exist

        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)

        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:339)

        at py4j.Gateway.invoke(Gateway.java:274)

        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

        at py4j.commands.CallCommand.execute(CallCommand.java:79)

        at py4j.GatewayConnection.run(GatewayConnection.java:214)

        at java.lang.Thread.run(Thread.java:748)

StackTrace: org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

scala.Option.foreach(Option.scala:257)

org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162)

sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

java.lang.reflect.Method.invoke(Method.java:498)

py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)

py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

py4j.Gateway.invoke(Gateway.java:280)

py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

py4j.commands.CallCommand.execute(CallCommand.java:79)

py4j.GatewayConnection.run(GatewayConnection.java:214)

java.lang.Thread.run(Thread.java:748)

#12

you don’t have other Hail packages around, right? This error usually means that there’s a mismatch between Hail python and jar.


#13

I cleaned up one copy from hdfs and one from the Hadoop user’s home directory on the master node and now I get the following:

Name: org.apache.toree.interpreter.broker.BrokerException

Message: Traceback (most recent call last):

File “/tmp/kernel-PySpark-c09cc1bc-fc7d-4dc6-a4ab-6b54eeed1999/pyspark_runner.py”, line 194, in

eval(compiled_code)

File “”, line 5, in

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py”, line 547, in wrapper

return f(*args_, **kwargs_)

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/context.py”, line 158, in init

default_reference)

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py”, line 547, in wrapper

return f(*args_, **kwargs_)

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/context.py”, line 53, in init

min_block_size, branching_factor, tmp_dir)

TypeError: ‘JavaPackage’ object is not callable

StackTrace: org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

scala.Option.foreach(Option.scala:257)

org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162)

sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

java.lang.reflect.Method.invoke(Method.java:498)

py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)

py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

py4j.Gateway.invoke(Gateway.java:280)

py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

py4j.commands.CallCommand.execute(CallCommand.java:79)

py4j.GatewayConnection.run(GatewayConnection.java:214)

java.lang.Thread.run(Thread.java:748)


#14

TypeError: ‘JavaPackage’ object is not callable normally happens when the Hail jar file is not found or fails to load. sc.addJar documentation says it “Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.” but that means tasks (workers), not the driver. So you also need to add it to the driver. How are you invoking the driver? You may need to set, for example, spark.driver.extraClasspath.


#15

I have the following set in spark-defaults.conf. I tried using ./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

I put the devel jar in /home/hadoop using the standard name. The following are settings in my spark-default.conf file.

spark.driver.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

spark.driver.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native

spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

bash$ ls -l /home/hadoop/

total 28108

-rw-rw-r-- 1 hadoop hadoop 27424722 Jul 2 11:52 hail-all-spark.jar

-rw-rw-r-- 1 hadoop hadoop 1196210 Jul 2 11:52 hail-python.zip

-rw-rw-r-- 1 hadoop hadoop 21055 Jul 5 21:22 Untitled.ipynb

It inits fine:

sc._jsc.addJar(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’)

sc.addPyFile(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’)

import hail as hl

import hail.expr.aggregators as agg

hl.init(sc)

When I run the following command:

ds = hl.import_vcf(‘s3://gfb-registry-raw/controls/1000genomes/human/dna/vcf.b38/AIMs/ALL*.vcf.bgz’ reference_genome=‘GRCh38’)

I still get the exception:

hail.utils.java.FatalError: ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration

I noticed if I don’t have :./hail-all-spark.jar in the paths in spark-default.conf the init fails. Any other suggestions?

Thanks,

Adam


#16

I checked the task attempt logs from yarn and found the following. It looks like it is downloading the jar from s3. I also put the jar in /home/hadoop on all nodes in the cluster.

18/07/07 01:39:55 INFO Executor: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip with timestamp 1530925905396
18/07/07 01:39:57 INFO S3NativeFileSystem: Opening ‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’ for reading
18/07/07 01:39:57 INFO Utils: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp4806657294271057220.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/3525771591530925905396_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-devel-ae9e34fb3cbf.zip
18/07/07 01:39:57 INFO Executor: Fetching spark://10.66.117.154:39293/jars/toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar with timestamp 1530925892206
18/07/07 01:39:57 INFO TransportClientFactory: Successfully created connection to /10.66.117.154:39293 after 0 ms (0 ms spent in bootstraps)
18/07/07 01:39:57 INFO Utils: Fetching spark://10.66.117.154:39293/jars/toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp2212807853478377914.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/-10019953601530925892206_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar
18/07/07 01:39:57 INFO Executor: Adding file:/mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar to class loader
18/07/07 01:39:57 INFO Executor: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar with timestamp 1530925903635
18/07/07 01:39:57 INFO S3NativeFileSystem: Opening ‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’ for reading
18/07/07 01:39:57 INFO Utils: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp4183466651204311484.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/-14685433961530925903635_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar
18/07/07 01:39:57 INFO Executor: Adding file:/mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar to class loader
18/07/07 01:39:57 INFO Executor: Fetching spark://10.66.117.154:39293/jars/hail-all-spark.jar with timestamp 1530926976355
18/07/07 01:39:57 INFO Utils: Fetching spark://10.66.117.154:39293/jars/hail-all-spark.jar to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp8017925226355738872.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/20445175021530926976355_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-all-spark.jar
18/07/07 01:39:57 INFO Executor: Adding file:/mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-all-spark.jar to class loader
18/07/07 01:39:58 INFO TorrentBroadcast: Started reading broadcast variable 17
18/07/07 01:39:58 INFO TransportClientFactory: Successfully created connection to /10.66.117.154:46805 after 1 ms (0 ms spent in bootstraps)
18/07/07 01:39:58 INFO MemoryStore: Block broadcast_17_piece0 stored as bytes in memory (estimated size 1940.0 B, free 2.8 GB)
18/07/07 01:39:58 INFO TorrentBroadcast: Reading broadcast variable 17 took 113 ms
18/07/07 01:39:58 INFO MemoryStore: Block broadcast_17 stored as values in memory (estimated size 4.8 KB, free 2.8 GB)
18/07/07 01:39:58 INFO TorrentBroadcast: Started reading broadcast variable 15
18/07/07 01:39:58 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes in memory (estimated size 23.6 KB, free 2.8 GB)
18/07/07 01:39:58 INFO TorrentBroadcast: Reading broadcast variable 15 took 5 ms
18/07/07 01:39:58 INFO TorrentBroadcast: Started reading broadcast variable 15
18/07/07 01:39:58 ERROR Utils: Exception encountered
com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:65)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:246)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$8.apply(TorrentBroadcast.scala:293)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1337)
at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:294)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:226)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:836)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:835)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:685)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1865)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1748)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2039)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1570)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:430)
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:63)
… 26 more
18/07/07 01:39:58 INFO TorrentBroadcast: Reading broadcast variable 15 took 1 ms
18/07/07 01:39:58 ERROR Utils: Exception encountered


#17

Ok so. Let’s review the relevant portions of the paths:

spark.driver.extraClassPath ...:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar
spark.executor.extraClassPath ...:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

AFAIK, the JDK has no idea what s3:// is so adding that as a path does not help.

Second point to keep in mind: spark.driver.extraClassPath is relative to the working directory of the driver, so, wherever spark is started. Moreover, addJar and addFile do not copy the file to the working directory of the driver (only to the working directories of the executors).

Action: Let’s remove the s3:// paths.


OK, so, this works:

sc._jsc.addJar('s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar')
sc.addPyFile('s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip')
import hail as hl
import hail.expr.aggregators as agg
hl.init(sc)

This means that we’re successfully starting a hail context on the driver node. Ergo, Spark is finding the hail JAR on the driver.


This fails:

ds = hl.import_vcf('s3://gfb-registry-raw/controls/1000genomes/human/dna/vcf.b38/AIMs/ALL*.vcf.bgz',
                   reference_genome=‘GRCh38’)

because there is no Spark JAR in your class path. Recall:

spark.executor.extraClassPath ...:./hail-all-spark.jar

There is a jar on the worker nodes. It’s named hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar. Instead of adding the JAR from s3://, can you try sc.addJar('./hail-all-spark.jar')?


#18

Hi Dan,

Thank you for following up on this thread – I really appreciate your help. Unfortunately, I’m still running into the same error.

I launched a new cluster with a cleaned up spark-defaults.conf

spark.driver.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar

spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar

I tried each of the following (separately):

sc._jsc.addJar(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’)

sc.addPyFile(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’)

sc._jsc.addJar(’/home/hadoop/hail-all-spark.jar’)

sc.addPyFile(’/home/hadoop/hail-python.zip’)

sc._jsc.addJar(’./hail-all-spark.jar’)

sc.addPyFile(’./hail-python.zip’)

Thanks,

Adam


#19

could you try these:

sc._jsc.addJar(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’)

sc.addPyFile(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’)

with ./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar on the class paths?


#20

I tried putting both on the classpath but still get the same error:

spark.driver.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar