Hail 0.2 class not found exception on EMR

Hi Tim,

Thanks for your continued help with this. I made the following change to my code after transferring the jar and zip files to one of our s3 buckets:

sc._jsc.addJar('s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar')

sc.addPyFile('s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip')

import hail as hl

import hail.expr.aggregators as agg

hl.init(sc)

I also created /mnt/tmp in hdfs with 777 perms.

Running that block above produces the following error:

Name: org.apache.toree.interpreter.broker.BrokerException

Message: Traceback (most recent call last):

  File "/tmp/kernel-PySpark-66f08197-35ee-41a5-a533-a8f67337ca3c/pyspark_runner.py", line 194, in <module>

    eval(compiled_code)

  File "<string>", line 5, in <module>

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py", line 547, in wrapper

    return f(*args_, **kwargs_)

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/context.py", line 158, in init

    default_reference)

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py", line 547, in wrapper

    return f(*args_, **kwargs_)

  File "/mnt/tmp/spark-9a35d007-a112-411c-8df4-cbaaee260e19/userFiles-4844298c-a848-4a94-aabf-123a97a6cdf8/hail-devel-ae9e34fb3cbf.zip/hail/context.py", line 53, in __init__

    min_block_size, branching_factor, tmp_dir)

  File "/usr/lib/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__

    answer, self.gateway_client, self.target_id, self.name)

  File "/usr/lib/spark/python/pyspark/sql/utils.py", line 63, in deco

    return f(*a, **kw)

  File "/usr/lib/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 323, in get_return_value

    format(target_id, ".", name, value))

py4j.protocol.Py4JError: An error occurred while calling z:is.hail.HailContext.apply. Trace:

 py4j.Py4JException: Method apply([class org.apache.spark.SparkContext, class java.lang.String, class scala.None$, class java.lang.String, class java.lang.String, class java.lang.Boolean, class java.lang.Boolean, class java.lang.Integer, class java.lang.Integer, class java.lang.String]) does not exist

        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)

        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:339)

        at py4j.Gateway.invoke(Gateway.java:274)

        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

        at py4j.commands.CallCommand.execute(CallCommand.java:79)

        at py4j.GatewayConnection.run(GatewayConnection.java:214)

        at java.lang.Thread.run(Thread.java:748)

StackTrace: org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

scala.Option.foreach(Option.scala:257)

org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162)

sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

java.lang.reflect.Method.invoke(Method.java:498)

py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)

py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

py4j.Gateway.invoke(Gateway.java:280)

py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

py4j.commands.CallCommand.execute(CallCommand.java:79)

py4j.GatewayConnection.run(GatewayConnection.java:214)

java.lang.Thread.run(Thread.java:748)

you don’t have other Hail packages around, right? This error usually means that there’s a mismatch between Hail python and jar.

I cleaned up one copy from hdfs and one from the Hadoop user’s home directory on the master node and now I get the following:

Name: org.apache.toree.interpreter.broker.BrokerException

Message: Traceback (most recent call last):

File “/tmp/kernel-PySpark-c09cc1bc-fc7d-4dc6-a4ab-6b54eeed1999/pyspark_runner.py”, line 194, in

eval(compiled_code)

File “”, line 5, in

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py”, line 547, in wrapper

return f(*args_, **kwargs_)

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/context.py”, line 158, in init

default_reference)

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/typecheck/check.py”, line 547, in wrapper

return f(*args_, **kwargs_)

File “/mnt/tmp/spark-51a0a7f3-5444-4063-9680-fe6df8de0c84/userFiles-b446ce40-be84-4a01-9851-e842923ac1cd/hail-devel-ae9e34fb3cbf.zip/hail/context.py”, line 53, in init

min_block_size, branching_factor, tmp_dir)

TypeError: ‘JavaPackage’ object is not callable

StackTrace: org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)

scala.Option.foreach(Option.scala:257)

org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162)

sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

java.lang.reflect.Method.invoke(Method.java:498)

py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)

py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

py4j.Gateway.invoke(Gateway.java:280)

py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

py4j.commands.CallCommand.execute(CallCommand.java:79)

py4j.GatewayConnection.run(GatewayConnection.java:214)

java.lang.Thread.run(Thread.java:748)

TypeError: ‘JavaPackage’ object is not callable normally happens when the Hail jar file is not found or fails to load. sc.addJar documentation says it “Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.” but that means tasks (workers), not the driver. So you also need to add it to the driver. How are you invoking the driver? You may need to set, for example, spark.driver.extraClasspath.

I have the following set in spark-defaults.conf. I tried using ./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

I put the devel jar in /home/hadoop using the standard name. The following are settings in my spark-default.conf file.

spark.driver.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

spark.driver.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native

spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

bash$ ls -l /home/hadoop/

total 28108

-rw-rw-r-- 1 hadoop hadoop 27424722 Jul 2 11:52 hail-all-spark.jar

-rw-rw-r-- 1 hadoop hadoop 1196210 Jul 2 11:52 hail-python.zip

-rw-rw-r-- 1 hadoop hadoop 21055 Jul 5 21:22 Untitled.ipynb

It inits fine:

sc._jsc.addJar(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’)

sc.addPyFile(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’)

import hail as hl

import hail.expr.aggregators as agg

hl.init(sc)

When I run the following command:

ds = hl.import_vcf(‘s3://gfb-registry-raw/controls/1000genomes/human/dna/vcf.b38/AIMs/ALL*.vcf.bgz’ reference_genome=‘GRCh38’)

I still get the exception:

hail.utils.java.FatalError: ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration

I noticed if I don’t have :./hail-all-spark.jar in the paths in spark-default.conf the init fails. Any other suggestions?

Thanks,

Adam

I checked the task attempt logs from yarn and found the following. It looks like it is downloading the jar from s3. I also put the jar in /home/hadoop on all nodes in the cluster.

18/07/07 01:39:55 INFO Executor: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip with timestamp 1530925905396
18/07/07 01:39:57 INFO S3NativeFileSystem: Opening ‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’ for reading
18/07/07 01:39:57 INFO Utils: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp4806657294271057220.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/3525771591530925905396_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-devel-ae9e34fb3cbf.zip
18/07/07 01:39:57 INFO Executor: Fetching spark://10.66.117.154:39293/jars/toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar with timestamp 1530925892206
18/07/07 01:39:57 INFO TransportClientFactory: Successfully created connection to /10.66.117.154:39293 after 0 ms (0 ms spent in bootstraps)
18/07/07 01:39:57 INFO Utils: Fetching spark://10.66.117.154:39293/jars/toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp2212807853478377914.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/-10019953601530925892206_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar
18/07/07 01:39:57 INFO Executor: Adding file:/mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./toree-assembly-0.3.0.dev1-incubating-SNAPSHOT.jar to class loader
18/07/07 01:39:57 INFO Executor: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar with timestamp 1530925903635
18/07/07 01:39:57 INFO S3NativeFileSystem: Opening ‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’ for reading
18/07/07 01:39:57 INFO Utils: Fetching s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp4183466651204311484.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/-14685433961530925903635_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar
18/07/07 01:39:57 INFO Executor: Adding file:/mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar to class loader
18/07/07 01:39:57 INFO Executor: Fetching spark://10.66.117.154:39293/jars/hail-all-spark.jar with timestamp 1530926976355
18/07/07 01:39:57 INFO Utils: Fetching spark://10.66.117.154:39293/jars/hail-all-spark.jar to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/fetchFileTemp8017925226355738872.tmp
18/07/07 01:39:57 INFO Utils: Copying /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/spark-8c7c40a4-c509-4ea3-a625-663c153c52ba/20445175021530926976355_cache to /mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-all-spark.jar
18/07/07 01:39:57 INFO Executor: Adding file:/mnt/yarn/usercache/hadoop/appcache/application_1530925830138_0001/container_1530925830138_0001_01_000020/./hail-all-spark.jar to class loader
18/07/07 01:39:58 INFO TorrentBroadcast: Started reading broadcast variable 17
18/07/07 01:39:58 INFO TransportClientFactory: Successfully created connection to /10.66.117.154:46805 after 1 ms (0 ms spent in bootstraps)
18/07/07 01:39:58 INFO MemoryStore: Block broadcast_17_piece0 stored as bytes in memory (estimated size 1940.0 B, free 2.8 GB)
18/07/07 01:39:58 INFO TorrentBroadcast: Reading broadcast variable 17 took 113 ms
18/07/07 01:39:58 INFO MemoryStore: Block broadcast_17 stored as values in memory (estimated size 4.8 KB, free 2.8 GB)
18/07/07 01:39:58 INFO TorrentBroadcast: Started reading broadcast variable 15
18/07/07 01:39:58 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes in memory (estimated size 23.6 KB, free 2.8 GB)
18/07/07 01:39:58 INFO TorrentBroadcast: Reading broadcast variable 15 took 5 ms
18/07/07 01:39:58 INFO TorrentBroadcast: Started reading broadcast variable 15
18/07/07 01:39:58 ERROR Utils: Exception encountered
com.esotericsoftware.kryo.KryoException: Error during Java deserialization.
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:65)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:246)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$8.apply(TorrentBroadcast.scala:293)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1337)
at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:294)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:226)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:836)
at is.hail.io.vcf.LoadVCF$$anonfun$apply$3.apply(LoadVCF.scala:835)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:918)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: is.hail.utils.SerializableHadoopConfiguration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:685)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1865)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1748)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2039)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1570)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:430)
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:63)
… 26 more
18/07/07 01:39:58 INFO TorrentBroadcast: Reading broadcast variable 15 took 1 ms
18/07/07 01:39:58 ERROR Utils: Exception encountered

Ok so. Let’s review the relevant portions of the paths:

spark.driver.extraClassPath ...:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar
spark.executor.extraClassPath ...:./hail-all-spark.jar:s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

AFAIK, the JDK has no idea what s3:// is so adding that as a path does not help.

Second point to keep in mind: spark.driver.extraClassPath is relative to the working directory of the driver, so, wherever spark is started. Moreover, addJar and addFile do not copy the file to the working directory of the driver (only to the working directories of the executors).

Action: Let’s remove the s3:// paths.


OK, so, this works:

sc._jsc.addJar('s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar')
sc.addPyFile('s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip')
import hail as hl
import hail.expr.aggregators as agg
hl.init(sc)

This means that we’re successfully starting a hail context on the driver node. Ergo, Spark is finding the hail JAR on the driver.


This fails:

ds = hl.import_vcf('s3://gfb-registry-raw/controls/1000genomes/human/dna/vcf.b38/AIMs/ALL*.vcf.bgz',
                   reference_genome=‘GRCh38’)

because there is no Spark JAR in your class path. Recall:

spark.executor.extraClassPath ...:./hail-all-spark.jar

There is a jar on the worker nodes. It’s named hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar. Instead of adding the JAR from s3://, can you try sc.addJar('./hail-all-spark.jar')?

Hi Dan,

Thank you for following up on this thread – I really appreciate your help. Unfortunately, I’m still running into the same error.

I launched a new cluster with a cleaned up spark-defaults.conf

spark.driver.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar

spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar

I tried each of the following (separately):

sc._jsc.addJar(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’)

sc.addPyFile(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’)

sc._jsc.addJar(’/home/hadoop/hail-all-spark.jar’)

sc.addPyFile(’/home/hadoop/hail-python.zip’)

sc._jsc.addJar(’./hail-all-spark.jar’)

sc.addPyFile(’./hail-python.zip’)

Thanks,

Adam

could you try these:

sc._jsc.addJar(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar’)

sc.addPyFile(‘s3://gfb-hail/hail-devel-ae9e34fb3cbf.zip’)

with ./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar on the class paths?

I tried putting both on the classpath but still get the same error:

spark.driver.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/:/usr/share/aws/emr/emrfs/auxlib/:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:./hail-all-spark.jar:./hail-devel-ae9e34fb3cbf-Spark-2.2.0.jar

Could you share the hail log from one of these failing runs? I’d like to see what Spark thinks its copying and where it is putting the jars.

I don’t see how to attach anything other than an image here, so I dumped stdout and stderr from one of the task attempt logs on s3.

Stderr:
https://gfb-external-access.s3.amazonaws.com/stderr.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJABOW7QPSUAFJBZA/20180713/us-east-1/s3/aws4_request&X-Amz-Date=20180713T165750Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=8800faa5de50d3402077e4877513cbae38a40441adb178455aae1109b1c7d604

Stdout:
https://gfb-external-access.s3.amazonaws.com/stdout.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJABOW7QPSUAFJBZA/20180713/us-east-1/s3/aws4_request&X-Amz-Date=20180713T165815Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=f604a2571be9a8d2dddc1c6685f17e62c9dc799b1dae07f80b140ab6dadcbde5

We were able to get past this by manually distributing the jar to all of the cluster nodes and adding the absolute path to the jar to the classpath variables in spark-defaults.

That’s pretty horrible. Maybe we should talk to the Spark people about this stuff?

Things like addJar not being exposed in Python, not doing the right thing for s3 paths, etc.

Hi Tim,

Is there a way to compile our own files with gradlew so we can get the latest Hail 0.2 version using Spark 2.3.0?

I used your .jar and .zip files (version ae9e34fb3cbf) and they do work on emr-5.13.0 with Spark 2.3.0; we just want to get Hail with the latest updates.

Thanks,

Carlos

It’s totally possible to compile your own! I haven’t done it in a while (since making that .jar and .zip) so I could be wrong about specifics, but all you need to do is pass versions for Spark, Breeze, and py4j:

./gradlew -Dspark.version=2.3.0 -Dbreeze.version=0.13.2 -Dpy4j.version=0.10.6 shadowJar archiveZip

I just looked up the breeze / py4j versions for spark 2.3.0 so these should be correct.

1 Like

also note you’ll need to compile on the same OS that the EMR VMs are using

Thanks Tim. I’ll give it a shot

It worked, np. Thanks Tim!