Import_vcf() on databricks results in NoClassDefFoundError

Hi,

I am just getting started with hail. I set up the libraries and cluster configuration as mentioned in the tutorial.

When I run: vds = hc.import_vcf(vcf_path), I get the following error:

java.lang.NoClassDefFoundError: Could not initialize class is.hail.driver.ToplevelCommands$
Py4JJavaError: An error occurred while calling z:is.hail.driver.ToplevelCommands.lookup.
: java.lang.NoClassDefFoundError: Could not initialize class is.hail.driver.ToplevelCommands$
at is.hail.driver.ToplevelCommands.lookup(Command.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)

Is it a mistake with my configuration? Please help.
Shiva

Hi Shiva,
This isn’t your fault – you followed the directions perfectly! There’s an issue with Databricks library config which can be resolved by following the contents of this post:

https://forums.databricks.com/questions/11391/can-i-run-hail-on-databricks.html

Also, you should know that the Databricks tutorial uses an older version of Hail, so not everything you find in the documentation will work in that build. We’ll go back and update the Databricks deployment soon.

1 Like

Thank you, Tim. I will try installing it on my machine.