In running through the ‘Getting Started’ i am trying to import the included sample.vcf into Hail’s .vds format, run: but when i run the command in the python interperter i get the error:
[cloudera@quickstart hail]$ python2.7
Python 2.7.13 (default, Mar 28 2017, 10:26:54)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
from hail import *
hc = HailContext()
Setting default log level to “WARN”.
To adjust logging level use sc.setLogLevel(newLevel).
hail: info: SparkUI:
hail: warning: `src/test/resources/sample.vcf’ refers to no files
Traceback (most recent call last):
File “”, line 1, in
File “”, line 2, in import_vcf
File “/home/cloudera/hail/python/hail/”, line 110, in handle_py4j
raise FatalError(msg) arguments refer to no files
and i can not go beyond that without getting into more errors. Any thoughts ?
Hi Michael,
I’ve seen this kind of error before when using a system with an HDFS file system installed – this is the default file system in Spark, so src/test/resources/sample.vcf will look in the hdfs home directory.
If this is the case, try importing this file by using file:// followed by the fully clarified path to sample.vcf.
Traceback (most recent call last):
File “”, line 1, in
File “”, line 2, in import_vcf
File “/home/cloudera/hail/python/hail/”, line 110, in handle_py4j
raise FatalError(msg) UnsupportedClassVersionError: is/hail/io/compress/BGzipCodec : Unsupported major.minor version 52.0
Please see this Stack Overflow post about setting the JDK version for a Spark cluster. Setting JAVA_HOME only modifies the JDK on the driver node (the node on which you’ve set JAVA_HOME). You must also set the appropriate JDK for all of the worker nodes. This can be done with a spark2-submit--conf parameter: