Java error when trying to write a VDS

Hi there,

I’m trying to get familiar with hail and running a few basic commands to get my feet wet.

I have been able to import_vcf, report() and count() the vds. However when I tried to write out the vds I get a java.FatalError

[eddip@victorchang.edu.au@vclefkas01 /short/software 16:59:01 j:0] $ spark-submit /mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/W_06_hailImport.py
Running on Apache Spark version 2.1.0
SparkUI available at http://129.94.111.229:4040
Welcome to
     __  __     <>__
    / /_/ /__  __/ /
   / __  / _ `/ / /
  /_/ /_/\_,_/_/_/   version 0.1-69ec30f
[Stage 0:============================================>              (3 + 1) / 4]2017-10-25 16:59:57 Hail: INFO: Multiallelic variants detected. Some methods require splitting or filtering multiallelics first.
2017-10-25 16:59:58 Hail: INFO: Coerced sorted dataset
2017-10-25 16:59:58 Hail: INFO: while importing:
    all_17102017_vqsr.normalized.Y.vcf.bgz  import clean
[Stage 2:============================================>              (3 + 1) / 4]364 75577
Traceback (most recent call last):
  File "/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/W_06_hailImport.py", line 10, in <module>
    vds.write('/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds') #
  File "<decorator-gen-285>", line 2, in write
  File "/short/software/hail/python/hail/java.py", line 121, in handle_py4j
    'Error summary: %s' % (deepest, full, Env.hc().version, deepest))
hail.java.FatalError

I saw a few other posts about write errors and there were suggestion of --config additions to spark call but didn’t work for me.

Hope you can help.
Thanks.
Eddie

Is this the full error message? If so, we need to first fix whatever is causing the real message to get dropped.

Looking a bit closer, I bet that what’s happened is the Hail jar isn’t visible to the executors. I definitely want to see the rest of the error message if it exists though!

Hi Tim,

That’s the only error message I see back on the console. However, I have found the hail.log, and within that I see an error for changing permissions :

The command - vds.write(‘/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds’)
The error in the hail.log

2017-10-26 10:42:20 root: ERROR: ExitCodeException: chmod: changing permissions of ‘/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds/metadata.json.gz’: Operation not permitted

From org.apache.hadoop.util.Shell$ExitCodeException: chmod: changing permissions of ‘/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds/metadata.json.gz’: Operation not permitted

	at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
	at org.apache.hadoop.util.Shell.run(Shell.java:479)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:866)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:849)
	at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
	at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
	at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
	at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
	at is.hail.utils.richUtils.RichHadoopConfiguration$.is$hail$utils$richUtils$RichHadoopConfiguration$$create$extension(RichHadoopConfiguration.scala:22)
	at is.hail.utils.richUtils.RichHadoopConfiguration$.writeTextFile$extension(RichHadoopConfiguration.scala:243)
	at is.hail.variant.VariantSampleMatrix.writeMetadata(VariantSampleMatrix.scala:2118)
	at is.hail.variant.VariantDatasetFunctions$.write$extension(VariantDataset.scala:730)
	at is.hail.variant.VariantDatasetFunctions.write(VariantDataset.scala:722)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:280)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:214)
	at java.lang.Thread.run(Thread.java:745)

I changed the location of where I’m writing the vds to a local directory in my system and the write now completes with no error, vds (folder) created.

So I’m now able to write out my vds.

Eddie

Well we’ve never seen this error before. I’m also confused how it didn’t bubble back to the Python error message.

It definitely looks like the user that Spark executes as doesn’t have permission to edit:

/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/

What are the permissions on this folder?

Hi Dan,

[eddip@victorchang.edu.au@vclefkas01 /mnt/smb/giannoulatou_lab/Eddie_Ip/working/software 09:05:37 j:0] $ ls -l
total 240640
-rwxrwxrwx 1 root root  29797459 Oct 23 18:22 Hail-0.1-69ec30fa6c74-Spark-2.1.0.zip
drwxrwxrwx 2 root root         0 Oct 26 10:46 hail_data
-rwxrwxrwx 1 root root  19741785 Oct 24 12:09 scala-2.12.4.tgz
-rwxrwxrwx 1 root root 195636829 Oct 23 18:27 spark-2.1.0-bin-hadoop2.7.tgz

the write does create a folder “all_17102017_vqsr.normalized.vds”, and I can see an empty “metadata.json.gz” within the vds folder.

Eddie

All those directories are owned by root. According to the chmod man page:

Only the owner of a file or the super-user is permitted to change the mode of a file.

I’m fairly confident that your Spark workers are not running as the root user (this is not a good idea anyway). Those directories should really be owned by whichever user your Spark jobs are running as. I think that’s usually your username (you can check your username with whoami). If your username doesn’t work, ask whomever administrates your Spark cluster what user Spark jobs run as.


Aside: Spark really shouldn’t set the file mode like this. Unfortunately, Spark does not make this easy for us to change. I’ll look into mitigation strategies. Sorry about this.

Thanks for your response Dan.

Eddie