Java error when trying to write a VDS

eipVCCRI · October 25, 2017, 6:02am

Hi there,

I’m trying to get familiar with hail and running a few basic commands to get my feet wet.

I have been able to import_vcf, report() and count() the vds. However when I tried to write out the vds I get a java.FatalError

[eddip@victorchang.edu.au@vclefkas01 /short/software 16:59:01 j:0] $ spark-submit /mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/W_06_hailImport.py
Running on Apache Spark version 2.1.0
SparkUI available at http://129.94.111.229:4040
Welcome to
     __  __     <>__
    / /_/ /__  __/ /
   / __  / _ `/ / /
  /_/ /_/\_,_/_/_/   version 0.1-69ec30f
[Stage 0:============================================>              (3 + 1) / 4]2017-10-25 16:59:57 Hail: INFO: Multiallelic variants detected. Some methods require splitting or filtering multiallelics first.
2017-10-25 16:59:58 Hail: INFO: Coerced sorted dataset
2017-10-25 16:59:58 Hail: INFO: while importing:
    all_17102017_vqsr.normalized.Y.vcf.bgz  import clean
[Stage 2:============================================>              (3 + 1) / 4]364 75577
Traceback (most recent call last):
  File "/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/W_06_hailImport.py", line 10, in <module>
    vds.write('/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds') #
  File "<decorator-gen-285>", line 2, in write
  File "/short/software/hail/python/hail/java.py", line 121, in handle_py4j
    'Error summary: %s' % (deepest, full, Env.hc().version, deepest))
hail.java.FatalError

I saw a few other posts about write errors and there were suggestion of --config additions to spark call but didn’t work for me.

Hope you can help.
Thanks.
Eddie

tpoterba · October 25, 2017, 10:04am

Is this the full error message? If so, we need to first fix whatever is causing the real message to get dropped.

tpoterba · October 25, 2017, 8:31pm

Looking a bit closer, I bet that what’s happened is the Hail jar isn’t visible to the executors. I definitely want to see the rest of the error message if it exists though!

eipVCCRI · October 25, 2017, 11:54pm

Hi Tim,

That’s the only error message I see back on the console. However, I have found the hail.log, and within that I see an error for changing permissions :

The command - vds.write(‘/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds’)
The error in the hail.log

2017-10-26 10:42:20 root: ERROR: ExitCodeException: chmod: changing permissions of ‘/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds/metadata.json.gz’: Operation not permitted

From org.apache.hadoop.util.Shell$ExitCodeException: chmod: changing permissions of ‘/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/all_17102017_vqsr.normalized.vds/metadata.json.gz’: Operation not permitted

	at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
	at org.apache.hadoop.util.Shell.run(Shell.java:479)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:866)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:849)
	at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
	at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
	at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
	at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
	at is.hail.utils.richUtils.RichHadoopConfiguration$.is$hail$utils$richUtils$RichHadoopConfiguration$$create$extension(RichHadoopConfiguration.scala:22)
	at is.hail.utils.richUtils.RichHadoopConfiguration$.writeTextFile$extension(RichHadoopConfiguration.scala:243)
	at is.hail.variant.VariantSampleMatrix.writeMetadata(VariantSampleMatrix.scala:2118)
	at is.hail.variant.VariantDatasetFunctions$.write$extension(VariantDataset.scala:730)
	at is.hail.variant.VariantDatasetFunctions.write(VariantDataset.scala:722)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:280)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:214)
	at java.lang.Thread.run(Thread.java:745)

I changed the location of where I’m writing the vds to a local directory in my system and the write now completes with no error, vds (folder) created.

So I’m now able to write out my vds.

Eddie

tpoterba · October 26, 2017, 12:29am

Well we’ve never seen this error before. I’m also confused how it didn’t bubble back to the Python error message.

danking · October 26, 2017, 4:13pm

It definitely looks like the user that Spark executes as doesn’t have permission to edit:

/mnt/smb/giannoulatou_lab/Eddie_Ip/working/software/hail_data/

What are the permissions on this folder?

eipVCCRI · October 26, 2017, 10:03pm

Hi Dan,

[eddip@victorchang.edu.au@vclefkas01 /mnt/smb/giannoulatou_lab/Eddie_Ip/working/software 09:05:37 j:0] $ ls -l
total 240640
-rwxrwxrwx 1 root root  29797459 Oct 23 18:22 Hail-0.1-69ec30fa6c74-Spark-2.1.0.zip
drwxrwxrwx 2 root root         0 Oct 26 10:46 hail_data
-rwxrwxrwx 1 root root  19741785 Oct 24 12:09 scala-2.12.4.tgz
-rwxrwxrwx 1 root root 195636829 Oct 23 18:27 spark-2.1.0-bin-hadoop2.7.tgz

the write does create a folder “all_17102017_vqsr.normalized.vds”, and I can see an empty “metadata.json.gz” within the vds folder.

Eddie

danking · October 27, 2017, 12:08am

All those directories are owned by root. According to the chmod man page:

Only the owner of a file or the super-user is permitted to change the mode of a file.

I’m fairly confident that your Spark workers are not running as the root user (this is not a good idea anyway). Those directories should really be owned by whichever user your Spark jobs are running as. I think that’s usually your username (you can check your username with whoami). If your username doesn’t work, ask whomever administrates your Spark cluster what user Spark jobs run as.

Aside: Spark really shouldn’t set the file mode like this. Unfortunately, Spark does not make this easy for us to change. I’ll look into mitigation strategies. Sorry about this.

eipVCCRI · October 27, 2017, 12:26am

Thanks for your response Dan.

Eddie

Topic		Replies	Views
Error running vds.summarize Help [0.1]	2	737	September 22, 2017
Not able to write to vds Help [0.1]	9	2027	September 1, 2017
Can't write VEP annotated hail table Hail Query & hailctl	4	602	February 13, 2019
Run-time error when using spark-submit Hail Query & hailctl	10	1821	November 15, 2018
Hail Exception crash during export step - how to diagnose Hail Query & hailctl	4	950	June 3, 2019

Java error when trying to write a VDS

Related Topics