VEP annotation (IOException: error=13, Permission denied)


#1

Dear Hail team,

We are trying to annotate a MatrixTable using VEP. The command is like follows:
mt = hl.vep(mt, “s3://gfb-genomics/vep-configuration.json”)

But we got the following error message:

FatalError: IOException: error=13, Permission denied

Java stack trace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 9 in stage 0.0 failed 4 times, most recent failure: Lost task 9.3 in stage 0.0 (TID 15, ip-10-66-50-177.goldfinch.lan, executor 5): java.io.IOException: Cannot run program “/vep”: error=13, Permission denied

Could you please let use know how to solve this? I can paste the full error message if that’s helpful.

Best regards,
Wei


#2

How did you set up VEP?


#3

What are the permissions on /vep (ls -al /vep) on every one of your worker nodes?


#4

Thank you for the reply! Yes it’s some installation issues. We are trying to solve it.


#5

#!/usr/bin/env bash

Copy VEP

mkdir -p /vep/homo_sapiens
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/loftee /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/ensembl-tools-release-85 /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/loftee_data /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/Plugins /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/homo_sapiens/85_GRCh38 /vep/homo_sapiens/
/usr/local/gsutil/gsutil cp gs://hail-common/vep/vep/vep85-gcloud.json /vep/vep85-gcloud.json

legacy

/usr/local/gsutil/gsutil cp gs://hail-common/vep/vep/vep85-gcloud.properties /vep/vep-gcloud.properties

Create symlink to vep

ln -s /vep/ensembl-tools-release-85/scripts/variant_effect_predictor /vep

Give perms

chmod -R 777 /vep

Copy perl JSON module

/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/perl-JSON/* /usr/share/perl/5.20/

Copy perl DBD::SQLite module

/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/perl-SQLITE/* /usr/share/perl/5.20/

Copy htslib and samtools

/usr/local/gsutil/gsutil rsync gs://hail-common/vep/htslib /usr/bin/
/usr/local/gsutil/gsutil rsync gs://hail-common/vep/samtools /usr/bin/
chmod a+rx /usr/bin/tabix
chmod a+rx /usr/bin/bgzip
chmod a+rx /usr/bin/htsfile
chmod a+rx /usr/bin/samtools

Run VEP on the 1-variant VCF to create fasta.index file – caution do not make fasta.index file writeable afterwards!

/usr/local/gsutil/gsutil cp gs://hail-common/vep/vep/1var.vcf /vep
#The following is a local copy of the shell script that references GRCH38 instead of 37
aws s3 cp --region=us-east-1 s3://out-bucket/hail/build/run_hail_vep85_vcf.sh /vep/
#gsutil cp gs://hail-common/vep/vep/run_hail_vep85_vcf.sh /vep
chmod a+rx /vep/run_hail_vep85_vcf.sh

/vep/run_hail_vep85_vcf.sh /vep/1var.vcf


#6
[ec2-user@ip-10-66-51-124 ~]$ ls -la /vep
total 52
drwxrwxrwx  8 hadoop hadoop 4096 Jan 25 04:23 .
dr-xr-xr-x 26 root   root   4096 Jan 25 04:21 ..
-rwxrwxrwx  1 hadoop hadoop   87 Jan 25 02:13 1var.vcf
drwxrwxrwx  4 hadoop hadoop 4096 Jan 25 04:21 ensembl-tools-release-85
drwxrwxrwx  3 hadoop hadoop 4096 Jan 25 04:21 homo_sapiens
drwxrwxrwx  4 hadoop hadoop 4096 Jan 25 04:22 loftee
drwxrwxrwx  2 hadoop hadoop 4096 Jan 25 04:23 loftee_data
drwxrwxrwx  4 hadoop hadoop 4096 Jan 25 04:21 Plugins
-rwxrwxr-x  1 hadoop hadoop  571 Dec 30 02:24 run_hail_vep85_vcf.sh
drwxrwxrwx  6 hadoop hadoop 4096 Jan 25 04:23 variant_effect_predictor
-rwxrwxrwx  1 hadoop hadoop 3109 Jan 25 02:16 vep85-gcloud.json
-rwxrwxrwx  1 hadoop hadoop 1720 Jan 25 02:16 vep85-init.sh
-rwxrwxrwx  1 hadoop hadoop  389 Jan 25 02:16 vep-gcloud.properties

#7

I should also mention that we see the following in the spark history log on the namenode:

log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /stdout (Permission denied)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
    at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
    at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
    at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223)
    at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
    at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
    at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
    at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
    at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:120)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:108)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
    at org.apache.spark.deploy.history.HistoryServer$.log(HistoryServer.scala:265)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:271)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
log4j:ERROR Either File or DatePattern options are not set for appender [DRFA-stdout].log4j:ERROR 
setFile(null,true) call failed.
java.io.FileNotFoundException: /stdout (Permission denied)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
    at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
    at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
    at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223)
    at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
    at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
    at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
    at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
    at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:120)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:108)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
    at org.apache.spark.deploy.history.HistoryServer$.log(HistoryServer.scala:265)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:271)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
log4j:ERROR Either File or DatePattern options are not set for appender [DRFA-stdout].

#8

@atebbe, it looks like there’s a log system that’s misconfigured? I don’t think this is related to Hail. Do you see this issue with plain pyspark jobs?