VEP annotation (IOException: error=13, Permission denied)

Dear Hail team,

We are trying to annotate a MatrixTable using VEP. The command is like follows:
mt = hl.vep(mt, “s3://gfb-genomics/vep-configuration.json”)

But we got the following error message:

FatalError: IOException: error=13, Permission denied

Java stack trace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 9 in stage 0.0 failed 4 times, most recent failure: Lost task 9.3 in stage 0.0 (TID 15, ip-10-66-50-177.goldfinch.lan, executor 5): java.io.IOException: Cannot run program “/vep”: error=13, Permission denied

Could you please let use know how to solve this? I can paste the full error message if that’s helpful.

Best regards,
Wei

How did you set up VEP?

What are the permissions on /vep (ls -al /vep) on every one of your worker nodes?

Thank you for the reply! Yes it’s some installation issues. We are trying to solve it.

#!/usr/bin/env bash

Copy VEP

mkdir -p /vep/homo_sapiens
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/loftee /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/ensembl-tools-release-85 /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/loftee_data /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/Plugins /vep/
/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/vep/homo_sapiens/85_GRCh38 /vep/homo_sapiens/
/usr/local/gsutil/gsutil cp gs://hail-common/vep/vep/vep85-gcloud.json /vep/vep85-gcloud.json

legacy

/usr/local/gsutil/gsutil cp gs://hail-common/vep/vep/vep85-gcloud.properties /vep/vep-gcloud.properties

Create symlink to vep

ln -s /vep/ensembl-tools-release-85/scripts/variant_effect_predictor /vep

Give perms

chmod -R 777 /vep

Copy perl JSON module

/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/perl-JSON/* /usr/share/perl/5.20/

Copy perl DBD::SQLite module

/usr/local/gsutil/gsutil cp -r gs://hail-common/vep/perl-SQLITE/* /usr/share/perl/5.20/

Copy htslib and samtools

/usr/local/gsutil/gsutil rsync gs://hail-common/vep/htslib /usr/bin/
/usr/local/gsutil/gsutil rsync gs://hail-common/vep/samtools /usr/bin/
chmod a+rx /usr/bin/tabix
chmod a+rx /usr/bin/bgzip
chmod a+rx /usr/bin/htsfile
chmod a+rx /usr/bin/samtools

Run VEP on the 1-variant VCF to create fasta.index file – caution do not make fasta.index file writeable afterwards!

/usr/local/gsutil/gsutil cp gs://hail-common/vep/vep/1var.vcf /vep
#The following is a local copy of the shell script that references GRCH38 instead of 37
aws s3 cp --region=us-east-1 s3://out-bucket/hail/build/run_hail_vep85_vcf.sh /vep/
#gsutil cp gs://hail-common/vep/vep/run_hail_vep85_vcf.sh /vep
chmod a+rx /vep/run_hail_vep85_vcf.sh

/vep/run_hail_vep85_vcf.sh /vep/1var.vcf

[ec2-user@ip-10-66-51-124 ~]$ ls -la /vep
total 52
drwxrwxrwx  8 hadoop hadoop 4096 Jan 25 04:23 .
dr-xr-xr-x 26 root   root   4096 Jan 25 04:21 ..
-rwxrwxrwx  1 hadoop hadoop   87 Jan 25 02:13 1var.vcf
drwxrwxrwx  4 hadoop hadoop 4096 Jan 25 04:21 ensembl-tools-release-85
drwxrwxrwx  3 hadoop hadoop 4096 Jan 25 04:21 homo_sapiens
drwxrwxrwx  4 hadoop hadoop 4096 Jan 25 04:22 loftee
drwxrwxrwx  2 hadoop hadoop 4096 Jan 25 04:23 loftee_data
drwxrwxrwx  4 hadoop hadoop 4096 Jan 25 04:21 Plugins
-rwxrwxr-x  1 hadoop hadoop  571 Dec 30 02:24 run_hail_vep85_vcf.sh
drwxrwxrwx  6 hadoop hadoop 4096 Jan 25 04:23 variant_effect_predictor
-rwxrwxrwx  1 hadoop hadoop 3109 Jan 25 02:16 vep85-gcloud.json
-rwxrwxrwx  1 hadoop hadoop 1720 Jan 25 02:16 vep85-init.sh
-rwxrwxrwx  1 hadoop hadoop  389 Jan 25 02:16 vep-gcloud.properties

I should also mention that we see the following in the spark history log on the namenode:

log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /stdout (Permission denied)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
    at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
    at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
    at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223)
    at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
    at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
    at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
    at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
    at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:120)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:108)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
    at org.apache.spark.deploy.history.HistoryServer$.log(HistoryServer.scala:265)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:271)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
log4j:ERROR Either File or DatePattern options are not set for appender [DRFA-stdout].log4j:ERROR 
setFile(null,true) call failed.
java.io.FileNotFoundException: /stdout (Permission denied)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
    at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
    at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
    at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223)
    at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
    at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
    at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
    at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
    at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:120)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:108)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99)
    at org.apache.spark.deploy.history.HistoryServer$.initializeLogIfNecessary(HistoryServer.scala:265)
    at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
    at org.apache.spark.deploy.history.HistoryServer$.log(HistoryServer.scala:265)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:271)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
log4j:ERROR Either File or DatePattern options are not set for appender [DRFA-stdout].

@atebbe, it looks like there’s a log system that’s misconfigured? I don’t think this is related to Hail. Do you see this issue with plain pyspark jobs?

Hi Dan,

We launched a new EMR cluster using last week’s EMR release that fixed a bunch of log4j issues. We are however, continuing to see issues with using VEP. Have you seen this before or have any suggestions?

2019-03-01 16:45:22 DAGScheduler: INFO: ResultStage 0 (collect at RVD.scala:622) failed in 12.999 s due to Job aborted due to stage failure: Task 10 in stage 0.0 failed 4 times, most recent failure: Lost task 10.3 in stage 0.0 (TID 17, ip-10-66-51-150.goldfinch.lan, executor 6): java.io.IOException: Cannot run program “/vep”: error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at is.hail.utils.richUtils.RichIterator$.pipe$extension(RichIterator.scala:47)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:159)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:155)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:220)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:298)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:599)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:762)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1606)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1002)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:62)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:390)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1334)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
… 91 more

Driver stacktrace:
2019-03-01 16:45:22 DAGScheduler: INFO: Job 0 failed: collect at RVD.scala:622, took 13.086885 s
2019-03-01 16:45:22 root: ERROR: IOException: error=13, Permission denied
From org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 0.0 failed 4 times, most recent failure: Lost task 10.3 in stage 0.0 (TID 17, ip-10-66-51-150.goldfinch.lan, executor 6): java.io.IOException: Cannot run program “/vep”: error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at is.hail.utils.richUtils.RichIterator$.pipe$extension(RichIterator.scala:47)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:159)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:155)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:220)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:298)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:599)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:762)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1606)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1002)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:62)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:390)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1334)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
… 91 more

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:2039)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2027)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2026)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2026)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:966)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2260)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2209)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2198)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:777)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:945)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.collect(RDD.scala:944)
at is.hail.rvd.RVD.collectAsBytes(RVD.scala:622)
at is.hail.rvd.RVD.collect(RVD.scala:605)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:768)
at is.hail.expr.ir.Interpret$.is$hail$expr$ir$Interpret$$interpret$1(Interpret.scala:102)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:659)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:93)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:63)
at is.hail.expr.ir.Interpret$.interpretJSON(Interpret.scala:22)
at is.hail.expr.ir.Interpret.interpretJSON(Interpret.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)

java.io.IOException: Cannot run program “/vep”: error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at is.hail.utils.richUtils.RichIterator$.pipe$extension(RichIterator.scala:47)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:159)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:155)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:220)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:298)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:599)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:762)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1606)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1002)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:62)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:390)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1334)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
at is.hail.utils.richUtils.RichIterator$.pipe$extension(RichIterator.scala:47)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:159)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:155)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:220)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:298)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:599)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:762)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1606)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1002)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:62)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$36.apply(ContextRDD.scala:469)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:390)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)

I am wondering if the original respondents were having trouble with this. Were both of you also attempting to run VEP on an AWS EMR cluster?

For anyone else, like me, who Googled this: I had the same problem as you did and ended up having to change vep85-gcloud.json to point directly at variant_effect_predictor.pl and not use symlinks or relative paths/etc:

{"command": [
    "/usr/bin/perl",
    "/vep/variant_effect_predictor/variant_effect_predictor.pl",
    "--format", "vcf",
    "__OUTPUT_FORMAT_FLAG__",
    "--everything",
    "--allele_number",
    "--no_stats",
    "--cache", "--offline",
    "--dir", "/vep",
    "--fasta", "/vep/homo_sapiens/85_GRCh37/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa",
    "--minimal",
    "--assembly", "GRCh37",
    "--plugin", "LoF,human_ancestor_fa:/vep/loftee_data/human_ancestor.fa.gz,filter_position:0.05,min_intron_size:15,conservation_file:/vep/loftee_data/phylocsf_gerp.sql",
    "-o", "STDOUT"
],
 "env": {
     "PERL5LIB": "/vep/loftee"
 },

I am not sure what the issue was - I got the impression from Hail source that VEP would automatically fry to find the variant_effect_predictor.pl in the supplied folder, and that I wouldn’t need to supply it directly. Perhaps I was misreading this and the /vep is actually a symbink to variant_effect_predictor.pl. But after I did this (and a few other config changes) I was able to run on EMR.

Adding one note here now that I’m not longer behind on annotating variants :slight_smile:

I believe the Java “permission denied” issues on a Spark cluster are extremely ambiguous and sometimes misleading: often it’s really a “file not found” error rather than something you need to go chmod/etc.

I am still a bit confused why I needed to do this - I am wondering if there are subtle differences in Debian/CentOS (aka Ubuntu/AWS Linux) symlinks that were confusing Hail. I am pretty sure this worked out-of-the-box when I was testing locally on macOS.

You are both on AWS Linux 2? Did you build Hail on the cluster as part of the deployment, or did you use a prebuilt JAR? I built Hail, but I did not build VEP from source. Perhaps I should. Anyway, atebbe, if editing your vep85 json with explicitly specifying a variant_effect_predictor.pl location fixes your problem, that would be very useful for my team and me in the future.