Cant write out vep annotated vcf file

I think the Jupyter visual issue is unrelated.

Can you paste the error message now? The code change I linked above should have added some extra output from VEP.

2019-03-29 13:37:19 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:37:22 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:38:57 Hail: INFO: Coerced sorted dataset
2019-03-29 13:39:00 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:39:03 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:40:45 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:40:46 Hail: INFO: Coerced sorted dataset
2019-03-29 13:40:46 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:40:49 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:42:15 Hail: INFO: Coerced sorted dataset
2019-03-29 13:42:18 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:42:21 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:43:49 Hail: INFO: Coerced sorted dataset
2019-03-29 13:43:50 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:43:50 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:43:53 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:45:18 Hail: INFO: Coerced sorted dataset
2019-03-29 13:45:20 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:45:23 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:46:52 Hail: INFO: Coerced sorted dataset
2019-03-29 13:46:53 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:46:53 Hail: INFO: Ordering unsorted dataset with network shuffle
2019-03-29 13:46:54 Hail: INFO: Coerced sorted dataset


FatalError Traceback (most recent call last)
in
----> 1 vcf1_ann.vep.show(5)

in show(self, n, width, truncate, types, handler)

~/programs/anaconda2/envs/hail/opt/hail/python/hail/typecheck/check.py in wrapper(__original_func, *args, **kwargs)
558 def wrapper(original_func, *args, **kwargs):
559 args
, kwargs
= check_all(__original_func, args, kwargs, checkers, is_method=is_method)
–> 560 return original_func(*args, **kwargs)
561
562 return wrapper

~/programs/anaconda2/envs/hail/opt/hail/python/hail/expr/expressions/base_expression.py in show(self, n, width, truncate, types, handler)
686 Print an extra header line with the type of each field.
687 “”"
–> 688 handler(self._show(n, width, truncate, types))
689
690 def _show(self, n=10, width=90, truncate=None, types=True):

~/programs/anaconda2/envs/hail/opt/hail/python/hail/expr/expressions/base_expression.py in _show(self, n, width, truncate, types)
712 if name in t.key:
713 t = t.key_by(name).select()
–> 714 return t._show(n, width, truncate, types)
715
716

~/programs/anaconda2/envs/hail/opt/hail/python/hail/table.py in _show(self, n, width, truncate, types)
1190
1191 def _show(self, n=10, width=90, truncate=None, types=True):
-> 1192 return self._jt.showString(n, joption(truncate), types, width)
1193
1194 def index(self, *exprs):

~/programs/anaconda2/envs/hail/opt/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in call(self, *args)
1131 answer = self.gateway_client.send_command(command)
1132 return_value = get_return_value(
-> 1133 answer, self.gateway_client, self.target_id, self.name)
1134
1135 for temp_arg in temp_args:

~/programs/anaconda2/envs/hail/opt/hail/python/hail/utils/java.py in deco(*args, **kwargs)
208 raise FatalError(’%s\n\nJava stack trace:\n%s\n’
209 ‘Hail version: %s\n’
–> 210 ‘Error summary: %s’ % (deepest, full, hail.version, deepest)) from None
211 except pyspark.sql.utils.CapturedException as e:
212 raise FatalError(’%s\n\nJava stack trace:\n%s\n’

FatalError: HailException: vep command ‘vep --format vcf --json --everything --allele_number --no_stats --cache --offline --minimal --assembly GRCh37 --plugin LoF,human_ancestor_fa:/home/imbhs/src/VEP/.vep/pluginData/human_ancestor.fa.gz,filter_position:0.05,min_intron_size:15,conservation_file:/home/imbhs/src/VEP/.vep/pluginData/phylocsf.sql -o STDOUT’ failed with non-zero exit status 2

Java stack trace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 740.0 failed 4 times, most recent failure: Lost task 0.3 in stage 740.0 (TID 82654, 10.222.5.30, executor 2): is.hail.utils.HailException: vep command ‘vep --format vcf --json --everything --allele_number --no_stats --cache --offline --minimal --assembly GRCh37 --plugin LoF,human_ancestor_fa:/home/imbhs/src/VEP/.vep/pluginData/human_ancestor.fa.gz,filter_position:0.05,min_intron_size:15,conservation_file:/home/imbhs/src/VEP/.vep/pluginData/phylocsf.sql -o STDOUT’ failed with non-zero exit status 2
at is.hail.utils.ErrorHandling$class.fatal(ErrorHandling.scala:9)
at is.hail.utils.package$.fatal(package.scala:26)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:187)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:128)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:215)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1038)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1029)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:969)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1029)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:760)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:532)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:764)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1633)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1004)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:48)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$31.apply(ContextRDD.scala:402)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$31.apply(ContextRDD.scala:402)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$27$$anonfun$apply$28.apply(ContextRDD.scala:355)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$27$$anonfun$apply$28.apply(ContextRDD.scala:355)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.utils.package$.getIteratorSizeWithMaxN(package.scala:357)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:480)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:480)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:510)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:508)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062)
at is.hail.sparkextras.ContextRDD.runJob(ContextRDD.scala:506)
at is.hail.sparkextras.ContextRDD.head(ContextRDD.scala:480)
at is.hail.rvd.RVD.head(RVD.scala:436)
at is.hail.expr.ir.TableHead.execute(TableIR.scala:322)
at is.hail.expr.ir.TableMapGlobals.execute(TableIR.scala:656)
at is.hail.expr.ir.TableMapRows.execute(TableIR.scala:514)
at is.hail.expr.ir.TableMapGlobals.execute(TableIR.scala:656)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:55)
at is.hail.table.Table.x$2$lzycompute(Table.scala:200)
at is.hail.table.Table.x$2(Table.scala:200)
at is.hail.table.Table.value$lzycompute(Table.scala:200)
at is.hail.table.Table.value(Table.scala:200)
at is.hail.table.Table.rdd$lzycompute(Table.scala:204)
at is.hail.table.Table.rdd(Table.scala:204)
at is.hail.table.Table.collect(Table.scala:522)
at is.hail.table.Table.showString(Table.scala:566)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)

is.hail.utils.HailException: vep command ‘vep --format vcf --json --everything --allele_number --no_stats --cache --offline --minimal --assembly GRCh37 --plugin LoF,human_ancestor_fa:/home/imbhs/src/VEP/.vep/pluginData/human_ancestor.fa.gz,filter_position:0.05,min_intron_size:15,conservation_file:/home/imbhs/src/VEP/.vep/pluginData/phylocsf.sql -o STDOUT’ failed with non-zero exit status 2
at is.hail.utils.ErrorHandling$class.fatal(ErrorHandling.scala:9)
at is.hail.utils.package$.fatal(package.scala:26)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:187)
at is.hail.methods.VEP$$anonfun$7$$anonfun$apply$4.apply(VEP.scala:128)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:215)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1038)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1029)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:969)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1029)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:760)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:532)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:764)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1633)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1004)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:48)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$31.apply(ContextRDD.scala:402)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$31.apply(ContextRDD.scala:402)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$27$$anonfun$apply$28.apply(ContextRDD.scala:355)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$27$$anonfun$apply$28.apply(ContextRDD.scala:355)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.rvd.RVD$$anonfun$apply$25$$anon$3.hasNext(RVD.scala:1262)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at is.hail.utils.package$.getIteratorSizeWithMaxN(package.scala:357)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:480)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:480)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:510)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:508)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Hail version: 0.2-b6f8a15434b4
Error summary: HailException: vep command ‘vep --format vcf --json --everything --allele_number --no_stats --cache --offline --minimal --assembly GRCh37 --plugin LoF,human_ancestor_fa:/home/imbhs/src/VEP/.vep/pluginData/human_ancestor.fa.gz,filter_position:0.05,min_intron_size:15,conservation_file:/home/imbhs/src/VEP/.vep/pluginData/phylocsf.sql -o STDOUT’ failed with non-zero exit status 2

That version of Hail is from November – can you update and run on the latest?

Okay, so now I finally installed your version from April. Now it runs out of memory (java error) while calling split_multi_hts(). I use 384gb RAM. This wasn’t a problem in the prior version?

the VEP execution could have become lazy from the November build to April – can you share the pipeline you’re running and the stack trace for the error?

First I do some annotation, then some interval filtering, then I try and use the mentioned split_multi_hts() on the filtered data.
This screen dump is the best I can do:
06

what’s the full pipeline you’re running?