Unable to run Nirvana Locally

I’m trying to run Nirvana on some of the sample vcf files locally, but I’m getting a very obscure/strange error. My code is simple:

import hail as hl
hl.init()
vcf = hl.import_vcf(‘data/1kg.vcf.bgz’, min_partitions=8)
vds = vcf.nirvana(‘data/nirvana.properties’)

I get the following output:

File “”, line 1, in
File “</Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/decorator.py:decorator-gen-352>”, line 2, in show
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py”, line 561, in wrapper
return original_func(*args, **kwargs)
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/expr/expressions/base_expression.py”, line 704, in show
handler(self._show(n, width, truncate, types))
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/table.py”, line 1232, in str
return self.table._ascii_str(self.n, self.width, self.truncate, self.types)
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/table.py”, line 1293, in _ascii_str
rows = t.take(n + 1)
File “</Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/decorator.py:decorator-gen-828>”, line 2, in take
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py”, line 561, in wrapper
return original_func(*args, **kwargs)
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/table.py”, line 1991, in take
return self.head(n).collect(_localize)
File “</Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/decorator.py:decorator-gen-822>”, line 2, in collect
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py”, line 561, in wrapper
return original_func(*args, **kwargs)
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/table.py”, line 1805, in collect
return Env.backend().execute(e._ir)
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/backend/backend.py”, line 106, in execute
result = json.loads(Env.hail().backend.spark.SparkBackend.executeJSON(self._to_java_ir(ir)))
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py”, line 1257, in call
answer, self.gateway_client, self.target_id, self.name)
File “/Users/username/miniconda3/envs/hail/lib/python3.7/site-packages/hail/utils/java.py”, line 240, in deco
‘Error summary: %s’ % (deepest, full, hail.version, deepest)) from None
hail.utils.java.FatalError: HailException: nirvana command failed with non-zero exit status 1

Java stack trace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 9, localhost, executor driver): is.hail.utils.HailException: nirvana command failed with non-zero exit status 1
at is.hail.utils.ErrorHandling$class.fatal(ErrorHandling.scala:9)
at is.hail.utils.package$.fatal(package.scala:28)
at is.hail.methods.Nirvana$$anonfun$1$$anonfun$apply$2.apply(Nirvana.scala:447)
at is.hail.methods.Nirvana$$anonfun$1$$anonfun$apply$2.apply(Nirvana.scala:428)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:611)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:762)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1867)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1002)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:62)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$38.apply(ContextRDD.scala:481)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$38.apply(ContextRDD.scala:481)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.utils.package$.getIteratorSizeWithMaxN(package.scala:383)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:559)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:559)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:589)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:587)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
at is.hail.sparkextras.ContextRDD.runJob(ContextRDD.scala:585)
at is.hail.sparkextras.ContextRDD.head(ContextRDD.scala:559)
at is.hail.rvd.RVD.head(RVD.scala:447)
at is.hail.expr.ir.TableHead.execute(TableIR.scala:369)
at is.hail.expr.ir.TableMapRows.execute(TableIR.scala:800)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:747)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:87)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:59)
at is.hail.expr.ir.InterpretNonCompilable$$anonfun$5.apply(InterpretNonCompilable.scala:16)
at is.hail.expr.ir.InterpretNonCompilable$$anonfun$5.apply(InterpretNonCompilable.scala:16)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at is.hail.expr.ir.InterpretNonCompilable$.apply(InterpretNonCompilable.scala:16)
at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:35)
at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:35)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:20)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:35)
at is.hail.backend.spark.SparkBackend$.execute(SparkBackend.scala:75)
at is.hail.backend.spark.SparkBackend$.executeJSON(SparkBackend.scala:18)
at is.hail.backend.spark.SparkBackend.executeJSON(SparkBackend.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)

is.hail.utils.HailException: nirvana command failed with non-zero exit status 1
at is.hail.utils.ErrorHandling$class.fatal(ErrorHandling.scala:9)
at is.hail.utils.package$.fatal(package.scala:28)
at is.hail.methods.Nirvana$$anonfun$1$$anonfun$apply$2.apply(Nirvana.scala:447)
at is.hail.methods.Nirvana$$anonfun$1$$anonfun$apply$2.apply(Nirvana.scala:428)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at is.hail.sparkextras.ContextRDD.iterator(ContextRDD.scala:611)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:60)
at is.hail.sparkextras.RepartitionedOrderedRDD2$$anonfun$compute$1$$anonfun$apply$1.apply(RepartitionedOrderedRDD2.scala:59)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:762)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.io.RichContextRDDRegionValue$$anonfun$boundary$extension$1$$anon$1.hasNext(RowStore.scala:1867)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:1002)
at is.hail.utils.richUtils.RichIterator$$anon$5.isValid(RichIterator.scala:22)
at is.hail.utils.StagingIterator.isValid(FlipbookIterator.scala:48)
at is.hail.utils.FlipbookIterator$$anon$9.setValue(FlipbookIterator.scala:331)
at is.hail.utils.FlipbookIterator$$anon$9.(FlipbookIterator.scala:344)
at is.hail.utils.FlipbookIterator.leftJoinDistinct(FlipbookIterator.scala:323)
at is.hail.annotations.OrderedRVIterator.leftJoinDistinct(OrderedRVIterator.scala:62)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$6.apply(KeyedRVD.scala:88)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:98)
at is.hail.rvd.KeyedRVD$$anonfun$orderedJoinDistinct$1.apply(KeyedRVD.scala:95)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$38.apply(ContextRDD.scala:481)
at is.hail.sparkextras.ContextRDD$$anonfun$czipPartitions$1$$anonfun$apply$38.apply(ContextRDD.scala:481)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at is.hail.sparkextras.ContextRDD$$anonfun$cmapPartitionsWithIndex$1$$anonfun$apply$32$$anonfun$apply$33.apply(ContextRDD.scala:422)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.utils.package$.getIteratorSizeWithMaxN(package.scala:383)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:559)
at is.hail.sparkextras.ContextRDD$$anonfun$14.apply(ContextRDD.scala:559)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:589)
at is.hail.sparkextras.ContextRDD$$anonfun$runJob$1.apply(ContextRDD.scala:587)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Hail version: 0.2.14-8dcb6722c72a
Error summary: HailException: nirvana command failed with non-zero exit status 1

On the nirvana github, they mention that if you run out of memory that you can try increasing the spark.driver.memory setting in the conf files, but the vcf file isn’t that big so I didn’t expect that this would be a problem. I also couldn’t figure out where to set that, because it seems like spark is loaded from the .jar file when running hail locally. Has anyone had success running nirvana locally?

That code example looks rather old. In current hail, Nirvana is invoked by doing something like:

import hail as hl
hl.init()
vcf = hl.import_vcf('data/1kg.vcf.bgz’, min_partitions=8)
hl.nirvana(vcf, 'path/to/nirvana.properties')

See the docs here: https://hail.is/docs/0.2/methods/genetics.html#hail.methods.nirvana

Should also note that Nirvana compatibility isn’t tested as part of the continuous integration suite in Hail (it’s slow and nobody was using it…), so it’s very likely broken.

Oh right, I’m not sure why I copy pasted that line, I was actually using that invocation, and it gave that error.

Given that Nirvana isn’t well supported though, should I switch to VEP instead for annotations? I’m not super picky about which one to use. If so, can I run VEP locally or does it have to run on the Google Cloud?

VEP is certainly better-supported than Nirvana.

It’ll still be a challenge to run locally, though – there’s not really any documentation on how to install and configure it for Hail.

We’re moving to a dockerized world, which will make this a bit more portable, but that world isn’t here yet.