Hail: VCFParseError

Hi,

I am currently working on Variant Quality Checks of vcf file using Hail. I am getting the below error. Kindly help to resolve this issue.

Hail version: 0.2.61-3c86d3ba497a
Error summary: VCFParseError: invalid character ‘|’ in integer literal

Thanks in advance

This means your VCF is malformed. I bet that the problem is an incorrect header – a field designated with Number=1 that’s actually an array (number=A,R,G,2+, etc) in the data.

Thanks for your reply… I have used each chromosome vcf files for variant QC, I didn’t get this error. And then I merged all chromosomes to one vcf file using gatk merge vcf, this merged vcf had throwing this error.

But this same excerise I have performed before, I didn’t face this issues. But now I am getting this error.

Could you please help me how can I fix this issue?

what’s the full stack trace?

p1 = hl.plot.histogram(qc_table.gq_mean, range=(10,80), legend=‘Variant Mean GQ’)
[Stage 5:> (0 + 72) / 8050]Traceback (most recent call last):
File “”, line 1, in
File “”, line 2, in histogram
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/hail/typecheck/check.py”, line 614, in wrapper
return original_func(*args, **kwargs)
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/hail/plot/plots.py”, line 401, in histogram
data = agg_f(aggregators.hist(data, start, end, bins))
File “”, line 2, in aggregate
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/hail/typecheck/check.py”, line 614, in wrapper
return original_func(*args, **kwargs)
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/hail/table.py”, line 1178, in aggregate
return Env.backend().execute(agg_ir)
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/hail/backend/py4j_backend.py”, line 98, in execute
raise e
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/hail/backend/py4j_backend.py”, line 74, in execute
result = json.loads(self._jhc.backend().executeJSON(jir))
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/py4j/java_gateway.py”, line 1257, in call
answer, self.gateway_client, self.target_id, self.name)
File “/gpfs/data/user/krithika/soft/miniconda3/envs/gatk_m/lib/python3.6/site-packages/hail/backend/py4j_backend.py”, line 32, in deco
‘Error summary: %s’ % (deepest, full, hail.version, deepest), error_id) from None
hail.utils.java.FatalError: VCFParseError: invalid character ‘|’ in integer literal

Java stack trace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 68 in stage 5.0 failed 1 times, most recent failure: Lost task 68.0 in stage 5.0 (TID 24619, localhost, executor driver): is.hail.utils.HailException: file:/gpfs/data/user/krithika/G9_G13/GenomeIndia_vcf/Trail/651_allchr_merge.vcf:offset 9127150159: error while parsing line
chr1 21955383 . CGGGTGGG GGGGTGGG,CGGGGTGGG,* 475.51 . AC=12,4,14;AF=0.013,4.255e-03,0.015;AN=940;AS_UNIQ_ALT_READ_COUNT=0|0|0;BaseQRankSum=-1.176e+00;DP=23940;ExcessHet=0.0053;FS=186.849;GQ_MEAN=11.31;GQ_STDDEV=22.92;InbreedingCoeff=0.0556;MBQ=0,0,0,0;MFRL=0,0,0,0;MLEAC=14,24,52;MLEAF=0.015,0.026,0.055;MMQ=60,60,60,60;MPOS=50,50,50;MQ=60.00;MQ0=0;MQRankSum=0.00;NCC=181;NCount=0;QD=1.23;ReadPosRankSum=-1.553e+00;SOR=9.230 GT:AD:AF:DP:F1R2:F2R1:GQ:PGT:PID:PL:PS ./.:35,0,0,0:.:35:0,0,0,0:0,0,0,0:.:.:.:0,0,0,0,0,0,0,0,0,0

at is.hail.utils.ErrorHandling$class.fatal(ErrorHandling.scala:15)
at is.hail.utils.package$.fatal(package.scala:77)
at is.hail.io.vcf.MatrixVCFReader$$anonfun$21$$anonfun$apply$10$$anonfun$apply$11.apply(LoadVCF.scala:1745)
at is.hail.io.vcf.MatrixVCFReader$$anonfun$21$$anonfun$apply$10$$anonfun$apply$11.apply(LoadVCF.scala:1734)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:464)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.utils.richUtils.RichContextRDD$$anonfun$cleanupRegions$1$$anon$1.hasNext(RichContextRDD.scala:68)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Caused by: is.hail.io.vcf.VCFParseError: invalid character ‘|’ in integer literal
at is.hail.io.vcf.VCFLine.parseError(LoadVCF.scala:59)
at is.hail.io.vcf.VCFLine.numericValue(LoadVCF.scala:63)
at is.hail.io.vcf.VCFLine.parseIntInInfoArray(LoadVCF.scala:810)
at is.hail.io.vcf.VCFLine.parseIntInfoArrayElement(LoadVCF.scala:831)
at is.hail.io.vcf.VCFLine.parseAddInfoArrayInt(LoadVCF.scala:855)
at is.hail.io.vcf.VCFLine.parseAddInfoField(LoadVCF.scala:933)
at is.hail.io.vcf.VCFLine.addInfoField(LoadVCF.scala:953)
at is.hail.io.vcf.VCFLine.parseAddInfo(LoadVCF.scala:986)
at is.hail.io.vcf.LoadVCF$.parseLine(LoadVCF.scala:1428)
at is.hail.io.vcf.LoadVCF$.parseLine(LoadVCF.scala:1304)
at is.hail.io.vcf.MatrixVCFReader$$anonfun$21$$anonfun$apply$10$$anonfun$apply$11.apply(LoadVCF.scala:1741)
… 42 more

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2158)
at is.hail.rvd.RVD.combine(RVD.scala:724)
at is.hail.expr.ir.Interpret$.run(Interpret.scala:938)
at is.hail.expr.ir.Interpret$.alreadyLowered(Interpret.scala:53)
at is.hail.expr.ir.InterpretNonCompilable$.interpretAndCoerce$1(InterpretNonCompilable.scala:16)
at is.hail.expr.ir.InterpretNonCompilable$.is$hail$expr$ir$InterpretNonCompilable$$rewrite$1(InterpretNonCompilable.scala:53)
at is.hail.expr.ir.InterpretNonCompilable$.apply(InterpretNonCompilable.scala:58)
at is.hail.expr.ir.lowering.InterpretNonCompilablePass$.transform(LoweringPass.scala:67)
at is.hail.expr.ir.lowering.LoweringPass$$anonfun$apply$3$$anonfun$1.apply(LoweringPass.scala:15)
at is.hail.expr.ir.lowering.LoweringPass$$anonfun$apply$3$$anonfun$1.apply(LoweringPass.scala:15)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass$$anonfun$apply$3.apply(LoweringPass.scala:15)
at is.hail.expr.ir.lowering.LoweringPass$$anonfun$apply$3.apply(LoweringPass.scala:13)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass$class.apply(LoweringPass.scala:13)
at is.hail.expr.ir.lowering.InterpretNonCompilablePass$.apply(LoweringPass.scala:62)
at is.hail.expr.ir.lowering.LoweringPipeline$$anonfun$apply$1.apply(LoweringPipeline.scala:14)
at is.hail.expr.ir.lowering.LoweringPipeline$$anonfun$apply$1.apply(LoweringPipeline.scala:12)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:12)
at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:28)
at is.hail.backend.spark.SparkBackend.is$hail$backend$spark$SparkBackend$$_execute(SparkBackend.scala:354)
at is.hail.backend.spark.SparkBackend$$anonfun$execute$1.apply(SparkBackend.scala:338)
at is.hail.backend.spark.SparkBackend$$anonfun$execute$1.apply(SparkBackend.scala:335)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:25)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:23)
at is.hail.utils.package$.using(package.scala:618)
at is.hail.annotations.Region$.scoped(Region.scala:18)
at is.hail.expr.ir.ExecuteContext$.scoped(ExecuteContext.scala:23)
at is.hail.backend.spark.SparkBackend.withExecuteContext(SparkBackend.scala:247)
at is.hail.backend.spark.SparkBackend.execute(SparkBackend.scala:335)
at is.hail.backend.spark.SparkBackend$$anonfun$7.apply(SparkBackend.scala:379)
at is.hail.backend.spark.SparkBackend$$anonfun$7.apply(SparkBackend.scala:377)
at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52)
at is.hail.backend.spark.SparkBackend.executeJSON(SparkBackend.scala:377)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)

is.hail.utils.HailException: file:/gpfs/data/user/krithika/G9_G13/GenomeIndia_vcf/Trail/651_allchr_merge.vcf:offset 9127150159: error while parsing line
chr1 21955383 . CGGGTGGG GGGGTGGG,CGGGGTGGG,* 475.51 . AC=12,4,14;AF=0.013,4.255e-03,0.015;AN=940;AS_UNIQ_ALT_READ_COUNT=0|0|0;BaseQRankSum=-1.176e+00;DP=23940;ExcessHet=0.0053;FS=186.849;GQ_MEAN=11.31;GQ_STDDEV=22.92;InbreedingCoeff=0.0556;MBQ=0,0,0,0;MFRL=0,0,0,0;MLEAC=14,24,52;MLEAF=0.015,0.026,0.055;MMQ=60,60,60,60;MPOS=50,50,50;MQ=60.00;MQ0=0;MQRankSum=0.00;NCC=181;NCount=0;QD=1.23;ReadPosRankSum=-1.553e+00;SOR=9.230 GT:AD:AF:DP:F1R2:F2R1:GQ:PGT:PID:PL:PS ./.:35,0,0,0:.:35:0,0,0,0:0,0,0,0:.:.:.:0,0,0,0,0,0,0,0,0,0

at is.hail.utils.ErrorHandling$class.fatal(ErrorHandling.scala:15)
at is.hail.utils.package$.fatal(package.scala:77)
at is.hail.io.vcf.MatrixVCFReader$$anonfun$21$$anonfun$apply$10$$anonfun$apply$11.apply(LoadVCF.scala:1745)
at is.hail.io.vcf.MatrixVCFReader$$anonfun$21$$anonfun$apply$10$$anonfun$apply$11.apply(LoadVCF.scala:1734)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:464)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.utils.richUtils.RichContextRDD$$anonfun$cleanupRegions$1$$anon$1.hasNext(RichContextRDD.scala:68)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

is.hail.io.vcf.VCFParseError: invalid character ‘|’ in integer literal
at is.hail.io.vcf.VCFLine.parseError(LoadVCF.scala:59)
at is.hail.io.vcf.VCFLine.numericValue(LoadVCF.scala:63)
at is.hail.io.vcf.VCFLine.parseIntInInfoArray(LoadVCF.scala:810)
at is.hail.io.vcf.VCFLine.parseIntInfoArrayElement(LoadVCF.scala:831)
at is.hail.io.vcf.VCFLine.parseAddInfoArrayInt(LoadVCF.scala:855)
at is.hail.io.vcf.VCFLine.parseAddInfoField(LoadVCF.scala:933)
at is.hail.io.vcf.VCFLine.addInfoField(LoadVCF.scala:953)
at is.hail.io.vcf.VCFLine.parseAddInfo(LoadVCF.scala:986)
at is.hail.io.vcf.LoadVCF$.parseLine(LoadVCF.scala:1428)
at is.hail.io.vcf.LoadVCF$.parseLine(LoadVCF.scala:1304)
at is.hail.io.vcf.MatrixVCFReader$$anonfun$21$$anonfun$apply$10$$anonfun$apply$11.apply(LoadVCF.scala:1741)
at is.hail.io.vcf.MatrixVCFReader$$anonfun$21$$anonfun$apply$10$$anonfun$apply$11.apply(LoadVCF.scala:1734)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:464)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at is.hail.utils.richUtils.RichContextRDD$$anonfun$cleanupRegions$1$$anon$1.hasNext(RichContextRDD.scala:68)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Hail version: 0.2.61-3c86d3ba497a
Error summary: VCFParseError: invalid character ‘|’ in integer literal
[Stage 5:> (0 + 11) / 8050]

Hi @tpoterba, could you please tell me how to fix this issues.

Hail version: 0.2.61-3c86d3ba497a
Error summary: VCFParseError: invalid character ‘|’ in integer literal
[Stage 5:> (0 + 11) / 8050]

We can improve the error message in the future to point to the specific problematic field, but for now you should run this VCF through something like the vcftools vcf-validator, which should tell you where the problem is.