Hl.experimental.run_combiner() AssertionError

Hello,

While running Hail (version 0.2.47-d9e1f3a110c8) in Google (Spark version 2.4.5) via dataproc, the method hl.experimental.run_combiner() failed with

File “/tmp/fa5ffc019a184f189e94e356dd38d7b5/combiner.py”, line 28, in
reference_genome=‘GRCh38’)
File “/opt/conda/default/lib/python3.6/site-packages/hail/experimental/vcf_combiner/vcf_combiner.py”, line 596, in run_combiner
hl.experimental.write_matrix_tables(merge_mts, tmp, overwrite=True)
File “”, line 2, in write_matrix_tables
File “/opt/conda/default/lib/python3.6/site-packages/hail/typecheck/check.py”, line 614, in wrapper
return original_func(*args, **kwargs)
File “/opt/conda/default/lib/python3.6/site-packages/hail/experimental/write_multiple.py”, line 17, in write_matrix_tables
Env.backend().execute(MatrixMultiWrite([mt._mir for mt in mts], writer))
File “/opt/conda/default/lib/python3.6/site-packages/hail/backend/spark_backend.py”, line 296, in execute
result = json.loads(self._jhc.backend().executeJSON(jir))
File “/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py”, line 1257, in call
File “/opt/conda/default/lib/python3.6/site-packages/hail/backend/spark_backend.py”, line 41, in deco
‘Error summary: %s’ % (deepest, full, hail.version, deepest)) from None
hail.utils.java.FatalError: AssertionError: assertion failed

That was trying to merge ~4000 WES VCFs.

Find attached the complete error log:
hail_combiner_error.log (165.2 KB)

and python script:
combiner.py.txt (2.7 KB)

I am not sure what happened, and would appreciate any help in figuring out the error.

Thank you,

Guillaume Noell
Sanger Institute - Human Genetics Informatics

Ack, this is a tabix index issue. I think we need to isolate the file that’s causing the problem. I’ll create a development build you can run that will print the file path in the error.

1 Like

I’ll build a dev version when this goes in:

If you download and pip install this wheel, a cluster you create should produce an error that tells us the file causing the problem:

gs://hail-common/hailctl/dataproc/tpoterba-dev/0.2.49-b1db2c323727/hail-0.2.49-py3-none-any.whl

Hi Tim,

Thank you for the wheel.
It gives

Hail version: 0.2.49-b1db2c323727
Error summary: RuntimeException: error reading tabix-indexed file gs://interval-wes/gvcf_for_hail_jc/11669098.ACXX.paired158.09fa37a265.g.vcf.gz: i=0, curOff=12595964411980, expected=12595964411904
[Stage 46:===================================================>(1148 + 1) / 1150]

The actual index file in that bucket has the .tbi extension (which I had generated with tabix -p vcf 11669098.ACXX.paired158.09fa37a265.g.vcf.gz ).

Do you know how to debug this?

I have another thing to try – Here’s a wheel that disables this assertion entirely, which I suspect might be too strong:

gs://hail-common/hailctl/dataproc/tpoterba-dev/0.2.49-1cfe542db983/hail-0.2.49-py3-none-any.whl

If you run that and get a weird parse error or failure, we’ll push harder on the first problem.

With this wheel, the error is

Hail version: 0.2.49-1cfe542db983
Error summary: RuntimeException: bad chromosome! chr15, chr16	31605	.	T	<NON_REF>	.	.	END=31626	GT:DP:GQ:MIN_DP:PL	0/0:2:6:2:0,6,75

[Stage 46:===================================================>(1148 + 1) / 1150]

Full log: error.log (213.5 KB)

Is this caused by a corrupted vcf or index?

OK, this is a tough one. Could you maybe try re-zipping and re-tabix-indexing that file?

I re-zipped the file with bgzip, and re-created the index with tabix,
but it still gives a similar error:

reindexed.error.log (191.8 KB)

Is there maybe some python hail code I could run on each vcf+index one-by-one in a loop to make sure they are in the correct format and not corrupted?

By the way, this time it ran 45 minutes longer than the previous tries (6h30min vs 5h45 for the last 3 tries), so that could actually suggest that the reindex/reupload fixed the one vcf it was bugging on, and it simply found a second corrupted vcf/index later on?

Don’t use this version – we want to get the file-specific error message here, not the parse error. Use the first one I posted in this thread, I think.

I’m sorry it’s taking so long to replicate! You could try the following to just try chr22:

chr22_interval = [hl.Interval(
   start=hl.Locus(contig=contig, position=1, reference_genome='GRCh38'),
   end=hl.Locus.parse(f'{contig}:END', reference_genome='GRCh38'))]

...
hl.experimental.run_combiner(... intervals=chr22_interval, ...)

Thanks Tim,

If fixed a second vcf/index and the combiner completed successfully!

I ran it on the whole genome as I was not able to subset to chr22; I got
hail.utils.java.FatalError: HailException: range bounds must be inclusive when running:

# run chr22 only for faster test:                                                                                           
contig='chr22'                                                                                                              
chr22_interval = [hl.Interval(                                                                                              
   start=hl.Locus(contig=contig, position=1, reference_genome='GRCh38'),                                                    
   end=hl.Locus.parse(f'{contig}:END', reference_genome='GRCh38'))]                                                         
print("run combiner chr22")                                                                                                 
hl.experimental.run_combiner(inputs,                                                                                        
                             intervals=chr22_interval,                                                                      
                             out_file=output_file,                                                                          
                             tmp_path=temp_bucket,                                                                          
                             reference_genome='GRCh38') 

What is the correct syntax to subset chr22?

Great news!!!

Ah, I messed up here – we just need to make the range inclusive:

1 Like