VCF Combiner Error

I tried to use Hail GVCF combiner function to merge two GVCFs into a matrix table, but got an error.

Code:

import hail as hl

inputs = []

inputs.append('data/sample1.gvcf.gz')
inputs.append('data/sample2.gvcf.gz')

# print(' '.join(inputs)) 
output_file = 'data/merged.sample.mt'  # output destination
temp_bucket = 'tmp/'  # bucket for storing intermediate files

hl.experimental.run_combiner(inputs, out_file=output_file, tmp_path=temp_bucket, reference_genome='GRCh37', use_genome_default_intervals=True) 

Error summary:

Hail version: 0.2.64-1ef70187dc78
HailException: unphased_diploid_gt_index only supports ploidy == 2. Found 1.

Any suggestions?

This is an oversight. The VCF combiner internals use unphased_diploid_gt_index in a few places to compute if there is a <NON_REF> call (we set them to missing if there is). In practice we don’t produce GVCFs with <NON_REF> calls, but others do.

A workaround could be to rewrite the GVCFs such that all calls are diploid. I’ll be adding support for haploid calls and that should be out in the next version.

Thanks for reaching out!

Thanks for the reply, looking forward to the next release!