I don’t think this is a bug, but after merging two gVCFs with run_combiner, I obtain an output with null genotypes for some samples and loci:
locus<GRCh38> SAMP1:GT,END SAMP2:GT,END
chr1:10001 0/0,10004 0/0,10019
chr1:10005 0/0,10005 NA,NA
chr1:10006 0/0,10024 NA,NA
I think I understand why this is happening - run_combiner is splitting the locii where the samples have different genotypes. But I wonder if there is a way to produce output such that every sample has a genotype at every locus (the following output was obtained with GATK CombineGVCFs):
||#CHROM|POS|INFO|SAMP1|SAMP2|
|---|---|---|---|---|---|
|0|chr1|10001|END=10004|./.:10:0:0,0,0|./.:29:0:0,0,0|
|1|chr1|10005|.|./.:14:3:0,3,45|./.:29:0:0,0,0|
|2|chr1|10006|END=10019|./.:23:6:0,6,90|./.:29:0:0,0,0|
|3|chr1|10020|.|./.:23:6:0,6,90|./.:39:3:0,3,45|
|4|chr1|10021|.|./.:23:6:0,6,90|./.:39:6:0,6,90|
Any help anyone could offer would be appreciated. Hail is very convenient in other ways, so I am hoping I can find a way around this problem.