Hi,
I’m trying to simply extract some snps from a set of Hail Matrix files and export them as plink bed/bim/fam files. It’s worked fine for all the target autosomal snps but when I try to extract a snp on the X chromosome I’m getting the following error:
Error summary: HailException: unphased_diploid_gt_index only supports ploidy == 2. Found |0.
I’ve found some help forum replies that seemed related and had some suggestions, including this post: Error summary: HailException: Only support ploidy == 2 and unphased. Found 1|1
I’ve attempted the “mt.annotate_entries(…)” solution recommended in that post, but I’m still getting the same error.
Any advice?
Thanks in advance!
Hi @pstraub,
export_plink
only supports unphased diploid calls. The suggestion in that other post is the right idea, but it leaves haploid calls, wheareas you need to recode them. I can’t advise on how to recode them, but you’ll want to do something like:
mt.annotate_entries(GT=hl.case()
.when(mt.GT.is_diploid(), mt.GT)
.when(mt.GT.is_haploid(), hl.call(mt.GT[0], 0))
.default(hl.null(hl.tcall))
)
Thank you! That code chunk did the trick.
One quick follow up:
The following code chunk will run without an error when haploid genotypes are present:
mt.annotate_entries(GT=hl.case()
.when(mt.GT.is_diploid(), mt.GT)
.when(mt.GT.is_haploid(), hl.call(mt.GT[0], 0))
.default(hl.null(hl.tcall))
)
However, it’s always giving the reference allele as the second allele in the case of a haploid genotype, which is going to all of the males for something like the X chromosome. And for males with the alternative alleles for the X chromosome, this results in an ref/alt genotype when it should be either alt/alt or alt/missing (depending on how males are coded)
Instead, using something like this works:
mt.annotate_entries(GT=hl.case()
.when(mt.GT.is_diploid(), mt.GT)
.when(mt.GT.is_haploid(), hl.call(mt.GT[0], mt.GT[0]))
.default(hl.null(hl.tcall))
)
That’s it!