Hello all! If I have an entry of phased genotypes (GT), is it possible to split it into two different entries for each haplotype, within a matrixtable? Thank you!
Yes. mt.GT[0]
and mt.GT[1]
refer to the two alleles of a call, and if that call is phased these will refer to the two phased haplotypes.
Oh ok that makes sense! Is there a way to split it into two different fields?
mt.annotate_entries(fieldOne = mt.GT[0], fieldTwo = mt.GT[1])
Where you can change fieldOne
and fieldTwo
to any names you like.
Thank you for your response @johnc1231. I have tried that command and got this response:
The GT is within the entries, may I know if there is a way to resolve this?
You don’t need to call .entries()
to access the entry fields. .entries()
is an operation: it converts from a dense, efficient MatrixTable format to a huge, inefficient Table format. It’s useful in a very limited set of circumstances.
Try this instead:
new_mt.annotate_entries(
fieldOne = new_mt.mother_entry.PBT_GT[0],
fieldTwo = new_mt.mother_entry.PBT_GT[1]
)
Oh okay, that makes sense! Sorry for this, a lot of Hail’s intricacies do not come naturally to me yet. Is there a way to transfer the haplotype information (+ the associated variant) into a CSV file so that I can work with it on Python?
Thank you sooo much for all your help.