Hi, I’m trying to split my MT generated by DRAGEN into a diallelic MT. I’m facing the issue that the AD field contains only a single entry for hom-ref samples, but non-hom-ref samples contain N entries where N is the number of alleles per site. Example:
Column 1
Alleles
Sample1
Sample2
Sample3
chr1:10000
[“A”, “G”, “C”]
0/0:30
0/1:0,30,0
1/2:0,15,15
So what I would need is the following:
Column 1
Alleles
Sample1
Sample2
Sample3
chr1:10000
[“A”, “G”, “C”]
0/0:30,0,0
0/1:0,30,0
1/2:0,15,15
I’ve tried to create a new AD field like this where each entry contains exactly N elements in the AD list:
mt_annot = mt.annotate_entries(AD = hl.if_else(mt.GT.is_hom_ref(), (mt.AD.append(0) for entry in range(mt.GT.n_alt_alleles())), mt.AD))
But I get the following error:
TypeError: 'Int32Expression' object cannot be interpreted as an integer
I’m struggling with the part where I need to add “0” as many times as there are n_alt_alleles, how could I achieve that?
As a general rule of thumb, python loops and comprehensions almost never mix the way you want with hail code. The way I would append n_alt_alleles zeroes is