I’m trying to filter out variants with AF < 0.8 from a matrixtable. I can of course use hl.variant_qc
with mt.filter_rows
to get this done but my VCF has AF fields with values 0, 0.5 and 1.0 which is not helpful for a fine filtering.
So I created a filter condition using AD and DP values in the entries
field to filter out variants.
mt = mt.filter_entries(mt.AD[1]/mt.DP > 0.8)
However, when I see the outputs I see NA
has been assigned where the filter condition evaluated to False
. See below,
mt.AD.show()
+---------------+------------+--------------+
| locus | alleles | 'test'.AD |
+---------------+------------+--------------+
| locus<GRCh38> | array<str> | array<int32> |
+---------------+------------+--------------+
| chrX:22849 | ["A","G"] | [0,22] |
| chrX:26601 | ["G","T"] | NA |
| chrX:26883 | ["C","T"] | [0,89] |
| chrX:26987 | ["A","C"] | NA |
| chrX:27266 | ["C","G"] | NA |
+---------------+------------+--------------+
I’m not sure why I see NAs when I expect the variants to be removed from mt
.
I also tried to export mt
as a VCF and I still see variants with AF<0.8 but the sample field is replaced with ./.
I basically want to get rid of variants which fail for my filtering condition (mt.AD[1]/mt.DP > 0.8) but I’m not sure what I’m missing in my implementation.
Any thoughts would be appreciated!
Faizal