Are 0/0 genotypes counted as defined in hl.is_defined(mt.GT)? And are NA genotypes counted as missing in hl.is_missing(mt.GT)?


I’ve observed some "NA"s in the GT field, but the result of mt.aggregate_entries(hl.agg.count_where(hl.is_missing(mt.GT))) is 0. Is this the expected behavior? Which command should I use to count the "NA"s in the GT field of the MT?

Additionally, I’ve observed that the results of mt.aggregate_entries(hl.agg.count_where(mt.GT.is_non_ref)) and mt.aggregate_entries(hl.agg.count_where(hl.is_defined(mt.GT))) are the same. Is this the expected outcome? Does the call_rate in sample_qc include 0/0 genotypes in the numerator?

Thank you for your help!

There is a bit of information on how filtered entries are handled in filter_entries. I am pretty sure 0/0 is called as hom_ref and NA is not called.