Transmission Disequilibrium Test help

Dear all,

I am trying to use the hl.transmission_disequilibrium_test() function to extract transmitted and untransmitted variants. As a part of the QC, I am planning to apply some entry-level filters:

mt = mt.filter_entries(mt.AD[1] >= 3)
mt = mt.filter_entries(mt.DP >= 10)
mt = mt.filter_entries(mt.DP < 1000)
mt = mt.filter_entries(mt.GQ >= 20)

Unfortunately, if I do this, my entire tdt Hail table is NA for all t and u fields:

|locus|alleles|t|u|chi_sq|p_value|
|locus<GRCh37>|array<str>|int64|int64|float64|float64|
|1:861266|["G","C"]|NA|NA|NA|NA|
|1:861303|["G","C"]|NA|NA|NA|NA|
|1:865579|["C","T"]|NA|NA|NA|NA|

When I do not do entry-level filtering, the t and u have integer values. May I ask if there is any advice to incorporate entry-level filtering with hl.transmission_disequilibrium_test()? Thank you!

As currently implemented, the test doesn’t support missing or filtered entry values. The implementation is rather short but I’m not a geneticist and don’t know how to modify it appropriately to handle missing data. You might try your hand at modifying it yourself.