Hello, I’m trying to understand how Hail treats the missing calls.
mt.GT.show I could see the missing calls as
But when using
hl.is_missing, they are still
NA instead of
Same when using
My question is, why are these calls ‘ignored’?
Hey @akhattab ,
Yeah, this is an unfortunate and confusing part of the Hail interface, particularly as it relates to
NA values are not missing but “filtered”. Filtered entries are produced with
filter_entries. This is the entry analogue to row or column filtering, but unlike row or column filtering, we can’t simply remove the entire row or column (because there are other not filtered entries at the same row or column).
You can convert filtered entries to missing entries with
mt.unfilter_entries(). You can compute statistics about the number of filtered entires with
There is no way to test for the filtered status of a particular field, because its not the field that is filtered, it is the entry of the matrix that is filtered.
- “Filtered” entries are like filtered rows or columns: we want to pretend that we didn’t even observe them. They’re not part of our dataset anymore.
- “Missing” values are data we measured but we’re uncertain of its value. We want to consider them as a part of our analysis but as a separate third category of thing: something we know exists but we can’t observe directly.
Great. Thank you so much!