Remove rows with "NA" in any sample

From a matrix table, how do I remove rows (variants) with a “NA” in any one of the samples? Thank you!

Hi @CuriousGeneticist !

We can take your sentence piece-by-piece from English to Hail!

From a matrix table,

This tells us that we’ll be using MatrixTable methods like annotate_rows, filter_rows, annotate_cols, etc. instead of the Table methods annotate and filter.

remove rows (variants) with

Removal of rows or columns in Hail is called “filtering”, so we want filter_rows:

mt = mt.filter_rows( some_condition )

a “NA” in any one of the samples

This has three parts “a ‘NA’”, “any one of”, and “the samples” which correspond to:

  1. hl.is_missing( some_thing_that_can_be_missing )
  2. hl.agg.any( some_other_condition )
  3. mt.GT, mt.AD, … (the various genotype fields which vary per-sample)

I will assume you want to look for missing genotypes (mt.GT).

mt = mt.filter_rows(

An important caveat! Hail has both “missing” entry fields and “filtered entries”. Filtered entries are created by filter_entries. You never get filtered entries from import_vcf. You can check if you have any filtered entries by running this command:

mt = mt.compute_entry_filter_stats()

Thank you so much for your response Danking. I tried to run this code:
mt = mt.filter_rows(

  • hl.agg.any(*
  •    hl.is_missing(mt.GT)*
  • )*
    But the NA rows are still there. Does that mean that it is a “filtered entry”?

But the NA rows are still there

What is the query you are running to see the “NA rows”? This might help us best answer.

I used
and I still saw the “NA” rows.

Can you copy and paste or screenshot an example row from Also what is printed by my suggestion above for checking for filtered entries?

Actually, I realized that it was an issue with an upstream query. Thanks so much for your help!