What is the right way to query by RSID?

Good day,

I am testing hail. Still very new to it. I have looked through the docs and came to conclusion, that I should be querying the matrix via the following request:

mtx.filter_rows(mtx.rsid=="rs11111111").show()

The idea is to find the row/entry based on RSID. There’s such a row as rsid in the matrix as well as the rsid with the following number. Nonetheless, I am getting none. In order to check some of the IDs I have printed them via: mtx.annotate_rows() and there’s such an RSID.

Nonetheless, I’m getting ‘None’ when trying to filter rows. Why? What am I doing wrong?

Hey annalisasnow,

The example you wrote:

mtx.filter_rows(mtx.rsid=="rs11111111").show()

says “filter my matrix table to rsid “rs11111111”, then print it”. Are you seeing no print out?

Note that if you do: foo = mtx.filter_rows(mtx.rsid=="rs11111111").show(), then foo will be None since .show() doesn’t return a value, it just prints.

annotate_rows() doesn’t print, so I’m confused what you mean by using annotate_rows to print. It is used to add a new field. These are potentially helpful if you haven’t seen them: Hail | Cheat Sheets

1 Like

I see ‘None’. That’s what I get when using the whole statement as:

print("\n\n\n RSID \n\n\n", mtx.filter_rows(mtx.rsid=="rs11111111").annotate_rows().show())

If I do no rows annotation:

print(mtx.filter_rows(mtx.rsid=="rs11111111").show())

I still get None

Yeah, you shouldn’t do print. print is a function that prints its input. .show() is a methoe that returns None, but has the side effect of printing the table. You should just be doing:

mtx.filter_rows(mtx.rsid=="rs11111111").show()

no print at all.

Thx, got it!