Look up single row in table or matrix table

Hi folks,

I have what must be a very basic question—how does one look up the information in a hail table corresponding to a single column (sample) or row (variant)?

For example, in this MatrixTable, I’d like to see the information for entry “1000013” (or look up all the information corresponding to one of the columns):

Or, in this Table, I’d like to see the information for “chr1:17379:G:A” (or maybe see what all the gene_symbol entries are):

Based on this post, I wonder if what I’m trying to do is actually more difficult than I’d expect?

Best,
Jeremy

Hey @jbchang ,

Hail focuses on bulk processing, but if you want to access a particular column, filter to it:

mt.filter_cols(mt.s == "1000013").sample_qc.show()

or a particular row of a table:

t.filter(
    (t.locus == hl.locus("chr1", 17379)) & (
    t.alleles == ["G", "A"]
)).csq.show()

It’s important to note that these operations are only fast if mt is the result of a read_matrix_table or t is the result of a read_table. If you’re importing from a foreign format or performing a lot of complex operations before the filter, all that importation or filtration work must be done!

Moreover, Hail doesn’t, for example, read VCF index files so it will read the entire VCF to find the single row you want. Hail MatrixTables have built-in indices which Hail does use to quickly access the row in question.

Thank you, @danking !