Is there a way to load a long format file into hail matrix table?

I have a file in the long format, that is

condition1.txt

GENE SAMPLE VALUE
A    S1     10.6
B    S2     20.3

That I would like to load into a HAIL Matrix Table where the GENE would be the row field, the SAMPLE the column field and the value an entry field.
with this structure:

----------------------------------------
Global fields:
    None
----------------------------------------
Column fields:
    'SAMPLE': str
----------------------------------------
Row fields:
    'GENE': str
----------------------------------------
Entry fields:
    'condition1': float64
 ----------------------------------------
Column key: ['SAMPLE']
Row key: ['GENE']
----------------------------------------

Is there a way to import this into a HAIL MT?

Also, how would i name or rename the entry field and column field during import?

The docs don’t seem to have info on it (Hail | Import / Export) although I could use the mt.rename() function I think, but seems something you want to do at import.

In addition, is it possible to add an new entry into an exisiting MT?
So, if I had an additional file

condition2.txt

GENE SAMPLE VALUE
A    S1     6.2
B    S2     8.1

Could I get it to load as an additional entries so I would have

----------------------------------------
Global fields:
    None
----------------------------------------
Column fields:
    'SAMPLE': str
----------------------------------------
Row fields:
    'GENE': str
----------------------------------------
Entry fields:
    'condition1': float64
    'condition2': float64
 ----------------------------------------
Column key: ['SAMPLE']
Row key: ['GENE']
----------------------------------------

Thanks!

check out Table.to_matrix_table!

How about the second part of the question, about how to combine or update or add a new entry to an existing MT? Is that something to_matrix_table can take care of? Or is this done through annotate_entries of some flavor?

You can use annotate_entries with hail’s “join” syntax to combine two matrix tables:

In [10]: import hail as hl 
    ...: a = hl.utils.range_matrix_table(3, 3) 
    ...: a = a.annotate_entries(e1 = a.row_idx * a.col_idx) 
    ...: b = hl.utils.range_matrix_table(3, 3) 
    ...: b = b.annotate_entries(e2 = b.row_idx + b.col_idx) 
    ...:  
    ...: c = a.annotate_entries(**b[a.row_idx, a.col_idx]) 
    ...: c.show()                                                                                                                                              
+---------+-------+-------+-------+-------+-------+-------+
| row_idx |  0.e1 |  0.e2 |  1.e1 |  1.e2 |  2.e1 |  2.e2 |
+---------+-------+-------+-------+-------+-------+-------+
|   int32 | int32 | int32 | int32 | int32 | int32 | int32 |
+---------+-------+-------+-------+-------+-------+-------+
|       0 |     0 |     0 |     0 |     1 |     0 |     2 |
|       1 |     0 |     1 |     1 |     2 |     2 |     3 |
|       2 |     0 |     2 |     2 |     3 |     4 |     4 |
+---------+-------+-------+-------+-------+-------+-------+

1 Like