Adding sample labels to a relationship matrix

JanErik · April 13, 2019, 8:36pm

I have created a hail matrix table from a vcf with 591 samples. I have had good success with using the hail.realized_relationship_matrix. It would be helpful, however, to have row and column labels for the resulting matrix. The solution I came up with was to convert the block matrix to an ndarray and then convert the ndarray to a panda using a list of samples as row and column names:

rrm = hl.realized_relationship_matrix(mt.GT)
rrm_npy = rrm.to_numpy()
samples = mt.s.collect()
rrm_panda = pd.DataFrame(rrm_npy, index=samples, columns=samples)

My question: does this seem like a robust solution? What’s opaque to me is whether the block matrix indices are bound to match the indices of the array created by mt.s.collect().

My kudos to the hail team – it is awesome.

tpoterba · April 14, 2019, 12:39am

This is a topic that has come up before, I think. I suppose the answer will depend on what you want to do downstream. Your code looks fine, but it won’t scale, and interconverting between Hail objects and python objects can be very slow.

One of the natural things to do may be to convert it to a MatrixTable using this method.

Once it is again a matrix table, you can put the keys back in:

rrm = hl.realized_relationship_matrix(mt.GT)
rrm_mt = rrm.to_matrix_table_row_major()

sample_ids = hl.literal(mt.s.collect())
rrm_mt = rrm_mt.key_rows_by(s1 = sample_ids[rrm_mt.row_idx])
rrm_mt = rrm_mt.key_cols_by(s2 = sample_ids[rrm_mt.col_idx])

JanErik · April 14, 2019, 2:59am

Cool.

To get it to work for me, I needed to cast the MatrixTable indices to int32:

rrm_mt = rrm_mt.key_rows_by(s1 = sample_ids[hl.int32(rrm_mt.row_idx)])
rrm_mt = rrm_mt.key_cols_by(s2 = sample_ids[hl.int32(rrm_mt.col_idx)])

tpoterba · April 14, 2019, 1:55pm

ah, yes! That always comes up, and it’s a bit annoying but better than either doing an unsafe cast or an expensive check automatically.

Topic		Replies	Views
Can Hail convert mt.show() output to dataframe Hail Query & hailctl	3	470	January 13, 2023
`Table` to `MatrixTable` to export `VCF` Hail Query & hailctl	2	476	May 20, 2023
Hail 0.2 - Questions on MatrixTable joins for columns Hail Query & hailctl	2	755	April 15, 2018
Making MatrixTable into a Table Hail Query & hailctl	10	608	January 15, 2021
Annotate column with a list, then merge two matrix tables Hail Query & hailctl	13	1234	June 8, 2021

Adding sample labels to a relationship matrix

Related topics