I have a question regarding a strange result I got after densifying a matrix table. When comparing the file size of my densified and sparse matrix tables, the densified matrix table was larger than the sparse one (12,975MB for the densified matrix table and 13,273MB for the sparse matrix table). This is the command I used to generate the densified matrix table:
mt = hl.read_matrix_table(sparse_matrix_table) mt = hl.experimental.densify(mt) mt.write(mt_path)
No repartitioning was done on either of the matrix tables, and they both have 2,586 partitions (20 samples with 167M variants).
Is there any reason as to why the sparse matrix table would be larger than the densified one? It seems counter-intuitive to me, but perhaps there’s some explanation I’m missing.