Change reference_genome of MatrixTable or from parse_variants

Hello,

I am trying to change the reference genome of MatrixTable. Sth like:

mt = hl.methods.read_matrix_table(path) if here it is saved as reference_genome =None
mt.set_reference_genome(‘GRCh37’)

or here of the Table:

ht = hl.Table.from_pandas(var)
ht = ht.key_by(**hl.parse_variant(ht.chrom + hl.literal(‘:’) + hl.str(ht.pos) + hl.literal(‘:’) + ht.ref + hl.literal(‘:’) + ht.alt))
ht.set_reference_genome(None)

to pass reference_genome=None here
Is this possible?

I want to execute after that:

ht = hl.MatrixTable.from_rows_table(ht)
result = mt.filter_rows(~hl.is_missing(ht.index_rows(mt[‘locus’], mt[‘alleles’])))

and I am getting an error that both locus should be the same type

Thanks in advance!

Best,

if reference genome is defferent between two mt

result = mt.filter_rows(~hl.is_missing(ht.index_rows(hl.locus(mt.locus.contig, mt.locus.position), mt[‘alleles’])))

this one will work, but it seems to be slow

I think that the syntax here could be causing problems in the Hail optimizer. There’s a simpler way to do this, which I believe will perform better:

result = mt.filter_rows(~hl.is_missing(ht.index(mt['locus'], mt[‘alleles’])))

or equivalent, but shorter:

result = mt.filter_rows(~hl.is_missing(ht.index(mt.row_key]))

or equivalent, but even shorter:

result = mt.semi_join_rows(ht)
1 Like