Extract sequence from reference genome

Is there a way to extract a slice of the Hail reference genome?

rg = hl.get_reference(‘GRCh38’)
seq=rg.seq(chr19:44905796-44909393)

For SV calls, the ref and alt are often not the sequence but instead place holders DEL,INS, DUP with the SVLEN. It would be useful to be able to update the ref and alt with the actual sequence.

John

There is! it’s a method on LocusExpression called sequence_context. This requires you to add sequence to the reference genome first:

https://hail.is/docs/0.2/genetics/hail.genetics.ReferenceGenome.html#hail.genetics.ReferenceGenome.add_sequence

Then you can use sequence context:

https://hail.is/docs/0.2/hail.expr.LocusExpression.html#hail.expr.LocusExpression.sequence_context

Here it would look like:

seq=hl.eval(hl.locus('chr19', 44905796).sequence_context(before=0, after=...))

Excellent! I will give it a try.