Left-alignement, normalization, splitting multiallelics

Hi!
Is there any functionality in hail that would mimick the left-alignement as implemented in https://samtools.github.io/bcftools/bcftools.html#norm ?
If I understand correctly, it does not magically happen during hl.split_multi_hts?

That is correct. split_multi is closer to vt’s decompose, but no normalization happens. This is hard to do without a loop, which wasn’t available when these functions were implemented (not sure if the new loop functionality now would allow this? @tpoterba may have a better idea).

Citing @tpoterba from another topic above.
So some normalisation does actually magically happen during split_multi_hts? Does anyone know: is there some room for ambiguity, or should split_multi_hts produce the same allels as bcftools -norm m for any input collection of alleles?

Hail doesn’t re-align to the reference (bcftools might) so Hail’s implementation is as simple as truncating identical suffixes off the ref and alt (I think).

1 Like