Hey hail team,
I have a question about checking overlapping intervals in a sparse matrixtable. I want to use the number of bases with coverage in the sparse mt as a proxy for call rate to avoid densification. I’m hoping to define call rate as n_called / total_bases
, where n_called
is the number of bases in reference blocks + number of non-ref sites per sample. total_bases
in this case is just a sum of interval lengths (intervals stored in a table).
I’d like to only count bases that are within certain intervals for n_called
. However, the reference blocks can span multiple intervals. Is there a way to add an annotation of the number of bases in each reference block that are within specified intervals?