Filter genotypes based on other genotypes


One of the QC steps I’ve started doing is that I now exclude indel LOFs from my analysis if they’re within 10bp of another indel in the same sample. I would love a way to do this filter step in Hail. I don’t think I currently can.

Edit: realized this was a genotype issue and not a variant issue. This would be like a filtergenotypes call, since I need to access sample information as well as variant information, but I also need to be able to compare genotypes to one another instead of strictly filtering based on one call’s properties.


Hi kyle,

Just to understand the use case, you want to filter out indel LOF calls that are within 10bp of another indel call in that sample, yes? I had a way to do the variant level filter, but this is harder. Let me think about it.


Hi Cotton,

Yes, that’s right. When we were talking about it before I neglected to specify the bit about it being in the same sample. I realized my error once I looked into how aggregating by intervals would work with this. It’s a genotype-level filter rather than a variant-level filter (although so many of these questionable calls are unique that it’s almost the same…).