A very useful metric for variant QC is the ABHet ratio. Variants with ABHet < 0.2 and > 0.8 are used to filter out false positives. More conservative cutoffs (0.25 and 0.75) are also used.
ABHet is a variant-level annotation that aims to estimate if biallelic variants match expected allelic ratios. An ideal heterozygous variant will have a value of close to 0.5 and an ideal homozygous variant will have a value of close to 1.0. ABHet is calculated for a variant based on all samples in the VCF which are not homozygous at this site.
ABHet=# REF reads from heterozygous samples/# REF + ALT reads from heterozygous samples
What would be the best approach for calculating this with Hail? Could this be added to the variantQC method?