How to calculate hemizygous counts for sex chromosome variants

Hello!

I’m building a Hail pipeline to calculate allele frequency statistics by sex and ancestry, aiming to replicate the gnomAD-style table with:

  • Allele Count (AC)

  • Allele Number (AN)

  • Number of Homozygotes (n_hom_var)

  • Number of Hemizygotes (when applicable)

  • Allele Frequency (AF)

Using hl.agg.call_stats, I get AC, AN, AF, and n_hom_var directly, but Hail doesn’t seem to provide a built-in “hemizygous count” column for sex chromosome variants.

My understanding of the logic is:

  • PAR regions: diploid → n_hom_var works as usual.

  • Non-PAR regions: haploid → we need to count any alternate allele in these samples as “hemizygous alternate.”

Does this approach make sense, or is there a more optimal way in Hail to compute hemizygous counts that I’m missing?

Thanks in advance!

Dropping here the answer I got in the gnomAD forum in case is useful for someone How to calculate hemizygous counts - Browser - gnomAD :slight_smile: