gnomAD: How are mutation rates calculated?

agastya · October 16, 2023, 6:29am

Hi everyone, I apologize if this question does not fall in the purview of this forum.
I was going through the gnomAD constraint calculation code (https://github.com/broadinstitute/gnomad_lof/blob/master/constraint_utils/constraint_basics.py line 318) and had some questions regarding the calculation of the mutation probabilities.
I do not know the shape of the data inside genome_ht and context_ht as I have never used hail.

My question is: How are the probabilites of a base mutation calculated? I find supplementary documentation vague when they talk about its calculation.

There are 3 ways that I have narrowed down the calculation to, here they are:
A= no. of AAA>ATA mutations in the whole genome.
B= Total no. of AAA context mutations in the whole genome. This is basically AAA>ATA + AAA>ACA + AAA>AGA mutations.
C= No. of times AAA occurs in the whole genome. In the sequence AAAAA, AAA occurs 3 times.

Which of the following is the correct equation for the calculation of probability of AAA>ATA mutation?

A/B
A/C
A/(C*3)
Or am I completely wrong?

Also, whats the logic behind the correction factor to calculate mu from these probabilities?
Thanks a lot

iris-garden · October 17, 2023, 5:24pm

hi, this forum is primarily for support for the hail library, so i think you would likely have more luck reaching out to the gnomad team directly over email at gnomad@broadinstitute.org. hope that helps!

agastya · October 19, 2023, 5:18am

Thanks, I’ll do that.

danking · November 9, 2023, 7:06pm

@agastya You might also try the brand new gnomAD forum: https://discuss.gnomad.broadinstitute.org

Topic		Replies	Views
Gnomad allele frequency query Hail Query & hailctl	11	2764	March 31, 2021
Applying gnomAD Ancestry Methods to other Data Hail Query & hailctl	2	461	August 2, 2021
Annotating variants in a matrix table with 1000genomes database Hail Query & hailctl	0	341	April 20, 2023
Finding genotype for each (exome locus, sample ID) pair Hail Query & hailctl	5	532	October 30, 2018
Script for variant qc and annotation Hail Query & hailctl	2	515	October 31, 2018

gnomAD: How are mutation rates calculated?

Related topics