Hi all,
I cant find a consensus in the literature for the term “internal frequency” in large-scale genomic studies. I’m new in this topic and I would like to be sure that people refer to ‘internal’ allele frequency as the specific frequency for the population/cohort under study, which is different to the allele frequencies annotated from external source/dataset like ExAC, 1000Genomes, etc.
When using ‘variant_qc()’ method in Hail I get AF among the QC metrics. From the source code (https://github.com/hail-is/hail/blob/b226e1f70f338dea953d58c8706ff42fd74f4992/src/main/scala/is/hail/methods/VariantQC.scala) I can see that the AF’s formula is:
AF = (nHet + 2(nHomVar)) / 2(nCalled), where
nCalled = nHomRef + nHet + nHomVar
So, Can this AF be interpreted as ‘internal allele frequency’?
Thanks