Effect Allele in regressions


#1

If I got things right, in regressions the effect allele, for which beta is calculated, is the “alternate allele”, but which allele is considered the alternate allele in Hail?
Is it the second allele (as in ds.alleles[1]) regardless of the allele frequency, or is it the minor allele as calculated in variant_qc.AF (which can be allele 1 or allele 2)?

| locus | alleles | rsid | variant_qc.AC | variant_qc.AF |
±--------------±-----------±-------------±--------------±--------------------+
| locus | array | str | array | array |
±--------------±-----------±-------------±--------------±--------------------+
| 1:768448 | [“G”,“A”] | “rs12562034” | [11726,1144] | [9.11e-01,8.89e-02] |
| 1:1018704 | [“A”,“G”] | “rs9442372” | [6043,6987] | [4.64e-01,5.36e-01] |

In the example above, are the effect alleles A and G, respectively, or A and A.

Related to that question, how would I create columns called EA and NEA (Effect Allele and Non-Effect Allele) I could then use to match variants for meta-analysis?


#2

This depends how you code the regression -

Model 1:

hl.linear_regression_rows(y=mt.pheno, x=mt.GT.n_alt_alleles())

In this model, we encode homref as 0, het as 1, homvar as 2. So the effect allele is alleles[1] (for biallelic variants)

Model 2:

hl.linear_regression_rows(y=2-mt.pheno, x=mt.GT.n_alt_alleles())

In this model, alleles[0] is the efefct allele.


#3

Oh, I see, thanks.
I’m guessing this also applies to dosages, so that for the following code alleles[1] are the effect alleles:

gwas = hl.logistic_regression_rows(test='wald',pass_through=[ds.rsid], y=ds.pheno_case, x=hl.gp_dosage(ds.GP))


#4

yes, exactly.


#5

@tpoterba, this is the natural way to flip the encoding:

hl.linear_regression_rows(y=mt.pheno, x = 2 - mt.GT.n_alt_alleles())

Also, users, don’t forget to include the intercept if you want one!

hl.linear_regression_rows(y=mt.pheno, x = 2 - mt.GT.n_alt_alleles(), covariates=[1])

Without intercept, negating y flips the sign on the effect size on x, but negating and shifting y has stranger influence. With intercept, negating (and possibly shifting y) flips the sign on the effect size of x.


#6

oops, that’s totally what I meant, put the 2 - on the wrong arg as I was typing