I have been happily running linear regressions on entry data as such:
mt = mt.annotate_rows(GWAS= hl.agg.linreg( mt.pheno, [1.0, mt.entry1, mt.entry2, covariates]))
I now have several binary traits and am wondering if there’s a comparable way to run logistic regression on entries rather than rows?
Ack, sorry, I thought I’d responded. Unfortunately logistic regression doesn’t have a one-pass algorithm, so it doesn’t fit as naturally into the aggregator system. We’ll have a way to do this in the next 6-9 months, I think, but before then the rather inflexible interface of logistic_regression_rows is the only option.
ok good to know. Thanks, and keep please me in the loop when this is added! Would be great to package this feature into the Tractor framework for case/control phenotypes.
Hey again! I am hoping to run this pipeline for a consortium working group, and we will have some binary phenotypes. Is there a fix for this yet, or do you have suggestions for workarounds?
Hi again! I have a workaround for this that I wanted to sanity check with you. I am thinking the best option for the short term would be to just transform the results from linreg on binary traits. The specific equation for this in the literature (e.g. https://www.nature.com/articles/ejhg2016150#Sec2) is effect logistic = effect linear / ( (intercept linear) / (1- intercept linear)). I am currently including the intercept term explicitly in hl.agg.linreg, so I guess I could just take the beta from that?
specifically, the regression is run with hl.agg.linreg(mt.TC, [1.0, mt.hapcounts0.x, mt.anc0dos.x, mt.anc1dos.x, covariates...) so I am thinking I could just get the intercept with mt.TC.beta[0] ? I could then transform the betas for ancestry 0 and 1 separately.