Hail curious potential user Q

jbloom · March 7, 2017, 7:05pm

The internals of all four logistic models are here, with some performance comparison at the end of this post. The projection trick does not extend to logistic and genotype sparsity doesn’t help much as the sample covariates are dense and used alongside in each iteration. We’ve also tried using QR and triangular solve in Newton iteration to avoid direct inversion of the Hessian (Fisher info), but found this does worse, likely because the number of covariates is tiny compared to the number of samples. If you really want to optimize single-core performance, check out the vectorization tricks in TopCoder competition. In the end, logistic regression is per variant and scales beautifully with cores, so does not seem to be a pressing computational bottleneck for Hail (though we may circle back to make it more efficient in the future).

Topic		Replies	Views
PLINK glm vs linreg3 Hail Query & hailctl	17	1039	June 9, 2020
How to run GWAS from UK Biobank efficiently on Hail Hail Query & hailctl	11	3446	December 21, 2020
Big picture issues: considering switching to HAIL Meta	6	4007	January 3, 2023
Fast linear regression for multiple phenotypes, eQTLs Updates	24	3280	July 17, 2018
Improve writing time for GWAS results Hail Query & hailctl	2	483	November 20, 2020

Hail curious potential user Q

Related topics