Logistic regression on remote servers

Stephen · October 14, 2020, 3:36am

Hi,
I’m trying to use Hail to run a logistic regression GWAS on a case-control(0 vs 1) phenotype. I have two questions:

What input phenotype is expected by Hail? 0 vs 1 in tfloat64 format or True vs False in tbool format?
If I would like to run Hail in .py scripts on remote school linux servers, are there any mistakes I made in my code?

#!/s/bin/python3
import hail as hl
hl.init()

mt = hl.import_plink(bed=‘chr22.bed’,bim=‘chr22.bim’,fam=‘chr22.fam’,quant_pheno=False)
covar = (hl.import_table(‘pheno_covariate.txt’,types={‘IID’:hl.tstr,‘mypheno’:hl.tbool},impute=True).key_by(‘IID’))

mt = mt.annotate_cols(covar=covar[mt.s])

mt_logistic =hl.logistic_regression_rows(test=‘wald’,y=[mt.covar.mypheno,mt.covar.mypheno],x=mt.GT.n_alt_alleles(),
covariates=[1,mt.covar.IsMale,mt.covar.Year,mt.covar.IsAxiom,mt.covar.PC1,mt.covar.PC2,
mt.covar.PC3,mt.covar.PC4,mt.covar.PC5,mt.covar.PC6,mt.covar.PC7,
mt.covar.PC8,mt.covar.PC9,mt.covar.PC10])

mt_logistic.export(‘chr22_hail.txt’)

tpoterba · October 14, 2020, 12:07pm

Either one is fine. The method technically expects float64s, but booleans are coercible to floats as 0.0 / 1.0.

This looks right to me!

Topic		Replies	Views
Missing value and logistic regression Hail Query & hailctl	5	790	October 2, 2020
Logistic regression implementation Hail Query & hailctl	4	805	September 23, 2020
Error summary: HailException Hail Query & hailctl	4	543	September 26, 2022
Logistic regression on entries Hail Query & hailctl	10	1290	December 6, 2021
Code check to run WES Hail Query & hailctl	2	553	July 8, 2020

Logistic regression on remote servers

Related topics