# Rewrite R glm in hail

Hi hail team!

I have a very basic question. I’m trying to rewrite the following R code (from Kaitlin’s MPC code) using hail:

``````glm(pop_v_path ~ obs_exp + mis_badness3 + obs_exp:mis_badness3 + polyphen2 + obs_exp:polyphen2, data=cleaned_joint_exac_clinvar.scores, family=binomial)
``````

Is there an easy way to write this formula?

Thanks!

My R reading isn’t great: is this a linear regression? Or is this more complicated than that?

I think this is a logistic regression. I saw in the R docs that the `:` operator has a specific definition for `glm`:

``````A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second.
``````

do you know if there is an equivalent for this in python?

Ok, so I think `obs_exp:mis_badness3` in R is the same as `obs_exp * mis_badness3` in Python. In an R formula, `obs_exp * mis_badness3` would translate to the Python `obs_exp + mis_badness3 + obs_exp * mis_badness3`.

1 Like

thank you!!

hello! circling back to this, is there a way to do a logistic regression in hail? I think these two functions are the most relevant: Hail | Aggregators and Hail | Statistics.

I’m hoping to run a logistic regression in an aggregation (ideally something like `hl.agg.logreg`), is that possible with the existing functionality? Maybe I’ve missed something in the docs?

I’d appreciate any tips – thanks in advance!

This isn’t possible right now. Fitting a logistic regression a convex optimization problem, and there are no good options for doing this in a single pass over the data (which is what Hail aggregators require). We intend to support doing this on a table, but don’t have a timeline right now.

gotcha, thank you for letting me know!