I think this is a logistic regression. I saw in the R docs that the : operator has a specific definition for glm:
A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second.
do you know if there is an equivalent for this in python?
Ok, so I think obs_exp:mis_badness3 in R is the same as obs_exp * mis_badness3 in Python. In an R formula, obs_exp * mis_badness3 would translate to the Python obs_exp + mis_badness3 + obs_exp * mis_badness3.
hello! circling back to this, is there a way to do a logistic regression in hail? I think these two functions are the most relevant: Hail | Aggregators and Hail | Statistics.
I’m hoping to run a logistic regression in an aggregation (ideally something like hl.agg.logreg), is that possible with the existing functionality? Maybe I’ve missed something in the docs?
This isn’t possible right now. Fitting a logistic regression a convex optimization problem, and there are no good options for doing this in a single pass over the data (which is what Hail aggregators require). We intend to support doing this on a table, but don’t have a timeline right now.