[Breaking Change] Intercept is now optional in regression and SKAT

jbloom · August 6, 2018, 1:44pm

To give users more control, we’ve changed inclusion of an intercept from implicit to explicit in linear regression, logistic regression, and SKAT. Consider the linear regression model

y = b*x + b0 + b1*c1 + b2*c2 + e

where we are interested the effect size b on x per row of a matrix table mt, and b0 represents an intercept.

BEFORE the intercept was implicitly added so you would write:

hl.linear_regression(y=mt.y, x=mt.x, covariates=[mt.c1, mt.c2]).

NOW the intercept must be included explicitly if desired so you should write:

hl.linear_regression(y=mt.y, x=mt.x, covariates=[1.0, mt.c1, mt.c2]).

Note that 1.0 is just a numeric expression (not special syntax) corresponding to a covariate that is 1.0 for every sample. This is equivalent to the model above, thought of as:

y = b*x + b0*1.0 + b1*c1 + b2*c2 + e

In sum: to get the same behavior as before, just add the covariate 1.0.

WARNING: The first command will still run but give different results, since it now corresponds to the model without intercept:

y = b*x + b1*c1 + b2*c2 + e.

As another example, now simple linear regression

y = b*x + b_0 + e

corresponds to

hl.linear_regression(y=mt.y, x=mt.x, covariates=[1.0]).

We’ve also removed the empty default value for covariates, so to do even simpler linear regression (not even an intercept!)

y = b*x + e

explicitly write the empty list [] in

hl.linear_regression(y=mt.y, x=mt.x, covariates=[]).

These changes also makes the regression interface more consistent with the linreg aggregator and LinearMixedModel class.

Topic		Replies	Views
Possible incorrect linreg aggregator results in 0.2.29 - 0.2.37 Updates	2	921	April 17, 2020
Logistic regression on entries Hail Query & hailctl	10	1290	December 6, 2021
Linear/Logistic regression covariates Hail Query & hailctl	1	659	August 18, 2020
Log of breaking changes in 0.2 beta Updates	17	3298	August 15, 2018
[Feature] Chained linear regression Updates	0	982	October 26, 2018

[Breaking Change] Intercept is now optional in regression and SKAT

Related topics