I know logistic mixed model is not possible, but is it possible to analyse case-control traits using a linear model and then transform the effect size (like BOLT-LMM)?
When I try I get the following error:
/databricks/python/lib/python3.7/site-packages/hail/stats/linear_mixed_model.py:492: RuntimeWarning: divide by zero encountered in log
neg_log_reml = (np.linalg.slogdet(xdx) - logdet_d + self._dof * np.log(sigma_sq)) / 2
/databricks/python/lib/python3.7/site-packages/scipy/optimize/optimize.py:1767: RuntimeWarning: invalid value encountered in double_scalars
r = (xf - nfc) * (fx - ffulc)
Exception: failed to fit log_gamma: optimum within 0.001 of upper bound.
The developer who wrote Hail’s linear mixed model is no longer on the team but in response to a similar question on Zulip he wrote this:
log_gamma is the log of ratio of variance explained by genetics to variance explained by “enviroment / noise”. Apparently in your case, the value of log_gamma that maximizes the likelihood of the data is within 0.001 of the lower bound of -8, that is the ratio gamma is very near 0 (about e^-8). In other words, you’re getting an error because the model isn’t finding any genetic mixture component. Seeing
log_gamma = 0.001 is definitely wrong…that would correspond to gamma=1 and force a nearly equal contribution of genetic and environmental variance. Setting log_gamma to -8 will basically be the same as linear regression, just much slower. So if you’re truly in a situation where kinship isn’t explaining anything (beyond PCs you may be including as fixed effects), you should just use linear regression (equivalent to gamma = 0).