Hi @Stephen,
You probably need to increase the memory available to Spark, a library Hail uses.
Hail treats missing data as missing. If you execute,
covar.filter(hl.is_missing(cover.mypheno)).show()
you’ll see the missing data represented by a special value NA
.
If you are curious how logistic_regression_rows
deals with missing y-values, check out the docs for logistic_regression_rows
, specifically the warning box.