Hail treats missing data as missing. If you execute,
covar.filter(hl.is_missing(cover.mypheno)).show()
you’ll see the missing data represented by a special value NA.
If you are curious how logistic_regression_rows deals with missing y-values, check out the docs for logistic_regression_rows, specifically the warning box.
Thank you for your reply. If I use Hail by runing .py file on a linux server, where should I add this command? Is it right to add it to the first line of my .py file?
That code sets an environment variable in a shell like bash or zsh. You have to run that l, in your shell, before you run Python or pyspark or spark-submit.
Thank you for your reply. Now I have another error:
Traceback (most recent call last):
File “/z/Comp/logi/1.py”, line 3, in
hl.init()
File “”, line 2, in init
File “/ua/zwang2547/.local/lib/python3.6/site-packages/hail/typecheck/check.py”, line 614, in wrapper
return original_func(*args, **kwargs)
File “/ua/zwang2547/.local/lib/python3.6/site-packages/hail/context.py”, line 231, in init
skip_logging_configuration, optimizer_iterations)
File “/ua/zwang2547/.local/lib/python3.6/site-packages/hail/backend/spark_backend.py”, line 165, in init
pyspark.SparkContext._ensure_initialized(conf=conf)
File “/ua/zwang2547/.local/lib/python3.6/site-packages/pyspark/context.py”, line 316, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File “/ua/zwang2547/.local/lib/python3.6/site-packages/pyspark/java_gateway.py”, line 46, in launch_gateway
return _launch_gateway(conf)
File “/ua/zwang2547/.local/lib/python3.6/site-packages/pyspark/java_gateway.py”, line 108, in _launch_gateway
raise Exception(“Java gateway process exited before sending its port number”)
Exception: Java gateway process exited before sending its port number
Look for a hail log file. There’s more detail there.
This almost certainly means you have an error in PYSPARK_SUBMIT_ARGS. Make sure it’s exactly as described at the other post and make sure you have enough memory.