Hailctl FatalError: SparkException

I started the hail cluster with the following command:

hailctl dataproc start art-cluster \
    --master-machine-type n1-standard-2 \
    --num-preemptible-workers 0 \
    --num-workers 2 \
    --worker-machine-type n1-standard-1 \
    --region us-east1 \
    --packages seaborn,matplotlib,imblearn,scikit-learn,plotly,graspy

Then I connected to notebook:

hailctl dataproc connect art-cluster --zone=us-east1-b notebook

After the next command:

pc_rel = hl.pc_relate(mt.GT,
              min_individual_maf = 0.05,

I got the error:

FatalError: SparkException: 
Bad data in pyspark.daemon's standard output. Invalid port number:
  1229870149 (0x494e5445)
Python command to execute the daemon was:
  /opt/conda/default/bin/python -m pyspark.daemon
Check that you don't have any unexpected modules or libraries in

Is it possible that the error is due to my predifined libraries?

I’ve never seen something like this before. It’s totally possible that one of those packages is causing trouble for pyspark.

I might try again without those libs, see if it works, and then add back until you figure out which one is problematic. You might need to install with a pinned version that doesn’t conflict with pyspark.