Set up spark.driver.memory in hail0.2 via hailctl

Hi,

I tried:

hailctl dataproc start hail-test
–master-machine-type n1-highmem-8
–master-boot-disk-size 500
–num-workers 2
–worker-machine-type n1-highmem-16
–worker-boot-disk-size 500
–region europe-west1
–zone europe-west1-b
–max-idle 60m
–scopes cloud-platform
–properties “spark:spark.driver.extraJavaOptions=-Xss4M,spark:spark.executor.extraJavaOptions=-Xss4M,spark:spark.driver.memory=100g,spark:spark.driver.maxResultSize=100g,spark:spark.task.maxFailures=20,spark:spark.kryoserializer.buffer.max=2g”

But I noticed, when command run, it actually generate spark.driver.memory=41g

–properties=^|||^spark:spark.task.maxFailures=20|||spark:spark.driver.extraJavaOptions=-Xss4M|||spark:spark.ex
ecutor.extraJavaOptions=-Xss4M|||spark:spark.speculation=true|||hdfs:dfs.replication=1|||dataproc:dataproc.logging.
stackdriver.enable=false|||dataproc:dataproc.monitoring.stackdriver.enable=false|||spark:spark.driver.memory=41g|||
spark:spark.driver.maxResultSize=100g|||spark:spark.kryoserializer.buffer.max=2g
–initialization-actions=gs://hail-common/hailctl/dataproc/0.2.49/init_notebook.py \

However, my task aborted with error code 134, out-of-memory issue. (with 41g)

I noticed spark doc said I need to set tag --driver-memory, however this didn’t work via hailctl.

How could I specific spark.driver.memory to 100g and what is the limitation for it?

Thanks for your time and any help are welcome.
Best.

Hello, @shuang

For some reason, I thought this was answered in a separate thread. Was it?

Otherwise, if I recall the --driver-memory specifications can go up to 96G. Unless if you have tried it and it still errored out, I can further look into this for you.

Hi @kumarveerapen Sorry for my slow response and thx for you answer.

I noticed the reason is my master-machine-type is too small which limited my available spark.driver.memory.

Great!