Running hail locally - number of cores

ag14774 · March 27, 2023, 4:23pm

It is unclear from the docs currently how to correctly initialize hail to maximize performance. In my case I am running Hail locally on a large machine with 1TB of memory and 96 cores. 1) Does hail automatically use all the available memory? 2) Does hail automatically use all available cores? What do I need to adjust to make hail use all resources? Any particular configurations to pass to hail.init()?

Thank you

tpoterba · March 27, 2023, 4:27pm

Hail running on Apache Spark in local mode will use all available cores by default, but you can load the Spark WebUI (the URL is printed in initialization) and look at the Executors pane to ensure everything is getting used. Using hl.init(master='local[96]') can explicitly request 96 local cores.

You should also make sure that you set memory with PYSPARK_SUBMIT_ARGS:

ag14774 · March 27, 2023, 4:49pm

Thanks for the reply!

In the “Environment” tab it also says this under “Resource Profiles”:

Executor Reqs:
	cores: [amount: 1]
Task Reqs:
	cpus: [amount: 1.0]

In the Executors pane though it shows all cores. Does that mean it is all ok and it is using all cores?

tpoterba · March 28, 2023, 10:14am

Should be, yes!

Spark sometimes has issues with resource contention when running with such a high degree of multithreading in a single java process, so post here if anything else looks funky as you start running computations.

Topic		Replies	Views
How do I increase the memory or RAM available to the JVM when I start Hail through Python? Hail Query & hailctl	2	5428	March 4, 2021
Limit memory usage Hail Query & hailctl	11	1203	June 24, 2020
Hail uses almost all of CPUs Hail Query & hailctl	2	715	June 17, 2019
Expanding cores available to Hail by using extra VMs Hail Query & hailctl	32	1332	February 28, 2019
Running Hail using MPI queue Hail Batch & General Cloud	1	20	April 8, 2025

Running hail locally - number of cores

Related topics