Tell Hail to make different use of Spark processing

Pedro_Barbosa · January 22, 2019, 2:31pm

Hello,

I would like to know if there is any option to tell Spark to run just on disk or use a different ratio of in-memory processing. Any way to easily perform benchmarking ?

Thanks,
Pedro

tpoterba · January 22, 2019, 2:39pm

Hi Pedro,
Spark configuration is pretty tough, and we’re not especially good at it (we’re also not sure who is). You can change the memory settings here: https://spark.apache.org/docs/2.2.0/configuration.html

Regarding benchmarking, we’re building a benchmarking suite but it’s not at a point where it’s especially useful beyond our development work.

Topic		Replies	Views
Questions about optimizing Hail and Spark configs and estimating resources and runtimes Hail Query & hailctl	3	1133	December 1, 2022
Running hail locally - number of cores Hail Query & hailctl	3	799	March 28, 2023
Hail on Spark cluster! Help [0.1]	2	840	March 20, 2018
What makes hail go fast locally? Hail Query & hailctl	1	410	November 2, 2020
Running Hail using MPI queue Hail Batch & General Cloud	1	11	April 8, 2025

Tell Hail to make different use of Spark processing

Related topics