Progress output when running on Jupyter notebook

tpoterba · September 27, 2022, 12:44pm

Yes, by default you’ll be running Spark with the configuration local[*], which will use all available CPUs.

I’d recommend also configuring memory as here: Java Heap Space out of memory - #6 by danking

The default Spark memory settings might be throttling the computation a bit.

Topic		Replies	Views
Hail progress report UKBB RAP Hail Query & hailctl	1	211	September 20, 2023
Running hail locally - number of cores Hail Query & hailctl	3	803	March 28, 2023
Hail on gcloud dataproc cluster runtime issues Hail Query & hailctl	4	378	November 2, 2021
Google cloud speed up Hail Query & hailctl	10	845	September 18, 2019
Hail performance reference/benchmark Hail Query & hailctl	0	465	February 2, 2022