Hail off-heap memory exceeded maximum threshold

I’m getting this error while running a pipeline (gnomad_qc/subpop_analysis.py at subpops · broadinstitute/gnomad_qc · GitHub): Error summary: HailException: Hail off-heap memory exceeded maximum threshold: limit 3.70 GiB, allocated 3.70 GiB
Any suggestions on what I can try changing? Let me know if I should send a log.

do you have the full stack trace?

we’ve been working through a similar issue with Wenhan and may have a solution

Hi Hail Team!

I’m posting on this thread rather than a new post because I just got the same error using very similar code and the same MT, and it failed at the densify step gnomad_qc/compare_freq.py at ac4a9c2bc4dc0bcb38cd43a0e89fee801c9e6f45 · broadinstitute/gnomad_qc · GitHub.

Kristen said that her error was fixed when using Hail version: 0.2.84-2817a9851cdf, however this was the version that I used for my test and I used the same cluster config that she did. One difference was that I added a split multi to the code.

Any thoughts on what I can change? I can send the log over email or Slack.

Thank you!

Hi @jkgoodrich,
Could you email the log to hail-team?

Yup, done. Thank you!

Closing the loop here. We were able to successfully run the pipeline here using the --off-heap-memory-fraction argument to hailctl dataproc start. I am continuing to investigate memory management in explodes (used by sparse_split_multi), since the split multi was the only difference between this failing pipeline and the one that Kristen ran sucessfully.