Densify running out of memory

chrisvittal · June 12, 2020, 6:10pm

We scraped everything using gcloud dataproc clusters diagnose and it seemed that results were being put in HDFS.

tpoterba · June 12, 2020, 6:17pm

oh, totally, it’s the aggs per partition: https://github.com/hail-is/hail/blob/10e525a56787dd00b6aaed62d856c1eabd3c0f3a/hail/src/main/scala/is/hail/expr/ir/TableIR.scala#L1666-L1674

tpoterba · June 12, 2020, 6:17pm

What was the cluster config? non_preempts, preempts?

chrisvittal · June 12, 2020, 6:19pm

It’s 300 workers all with lots of disk, we’re not running out. I suggested lower workers with more cores and that failed too.

chrisvittal · June 26, 2020, 3:26pm

Update here to not forget. The resource we are running out of in the latest bug is definitely memory . I’ve tracked the change that introduced the regression to https://github.com/hail-is/hail/pull/8794 and we are continuing to investigate.

tpoterba · June 30, 2020, 1:01am

This PR fixes the problem:

Topic		Replies	Views
Error summary: OutOfMemoryError: Java heap space Hail Query & hailctl	15	2594	August 18, 2022
Annotate_cols out of memory issues Hail Query & hailctl	24	2381	December 20, 2019
Container exited with a non-zero exit code 137 Hail Query & hailctl	11	2844	October 6, 2021
Export_vcf OutOfMemoryError: Java heap space despite --driver-memory 8g Hail Query & hailctl	26	2855	January 11, 2019
Pc_rel memory issue: ConnectionRefusedError: [Errno 111] Connection refused Hail Query & hailctl	10	778	June 11, 2024

Densify running out of memory

Related topics