Aggregate_intermediates in tmp_dir filling up

Ruth_Eberhardt · July 27, 2022, 7:35am

Hi
I’ve had a couple of issues when I’ve been performing a lot of aggregations, the aggregate_intermediates directory in the location specified by tmp_dir when Hail is initialised fills up and I get the following error:

2022-07-26 17:28:14 TaskSetManager: WARN: Lost task 874.0 in stage 336.0 (TID 529110) (192.168.252.56 executor 40): org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /aggregate_intermediates is exceeded: limit=1048576 items=1048576

If I go into my file system and manually clear this then I can run aggregations again.
Is there a way to ensure that this subdirectory if tmp_dir is cleared at intervals?
Thank you!

AB.Hail · August 8, 2022, 1:26am

I have experienced the same issue from time to time and I manually delete the directory to fix the issue. Is there a spark option to increase the size of this directory? Also is this possible to use a different path for each user so that deleting aggregate_intermediates for one user does not affect other users’ jobs?

tpoterba · August 8, 2022, 12:08pm

This is really a Hail issue – we need to be eagerly cleaning up files when they’re no longer necessary.

@AB.Hail - there’s no Spark option here, since it’s a Hail parameter. You can set the Hail temp dir on init with hl.init(...other args..., tmp_dir='...') You can set this to a blob store path (google/s3 bucket, etc) and that will work fine if you’re running on the cloud.

Ruth_Eberhardt · August 10, 2022, 8:23am

Thank you @AB.Hail and @tpoterba
I resorted to manually removing the contents of aggregate_intermediates as I was running sample_qc so many times in one script that I was running out of space in my tmp_dir.

Topic		Replies	Views
The directory item limit of /tmp/aggregate_intermediates is exceeded: limit=1048576 items=1048576 Hail Query & hailctl	1	258	October 13, 2023
Hail doesn't respect `tmp_dir` Hail Query & hailctl	5	531	January 21, 2025
When running Hail locally, how do I set the Spark local directory (scratch space) to something other than the default `/tmp`? Help [0.1]	1	745	February 6, 2017
Export_plink method uses /tmp instead of tmp_dir Hail Batch & General Cloud	0	24	April 26, 2025
Hail cluster mode output file error Hail Query & hailctl	1	475	January 17, 2022

Aggregate_intermediates in tmp_dir filling up

Related topics