I have a job that keeps running out of temp disk space, so I specified the tmp_dir
and local_tmpdir
arguments to hl.init()
in order to use a disk with more space. I even print the value of hl.tmp_dir()
to check that the value has been set.
However, no matter what I set tmp_dir
to, my job keeps failing with messages like Error summary: FileNotFoundException: /tmp/blockmgr-209b518a-3d50-424a-a9b8-7e1b6509da6c/17/shuffle_0_32640_0.data.d76ae3c3-b5ce-4a45-ad31-671207b8329a (No space left on device)
and indeed, when I check /tmp
while the job is running, I can see that Hail is writing lots of data there.
Is there a way to redirect all temporary files to a different location?
Many thanks!
I believe, I can answer this question myself:
To redirect all temporary files to a particular location, you need to set tmp_dir
, local_tmpdir
, and also spark.local.dir
:
temp_dir = "/my/temp/dir"
hl.init(
tmp_dir=temp_dir,
local_tmpdir=temp_dir,
spark_conf={"spark.local.dir": temp_dir}
)
1 Like
Hi @hannes_brt ,
Thanks for the detailed investigation. We’re discussing automatically setting the spark local dir to local_tmpdir. That’s a pretty reasonable thing to do.
1 Like
Hi Dan, has this problem been solved in updated versions?
Hi @Zhiwen-Owen-Jiang,
Unfortunately not - you still need to set spark.local.dir
yourself.
I’ll make an effort to address this for either the upcoming release or the next.
Hi ehigham, thanks for clarification. I have manually specified this in my config.