Hail doesn't respect `tmp_dir`

I have a job that keeps running out of temp disk space, so I specified the tmp_dir and local_tmpdir arguments to hl.init() in order to use a disk with more space. I even print the value of hl.tmp_dir() to check that the value has been set.

However, no matter what I set tmp_dir to, my job keeps failing with messages like Error summary: FileNotFoundException: /tmp/blockmgr-209b518a-3d50-424a-a9b8-7e1b6509da6c/17/shuffle_0_32640_0.data.d76ae3c3-b5ce-4a45-ad31-671207b8329a (No space left on device) and indeed, when I check /tmp while the job is running, I can see that Hail is writing lots of data there.

Is there a way to redirect all temporary files to a different location?

Many thanks!

I believe, I can answer this question myself:

To redirect all temporary files to a particular location, you need to set tmp_dir, local_tmpdir, and also spark.local.dir:

temp_dir = "/my/temp/dir"

hl.init(
  tmp_dir=temp_dir,
  local_tmpdir=temp_dir,
  spark_conf={"spark.local.dir": temp_dir}
)
1 Like

Hi @hannes_brt ,

Thanks for the detailed investigation. We’re discussing automatically setting the spark local dir to local_tmpdir. That’s a pretty reasonable thing to do.

1 Like