Empty matrix table with vcf_combiner.run_combiner

New to hail and trying run_combiner with a small set of input files (<100) on a small spark cluster. This runs, but the combined matrix table that is written seems to be just a skeleton that doesn’t contain the data (it’s only a few hundred KB) and I get an error of the form

Error summary: FileNotFoundException: File out.mt/rows/rows/parts/part-0115-1-115-0-9ea728f3-255e-ea11-92d8-f9eca3fe3045 does not exist

when I try to read it back. Looking in $SPARK_WORKER_DIR on each node, I am able to find out.mt directories that contain gigabytes of data.

Is there something wrong with the spark or hail setup? I have:

  • $SPARK_WORKER_DIR & $SPARK_LOCAL_DIRS: local to each node
  • For the call to hl.init, tmp_dir should be globally visible, but local_tmpdir is local.
  • For the call to run_combiner, out_file and tmp_path should be globally visible.