This means you don’t have enough local disk. I don’t know where Spark stores its shuffle intermediates, but it might not be /tmp
. The real answer is don’t use repartition
, it is slow and uses a Spark shuffle and is thus prone to failure. Use the min_partitions
argument to import_vcf
.
If you’re using preemptible workers, then the executors are lost because Google is preempting them and eventually giving you a replacement. Shuffle operations (e.g. repartition
) will fail in this environment.
I see you’re trying to reduce partition count, try naive_coalesce
instead.
You can also change min_block_size
on hl.init
to force your blocks to be larger.