"No space left on device" error

This means you don’t have enough local disk. I don’t know where Spark stores its shuffle intermediates, but it might not be /tmp. The real answer is don’t use repartition, it is slow and uses a Spark shuffle and is thus prone to failure. Use the min_partitions argument to import_vcf.

If you’re using preemptible workers, then the executors are lost because Google is preempting them and eventually giving you a replacement. Shuffle operations (e.g. repartition) will fail in this environment.

I see you’re trying to reduce partition count, try naive_coalesce instead.

You can also change min_block_size on hl.init to force your blocks to be larger.