Hi, Thanks Tim, for this task we plan to use Hail0.1.
If you have some idea for this, I really appreciate your help. I found a similar question in forum but no answer yet.
I use the same setting deal just (chr11-20), succeed. data size is around 1/3 of my whole dataset.
Then I try to handle whole dataset. I increase driver.mem to 600g, spark.driver.maxResultSize=180g, master node –master-machine-type n1-highmem-96, –worker-machine-type n1-highmem-32, it still report error. at stage2, finished half of work (30000 out of 60000).