Just in case you did not see my question in the reply in my other post, I ask it again here.
I updated Hail, as of version 0.2.56 and tested run_combiner() function with 100, 1k, and 10k gvcf files each (average size : 6G). Run_combiner() runs for 100 gvcfs and 1k gvcfs completed successfully through multiple attempts of failed subtasks showing similar messages as before. Run time in a new version was faster than in the previous Hail version. Thanks much for your and your team’s work.
The job for 10k gvcfs has been failed. The first round of 100 batches, merging 100 gvcfs to a sparse MT, completed successfully, but it failed when starting the second round with the error message as below. If you let me know how to resolve this issue, I will really appreciate it.
– Caused by: java.io.IOException: All datanodes [DatanodeInfoWithStorage[*******,DISK]] are bad
And, I found out 100 sparse MTs generated by the first round in a run_combiner() run in my temp storage.
Is it possible to combine 100 MTs to 1 MT with any other Hail function?