May I ask whether Hail provide any command or interface to print some log files showing the current status or progress, for instance, I am going to run Hail on UKBB RAP for 500,000 samples. It is necessary for me to print what percentage we have done and how many left to track the progress. I would to appreciate any suggestions. Thank you.
Hail uses Apache Spark for distributed computing. Apache Spark will give you a progress bar that shows you how many partitions remain in the current stage. However, in general, Spark cannot tell you how many more stages are left.
If you’re looking for detailed information about what’s happening in the cluster, the Apache Spark status page is the right place to look. That’s usually served on port 4040 on a master node of the Spark cluster.
Hail also writes everything it’s doing to a Hail log file. The location of that file is printed when Hail initializes. You can also set a specific location using hl.init. In general, the log file is very large and hard for users to interpret.
Can you share more specifics about what you want to know about your pipeline?