I run a Hail pipeline and get a VEP error but there is no error output from the VEP tool itself, Hail just says that VEP failed and the code is 2. Where is the real VEP error output written, which file? And if there is none, is there a way to figure this out somehow or redirect VEP output to a new file? The error happens when mt.write() triggers VEP execution in the pipeline:
The question is related to another one that I asked here:
No, it’s not, and that’s a thing. I need to see VEP error output while running Hail pipeline and it is not there. I can run VEP separately and it has some very informative output in case of an error, but when run by Hail pipeline nothing is seen except for error code 2 and that VEP failed. Its either the problem of Hail itself or the luigi pipeline that I am running and I do not know which one but people who developed the pipeline are silent:
It all comes down to finding a way of debugging. The actual issue specified in the title of the github is not that relevant. The VEP output gets completely lost and that is the main issue here.
Just read the other issue. That means that VEP isn’t printing any output when it errors. The next thing I would try is starting from a VEP command that works and iteratively changing options until you find the one that causes VEP to fail.
Does the VEP configuration in the docs for hl.vep work for you?
Because I am converting the pipeline now to grch38, there is no pipeline that actually works.
Yeah, so under VEP Error output: there is an empty line and nothing shown as if the output was somehow lost.
Yeah, the very same configuration of VEP does work when I run it locally in a standalone mode. So, I just copy-paste the command, add -i parameter to specify the input file, and then it produces correct annotated file.
The version of Hail is 0.2
Here I am also attaching the screenshot for how it looks in my case:
It fails without any output from VEP shown in the Hail log. Same as presented in the previous comment. The output file is getting created, so it’s not the permissions of the output folder otherwise I would expect the file not be created at all. All vep folders have permission 777 set. So, now I am stuck and the main reason is that Hail swallows up VEP error output.
Ok, I figured it out: the output does not go to the Hail log, but to Spark work folder of the worker that executed the process. I just looked into the respective spark work folders stderr files and was able to find there the specific VEP error.
Please do not use those JARs, they’re four months old. Hail development moves very quickly and we’ve made numerous improvements to Hail since then. We provide detailed instructions for the installation of Hail in a number of environments. If anything there is unclear, please ask questions here and we will improve the documentation.
Also, we recommend using hailctl dataproc to start and stop clusters. The cluster control scripts in that repository are not recommended.