Dear Hail team,
I am evaluating the use case for Hail at our Institute of Genetic Medicine here at Newcastle. As part of this I am trying to replicate your GWAS (https://github.com/Nealelab/UK_Biobank_GWAS/tree/master/imputed-v2-gwas ) and gather various performance and usage indicators.
I understand you performed this on google cloud dataproc and to keep things simple I would like to do the same. Before I make a start I would like to know about your experience
How much did it cost (just the google cloud bill) and what dataproc spec did you use to do this GWAS?
Apart from the very useful documentation on https://github.com/Nealelab , is there any other resource that would be useful?
Any other tips/advices