Hail utilities for gnomAD in local cluster


I had a question regarding using HAIL after utilizing variant calling for WGS datasets.

  1. I have millions of variations called by GATK best practices of WGS datasets. We obtained the summary stats for the same in a chromosome wise manner using HAIL. Next, we wanted to check the novel and common variations against gnomAD-v3 by gnomAD utils in HAIL.

  2. Could you please tell me whether can we do not this check in our local cluster and not use cloud services using Hail utilities for gnomAD.

Thanks in advance

You can certainly download the gnomAD data locally, and join that with your data. I think the question in particular is whether it’s possible to parameterize the gnomad data path, which would let you use the gnomad library to annotate, right?

@ch-kr do you know if this is possible right now?

Thanks for reaching out! We don’t have a way to parameterize the gnomAD data paths yet. Would the Cloud Storage Connector be helpful here? https://hail.is/docs/0.2/cloud/google_cloud.html#reading-from-google-cloud-storage

If you plan to use gnomad more than once, the cloud storage connector is likely going to be much more expensive than downloading gnomad once, maintaining your own copy, and annotating with gnomad manually (not using the gnomad library).