Submit script giving ModuleNotFoundError: No module named ‘gnomad’

I’m trying to run hailctl dataproc submit kmlt test.py where test.py has

from gnomad.resources.grch38.gnomad import DOWNSAMPLINGS

but I’m getting “ModuleNotFoundError: No module named ‘gnomad’”

I have the following for my HAIL_SCRIPTS in my bash_profile:
“export HAIL_SCRIPTS=/Users/kristen/code/gnomad-constraint/gnomad_constraint:/Users/kristen/code/gnomad_methods/gnomad”

Do you have any suggestsions on how to resolve this error?

Hi @klarrich,

I think you need to pass gnomad into the packages argument for the command you’re running when creating the Dataproc cluster in order to install it on there, like so:

hailctl dataproc start <cluster-name> --packages gnomad

Hope that helps!

Thanks for your reply! To clarify, passing the repos to packages when starting the cluster still works fine, but I want the submit script to recognize changes made in the paths supplied to HAIL_SCRIPTS (which has worked in the past) for quicker iterations.

Sorry, I missed that when reading your question the first time!

What does the output of the following commands look like?

ls -R /Users/kristen/code/gnomad-constraint
ls -R /Users/kristen/code/gnomad_methods

Oh I see, I lost an __init__.py file, adding it back in works, thank you!