Initialization scripts for Hail on AWS

Hi,

I am trying to add initialization scripts on AWS because I am getting an error while trying to run your Hail 0.1 pipeline (I know it is not supported, but still need that since 0.2 is not yet released on your website fully):

ERROR: HailException: could not open file: /vep/vep-gcloud-grch37.properties (No such file or directory) when ‘run_vep’ function is run:

vds = run_vep(vds, genome_version=args.genome_version, block_size=args.vep_block_size)

So, I suppose it is because VEP was not installed. Are there any scripts that I could use to set up environment not on Google Cloud or I should just manually install VEP (and other things, probably)?

I tried just running

./gcloud_dataproc/vep_init/run_hail_vep85_GRCh37_vcf.sh

But it fails since it can’t find another file:

/vep/variant_effect_predictor/variant_effect_predictor.pl

I’m a bit confused. What pipeline do you mean? And what do you mean when you say 0.2 is not realeased on our website fully?

I mean this one: https://github.com/macarthur-lab/hail-elasticsearch-pipelines/tree/master/hail_scripts

The pipeline is not yet complete, it seems for Hail 0.2. That is the only reason why I am struggling with Hail 0.1, need this pipeline.

Ah ok, the pipeline you linked is not actually maintained by us. That’s from another group who is using hail, not the hail team. We do not maintain that repository and don’t know what’s in it / how it works. That’s also not our website, our website is hail.is .

I don’t believe there are any set up vep scripts for AWS. You’d have to get vep installed on the machines yourself, as well as all the configuration specified here (if you are using 0.1): https://hail.is/docs/0.1/hail.VariantDataset.html#hail.VariantDataset.vep . Though since 0.1 is deprecated, you’d have a much better experience with 0.2.

The run_vep function is something defined within that particular project, not something that’s part of Hail 0.1, so you’d have to figure out how that works if you wanted to use it as is. It may have dependencies on Google Cloud, I don’t know because our team didn’t write it.

1 Like

Oh,I see, sorry about that, going to ask different team.