VEP custom annotation

Hi all,

I’m a beginner in Hail. I was wondering if it is possible to annotate from custom file/sources when using ‘vep’ function in Hail.

Thanks in advance!

I think I don’t completely understand what you’re looking to do. It’s not possible to run VEP with different configurations / plugins than we expose in the method parameters (e.g. CSQ vs the full json schema). However, it’s absolutely possible to annotate from text files like .bed / interval files, VCFs, or arbitrary delimited text files.

Can you share what those custom files/sources look like?


I have annotated some VCF files with VEP using the below script:

set -e
set -u
set -o pipefail

cd ~/projects/wes_10k

for inputfile in test_data/*.vcf.gz
  outfile=$(basename "$inputfile")
  echo "processing file: $inputfile ..."
  perl ./vep -i "$inputfile" -o ./vep_annotated/"$outfile.gz" \
       --fork 24 \
       --cache \
       --offline \
       --cache_version 90 \
       --custom ~/.vep/annotation/gonl.SV.r5.vcf.gz,GONL,vcf,exact,0,AF \
       --custom ~/.vep/annotation/anon-SweGen_STR_NSPHS_1000samples_freq_hg19.vcf.gz,SWEGEN,vcf,exact,0,AF \
       --custom ~/.vep/annotation/gnomad.exomes.r2.0.1.sites.noVEP.vcf.gz,GNOMAD,vcf,exact,0,AF \
       --custom ~/.vep/annotation/ExAC.0.3.GRCh37.vcf.gz,ExAC,vcf,exact,0,AF \
       --custom ~/.vep/annotation/conserved_sorted_cleaned.bed.gz,CONSERVED,bed,overlap,0 \
       --custom ~/.vep/annotation/ddd-unaffected-parental-maf-2017-03-18.bed.gz,DDD_PARENT,bed,overlap,0 \
       --custom ~/.vep/annotation/enhancer_sorted_cleaned.bed.gz,ENHANCER,bed,overlap,0 \
       --custom ~/.vep/annotation/ESP6500SI_21012013_MAF.bed.gz,ESP6500SI,bed,overlap,0 \
       --custom ~/.vep/annotation/heart_sorted_cleaned.bed.gz,ENH_HEART,bed,overlap,0 \
       --custom ~/.vep/annotation/segmentaldup_USCS_hg19_sorted_cleaned.bed.gz,SEGMENTAL_DUP,bed,overlap,0  \
       --custom ~/.vep/annotation/TRF_USCS_hg19_sorted_cleaned.bed.gz,TRF,bed,overlap,0 \
       --custom ~/.vep/annotation/UK10K_COHORT_20130116_AF.bed.gz,UK10K,bed,overlap,0 \
       --flag_pick --vcf --force_overwrite --compress_output  bgzip

But I would like to integrate this step also into my Hail workflow. So, Can I run Vep into Hail using the same settings?


No, this won’t work :frowning:

It’s hard to make VEP totally customizable in Hail, because we need to know what the schema of the resulting annotations is. VEP doesn’t make it easy to know / guess that, so we’ve had to guess and check it for the standard+LOFTEE setup (the Hail supports).

It would certainly be possible to run this yourself on the sites files, and join with the multi-sample VCFs later.

I see, many thanks for your quick reply. Anyhow, Hail allows me to parse/process really well these annotations later in the workflow!

Thanks again!