VEP command abruptly stops running

Hi,

I am having some trouble annotating variants using VEP. I am running hail locally and using the following commands:

MT = ‘Analyses/scz_maf_filtered_rare.mt’
ht = hl.read_matrix_table(MT).rows()
ht_vep = hl.vep(ht, “vep_configuration_wonu.json”)
ht_vep.write(“annotations/vep_annotate_wonu.ht”, overwrite=True)

My json configuration file below is based on the example here: https://hail.is/docs/0.2/methods/genetics.html?highlight=vep#hail.methods.vep with a different plugin and additional variables from the plugin in the json schema.

{
“command”: [
“/media/veracrypt10/Analyses/ensembl-vep/vep”,
“–format”, “vcf”,
OUTPUT_FORMAT_FLAG”,
“–everything”,
“–allele_number”,
“–no_stats”,
“–cache”, “–offline”,
“–port”, “3337”,
“–minimal”,
“–assembly”, “GRCh37”,
“–plugin”, “dbNSFP,/media/veracrypt10/Analyses/ensembl-vep/dbNSFPv3.5a/dbNSFP_hg19.gz,SIFT_score,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_pred,LRT_score,LRT_pred,MutationTaster_score,MutationTaster_pred,MutationAssessor_score,MutationAssessor_pred,PROVEAN_score,PROVEAN_pred”,
“–dir_plugins”, “/home/wonu/.vep/Plugins/”,
“–output_file”, “STDOUT”
],
“env”: {
“PERL5LIB”: “/home/wonu/.vep/Plugins/dbNSFP.pm”
},
“vep_json_schema”: “Struct{assembly_name:String,allele_string:String,ancestral:String,ensp:String,colocated_variants:Array[Struct{af:Float64,afr_af:Float64,amr_af:Float64,eas_af:Float64,eur_af:Float64,sas_af:Float64,aa_af:Float64,ea_af:Float64,gnomAD_AF:Float64,gnomAD_AFR_AF:Float64,gnomAD_AMR_AF:Float64,gnomAD_ASJ_AF:Float64,gnomAD_EAS_AF:Float64,gnomAD_FIN_AF:Float64,gnomAD_NFE_AF:Float64,gnomAD_OTH_AF:Float64,gnomAD_SAS_AF:Float64,MAX_AF:Float64,MAX_AF_POPS:Float64,aa_allele:String,aa_maf:Float64,afr_allele:String,afr_maf:Float64,allele_string:String,amr_allele:String,amr_maf:Float64,clin_sig:Array[String],end:Int32,eas_allele:String,eas_maf:Float64,ea_allele:String,ea_maf:Float64,eur_allele:String,eur_maf:Float64,exac_adj_allele:String,exac_adj_maf:Float64,exac_allele:String,exac_afr_allele:String,exac_afr_maf:Float64,exac_amr_allele:String,exac_amr_maf:Float64,exac_eas_allele:String,exac_eas_maf:Float64,exac_fin_allele:String,exac_fin_maf:Float64,exac_maf:Float64,exac_nfe_allele:String,exac_nfe_maf:Float64,exac_oth_allele:String,exac_oth_maf:Float64,exac_sas_allele:String,exac_sas_maf:Float64,id:String,minor_allele:String,minor_allele_freq:Float64,phenotype_or_disease:Int32,pubmed:Array[Int32],sas_allele:String,sas_maf:Float64,somatic:Int32,start:Int32,strand:Int32}],context:String,end:Int32,id:String,input:String,intergenic_consequences:Array[Struct{allele_num:Int32,consequence_terms:Array[String],impact:String,minimised:Int32,variant_allele:String}],most_severe_consequence:String,motif_feature_consequences:Array[Struct{allele_num:Int32,consequence_terms:Array[String],high_inf_pos:String,impact:String,minimised:Int32,motif_feature_id:String,motif_name:String,motif_pos:Int32,motif_score_change:Float64,strand:Int32,variant_allele:String}],regulatory_feature_consequences:Array[Struct{allele_num:Int32,biotype:String,consequence_terms:Array[String],impact:String,minimised:Int32,regulatory_feature_id:String,variant_allele:String}],seq_region_name:String,start:Int32,strand:Int32,transcript_consequences:Array[Struct{allele_num:Int32,amino_acids:String,biotype:String,canonical:Int32,ccds:String,cdna_start:Int32,cdna_end:Int32,cds_end:Int32,cds_start:Int32,codons:String,consequence_terms:Array[String],distance:Int32,domains:Array[Struct{db:String,name:String}],exon:String,gene_id:String,gene_pheno:Int32,symbol:String,symbol_source:String,hgnc_id:String,hgvsc:String,hgvsp:String,hgvs_offset:Int32,impact:String,intron:String,lof:String,lof_flags:String,lof_filter:String,lof_info:String,minimised:Int32,polyphen_prediction:String,polyphen_score:Float64,protein_end:Int32,protein_start:Int32,protein_id:String,sift_prediction:String,sift_score:Float64,strand:Int32,swissprot:String,transcript_id:String,trembl:String,uniparc:String,variant_allele:String,SIFT_score:Float64,SIFT_pred:String,Polyphen2_HDIV_score:Float64,Polyphen2_HDIV_pred:String,Polyphen2_HVAR_score:Float64,Polyphen2_HVAR_pred:String,LRT_score:Float64,LRT_pred:String,MutationTaster_score:Float64,MutationTaster_pred:String,MutationAssessor_score:Float64,MutationAssessor_pred:String,PROVEAN_score:Float64,PROVEAN_pred:String}],variant_class:String}”
}

The vep command seems to run for about 10 mins and then abruptly stops with no error message, and the resulting ht file is incomplete and unreadable as no metadata.json.gz file is produced. It also does not produce any output file. I am not sure what the problem is or how to proceed. I’m running hail version 0.2.44-6cfa355a1954

Thanks!

Hi @wonu, sorry you’re having trouble. Can you share the hail log file? That will have information necessary to diagnose the issue.

Hi, what’s the best way to do this? I’ve tried just uploading the file but it doesn’t seem to work.

If you can’t upload you can email to hail@broadinstitute.org

email sent. I think the file was too large to upload

So I’m pretty sure that what’s happening here is that your json schema isn’t consistent with what the plugin is doing. Admittedly, it’s not really easy to add new VEP plugins in our current config file model. I am pasting a few lines of the log below:

2020-06-09 16:47:18 Hail: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field flags at <root>.transcript_consequences[element] for value JArray(List(JString(cds_start_NF)))
2020-06-09 16:47:18 Hail: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field mutationassessor_pred at <root>.transcript_consequences[element] for value JString(M)
2020-06-09 16:47:18 root: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field mutationassessor_pred at <root>.transcript_consequences[element] for value JString(M)
2020-06-09 16:47:18 Hail: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field lrt_pred at <root>.transcript_consequences[element] for value JString(U)
2020-06-09 16:47:18 Hail: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field mutationassessor_score at <root>.transcript_consequences[element] for value JDouble(2.215)
2020-06-09 16:47:18 root: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field mutationassessor_score at <root>.transcript_consequences[element] for value JDouble(2.215)
2020-06-09 16:47:18 Hail: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field polyphen2_hdiv_pred at <root>.transcript_consequences[element] for value JString(D)
2020-06-09 16:47:18 root: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field polyphen2_hdiv_pred at <root>.transcript_consequences[element] for value JString(D)
2020-06-09 16:47:18 Hail: WARN: struct{allele_num: int32, amino_acids: str, biotype: str, canonical: int32, ccds: str, cdna_start: int32, cdna_end: int32, cds_end: int32, cds_start: int32, codons: str, consequence_terms: array<str>, distance: int32, domains: array<struct{db: str, name: str}>, exon: str, gene_id: str, gene_pheno: int32, symbol: str, symbol_source: str, hgnc_id: str, hgvsc: str, hgvsp: str, hgvs_offset: int32, impact: str, intron: str, lof: str, lof_flags: str, lof_filter: str, lof_info: str, minimised: int32, polyphen_prediction: str, polyphen_score: float64, protein_end: int32, protein_start: int32, protein_id: str, sift_prediction: str, sift_score: float64, strand: int32, swissprot: str, transcript_id: str, trembl: str, uniparc: str, variant_allele: str, SIFT_score: float64, SIFT_pred: str, Polyphen2_HDIV_score: float64, Polyphen2_HDIV_pred: str, Polyphen2_HVAR_score: float64, Polyphen2_HVAR_pred: str, LRT_score: float64, LRT_pred: str, MutationTaster_score: float64, MutationTaster_pred: str, MutationAssessor_score: float64, MutationAssessor_pred: str, PROVEAN_score: float64, PROVEAN_pred: str} has no field mutationtaster_score at <root>.transcript_consequences[element] for value JDouble(0.999949)

if you look at the end of the lines, it’s complaining that, for example, the json schema has no field called polyphen2_hvar_pred. That’s because you called it " Polyphen2_HVAR_pred. I think if you keep changing names in your schema to squash these warnings in the logs, that’ll fix your problem.

Thank you! I’ll try that.

Hi again,

I’ve been able to edit the more straight forward warnings but one of them is as follows:

2020-06-24 15:26:46 Hail: WARN: Can’t convert JSON value JObject(List((C,JObject(List((gnomad_fin,JInt(0)), (gnomad_nfe,JDouble(1.453E-4)), (gnomad_asj,JDouble(2.167E-4)), (gnomad_oth,JInt(0)), (gnomad,JDouble(1.24E-4)), (gnomad_eas,JInt(0)), (gnomad_sas,JDouble(3.702E-4)), (gnomad_afr,JInt(0)), (gnomad_amr,JInt(0))))))) to type array<struct{gnomad: float64, gnomad_afr: float64, gnomad_amr: float64, gnomad_asj: float64, gnomad_eas: float64, gnomad_fin: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_sas: float64}> at .colocated_variants[element].frequencies.

I’m not familiar with JSON formatting, so I was wondering if you could assist me with the correct structure for this?

The complete configuration is below:
{
“command”: [
“/media/veracrypt10/Analyses/ensembl-vep/vep”,
“–format”, “vcf”,
OUTPUT_FORMAT_FLAG”,
“–everything”,
“–allele_number”,
“–no_stats”,
“–cache”, “–offline”,
“–port”, “3337”,
“–minimal”,
“–assembly”, “GRCh37”,
“–plugin”, “dbNSFP,/media/veracrypt10/Analyses/ensembl-vep/dbNSFPv3.5a/dbNSFP_hg19.gz,SIFT_score,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_pred,LRT_score,LRT_pred,MutationTaster_score,MutationTaster_pred,MutationAssessor_score,MutationAssessor_pred,PROVEAN_score,PROVEAN_pred”,
“–dir_plugins”, “/home/wonu/.vep/Plugins/”,
“–output_file”, “STDOUT”
],
“env”: {
“PERL5LIB”: “/home/wonu/.vep/Plugins/dbNSFP.pm”
},
“vep_json_schema”: “Struct{assembly_name:String,allele_string:String,ancestral:String,ensp:String,minimised:Int32,colocated_variants:Array[Struct{af:Float64,afr_af:Float64,amr_af:Float64,eas_af:Float64,eur_af:Float64,sas_af:Float64,aa_af:Float64,ea_af:Float64,gnomAD_AF:Float64,gnomAD_AFR_AF:Float64,gnomAD_AMR_AF:Float64,gnomAD_ASJ_AF:Float64,gnomAD_EAS_AF:Float64,gnomAD_FIN_AF:Float64,gnomAD_NFE_AF:Float64,gnomAD_OTH_AF:Float64,gnomAD_SAS_AF:Float64,MAX_AF:Float64,MAX_AF_POPS:Float64,aa_allele:String,aa_maf:Float64,afr_allele:String,afr_maf:Float64,allele_string:String,amr_allele:String,amr_maf:Float64,frequencies:Array[Struct{gnomad:Float64,gnomad_afr:Float64,gnomad_amr:Float64,gnomad_asj:Float64,gnomad_eas:Float64,gnomad_fin:Float64,gnomad_nfe:Float64,gnomad_oth:Float64,gnomad_sas:Float64}],clin_sig:Array[String],end:Int32,eas_allele:String,eas_maf:Float64,ea_allele:String,ea_maf:Float64,eur_allele:String,eur_maf:Float64,exac_adj_allele:String,exac_adj_maf:Float64,exac_allele:String,exac_afr_allele:String,exac_afr_maf:Float64,exac_amr_allele:String,exac_amr_maf:Float64,exac_eas_allele:String,exac_eas_maf:Float64,exac_fin_allele:String,exac_fin_maf:Float64,exac_maf:Float64,exac_nfe_allele:String,exac_nfe_maf:Float64,exac_oth_allele:String,exac_oth_maf:Float64,exac_sas_allele:String,exac_sas_maf:Float64,id:String,minor_allele:String,minor_allele_freq:Float64,phenotype_or_disease:Int32,pubmed:Array[Int32],sas_allele:String,sas_maf:Float64,somatic:Int32,start:Int32,strand:Int32,seq_region_name:String}],context:String,end:Int32,id:String,input:String,intergenic_consequences:Array[Struct{allele_num:Int32,consequence_terms:Array[String],impact:String,minimised:Int32,variant_allele:String}],most_severe_consequence:String,motif_feature_consequences:Array[Struct{allele_num:Int32,consequence_terms:Array[String],high_inf_pos:String,impact:String,minimised:Int32,motif_feature_id:String,motif_name:String,motif_pos:Int32,motif_score_change:Float64,strand:Int32,variant_allele:String}],regulatory_feature_consequences:Array[Struct{allele_num:Int32,biotype:String,consequence_terms:Array[String],impact:String,minimised:Int32,regulatory_feature_id:String,variant_allele:String}],seq_region_name:String,start:Int32,strand:Int32,transcript_consequences:Array[Struct{allele_num:Int32,amino_acids:String,biotype:String,canonical:Int32,ccds:String,cdna_start:Int32,cdna_end:Int32,cds_end:Int32,cds_start:Int32,codons:String,consequence_terms:Array[String],distance:Int32,domains:Array[Struct{db:String,name:String}],exon:String,gene_id:String,gene_pheno:Int32,gene_symbol:String,gene_symbol_source:String,hgnc_id:String,hgvsc:String,hgvsp:String,hgvs_offset:Int32,impact:String,intron:String,lof:String,lof_flags:String,lof_filter:String,lof_info:String,minimised:Int32,polyphen_prediction:String,polyphen_score:Float64,protein_end:Int32,protein_start:Int32,protein_id:String,sift_prediction:String,strand:Int32,swissprot:String,transcript_id:String,trembl:String,uniparc:String,variant_allele:String,sift_score:Float64,sift_pred:String,polyphen2_hdiv_score:Float64,polyphen2_hdiv_pred:String,polyphen2_hvar_score:Float64,polyphen2_hvar_pred:String,lrt_score:Float64,lrt_pred:String,mutationtaster_score:Float64,mutationtaster_pred:String,mutationassessor_score:Float64,mutationassessor_pred:String,provean_score:Float64,provean_pred:String}],variant_class:String}”
}

Sorry this is so hard to use. If I’m reading this correctly, the JSON that’s coming out of VEP looks like:

JObject(
    List(
        (C, JObject(
                List(
                    (gnomad_fin,JInt(0)), 
                    (gnomad_nfe,JDouble(1.453E-4)), 
                    (gnomad_asj,JDouble(2.167E-4)), 
                    (gnomad_oth,JInt(0)), 
                    (gnomad,JDouble(1.24E-4)), 
                    (gnomad_eas,JInt(0)), 
                    (gnomad_sas,JDouble(3.702E-4)), 
                    (gnomad_afr,JInt(0)), 
                    (gnomad_amr,JInt(0))
                )
            )
        )
    )
) 

So to translate, JObject is equivalent to Struct. So this suggests that the type of colocated_variants.frequencies is

Struct{C:Struct{gnomad:Float64,gnomad_afr:Float64,gnomad_amr:Float64,gnomad_asj:Float64,gnomad_eas:Float64,gnomad_fin:Float64,gnomad_nfe:Float64,gnomad_oth:Float64,gnomad_sas:Float64}}

You currently have that type as an Array[Struct{.......}], and I think it should be a Struct{C: Struct{...}}

EDIT: Got it wrong the first time, changed it around.

No problem! Your suggestion seems to have worked. Thank you!