VEP- vep_json_schema specification for colocated_variants.frequencies

Hi,
I have problem defining vep json schema for population frequencies in colocated variants. Problem is that the struct field name consits REF sequence, which can be whatever. I covered A,C,T,G but of course this is not enough:

2022-04-14 00:37:11 Hail: WARN: struct{C: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, T: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, G: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, A: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64} has no field CAC at <root>.colocated_variants[element].frequencies for value JObject(List((eur,JDouble(0.3171)), (sas,JDouble(0.3374)), (eas,JDouble(0.3631)), (amr,JDouble(0.3963)), (afr,JDouble(0.4251))))
2022-04-14 00:37:20 Hail: WARN: struct{C: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, T: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, G: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, A: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64} has no field TGTGTGTGTGTGT at <root>.colocated_variants[element].frequencies for value JObject(List((amr,JDouble(0.6182)), (eur,JDouble(0.5964)), (sas,JDouble(0.6861)), (afr,JDouble(0.6392)), (eas,JDouble(0.6567))))
2022-04-14 00:37:20 Hail: WARN: struct{C: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, T: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, G: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, A: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64} has no field TGTGTGTGTGTGTGTGT at <root>.colocated_variants[element].frequencies for value JObject(List((eas,JDouble(0.6567)), (amr,JDouble(0.6182)), (eur,JDouble(0.5964)), (sas,JDouble(0.6861)), (afr,JDouble(0.6392))))
2022-04-14 00:37:20 Hail: WARN: struct{C: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, T: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, G: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, A: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64} has no field CCA at <root>.colocated_variants[element].frequencies for value JObject(List((amr,JDouble(0.5648)), (eur,JDouble(0.3608)), (sas,JDouble(0.3742)), (afr,JDouble(0.913)), (eas,JDouble(0.5149))))
2022-04-14 00:37:22 Hail: WARN: struct{C: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, T: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, G: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, A: struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}, sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64} has no field GGAGGT at <root>.colocated_variants[element].frequencies for value JObject(List((afr,JDouble(0.6921)), (eur,JDouble(0.9911)), (sas,JDouble(0.9939)), (amr,JDouble(0.964)), (eas,JDouble(0.9931))))

My schema for colocated_variants:

colocated_variants: Array[Struct {
    allele_string: String,
    start: Int32,
    strand: Int32,
    seq_region_name: String,
    id: String,
    end: Int,
    phenotype_or_disease: Int32,
    somatic: Int32,
    minor_allele_freq: Float64,
    minor_allele: String,
    minimised: Int32,
    frequencies: Struct{C:Struct{
            sas: Float64,
            eur: Float64,
            amr: Float64,
            afr: Float64,
            eas: Float64,
            aa: Float64,
            ea: Float64,
            gnomad_eas: Float64,
            gnomad_sas: Float64,
            gnomad_fin: Float64,
            gnomad_afr: Float64,
            gnomad: Float64,
            gnomad_amr: Float64,
            gnomad_nfe: Float64,
            gnomad_oth: Float64,
            gnomad_asj: Float64
            },
            T:Struct{
                sas: Float64,
                eur: Float64,
                amr: Float64,
                afr: Float64,
                eas: Float64,
                aa: Float64,
                ea: Float64,
                gnomad_eas: Float64,
                gnomad_sas: Float64,
                gnomad_fin: Float64,
                gnomad_afr: Float64,
                gnomad: Float64,
                gnomad_amr: Float64,
                gnomad_nfe: Float64,
                gnomad_oth: Float64,
                gnomad_asj: Float64
            },
            G:Struct{
                sas: Float64,
                eur: Float64,
                amr: Float64,
                afr: Float64,
                eas: Float64,
                aa: Float64,
                ea: Float64,
                gnomad_eas: Float64,
                gnomad_sas: Float64,
                gnomad_fin: Float64,
                gnomad_afr: Float64,
                gnomad: Float64,
                gnomad_amr: Float64,
                gnomad_nfe: Float64,
                gnomad_oth: Float64,
                gnomad_asj: Float64
            },
            A:Struct{
                sas: Float64,
                eur: Float64,
                amr: Float64,
                afr: Float64,
                eas: Float64,
                aa: Float64,
                ea: Float64,
                gnomad_eas: Float64,
                gnomad_sas: Float64,
                gnomad_fin: Float64,
                gnomad_afr: Float64,
                gnomad: Float64,
                gnomad_amr: Float64,
                gnomad_nfe: Float64,
                gnomad_oth: Float64,
                gnomad_asj: Float64
            },
            sas: Float64,
            eur: Float64,
            amr: Float64,
            afr: Float64,
            eas: Float64,
            aa: Float64,
            ea: Float64,
            gnomad_eas: Float64,
            gnomad_sas: Float64,
            gnomad_fin: Float64,
            gnomad_afr: Float64,
            gnomad: Float64,
            gnomad_amr: Float64,
            gnomad_nfe: Float64,
            gnomad_oth: Float64,
            gnomad_asj: Float64
        }
    }

Thank you Radim

Complete VEP config file: vep_config.json.txt (1.3 KB)

You’re definitely correct that this can’t be a struct if the full set of fields isn’t known ahead of time. I think this will work as a dictionary:

dict<str, struct{sas: float64, eur: float64, amr: float64, afr: float64, eas: float64, aa: float64, ea: float64, gnomad_eas: float64, gnomad_sas: float64, gnomad_fin: float64, gnomad_afr: float64, gnomad: float64, gnomad_amr: float64, gnomad_nfe: float64, gnomad_oth: float64, gnomad_asj: float64}>

Hmm, I can use dictionary :-). It works. Thank you. This syntax works for me:

frequencies: Dict[String, Struct{sas: Float64,eur: Float64,amr: Float64,afr: Float64,eas: Float64,aa: Float64,ea: Float64,gnomad_eas: Float64,gnomad_sas: Float64,gnomad_fin: Float64,gnomad_afr: Float64,gnomad: Float64,gnomad_amr: Float64,gnomad_nfe: Float64,gnomad_oth: Float64,gnomad_asj: Float64}]

oh, yes, thanks! We have multiple type representations :scream: