Hi, @danking @tpoterba
I would like to follow up the issue we had before. Unfortunately, we still have a long way to upgrade Hail due to infrastructure limitation. We want to double check if we missed anything before. We suspected if it didn’t go throuph VQSR? But it seems if I just import VCF and export as mt file, it works. The only problem after we join the reference data as I highlighted the codes above, it took forever to finish. We processed a lot of WGS and didn’t have the same problem as this one since it only contains only less than 700 variants. Do you think it might have something special in the VEP annotation? Thanks!
I pasted two variants as example after VEP annotation in case that helps.
chr2 201209484 rs17860405 A G 157158 . AC=2;AF=0.067;AN=30;AS_BaseQRankSum=4.75;AS_FS=0;AS_InbreedingCoeff=-0.0714;AS_MQ=60;AS_MQRankSum=0.15;AS_QD=12.78;AS_ReadPosRankSum=0.85;AS_SOR=0.697;BaseQRankSum=5.2;DB;DP=32928;ExcessHet=3.1627;FS=0;InbreedingCoeff=-0.0714;MLEAC=2;MLEAF=0.067;MQ=60;MQRankSum=0.469;QD=12.78;ReadPosRankSum=1.47;SOR=0.689;CSQ=G|missense_variant|MODERATE|CASP10|ENSG00000003400|Transcript|ENST00000272879|protein_coding|9/10||ENST00000272879.9:c.1337A>G|ENSP00000272879.5:p.Tyr446Cys|1521|1337|446|Y/C|tAt/tGt|rs17860405&CM060890|1||1||SNV|1|HGNC|HGNC:1500||||2||CCDS2338.1|ENSP00000272879|Q92851.219||UPI000004466C|Q92851-1|1|tolerated(0.09)|possibly_damaging(0.677)|Gene3D:3.40.50.1460&Pfam:PF00656&PROSITE_profiles:PS50207&PANTHER:PTHR10454&PANTHER:PTHR10454:SF26&SMART:SM00115&Superfamily:SSF52129&CDD:cd00032|||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|missense_variant|MODERATE|CASP10|ENSG00000003400|Transcript|ENST00000286186|protein_coding|9/10||ENST00000286186.11:c.1337A>G|ENSP00000286186.6:p.Tyr446Cys|1512|1337|446|Y/C|tAt/tGt|rs17860405&CM060890|1||1||SNV|1|HGNC|HGNC:1500|YES|NM_032977.4||1|P2|CCDS2340.1|ENSP00000286186|Q92851.219|A0A0S2Z3Z5.30|UPI0000074732|Q92851-4|1|tolerated(0.14)|benign(0.414)|Gene3D:3.40.50.1460&Pfam:PF00656&PROSITE_profiles:PS50207&PANTHER:PTHR10454&PANTHER:PTHR10454:SF26&SMART:SM00115&Superfamily:SSF52129&CDD:cd00032|||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|missense_variant|MODERATE|CASP10|ENSG00000003400|Transcript|ENST00000313728|protein_coding|7/8||ENST00000313728.11:c.1136A>G|ENSP00000314599.7:p.Tyr379Cys|1260|1136|379|Y/C|tAt/tGt|rs17860405&CM060890|1||1||SNV|1|HGNC|HGNC:1500||||1||CCDS56159.1|ENSP00000314599|Q92851.219||UPI0000421EE8|Q92851-6|1|tolerated(0.13)|possibly_damaging(0.482)|Gene3D:3.40.50.1460&Pfam:PF00656&PROSITE_profiles:PS50207&PANTHER:PTHR10454&PANTHER:PTHR10454:SF26&SMART:SM00115&Superfamily:SSF52129&CDD:cd00032|||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|missense_variant|MODERATE|CASP10|ENSG00000003400|Transcript|ENST00000346817|protein_coding|7/8||ENST00000346817.9:c.1208A>G|ENSP00000237865.7:p.Tyr403Cys|1355|1208|403|Y/C|tAt/tGt|rs17860405&CM060890|1||1||SNV|1|HGNC|HGNC:1500||||5|A2|CCDS2339.1|ENSP00000237865|Q92851.219|A0A0S2Z3G5.39|UPI000013CA28|Q92851-2|1|tolerated(0.16)|benign(0.411)|Gene3D:3.40.50.1460&Pfam:PF00656&PROSITE_profiles:PS50207&PANTHER:PTHR10454&PANTHER:PTHR10454:SF26&SMART:SM00115&Superfamily:SSF52129&CDD:cd00032|||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|3_prime_UTR_variant|MODIFIER|CASP10|ENSG00000003400|Transcript|ENST00000360132|protein_coding|8/9||ENST00000360132.7:c.*423A>G||1663|||||rs17860405&CM060890|1||1||SNV|1|HGNC|HGNC:1500||||5|||ENSP00000353250|Q92851.219||UPI000002ABA4|Q92851-3|1||||||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|downstream_gene_variant|MODIFIER|MTND5P25|ENSG00000227348|Transcript|ENST00000430499|unprocessed_pseudogene||||||||||rs17860405&CM060890|1|2799|-1||SNV|1|HGNC|HGNC:42287|YES|||||||||||||||||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|downstream_gene_variant|MODIFIER|CASP10|ENSG00000003400|Transcript|ENST00000438843|nonsense_mediated_decay||||||||||rs17860405&CM060890|1|1324|1||SNV|1|HGNC|HGNC:1500||||2|||ENSP00000401914||B4E3T5.91|UPI0000E07CFD||1||||||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|downstream_gene_variant|MODIFIER|MTND4P23|ENSG00000225796|Transcript|ENST00000447723|unprocessed_pseudogene||||||||||rs17860405&CM060890|1|3814|-1||SNV|1|HGNC|HGNC:42210|YES|||||||||||||||||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|missense_variant|MODERATE|CASP10|ENSG00000003400|Transcript|ENST00000448480|protein_coding|7/8||ENST00000448480.1:c.1208A>G|ENSP00000396835.1:p.Tyr403Cys|1329|1208|403|Y/C|tAt/tGt|rs17860405&CM060890|1||1||SNV|1|HGNC|HGNC:1500||||1||CCDS56160.1|ENSP00000396835|Q92851.219||UPI0000367D6F|Q92851-5|1|tolerated(0.14)|benign(0.231)|Gene3D:3.40.50.1460&Pfam:PF00656&PROSITE_profiles:PS50207&PANTHER:PTHR10454&PANTHER:PTHR10454:SF26&SMART:SM00115&Superfamily:SSF52129&CDD:cd00032|||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|downstream_gene_variant|MODIFIER|CASP10|ENSG00000003400|Transcript|ENST00000460140|retained_intron||||||||||rs17860405&CM060890|1|3234|1||SNV|1|HGNC|HGNC:1500||||1||||||||1||||||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|non_coding_transcript_exon_variant|MODIFIER|CASP10|ENSG00000003400|Transcript|ENST00000492363|processed_transcript|7/8||ENST00000492363.5:n.1245A>G||1245|||||rs17860405&CM060890|1||1||SNV|1|HGNC|HGNC:1500||||2||||||||1||||||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631|||||||||,G|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00001043178|promoter||||||||||rs17860405&CM060890|1||||SNV|1||||||||||||||||||||0.0128|0.0008|0.0231|0|0.0417|0.0051|0.007036|0.03744|0.02961|0.006645|0.0189|0.01234|0.0001088|0.06514|0.04129|0.03525|0.00778|0.06514|gnomAD_FIN|benign||1&1|20301287&22056502&16446975&31249631||||||||| GT:AD:DP:GQ:PL 0/0:1749,0:1749:99:0,120,1800 0/0:1442,0:1442:99:0,120,1800 0/1:3017,2917:5944:99:71491,0,70354 0/0:1503,0:1503:99:0,120,1800 0/0:1396,0:1396:99:0,120,1800 0/0:1699,0:1699:99:0,120,1800 0/0:1420,0:1420:99:0,120,1800 0/1:3209,3150:6389:99:85687,0,83797 0/0:2219,0:2219:99:0,120,1800 0/0:1296,0:1296:99:0,120,1800 0/0:1645,0:1645:99:0,120,1800 0/0:1534,0:1534:99:0,120,1800 0/0:1289,0:1289:99:0,120,1800 0/0:1483,0:1483:99:0,120,1800 0/0:1526,0:1526:99:0,120,1800
chr4 112430666 rs61747381 G A 181587 . AC=2;AF=0.067;AN=30;AS_BaseQRankSum=.;AS_FS=0;AS_InbreedingCoeff=1;AS_MQ=60;AS_MQRankSum=.;AS_QD=30.77;AS_ReadPosRankSum=.;AS_SOR=0.7;DB;DP=33574;ExcessHet=0.0755;FS=0;InbreedingCoeff=1;MLEAC=2;MLEAF=0.067;MQ=60;QD=30.77;SOR=0.699;CSQ=A|synonymous_variant|LOW|ALPK1|ENSG00000073331|Transcript|ENST00000177648|protein_coding|11/16||ENST00000177648.13:c.1119G>A|ENSP00000177648.9:p.Gly373%3D|1319|1119|373|G|ggG/ggA|rs61747381|1||1||SNV|1|HGNC|HGNC:20917||||1|P2|CCDS3697.1|ENSP00000177648|Q96QP1.147||UPI000045725F|Q96QP1-1|1|||PDB-ENSP_mappings:5z2c.A&PDB-ENSP_mappings:5z2c.B&PDB-ENSP_mappings:5z2c.C&PDB-ENSP_mappings:5z2c.D&PDB-ENSP_mappings:5z2c.E&PDB-ENSP_mappings:5z2c.F&PDB-ENSP_mappings:5z2c.G&PDB-ENSP_mappings:5z2c.H&PDB-ENSP_mappings:5z2c.I&PANTHER:PTHR46747|||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|synonymous_variant|LOW|ALPK1|ENSG00000073331|Transcript|ENST00000458497|protein_coding|12/17||ENST00000458497.6:c.1119G>A|ENSP00000398048.1:p.Gly373%3D|1501|1119|373|G|ggG/ggA|rs61747381|1||1||SNV|1|HGNC|HGNC:20917||||5|P2|CCDS3697.1|ENSP00000398048|Q96QP1.147||UPI000045725F|Q96QP1-1|1|||PDB-ENSP_mappings:5z2c.A&PDB-ENSP_mappings:5z2c.B&PDB-ENSP_mappings:5z2c.C&PDB-ENSP_mappings:5z2c.D&PDB-ENSP_mappings:5z2c.E&PDB-ENSP_mappings:5z2c.F&PDB-ENSP_mappings:5z2c.G&PDB-ENSP_mappings:5z2c.H&PDB-ENSP_mappings:5z2c.I&PANTHER:PTHR46747|||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|synonymous_variant|LOW|ALPK1|ENSG00000073331|Transcript|ENST00000504176|protein_coding|10/15||ENST00000504176.6:c.885G>A|ENSP00000426044.2:p.Gly295%3D|1191|885|295|G|ggG/ggA|rs61747381|1||1||SNV|1|HGNC|HGNC:20917||||2|A2|CCDS58923.1|ENSP00000426044|Q96QP1.147||UPI00020657A2|Q96QP1-2|1|||PANTHER:PTHR46747|||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|non_coding_transcript_exon_variant|MODIFIER|ALPK1|ENSG00000073331|Transcript|ENST00000504745|retained_intron|7/12||ENST00000504745.1:n.1607G>A||1607|||||rs61747381|1||1||SNV|1|HGNC|HGNC:20917||||2||||||||1||||||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|intron_variant&NMD_transcript_variant|MODIFIER|ALPK1|ENSG00000073331|Transcript|ENST00000505127|nonsense_mediated_decay||10/14|ENST00000505127.5:c.900+1413G>A|||||||rs61747381|1||1||SNV|1|HGNC|HGNC:20917||||2|||ENSP00000425559||B3KUH8.69|UPI00003E6011||1||||||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|downstream_gene_variant|MODIFIER|ALPK1|ENSG00000073331|Transcript|ENST00000508589|processed_transcript||||||||||rs61747381|1|3017|1||SNV|1|HGNC|HGNC:20917||||3||||||||1||||||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|downstream_gene_variant|MODIFIER|ALPK1|ENSG00000073331|Transcript|ENST00000509209|retained_intron||||||||||rs61747381|1|4703|1||SNV|1|HGNC|HGNC:20917||||2||||||||1||||||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|3_prime_UTR_variant&NMD_transcript_variant|MODIFIER|ALPK1|ENSG00000073331|Transcript|ENST00000509722|nonsense_mediated_decay|10/15||ENST00000509722.5:c.*562G>A||1154|||||rs61747381|1||1||SNV|1|HGNC|HGNC:20917||||2|||ENSP00000424492||D6RB29.49|UPI0001D3B73A||1||||||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|downstream_gene_variant|MODIFIER|ALPK1|ENSG00000073331|Transcript|ENST00000512847|retained_intron||||||||||rs61747381|1|2555|1||SNV|1|HGNC|HGNC:20917||||3||||||||1||||||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|downstream_gene_variant|MODIFIER|ALPK1|ENSG00000073331|Transcript|ENST00000515330|nonsense_mediated_decay||||||||||rs61747381|1|4973|1||SNV|1|HGNC|HGNC:20917||||2|||ENSP00000423978||B4E0R2.73|UPI00017A8368||1||||||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR|||||||||||||,A|synonymous_variant|LOW|ALPK1|ENSG00000073331|Transcript|ENST00000650871|protein_coding|11/16||ENST00000650871.1:c.1119G>A|ENSP00000498374.1:p.Gly373%3D|1372|1119|373|G|ggG/ggA|rs61747381|1||1||SNV|1|HGNC|HGNC:20917|YES|NM_025144.4|||P2|CCDS3697.1|ENSP00000498374|Q96QP1.147||UPI000045725F|Q96QP1-1|1|||PDB-ENSP_mappings:5z2c.A&PDB-ENSP_mappings:5z2c.B&PDB-ENSP_mappings:5z2c.C&PDB-ENSP_mappings:5z2c.D&PDB-ENSP_mappings:5z2c.E&PDB-ENSP_mappings:5z2c.F&PDB-ENSP_mappings:5z2c.G&PDB-ENSP_mappings:5z2c.H&PDB-ENSP_mappings:5z2c.I&PANTHER:PTHR46747|||0.0383|0.0023|0.1081|0.001|0.0447|0.0695|0.01158|0.05547|0.06771|0.01003|0.1899|0.0278|0.0008157|0.03022|0.05631|0.06042|0.08382|0.1899|gnomAD_AMR||||||||||||| GT:AD:DP:GQ:PL 0/0:2080,0:2080:99:0,120,1800 0/0:1878,0:1878:99:0,120,1800 0/0:1753,0:1753:99:0,120,1800 1/1:0,5901:5920:99:181613,17695,0 0/0:1617,0:1617:99:0,120,1800 0/0:2007,0:2007:99:0,120,1800 0/0:1602,0:1602:99:0,120,1800 0/0:2344,0:2344:99:0,120,1800 0/0:3030,0:3030:99:0,120,1800 0/0:1765,0:1765:99:0,120,1800 0/0:1938,0:1938:99:0,120,1800 0/0:1952,0:1952:99:0,120,1800 0/0:1676,0:1676:99:0,120,1800 0/0:1925,0:1925:99:0,120,1800 0/0:1958,0:1958:99:0,120,1800