Hello Hail Dev team,
CHR POS REF ALT INFO FORMAT SAMPLE1 SAMPLE2 ... SAMPLEn
chr1 1100 C T AF=0.3,GQ=20,... GT,AF,DP... 0/1,0.5,... 0/1,0.5,... 0/1,0.5,...
chr1 1101 G T AF=...,GQ=... GT,AF,DP... 0/1,0.5,... 0/1,0.5,... 0/1,0.5,...
chr1 1102 A T AF=...,GQ=... GT,AF,DP... 0/1,0.5,... 0/1,0.5,... 0/1,0.5,...
chr1 1103 C G AF=...,GQ=... GT,AF,DP... 0/1,0.5,... 0/1,0.5,... 0/1,0.5,...
chr1 1104 C T AF=...,GQ=... GT,AF,DP... 0/1,0.5,... 0/1,0.5,... 0/1,0.5,...
chr1 1105 C T AF=...,GQ=... GT,AF,DP... 0/1,0.5,... 0/1,0.5,... 0/1,0.5,...
chr1 1106 C T AF=...,GQ=... GT,AF,DP... 0/1,0.5,... 0/1,0.5,... 0/1,0.5,...
...
I want to filter all variants of a specific sample to create a structured data and export it to elasticsearch database. Data looks like in below example:
{
"sample_ID": "SAMPLE1",
"AF": 0.5,
"GQ": 10,
...
variant_filter: [
{
"locus": {
"contig": "chr1",
"position": 1100
},
"alleles": [
"C",
"T"
],
"variant_class": "SNV",
"consequences": ["intron_variant", ...],
"population_allele_freq": 0,3,
"population_genotype_quality": 20,
},
{
"locus": {
"contig": "chr2",
"position": 1101
},
"alleles": [
"G",
"T"
],
"variant_class": "indel",
"consequences": ["downstrean_gene_variant", ...],
"population_allele_freq": 0,125,
"population_genotype_quality": 50,
},
...
]
}
Does Hail support an easy way to do above work?