Hello everyone,
Congratulations and waiting for 0.2 to roll out.
I had been working on extraction of variant information and working on gwas data and i thought of extracting data by means of rsid. I think it will be helpful as each variant have rsids and can be very easy in mapping. and i think it also solves the transcripts problem(like one variant have many transcripts when rsid is used for extraction each have their own rsid which helps in extraction)
Thank you
Hi Akhil,
can you give an example of the input and output you’d like to see? The variant annotations / bed file that should get produced.
Sorry tim for the late reply. basically its like this. if we have chrom regions along with rsids , we can extract the genes with corresponding snps informaiton based on rsids.( i dont know what i mentioned is correct or not)
Input:
chrom |
start |
end |
snps |
P_VALUE |
disease |
chr1 |
203155882 |
203155882 |
rs4950928 |
1.00E-13 |
YKL-40 levels |
chr13 |
40350912 |
40350912 |
rs7993214 |
2.00E-06 |
Psoriasis |
chr15 |
78806023 |
78806023 |
rs8034191 |
3.00E-18 |
Lung cancer |
chr1 |
159680868 |
159680868 |
rs2808630 |
7.00E-06 |
Lung cancer |
chr3 |
190350461 |
190350461 |
rs7626795 |
8.00E-06 |
Lung cancer |
output should be like this in bed file format:
rsid chr TSS TTS genes_present_in_that_region
and if possible we can extract the variants based on rsids near Transcriptions start site using rsids?
I dont know whether its possible or not but just an idea
Ah, so you need to take the information in the table and find variants nearby? e.g. the chr1:203155882 line should find variants on chr1 nearby 203155882.
That’s something that totally should be possible in Hail, but I’m not sure how to do it right now.
Let’s explore when 0.2 is out!
ya thats the exact information i need. so thought of asking why not keep that kind of functionality in hail.