Extraction of bed files based on rsids


Hello everyone,
Congratulations and waiting for 0.2 to roll out.

I had been working on extraction of variant information and working on gwas data and i thought of extracting data by means of rsid. I think it will be helpful as each variant have rsids and can be very easy in mapping. and i think it also solves the transcripts problem(like one variant have many transcripts when rsid is used for extraction each have their own rsid which helps in extraction)

Thank you


Hi Akhil,
can you give an example of the input and output you’d like to see? The variant annotations / bed file that should get produced.


Sorry tim for the late reply. basically its like this. if we have chrom regions along with rsids , we can extract the genes with corresponding snps informaiton based on rsids.( i dont know what i mentioned is correct or not)


chrom start end snps P_VALUE disease
chr1 203155882 203155882 rs4950928 1.00E-13 YKL-40 levels
chr13 40350912 40350912 rs7993214 2.00E-06 Psoriasis
chr15 78806023 78806023 rs8034191 3.00E-18 Lung cancer
chr1 159680868 159680868 rs2808630 7.00E-06 Lung cancer
chr3 190350461 190350461 rs7626795 8.00E-06 Lung cancer

output should be like this in bed file format:

rsid chr TSS TTS genes_present_in_that_region and if possible we can extract the variants based on rsids near Transcriptions start site using rsids?

I dont know whether its possible or not but just an idea


Ah, so you need to take the information in the table and find variants nearby? e.g. the chr1:203155882 line should find variants on chr1 nearby 203155882.

That’s something that totally should be possible in Hail, but I’m not sure how to do it right now.

Let’s explore when 0.2 is out!


ya thats the exact information i need. so thought of asking why not keep that kind of functionality in hail.