Compare two variant datasets

Hi all,

I am new to hail. I couldn’t find a function where I can compare two different vcf files in hail. I am interesting to compare two datasets of biallelic variants only and find which are common & novel variants using hail

Kindly help me regarding this.

Thanks in advance

Hi Krithika

Thank you for your interest in using Hail! We love having our new users be interested in using the tool that we work on to enable easier genetics analyses.

As to get started on your scientific question, I’d highly suggest heading to hail.is and subsequently, going through our documentation page https://hail.is/docs/0.2/

This will teach you how to first install Hail on your computing environment of choice, then importing your vcf files as matrix tables, annotate the variants with allele frequencies and functional annotations, and finally answer your question on common and novel variants in your files.

To check for variants between your files, you can use outer join (https://hail.is/docs/0.2/tutorials/06-joins.html)