I have a “keep” file for rsid’s that I want to keep in my vcf files. What is the best way of doing this?
rsidTable = hl.import_table(ids) # This is my keep list
vcfs = hl.hadoop_ls(vcfloc)
for loc in vcfs: # vcfs are split by chrom
....name = loc['path'].split("/")[-1]
....vcf = hl.import_vcf(name, reference_genome=None)
eventually, I’d like to consolidate the vcfs from each chromosome into a single file. Any help on any part of that is much appreciated but right now I’m mainly stuck on how to subset the vcf by rsids.