I’m not sure what’s going on as this is my first time using HAIL but the issue I ran into was trying to annotate a VCF file with some gnomAD data.
gnomAD_raw = hl.read_table('/Users/irenaeuschan/Documents/Irenaeus/Scripts/gnomAD/gnomad.exomes.r2.1.1.sites.ht') vcf = hl.import_vcf(vcf_file, reference_genome='GRCh37') vcf.write(base_name + ".mt", overwrite=True) mt = hl.read_matrix_table(base_name + ".mt") mt = mt.distinct_by_row() t = mt.make_table(separator='_')
ht_annotate = t.annotate(gnomad_age_dist = gnomAD_raw[t.locus, t.alleles]['age_hist_het']['bin_freq'][age_index]) ht_annotate.export("output/annotated.tsv")
Issue here is at the annotate step, my VCF file contains 15 rows of data (testing with a smaller dataset for now) but I have been stuck on this annotation Stage for an hour, which doesn’t seem to make sense given the small input data. Furthermore, after the annotate (once it finally finishes), the exporting takes even longer.
Is there a reason for this happening or am I missing something with how I am trying to write the program?