my data is on GRCh38 and I want to filter a set of SNPs from this. Im am looking for something like this:
intervals = [hl.parse_locus(x, reference_genome='GRCh38') for x in ['4:144585304', '15:78575140']]
mt_hits = hl.filter_intervals(mt, intervals, keep = True)
which is giving the error: TypeError: filter_intervals: parameter ‘intervals’: expected expression of type array<interval>, found list: [<LocusExpression of type locus>, <LocusExpression of type locus>]
intervals = [hl.parse_locus(x, reference_genome='GRCh38') for x in ['4:144585304', '15:78575140']]
mt_hits = mt.filter_rows(hl.literal(intervals).contains(mt.locus))
will have the same performance as filter_intervals, but is much easier to use. See here:
Thank you, I am now using HAIL version 0.2.15-652d93ae3419, but get this error:
Error summary: HailException: Invalid locus ‘4:144585304’ found. Contig ‘4’ is not in the reference genome ‘GRCh38’
I have also tried other loci of which I am also sure the should be in GRCh38, but all give the same error.