I’m new to hail. I might have some very basic questions.
Currently, I am doing the genotype QC, to exclude chromosome Y for female participants. However, I was able to do is to either filter the matrix table to female participants only,
mt_female = mt.filter_cols(mt.pheno.is_female)
Or to use hl.filter_intervals to remove chrY for both male and female participants,
intervals = [hl.parse_locus_interval(x, reference_genome = ‘GRCh38’) for x in ‘chrY’]
mt_filtered = hl.filter_intervals(mt, intervals, keep = False)
Is it possible to remove chromosome Y from female participants?
Hail represents genetic data in the MatrixTable, which is a structured matrix of fields. The cheat sheet will be very helpful for seeing some visual representations of transformations.
“removing chromosome Y from female participants” is not a super clear operation on a matrix, because it means removing a block of rows (variants) for some of the columns (samples), leaving something that isn’t actually a matrix.
Instead, you might want to filter the entries of female participants on chromosome Y. This means removing entries from the matrix, leaving a matrix that looks like swiss cheese with holes in it where filtered entries used to be.