Hello,
I’m trying to group by a callset with two samples by the variant type, and the genotype of each sample, and then count the number of variants in each category. To that end, I tried to annotate the rows the the genotype of the first sample: both_hcr_hl.annotate_cols( gt0 = both_hcr_hl.GT[0] )
I am getting the following error:
2023-06-18 11:44:06 Hail: ERROR: scope violation: ‘MatrixTable.annotate_cols: field ‘gt0’’ expects an expression indexed by [‘column’]
Found indices [‘row’, ‘column’], with unexpected indices [‘row’]. Invalid fields:
‘GT’ (indices [‘row’, ‘column’])
‘MatrixTable.annotate_cols: field ‘gt0’’ supports aggregation over axes [‘row’], so these fields may appear inside an aggregator function.
EDIT: oops! I missed your reply. Glad you find something that works!
GT is an entry field. That means there’s a value of GT for each sample at each variant. When you index a GT like GT[0] you’re asking for the first allele of the genotype.
Hail is designed for datasets too large to fit in memory, so it doesn’t let you easily do things like get the genotype from a specific variant for every sample.
You can group the variants into variant types with something like: