Hi, I started to look into HAIL this week and I am really amazed how fast it is.
I want to calculate the MAC for all of my sites (> 100 million), but I am struggling with this.
I have an INFO/AC entry with is an array with as many entries as there are ALT alleles.
I tried to get the max value like this, but no success:
What failed about this? I’d expect this to work fine.
Separately, what is the definition of “MAC” at a multiallelic site? Is it the allele count of the alt allele with the most observations? the fewest? The sum of all alts?
TypeError: max: parameter 'expr': expected expression of type int32 or int64 or float32 or float64, found <ArrayNumericExpression of type array<int32>>
There are two max functions in Hail – hl.max and hl.agg.max. hl.max takes the maximum value of an array, while hl.agg.max is an aggregator that takes the max value along some aggregated axis. If you wanted to take the max value of some entry field (like GQ) per row, using hl.agg.max is correct, but here we want hl.max.