Since there’s no .median()
function yet, I’ve got this following hack to calculate the median of an aggregator (here, median of GQs across all carriers):
va.median_gq =
let sorted_vals = gs.filter(g => g.isCalledNonRef && !isMissing(g.gq)).map(g => g.gq).collect().sort() in
if (sorted_vals.size == 0) NA: Double
else if (sorted_vals.size %% 2 == 1) sorted_vals[(sorted_vals.size/2).toInt]
else (sorted_vals[(sorted_vals.size/2).toInt] + sorted_vals[(sorted_vals.size/2).toInt - 1])/2.0
It’s pretty slow though since it sorts - happy to hear any other solutions!