We need online median calculation in Hail. Currently, the closest thing you can get to a median is to calculate a histogram of your data using
hist and then determine the bin in which the median resides (by finding the weighted median of the bins).
It would be nice if there was an
Aggregable[Numeric].median() aggregator. What are some strategies for doing this with RDDs?