Probably simple but I have spent way too long on this. I have a table and would like to “de-duplicate” the rows keeping only row for each group in id2 based on max value of val. I manage with some gnarly group_by() -> hl.agg.max() -> join() but there must be a neater way. Essentially, I would like to turn this table:
id1 id2 val more_columns
-----------
a x 0.1 ...
b x 0.3 ...
c x 0.4 ...
d y 0.2 ...
e y 0.9 ...
f y 0.5 ...
Into this:
id1 id2 val more_columns
-------------
c x 0.4 ...
e y 0.9 ...