I have an entry field in a matrix table with two string values and I want to count them with the hl.agg.acounter but cannot figure out how to do this for each entry in the array.
ht = hl.utils.range_table(10)
ht = ht.annotate(arr = hl.range(0, 2).map(lambda _: hl.if_else(hl.rand_bool(0.5),'REF','PTV')))
ht.show()
idx. arr
int32 array<str>
0 ["REF","PTV"]
1 ["PTV","REF"]
2 ["REF","REF"]
3 ["PTV","PTV"]
4 ["PTV","PTV"]
5 ["REF","REF"]
6 ["PTV","REF"]
7 ["REF","REF"]
8 ["REF","REF"]
9 ["PTV","PTV"]
I would like to be able to count each occurance of âREFâ and âPTVâ with hl.agg.counter or something so I could get this:
{'PTV': 9, 'REF': 11}
I get close with something like this:
ht.aggregate(hl.agg.array_agg(lambda x: hl.agg.counter(x),ht.arr))
[{'PTV': 5, 'REF': 5}, {'PTV': 4, 'REF': 6}]
But then I need to run through that list and add each keys value etcâŚ
Or this, but then I need to know the actual field values, which I donât
ht.aggregate(
hl.dict([
('REF', hl.agg.sum(ht.arr.filter(lambda x: x == 'REF').size())),
('PTV', hl.agg.sum(ht.arr.filter(lambda x: x == 'PTV').size()))
])
)
{'PTV': 9, 'REF': 11}
Is there an easier way to do this?