Loading the annotation dataset and trying to produce annotation to each row of the main dataset:
nisc = hl.read_table('file:///NISC.ht')
res = hl.struct(**{'AC': nisc[mt.row_key].info.AC[mt.a_index-1],
'AF': nisc[mt.row_key].info.AF[mt.a_index-1]})
Now, if I trigger actual computation of the struct it fails:
res.collect()
Hail version: 0.2.63-cb767a7507c8
Error summary: HailException: array index out of bounds: index=1, length=1
----------
Python traceback:
File "<ipython-input-7-b20bb2a7b3af>", line 1, in <module>
res = hl.struct(**{'AC': nisc[mt.row_key].info.AC[mt.a_index-1],
How can I see which row fails here and why? I need to identify the row in NISC perhaps that gives the issue.
When I take another non-problematic datset cidr and run the same functions with it the output is the same (same variant positions) which is wrong because it does not have any issues with it, it runs fine.
I have also counted the number of allegedly problematic rows of mt rows with cidr (which does not have any in reality) and it is 11034 while with nisc I see 4689. So, the query - mt.filter_rows(hl.len(nisc[mt.row_key].info.AC) >= mt.a_index - 1) - does not seem to return correct results.
I verified that the number of items in the AC are 1 and a_index is 1 in these cases, so these are not problematic rows.