TypeError: struct() got multiple values for keyword argument 'AC' when updating matrix table with annotate_rows

Dear hail team,

I hope this question has not been posted yet, I can’t find the answer at least. I am trying to get familiar with hail by filtering and doing simple stuff I usually do on VCFs with softwares like bcftools.
I am working with Hail version 0.2.57-582b2e31b8bd
I splitted my VCF from multiallelic to biallelic with:
data_tmp_bi = hl.split_multi_hts(data_tmp)
Then I want to update the allele counts the same way as what I saw in your documentation:
data_tmp_bi = data_tmp_bi.annotate_rows(info = hl.struct(AC=data_tmp_bi.info.AC[data_tmp_bi.a_index - 1],**data_tmp_bi.info))
but I get this error:

  File "<ipython-input-29-3595a23add68>", line 1, in <module>
    data_tmp_bi = data_tmp_bi.annotate_rows(info = hl.struct(AC=data_tmp_bi.info.AC[data_tmp_bi.a_index - 1],**data_tmp_bi.info))

TypeError: struct() got multiple values for keyword argument 'AC'

The workaround I have been using is then to create a new info field called ‘AC2’, to then drop the ‘AC’ field and then recreate the ‘AC’ field with annotate_rows with to finally drop ‘AC2’. Which is a long workaround:

data_tmp_bi = data_tmp_bi.annotate_rows(info = hl.struct(AC2=data_tmp_bi.info.AC[data_tmp_bi.a_index - 1],**data_tmp_bi.info))
data_tmp_bi = data_tmp_bi.annotate_rows(info=data_tmp_bi.info.drop('AC'))
data_tmp_bi = data_tmp_bi.annotate_rows(info = hl.struct(AC=data_tmp_bi.info.AC2, **data_tmp_bi.info))
data_tmp_bi = data_tmp_bi.annotate_rows(info=data_tmp_bi.info.drop('AC2'))

On top of this, I filter by column some samples with filter_cols , so then, I want to update fields like allele count again, so I would still need to use the same workaround as above, otherwise I get the same error.

Do you have an idea of what the problem might be? Or a better way of doing this than what I am using?

So the problem here is that **data_tmp_bi.info is a struct that contains AC. So in this line:

data_tmp_bi = data_tmp_bi.annotate_rows(info = hl.struct(AC=data_tmp_bi.info.AC[data_tmp_bi.a_index - 1],**data_tmp_bi.info))

you’re saying "Make a struct called info, with a field called AC, and all of the old fields from data_tmp_bi.info". Since one of the old fields is AC, you’re defining AC twice, which isn’t allowed.

If your goal is “Keep the rest of the info struct the same, just change the AC field”, the easiest way to do that is probablly to do:

data_tmp_bi = data_tmp_bi.annotate_rows(info = data_tmp_bi.info.annotate(AC=data_tmp_bi.info.AC[data_tmp_bi.a_index - 1])))

That says "The new info field is determined by taking the old info field and overwriting AC

Alright I get the problem, thank you very much, it works perfectly this way!
However, in the split_multi_hts documentation (https://hail.is/docs/0.2/methods/genetics.html#hail.methods.split_multi_hts) what is written is the solution I tried, which is why I got confused.

Thanks for bringing that to my attention, let me go look into that and see why that’s there. Theoretically, when the website gets generated all of our examples in the documentation should get tested, so not sure what’s going on there.

For some reason, this example in our documents was marked to not be tested. You’re right that it was incorrect, and it should be resolved next time we release a version of hail and update our site. Fix here: