Mark
June 18, 2020, 4:44pm
1
I tried option number 2 but got the following error after the second line:
mt = mt.filter_rows(variants_to_include_set.contains(mt.locus))
TypeError: ‘SetExpression.contains’ expects ‘item’ to be the same type as its elements
set element type: ‘struct{locus: locus}’
type of arg ‘item’: ‘locus’
Any idea what I’m doing wrong?
Hey @Mark ,
Sorry you’re running into this problem.
This is because there is a difference between a struct containing a locus:
hl.struct(locus=hl.locus(...))
and a locus:
hl.locus(...)
You can fix this with:
mt = mt.filter_rows(variants_to_include_set.contains(hl.struct(locus=mt.locus)))
Alternatively, you can create a set that doesn’t have the unnecessary struct wrapper. Tim’s example in the other thread would look like this:
variants_to_include_set = hl.literal(set(variants_to_include.key.locus.collect()))
Mark
June 18, 2020, 6:04pm
3
Gotcha. Thanks! I couldn’t figure out how to get rid of the structure wrapper.
For completeness, let me note that you can also fix the set itself.
Any set or array that contains structs can “project” one field of the struct like this:
variants_to_include_set = variants_to_include_set.locus
You could also use map
:
variants_to_include_set = variants_to_include_set.map(
lambda a_struct: a_struct.locus)