Hi,
I got a python function that expect a string to perform some transformation on it.
def myfunction(str):
return str.encode("utf-8").hex()
Eventually I wish to apply this function to one field of my hail table for each row
ht = ht.annotate(
hex = myfunction(ht.s)
)
But I got an error :
AttributeError: 'StringExpression' object has no attribute 'encode'
Is there a way to apply a function to each row ? how do I “cast” my StringExpression into the string of the current row ?
You’re asking about how to use user-defined functions in Hail. Unfortunately, this isn’t something we support – Hail’s backend is not running Python at all, and so it’s not possible to call myfunction
from our backend. All of the functionality exposed on Hail expressions is implemented natively in Hail’s compiler / execution engine.
It’s pretty easy for us to expose this particular example (encode to binary, convert binary to hex strings). Would that be helpful?
I see,
I found a way to do my transformation:
- I select the field I need to transform
- Export to Pandas
- Here I am in python so I can apply my function
- Import table from Pandas
- Stich back my fields in my Hail Table
That is good enough for my usecase.
Transformation on one hail table field do not required too many compute.
Then I can use the power of Hail for more complicate genotype level handeling
Thanks