Is there an easy way to make a struct from list of columns in hail table?

I would like to group some columns of a table into a struct for better organization…

I have this

phenotypes = ['Diabetes']
covariates = ['BMI', 'SEX', 'Ethnicity']

And want to go from this

to this:

I know how to do this MANUALLY, like this:

hta = ht.annotate(phenotypes = hl.struct(Diabetes = ht.Diabetes), covariates = hl.struct(BMI = ht.BMI, Ethnicity = ht.Ethnicity, SEX = ht.SEX))
hta = hta.drop('Diabetes','BMI','Ethnicity','SEX')

But I feel there is an easier and more automatable way (where I can use the name of the list as the annotation and the columns from the ht in the list as the values…


Yep! The trick is do the following.

cov_dict = { name : ht[name] for name in covariates}
pheno_dict = { name : ht[name] for name in pheno}
hta = ht.annotate( covariates = hl.struct(**cov_dict), phenotypes = hl.struct(**pheno_dict))

You can also avoid the drop if you do transmute instead of annotate.

Excellent! Is transmute “cheaper” to run, or is hail simply doing a drop itself and is it just a convenience function? I can imagine that a user would like to “change their mind” about phenotypes vs covariates for GWAS etc. so it may happen more than once, so if it is much more efficient to do transmute, that would be great to know.

In any case, works like a charm!

Totally convenience. transmute is implemented as annotate/drop.