Is there an easy way to make a struct from list of columns in hail table?

thondeboer · November 16, 2021, 9:43am

I would like to group some columns of a table into a struct for better organization…

I have this

phenotypes = ['Diabetes']
covariates = ['BMI', 'SEX', 'Ethnicity']

And want to go from this

ht.show()

to this:

hta.show()

I know how to do this MANUALLY, like this:

hta = ht.annotate(phenotypes = hl.struct(Diabetes = ht.Diabetes), covariates = hl.struct(BMI = ht.BMI, Ethnicity = ht.Ethnicity, SEX = ht.SEX))
hta = hta.drop('Diabetes','BMI','Ethnicity','SEX')
hta.show()

But I feel there is an easier and more automatable way (where I can use the name of the list as the annotation and the columns from the ht in the list as the values…

Thanks!

johnc1231 · November 16, 2021, 1:24pm

Yep! The trick is do the following.

cov_dict = { name : ht[name] for name in covariates}
pheno_dict = { name : ht[name] for name in pheno}
hta = ht.annotate( covariates = hl.struct(**cov_dict), phenotypes = hl.struct(**pheno_dict))

You can also avoid the drop if you do transmute instead of annotate.

thondeboer · November 16, 2021, 5:13pm

Excellent! Is transmute “cheaper” to run, or is hail simply doing a drop itself and is it just a convenience function? I can imagine that a user would like to “change their mind” about phenotypes vs covariates for GWAS etc. so it may happen more than once, so if it is much more efficient to do transmute, that would be great to know.

In any case, works like a charm!

tpoterba · November 16, 2021, 5:38pm

Totally convenience. transmute is implemented as annotate/drop.

Topic		Replies	Views
Annotating nested structs based on the struct field value Hail Query & hailctl	0	311	August 21, 2023
Compound hets and array<str> to list help Hail Query & hailctl	2	555	May 12, 2020
TypeError: struct() got multiple values for keyword argument 'AC' when updating matrix table with annotate_rows Hail Query & hailctl	4	493	November 6, 2020
Issues with annotating a MT with a HT Hail Query & hailctl	1	318	June 3, 2022
Hail matrix column to list Hail Query & hailctl	1	472	August 10, 2023

Is there an easy way to make a struct from list of columns in hail table?

Related topics