Hello,
I’d like to ask what would be the best way to randomly permute the columns of a Table or a MatrixTable? Is there a way similar to running df.columns = new_list as in pandas?
Thanks,
Hello,
I’d like to ask what would be the best way to randomly permute the columns of a Table or a MatrixTable? Is there a way similar to running df.columns = new_list as in pandas?
Thanks,
You can permute the columns of a matrix table with choose_cols.
Tables have fields, not columns, in Hail speak. We don’t yet have a nice way to turn a table into a matrix table row by row. You could create a single array field from many fields in the table, and then annotate_entries on a matrix table with the same row keys using the column index as array index. Alternatively you could export the table to TSV and import_matrix_table.
Just to clarify, choose_cols
will permute both the entry and column-indexed fields. I’m not sure if this is specifically what you’re trying to do.
I’m trying to obtain a set of null stats for linear_regression by permuting the genotypes - but since it’ll permute both the entry and column-indexed fields, I’m guessing it’ll not yield the permuted results after all?
Agreed. Could you just annotate with a bunch of permutations of the phenotype, leaving the genotypes in place?
that’ll be way better. Extract phenotypes to a Python list, shuffle it, hl.literal
it, annotate using column index.