[Breaking change] New changes to select, annotate, key_by interface for 0.2



This PR will change the way keys on Tables and MatrixTables are dealt with in certain functions. (The doc links will be updated once this pull request lands.)

key_by(*expr, **named_exprs) (and key_rows_by and key_cols_by)

is the only method that can modify key fields. The interface is identical to the current select interface, where non-named exprs must be field references (not necessarily top-level), but non-field-reference expressions can still be used if a name is provided. All unused former key expressions are retained as value fields.

partition_rows_by(partition_key, *exprs, **named_exprs)

lets you specify a partition key from the key fields in the new matrix table. It’s identical to key_rows_by, but takes a list of key fields as its first arguement as a partition key. This interface is still subject to change.

annotate(**named_exprs) (and annotate_rows and annotate_cols)

An attempt to annotate over a key field will cause an error. If you actually want to annotate over a key field, use key_by directly (if preserving the field as a key), or specify a new key and then call annotate.

select(*exprs, **named_exprs) (and select_rows and select_cols)

likewise deals only with value fields. All key fields are automatically preserved; if locus and alleles are already row keys in a MatrixTable, you can do mt.select_rows() to drop all the other fields. If you want to


In order to drop a field, remove it from the key first with key_by.


Restrictions on what can be overwritten with transmute are identical to the restrictions on annotate. The main change is that referencing a key field is allowed, but will not cause the field to be dropped. Value fields will continue to be dropped.

Log of breaking changes in 0.2 beta