Log of breaking changes in 0.2 beta


#1

This thread documents breaking changes as we work toward stabilizing the development branch (i.e., master, 0.2 beta) as Hail 0.2 proper.


Clarification of `agg.sum` behavior
#2

Removed as_array parameter from PCA

pca and hwe_normalized_pca no longer take an as_array parameter. They now always return scores and loadings as arrays (formerly the as_array=True option).

See the overview tutorial for example usage in GWAS, where PC1 becomes scores[0].


#3

Removed dataset parameter from eight methods

All methods that took a dataset and at least one required expression on that dataset no longer take a dataset parameter at all (the dataset is implicitly the source of the expression):

grm
linear_regression
logistic_regression
linear_mixed_regression
pc_relate
pca
rrm
skat



#4

Changed ys to y and schema in linear regression

Consistent with the other statistics methods, the parameter ys on linear_regression is now y, and when y is an expression the linreg fields all have type float64. This is consistent with the other regression methods.

When y is a list of expressions (even a list of one expression) the behavior is the same as before: the the five y-dependent linreg fields have type array[float64].

The field n_complete_samples is now just n.

See the overview tutorial for example usage of the case where y is an expression. In particular, linear_regression_results.linreg.p_value[0].collect() no longer takes [0].


#5

See:


#6

Oops. See:


#7

ld_prune has changed to take a CallExpression instead of a matrix table. The new signature is ld_prune(call_expr, r2=0.2, window=1000000, memory_per_core=256).

See: https://github.com/hail-is/hail/pull/3518


#8

ld_prune no longer requires unphased genotypes (though it still makes no use of phasing information). And the parameter window has been renamed bp_window_size.

See: https://github.com/hail-is/hail/pull/3575


#9

While we’re at it, it also returns a Table with just ('locus', 'alleles') that is the set of independent variants at that threshold (rather than previously returning the MatrixTable filtered to that set).