Log of breaking changes in 0.2 beta

jbloom · April 5, 2018, 3:42pm

This thread documents breaking changes as we work toward stabilizing the development branch (i.e., master, 0.2 beta) as Hail 0.2 proper.

jbloom · April 5, 2018, 3:49pm

Removed as_array parameter from PCA

pca and hwe_normalized_pca no longer take an as_array parameter. They now always return scores and loadings as arrays (formerly the as_array=True option).

See the overview tutorial for example usage in GWAS, where PC1 becomes scores[0].

https://github.com/hail-is/hail/pull/3280

jbloom · April 5, 2018, 3:56pm

Removed dataset parameter from eight methods

All methods that took a dataset and at least one required expression on that dataset no longer take a dataset parameter at all (the dataset is implicitly the source of the expression):

grm
linear_regression
logistic_regression
linear_mixed_regression
pc_relate
pca
rrm
skat

https://github.com/hail-is/hail/pull/3211
https://github.com/hail-is/hail/pull/3262

jbloom · April 5, 2018, 4:12pm

Changed ys to y and schema in linear regression

Consistent with the other statistics methods, the parameter ys on linear_regression is now y, and when y is an expression the linreg fields all have type float64. This is consistent with the other regression methods.

When y is a list of expressions (even a list of one expression) the behavior is the same as before: the the five y-dependent linreg fields have type array[float64].

The field n_complete_samples is now just n.

See the overview tutorial for example usage of the case where y is an expression. In particular, linear_regression_results.linreg.p_value[0].collect() no longer takes [0].

https://github.com/hail-is/hail/pull/3295

wang · May 9, 2018, 5:26pm

See:

tpoterba · May 9, 2018, 5:31pm

Oops. See:

Meredith_Accum · May 11, 2018, 10:52pm

ld_prune has changed to take a CallExpression instead of a matrix table. The new signature is ld_prune(call_expr, r2=0.2, window=1000000, memory_per_core=256).

See: https://github.com/hail-is/hail/pull/3518

jbloom · May 14, 2018, 8:20pm

ld_prune no longer requires unphased genotypes (though it still makes no use of phasing information). And the parameter window has been renamed bp_window_size.

See: https://github.com/hail-is/hail/pull/3575

konradjk · May 14, 2018, 8:33pm

While we’re at it, it also returns a Table with just ('locus', 'alleles') that is the set of independent variants at that threshold (rather than previously returning the MatrixTable filtered to that set).

cseed · June 18, 2018, 6:15pm

See:

danking · July 12, 2018, 12:46am

see:

jbloom · July 18, 2018, 1:25am

see:

jbloom · July 31, 2018, 1:09pm

see:

jbloom · July 31, 2018, 1:10pm

see:

jbloom · July 31, 2018, 10:29pm

see:

jbloom · August 6, 2018, 1:44pm

see:

konradjk · August 13, 2018, 10:07am

Minor breaking change: hl.min_rep() now returns struct of locus (a LocusExpression) and alleles (an ArrayExpression of type str). This makes min_rep and re-key much easier as in:

mt = mt.key_rows_by(**hl.min_rep(mt.locus, mt.alleles))

wang · August 15, 2018, 6:30pm

minor change:

the parameter names of hl.rand_unif(min, max) are changing to lower and upper.

https://github.com/hail-is/hail/pull/4145

Topic		Replies	Views
PLINK glm vs linreg3 Hail Query & hailctl	17	895	June 9, 2020
P-value differ from R for linear regression Development	6	579	March 31, 2021
Modifying variables within hl.agg.linreg Science	1	429	October 13, 2021
Running Hail on Databricks Help [0.1]	5	1383	March 29, 2017
Announcing Hail 0.2! Updates	2	4900	October 22, 2018

Log of breaking changes in 0.2 beta

Related topics