Hail 0.1 is live!

tpoterba · May 15, 2017, 5:25pm

Hail is 0.1! What does that mean?

0.1 is Hail’s first stable release. “Stable release” can connote a lot of things, so here’s what we mean:

We will not be changing any interfaces in this version. Thanks to all of you for your patience as we broke your Hail scripts every week!
We will support this version by fixing bugs and answering questions about its interfaces until it is deprecated. Deprecation will probably happen 3 months after the next stable version (0.2) is released.
We will continue to add features (without changing existing interfaces!) to this version for now. This will probably stop in a month or so, as we shift to building features against changing development code for stable release in 0.2.
We will continue automatically deploying new builds to the gs://hail-common bucket. We may change how this is done in the next few days, though, so keep an eye on this forum!

Changes in 0.1

We’ve been saving up a pile of changes in the last month or so, in order not to break your scripts every few days. These changes all appear at once in the new 0.1 build. Here they are:

KeyTables: your new best friend for annotation and filtering

Reworked annotate_samples_table and annotate_variants_table. These are essentially the functionality of the old annotate_variants_keytable and annotate_samples_keytable, with more. For example, annotate_variants_table can be used to annotate variants by intervals, variants, or loci.
Many annotation methods have disappeared! These can be implemented in terms of annotate_samples_table and annotate_variants_table. Some examples of removed methods:
- annotate_samples_fam
- annotate_variants_intervals
- annotate_variants_bed
Added Keytable.import_fam, KeyTable.import_interval_list, and KeyTable.import_bed. These should be used with the new annotate methods.
Reworked HailContext's [import_table(https://hail.is/hail/hail.HailContext.html#hail.HailContext.import_table) (renamed from import_keytable). The TextTableConfig object is gone, and all the parameters are now simply arguments to this method. Additionally, field names in tables imported with no_header=True will be given names "f0"…"fN" instead of "_0"…"_N"
Added filter_variants_table and filter_samples_table. These take KeyTables with key Variant, Locus, or Interval (for variants) or String (for samples). Filtering from interval lists should go through this method. Don’t worry – we’ve kept the efficiency demonstrated in this blog post!

Pedigree as a first-class object

We’ve added first-class Python objects for Pedigree and Trio. Read fam files with the static read method.
The mendel_errors and tdt methods now take a pedigree as an argument, instead of a fam file.

Regression

Removed counts from LMM regression variant annotations. See the new schema in the docs here.
Added dosage option to logreg.

GRM

The grm method now returns a KinshipMatrix object, which has new methods to export to all of the old GRM PLINK formats. grm no longer takes a format argument.

Dosage

We used the term “dosage” in a very confusing (and incorrect) way. g.dosage() in the expression language is now the expected number of alternate alleles given the genotype probabilities. g.gp() is the linear-scaled genotype probabilities from which dosage is computed. This does not involve changes to the regression interfaces, though using g.dosage() for a variant as a covariate will now be correct.

Count

We changed the count method. It no longer takes a genotypes boolean parameter or returns a dict: instead, it returns a tuple of (number of samples, number of variants).
We added the summarize method, which is an excellent way to get a broad sense of a dataset’s contents.

Filter variants with Python objects

Added filter_variants_list, which takes a list of Python Variant objects.This method is very fast (does pushdown, see here).

`concordance` and `mendel_errors` return key tables

concordance and mendel_errors now both return KeyTable objects. concordance used to return the global statistics as a 2d list, and two variant datasets. mendel_errors used to write four files. The docs explain these things clearly.

TakeBy

The takeBy aggregator now takes elements from smallest to largest (used to be largest to smallest, but this was not clearly documented).

Method renames

HailContext.import_keytable => import_table
HailContext.read_keytable => read_table
VariantDataset.annotate_global_py => annotate_global
VariantDataset.from_keytable => from_table
VariantDataset.variants_keytable => variants_table
VariantDataset.samples_keytable => samples_table
VariantDataset.genotypes_keytable => genotypes_table
VariantDataset.filter_variants_intervals => filter_intervals (some functionality moved to filter_variants_table)
KeyTable.column_names => columns
KeyTable.num_rows => count
Plus: Arguments are renamed in many functions (“code” is replaced with “expr” most places).

Removed functionality

aggregate_intervals: this method is easy to implement in a few lines of KeyTable methods now.
annotate_global_table: rarely used functionality mostly replaced by KeyTable support. It is still possible to put tables into global annotations with KeyTable.collect() and VariantDataset.annotate_global.
annotate_global_list: same as above.
IntervalTree. This object is totally obviated by the deep support for Interval-keyed KeyTables.

Topic		Replies	Views
Announcing Hail 0.2! Updates	2	4927	October 22, 2018
Hail 0.2.10 patch notes Updates	0	1074	February 15, 2019
Summary of main changes between 0.1 and 0.2? Hail Query & hailctl	1	541	April 25, 2018
Hail 0.2 - Changes to data structure in the newest version? Pipeline broken in multiple places Hail Query & hailctl	4	905	May 6, 2018
How to run Hail from outside of interactive python Help [0.1]	6	1035	October 25, 2017

Hail 0.1 is live!

Hail is 0.1! What does that mean?

0.1 is Hail’s first stable release. “Stable release” can connote a lot of things, so here’s what we mean:

Changes in 0.1

KeyTables: your new best friend for annotation and filtering

Pedigree as a first-class object

Regression

GRM

Dosage

Count

Filter variants with Python objects

concordance and mendel_errors return key tables

TakeBy

Method renames

Removed functionality

Related topics

`concordance` and `mendel_errors` return key tables