Hail command stalls or "no space left on device" with VEP annotation

djw · June 1, 2022, 4:19pm

Hi all,

I am trying to work with a VEP-annotated version of the Genebass variant-level summary statistics. When I try to work with the VEP-annotated version, Hail either stalls (the progress bar stops) or I get FatalError: IOException: No space left on device – despite not trying to write out any file to storage, as you can see from the code below.

Does this sound like insufficient memory/RAM, or some other issue? If memory, what would be optimal parameters to set up a VM with hailctl for this case? My current configuration is the hailctl default (hailctl dataproc start cluster_name)

Here is example code which generates the error (it runs fine if you exclude the annotate_rows command):

#load genebass variants
genebass_variant = hl.read_matrix_table('path_to_genebass_variants’)
genebass_variant = genebass_variant.key_rows_by(genebass_variant.markerID)

#Filter variants
vep_ht = hl.read_table("path_to_genebass_vep_hailtable”)
vep_ht = vep_ht.key_by("markerID")
genebass_variant = genebass_variant.filter_rows(genebass_variant.annotation == "missense")
genebass_variant = genebass_variant.annotate_rows(vep = vep_ht[genebass_variant.markerID].vep)
genebass_variant = genebass_variant.filter_rows(genebass_variant.gene == "PCSK9")
genebass_variant.entries().show(10)

Thank you! -Dan

danking · June 2, 2022, 2:00pm

It’s hard to know exactly what the problem is without the full stack trace, but my guess is that Spark is using HDFS to re-order your data in the key_by. You can avoid that by explicitly initializing Hail and specifying a temporary directory:

hl.init(tmp_dir='gs://my-bucket/tmp')

danking · June 2, 2022, 2:01pm

Also, I recommend against using the entries(). That is an inefficient representation of the entries of a matrix table. If you want to look at the entries of a matrix table, just show the matrix table itself:

genebass_variant.show()

Topic		Replies	Views
VEP Annotation stalling Hail Query & hailctl	0	29	May 9, 2025
Error when writing a large VEP annotated Hail Table Hail Query & hailctl	2	549	March 6, 2023
Can`t write VEP annotations Hail Query & hailctl	16	840	October 4, 2019
Can't write VEP annotated hail table Hail Query & hailctl	4	653	February 13, 2019
VEP annotation errors with ClinVar Help [0.1]	8	1574	December 3, 2017

Hail command stalls or "no space left on device" with VEP annotation

Related topics