Table.export issue

Hi there,

I have been trying to export a table out with the following:


and it is taking quite a while (an hour+ so far). Is there a reason behind the hang? it will then usually exit out without exporting the file.

the lag occurs also with:

any tips would be appreciated.

1 Like

Hail is lazy, which means that an entire pipeline (reads / filters / annotates / methods / export) will be executed when you call table.export(). It’s probably something else that’s slow, not the export. What’s the full pipeline?

Is it resource heavy? I was using more nodes prior to this and never had any issues. Currently I am using only 2 cpu nodes on a cluster with at least 30G ram.

The pipeline is a simple seg_by_carrier and filtering one which I’ve used before, but changed slightly.

The most notable change was annotating using the gnomad mt instead of a custom gnomad entry:

g = hl.read_table(’/path/to/’)
mt = mt.annotate_rows(gnomad=hl.struct(nfe=g[mt.row_key].freq[2],popmax=g[mt.row_key].popmax,split=g[mt.row_key].was_split,filters=g[mt.row_key].filters))



and then the usual:


rare_mt = mt.filter_rows(mt.gnomad.nfe.AF < 0.001, keep=True)
rare_mt = rare_mt.filter_rows(rare_mt.impact != “LOW”)
rare_mt = rare_mt.filter_rows(rare_mt.impact != “MODIFIER”)

would this cause a lag when showing or exporting a table?

Thanks in advance

I think the slow thing here is reading/joining the gnomAD table. This is a serious chunk of data - the genomes are hundreds of GBs if I remember correctly. How big is your mt input? It must be much smaller.