ClosedChannelException: null hail 0.2.56

jkgoodrich · September 3, 2020, 4:59pm

Hi Hail team!

I am getting an error running the code below using Hail 0.2.56, but it was run without error in 0.2.34. Log attached.

import hail as hl

from gnomad.resources import MatrixTableResource
from gnomad.resources.grch38 import telomeres_and_centromeres
from gnomad.utils.sparse_mt import impute_sex_ploidy
from gnomad_qc.v3.resources.sample_qc import hard_filtered_samples
from gnomad_qc.v3.resources.meta import meta

hl.init(log='/hail.log', default_reference='GRCh38')


def get_gnomad_v3_mt(
        key_by_locus_and_alleles: bool = False,
) -> hl.MatrixTable:
    mt = gnomad_v3_genotypes.mt()
    if key_by_locus_and_alleles:
        mt = hl.MatrixTable(hl.ir.MatrixKeyRowsBy(mt._mir, ['locus', 'alleles'], is_sorted=True))
        
    return mt


# V3 genotype data
gnomad_v3_genotypes = MatrixTableResource("gs://gnomad/raw/hail-0.2/mt/genomes_v3/gnomad_genomes_v3.repartitioned.mt")


    
mt = get_gnomad_v3_mt()
renamed_1kg = hl.import_table('gs://gnomad-tmp/duplicate_1kg.txt').key_by('s')
mt = mt.filter_cols(hl.is_defined(renamed_1kg[mt.col_key]))
ht = impute_sex_ploidy(
        mt,
        excluded_calling_intervals=telomeres_and_centromeres.ht()
    )
ht = ht.checkpoint('gs://gnomad-tmp/sex_ploidy_duplicate_1kg.ht', overwrite=True)

Thank you in advance for your help!

hail.log (259.2 KB)

tpoterba · September 7, 2020, 3:12pm

looks like the ClosedChannelException is just masking the real error in the log:

2020-09-03 16:52:09 TaskSetManager: WARN: Lost task 118.1 in stage 0.0 (TID 134, jg3-w-0.c.maclab-ukbb.internal, executor 1): htsjdk.samtools.SAMException: Unable to load chr20(59410137, 59414233) from /tmp/fasta-reader-FcVGkbAEQi5MCYD6x7hi1B.fasta
	at htsjdk.samtools.reference.AbstractIndexedFastaSequenceFile.getSubsequenceAt(AbstractIndexedFastaSequenceFile.java:207)
	at htsjdk.samtools.reference.IndexedFastaSequenceFile.getSubsequenceAt(IndexedFastaSequenceFile.java:49)
	at is.hail.io.reference.FASTAReader.getSequence(FASTAReader.scala:73)
	at is.hail.io.reference.FASTAReader.fillBlock(FASTAReader.scala:83)
	at is.hail.io.reference.FASTAReader.readBlock(FASTAReader.scala:93)
	at is.hail.io.reference.FASTAReader.readBlock(FASTAReader.scala:99)
	at is.hail.io.reference.FASTAReader.lookupGlobalPos(FASTAReader.scala:138)
	at is.hail.io.reference.FASTAReader.lookup(FASTAReader.scala:110)
	at is.hail.variant.ReferenceGenome.getSequence(ReferenceGenome.scala:357)
	at __C24Compiled.__m42getReferenceSequenceFromValidLocus(Unknown Source)
	at __C24Compiled.apply(Unknown Source)
	at is.hail.expr.ir.TableFilter$$anonfun$execute$2.apply(TableIR.scala:946)
	at is.hail.expr.ir.TableFilter$$anonfun$execute$2.apply(TableIR.scala:946)
	at is.hail.expr.ir.TableValue$$anonfun$3.apply(TableValue.scala:65)
	at is.hail.expr.ir.TableValue$$anonfun$3.apply(TableValue.scala:65)
	at is.hail.rvd.RVD$$anonfun$17$$anonfun$apply$2.apply$mcZJ$sp(RVD.scala:605)
	at is.hail.rvd.RVD$$anonfun$17$$anonfun$apply$2.apply(RVD.scala:604)
	at is.hail.rvd.RVD$$anonfun$17$$anonfun$apply$2.apply(RVD.scala:604)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:464)
	at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
	at is.hail.rvd.RVDPartitionInfo$$anonfun$apply$1.apply(RVDPartitionInfo.scala:66)
	at is.hail.rvd.RVDPartitionInfo$$anonfun$apply$1.apply(RVDPartitionInfo.scala:38)
	at is.hail.utils.package$.using(package.scala:609)
	at is.hail.rvd.RVDPartitionInfo$.apply(RVDPartitionInfo.scala:38)
	at is.hail.rvd.RVD$$anonfun$32.apply(RVD.scala:1223)
	at is.hail.rvd.RVD$$anonfun$32.apply(RVD.scala:1221)
	at is.hail.sparkextras.ContextRDD$$anonfun$crunJobWithIndex$1.apply(ContextRDD.scala:232)
	at is.hail.sparkextras.ContextRDD$$anonfun$crunJobWithIndex$1.apply(ContextRDD.scala:230)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:123)
	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

jkgoodrich · September 8, 2020, 1:23pm

Thank you for finding the real error Tim! So does that mean there is an error loading the reference sequence? Do you have any suggestions for what I can change to fix it?

tpoterba · September 8, 2020, 1:30pm

I’m not totally sure. It does look like all those errors came from the same fasta block read.

Something you could do to help debug is run the following to see if the error replicates:

chr20 = mt.filter_rows(mt.locus.contig == 'chr20').rows().select()
chr20.annotate(context=chr20.locus.sequence_context())._force_count()

jkgoodrich · September 8, 2020, 3:05pm

With that I get this error TypeError: Reference genome 'GRCh38' does not have a sequence loaded. Use 'add_sequence' to load the sequence from a FASTA file.

If I change it a little to this I just get a number (60337758) and no error:

from gnomad.utils.reference_genome import get_reference_genome

chr20 = mt.filter_rows(mt.locus.contig == 'chr20').rows().select()
ref = get_reference_genome(chr20.locus, add_sequence=True)

chr20 = chr20.key_by(
            locus=hl.locus(contig=chr20.locus.contig, pos=chr20.locus.position, reference_genome=ref)
        )
chr20.annotate(context=chr20.locus.sequence_context())._force_count()

But maybe there is a different(and likely better) way to add the reference genome sequence that I should try.

tpoterba · September 8, 2020, 3:56pm

above your chr20 = ... line add:

hl.get_reference('GRCh38')\
  .add_sequence('gs://hail-common/references/Homo_sapiens_assembly38.fasta.gz')

jkgoodrich · September 8, 2020, 4:02pm

Thank you Tim! That is much cleaner. Gives no error still, just the number 60337758

tpoterba · September 8, 2020, 4:04pm

hmmm…weird. maybe try without the filter?

jkgoodrich · September 8, 2020, 4:36pm

No error, count is 2861561184

tpoterba · September 8, 2020, 8:10pm

I now see what’s going on in impute_sex_ploidy, I’ll see if I can replicate.

jkgoodrich · September 8, 2020, 8:33pm

Thank you so much Tim!

chrisvittal · September 10, 2020, 12:39pm

We’ve opened https://github.com/hail-is/hail/pull/9427 to fix another issue we found when working on this.

Unfortunately we haven’t had luck in replicating your exact issue, and I’m not sure my PR will fix it.

Topic		Replies	Views
Unable to create matrix table of gnomAD chr1, chr2 Hail Query & hailctl	2	296	January 3, 2023
ArrayIndexOutOfBoundsException error in hail 0.2.40 Hail Query & hailctl	18	527	May 18, 2020
Cannot Annotate DB Hail Query & hailctl	6	479	October 5, 2020
Final task/partition is hanging Hail Query & hailctl	4	545	September 26, 2019
ClassTooLargeException Hail Query & hailctl	3	457	September 29, 2020

ClosedChannelException: null hail 0.2.56

Related topics