Hello Hail team,
I am running into an error when running count after filtering on either columns or rows.
mt = get_gnomad_data('genomes', adj=True, release_annotations=True, split=True)
mt = hl.filter_intervals(mt, hl.experimental.get_gene_intervals(gene_symbols=['AP4B1','AP4E1', 'AP4M1', 'AP4S1']))
mt.rows().count()
will throw the error
FatalError: IllegalArgumentException: requirement failed
Java stack trace:
java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:212)
at is.hail.expr.ir.TableValue.<init>(TableValue.scala:47)
at is.hail.expr.ir.TableNativeZippedReader.apply(TableIR.scala:245)
at is.hail.expr.ir.TableRead.execute(TableIR.scala:295)
at is.hail.expr.ir.TableFilterIntervals.execute(TableIR.scala:1714)
at is.hail.expr.ir.Interpret$$anonfun$apply$2.apply$mcJ$sp(Interpret.scala:730)
at is.hail.expr.ir.Interpret$$anonfun$apply$2.apply(Interpret.scala:730)
at is.hail.expr.ir.Interpret$$anonfun$apply$2.apply(Interpret.scala:730)
at scala.Option.getOrElse(Option.scala:121)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:730)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:89)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:59)
at is.hail.expr.ir.InterpretNonCompilable$$anonfun$7.apply(InterpretNonCompilable.scala:19)
at is.hail.expr.ir.InterpretNonCompilable$$anonfun$7.apply(InterpretNonCompilable.scala:19)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at is.hail.expr.ir.InterpretNonCompilable$.apply(InterpretNonCompilable.scala:19)
at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:37)
at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:37)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:24)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:37)
at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:55)
at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:55)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:8)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:7)
at is.hail.utils.package$.using(package.scala:596)
at is.hail.annotations.Region$.scoped(Region.scala:18)
at is.hail.expr.ir.ExecuteContext$.scoped(ExecuteContext.scala:7)
at is.hail.backend.Backend.execute(Backend.scala:55)
at is.hail.backend.Backend.executeJSON(Backend.scala:61)
at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Hail version: 0.2.24-9cd88d97bedd
Error summary: IllegalArgumentException: requirement failed
mt.count() will run if I do not filter to the gene intervals. However, if I filter to gnomad release samples using mt = mt.filter_cols(mt.meta.release). I see the same error. I am using version 0.2.24-9cd88d97bedd .
can we have the log file?
Sorry…I cannot for the life of me figure out how to attach a file on here. Google has failed me. How should I send it?
This is prehaps non-intuitive or just aggressively intuitive, but you should be able to drag and drop? I just fixed the file extensions to permit .log files.
foo.log (9 Bytes)
Yup now I can. Before it would only accept image extensions. Thank you.
gnomad_clinvar.log (403.8 KB)
aha:
(TableRead Table{global:Struct{},key:,row:Struct{s:String}} False “{"name":"TableNativeReader","path":"gs://gnomad/hardcalls/hail-0.2/mt/exomes/gnomad.exomes.mt/cols","_spec":{"name":"TableSpec","file_version":65536,"hail_version":"devel-a23032101373","references_rel_path":"…/references","table_type":"Table{global:Struct{},key:[s],row:Struct{s:String}}","components":{"globals":{"name":"RVDComponentSpec","rel_path":"…/globals/rows"},"rows":{"name":"RVDComponentSpec","rel_path":"rows"},"partition_counts":{"name":"PartitionCountsComponentSpec","counts":[164332]}}}}”))))))
This is a super old file. I bet it has the required-globals problem.
I’m fairly confident I back-patched (read: manually edited the metadata.json.gz) all the gnomAD files, so unless there’s another issue, I would think it should work.
Maybe try a different file (the exomes, or non-split hardcalls) to double check, maybe I missed one, but this one’s a pretty big workhorse, so I’d be surprised. mt = mt.select_globals()
can also work to check if that’s the issue.
I fixed the error message here to give us more information. Can you try running on latest master?
(if you need help building from source, let me know)
mt = get_gnomad_data('exomes', adj=True, release_annotations=True, release_samples=True, split=True)
mt = mt.select_globals()
mt.count()
Fails with the same error.
I attempted to build from source but received this error when running ./install-gcs-connector.sh
ERROR: (gcloud.iam.service-accounts.keys.create) RESOURCE_EXHAUSTED: Maximum number of keys on account reached.
'@type': type.googleapis.com/google.rpc.RetryInfo
retryDelay: 86401s
I haven’t done this before so it is likely it was done incorrectly. However, when I initialize hail locally in the hail/hail directory I am running version 0.2.24-e3e63a2f9856. I think it’s just I can’t hook up to gcs?
This code works for me with my home-spun 9c44fc9e7c2b
(or maybe 22f6defd17d
who knows with my rig anymore, anyway relatively recent) so I don’t think it’s the old requiredness problem.
@mwilson I mean try on the cloud with a custom build – to do this you do (from hail/hail
):
HAILCTL_BUCKET_BASE="gs://a-bucket-you-can-write-to" make install-hailctl
With or without mt = mt.select_globals()
, I now see this error when running the code above. Would the full log be helpful?
FatalError: RuntimeException: globals mismatch:
typ: Struct{}
val: +Struct{}
Java stack trace:
java.lang.RuntimeException: globals mismatch:
typ: Struct{}
val: +Struct{}
at is.hail.expr.ir.TableValue.<init>(TableValue.scala:50)
at is.hail.expr.ir.TableNativeReader.apply(TableIR.scala:181)
at is.hail.expr.ir.TableRead.execute(TableIR.scala:295)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:748)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:89)
at is.hail.expr.ir.Interpret$.apply(Interpret.scala:59)
at is.hail.expr.ir.InterpretNonCompilable$$anonfun$5.apply(InterpretNonCompilable.scala:16)
at is.hail.expr.ir.InterpretNonCompilable$$anonfun$5.apply(InterpretNonCompilable.scala:16)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at is.hail.expr.ir.InterpretNonCompilable$.apply(InterpretNonCompilable.scala:16)
at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:37)
at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:37)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:24)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:37)
at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:57)
at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:57)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:8)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:7)
at is.hail.utils.package$.using(package.scala:596)
at is.hail.annotations.Region$.scoped(Region.scala:18)
at is.hail.expr.ir.ExecuteContext$.scoped(ExecuteContext.scala:7)
at is.hail.backend.Backend.execute(Backend.scala:57)
at is.hail.backend.Backend.executeJSON(Backend.scala:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Hail version: 0.2.24-e3e63a2f9856
Error summary: RuntimeException: globals mismatch:
typ: Struct{}
val: +Struct{}
@konradjk you didn’t get all the +
s
I think there must be some buried in other metadata.json.gz that weren’t used then, but are now.
ah that makes sense. i think i only had to do the overall metadata.json.gz one before, but maybe now we need to do more. sigh.
I can fix this in Hail though
oh that’d be great. the previous fix was pretty nervewracking
The code above now runs, i.e. filtering the columns only, but when I run
mt = mt.filter_rows(hl.agg.any(mt.GT.is_non_ref()))
mt.count()
I am seeing the same global mismatch error.
the log file might help here, then