I’m almost done preparing my MatrixTable of genotypes for linear regression, and I’m trying to attach ~1700 genes expression value (genes in chromosome 1, for example), using the ** trick:
FatalError: RuntimeException: Method code too large!
Java stack trace:
java.lang.RuntimeException: Method code too large!
When I try the same style of annotation with a smaller gene set (~200) or my covariate set (~50), it seems to work fine. Under the hood, is there an explicit limit to the number of phenotypes (or the total length of their names)? Is there any way I can avoid this error and still make annotate_cols work?
The reason not to do ** is that it expands out into a HUGE list of keyword arguments in Python. This is useful in many cases, but when dealing with tremendously large schemas like yours, it’s going to be very inefficient. It’s much harder for our compiler to work with.
FatalError: RuntimeException: Method code too large!
Java stack trace:
java.lang.RuntimeException: Method code too large!
at is.hail.relocated.org.objectweb.asm.MethodWriter.a(Unknown Source)
at is.hail.relocated.org.objectweb.asm.ClassWriter.toByteArray(Unknown Source)
at is.hail.asm4s.FunctionBuilder.classAsBytes(FunctionBuilder.scala:293)
at is.hail.asm4s.FunctionBuilder.result(FunctionBuilder.scala:325)
at is.hail.expr.CM.runWithDelayedValues(CM.scala:80)
at is.hail.expr.Parser$.is$hail$expr$Parser$$evalNoTypeCheck(Parser.scala:60)
at is.hail.expr.Parser$.eval(Parser.scala:73)
at is.hail.expr.Parser$.parseExpr(Parser.scala:88)
at is.hail.variant.MatrixTable.selectCols(MatrixTable.scala:989)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
Hail version: devel-907a817
Error summary: RuntimeException: Method code too large!
I tried a couple of values for the newest version of hail as well - looks like 1000 is too long and 800 is fine, but I haven’t tested further.
I also tried using select() with the list of genes, but it gave the same error. What would be the alternative here?
Seems like this method works - I see that the rows are incorporated into a struct with the name gene_data, and I need to modify the linear_regression code as such:
hl.linear_regression([analysis_set.gene_data[g] for g in chrom_gene_list], analysis_set.AC)
Thank you! I think I understand how the syntax works a little better now.