Sorry for all the questions lately. I’m trying to run logistic regression and I’m getting the following error:
… vds_gwas = (vds_gAD_sCR_sGQ_vGQ
… .filter_variants_expr(‘va.qc.AF > 0.01 && va.qc.AF < 0.99’)
… .annotate_samples_vds(vds_pca, code=‘sa.pca = vds.pca’)
… .logreg(test=‘wald’, y=‘sa.pheno.CCStatus’,
… covariates=[‘sa.pca.PC1’, ‘sa.pca.PC2’, ‘sa.pca.PC3’, ‘sa.pca.PC4’, ‘sa.pca.PC5’, ‘sa.pca.PC6’, ‘sa.pca.PC7’, ‘sa.pca.PC8’, ‘sa.pca.PC9’, ‘sa.pca.PC10’, ‘sa.pheno.age’, ‘sa.pheno.sex’]))
hail: info: Running wald logreg on 982 samples with 13 covariates including intercept…
Traceback (most recent call last):
File “”, line 6, in
File “”, line 2, in logreg
File “/work-zfs/darking1/software/src/hail/python/hail/java.py”, line 107, in handle_py4j
hail.java.FatalError: NoSuchMethodError: breeze.linalg.DenseVector$.canSetD()Lbreeze/generic/UFunc$InPlaceImpl2;
I saw there was another post (#1419) but I couldn’t find a fix that worked for me in that post. Any suggestions?
Just to confirm background info, this is on Google Cloud, right?
No. This is on an HPC cluster.
Ah, alright – same set up as Nate had when he got the same error. It looks like we never actually found the problem in #1419, it just started working on its own by recompiling. Could you try that and report back? Sorry it’s broken!
And by recompiling you mean reinstalling the software? Sorry, new to all of this.
Yeah, try pulling the latest master and running
gradle clean shadowJar
Still getting the same error.
OK, we’ll look into it. I’ll flag down Dan and Jon (who started addressing it last time) when they’re free.
Ok great. So I did run:
./gradlew test -Dspark.version=2.1.0
2 of the tests failed:
Gradle suite > Gradle test > is.hail.io.LoadBgenSuite.testBgenImportRandom FAILED
java.io.IOException at LoadBgenSuite.scala:137
Caused by: java.io.IOException at LoadBgenSuite.scala:137
Gradle suite > Gradle test > is.hail.stats.LogisticRegressionModelSuite.covariatesVsInterceptOnlyR FAILED
java.io.FileNotFoundException at LogisticRegressionModelSuite.scala:155
Running test: Test method covariatesVsInterceptOnlyTest(is.hail.stats.LogisticRegressionModelSuite)
Those two tests are failing because you don’t have the R and Plink test dependencies installed and on your path. See Running the Tests at bottom of Getting Started.
Do you get your original error if you increase the Java stack size as described here?
Do you get your original error using Spark 2.02?
Following on with Jon’s commentary, we can more effectively debug if we have specific information about your system:
- What operating system do the nodes of the HPC cluster run? If you can get on a node this command will give us the info we need:
- From the machine where you submit or invoke hail, can you post the output of:
- Can you post the exact command you used to compile Hail? The minimal command for Spark 2.0.2 is listed below. I’m particularly interested in whether you specified
-Dspark.version and to what it was set.
Sorry again that you’re having so much trouble. Hopefully we can pin down exactly what Hail is tripping on and clean that up for you and for others in the future!
So I’ve been using Spark 2.1.0, but I’m in the process of installing Spark 2.0.2 to see if I get the same error using that version.
Linux login-node03 2.6.32-573.22.1.el6.x86_64 #1 SMP Wed Mar 23 03:35:39 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
I’ve been using ./gradlew clean shadowJar
I’ll let you know when I get Spark 2.0.2 working if I get the same error.
Alright, I think I know the source of this error. A Hail JAR is peculiar to a version of Spark. To work properly, you need to build with this invocation:
./gradlew clean shadowJar -Dspark.version=2.1.0
Good news! I got logistic regression to work. I had to also install the R packages required. Thanks for your help!