When running Logistic Regression Rows I get a Py4JNetworkError

Like some other topics, I have also received errors relating to py4j.protocol.Py4JNetworkError: Answer from Java side is empty when I am trying to run

result_ht =hl.logistic_regression_rows()

I am not sure if their solution apply to me and should I build hail from https://hail.is/docs/0.2/getting_started_developing.html#building-hail?

Thanks!

I’d try this: Undefined symbol: cblas_dgemm

You might also simply not have the BLAS libraries installed. Verify that you’ve followed all the steps under installing Hail for Linux.

Hi @danking ,

If I would like to install Hail on remote linux servers (cluster), I do not have the permission to run

apt-get install -y
openjdk-8-jre-headless
g++
python3.6 python3-pip
libopenblas-base liblapack3

What should I do if I want to install these packages for Hail?

You’ll need to ask the system administrator or IT department to install these packages for you. Every compute cluster has a different system for doing this. Some are called “dot kits”.

Hi Dan,

I have confirmed with the IT department and we have all those packages installed. Do you know what could be other problems/solutions for this?

Thanks,
-Fengyi

Hey @fengyi ,

You probably need to explicitly tell Spark (a library that Hail uses) where BLAS and LAPACK are. We explain how to do that in this thread: Undefined symbol: cblas_dgemm.

1 Like

Hi Dan,

I follow the explanation in the solution you provided. But this error occurred again, can you help me to take a look at what else could go wrong? I have attached my code below.

I used:

hl.init(spark_conf={

“spark.executor.extraClassPath”: “/usr/lib64/libblas.so.3.4.2.so:/usr/lib64/liblapack.so.3.4.2”})

mt=hl.import_plink(bed= ‘,bim=’ ‘,fam=’/ ',quant_pheno=True,types={‘quant_pheno’:hl.tfloat64})

mt= mt.annotate_cols(is_case =

hl.case()

.when(mt.quant_pheno == -9, False )

.when(mt.quant_pheno == 1, True )

.or_missing())

mt = mt.drop(‘quant_pheno’)

covar = (hl.import_table(‘covar.txt’,types={‘IID’:hl.tstr},impute= True ).key_by(‘IID’))

mt = mt.annotate_cols(covar=covar[mt.s])

result_ht =hl.logistic_regression_rows(test=‘wald’,

y=mt.is_case,

x=mt.GT.n_alt_alleles() ,

covariates = [1,mt.covar.X1, mt.covar.X2]) # this line is where the code has an error.

There are a few versions of liblapack.so and libblas.so. So I just used the most recent I think.

Please let me know what else I can provide to clarify my problem!

Thanks!
-Fengyi

I am helping fengyi try to get hail to work. I’ve gotten past the BLAS errors, and have run into some other failure. It’s not clear from the error message what exactly spark is unhappy about. Are there settings I can use to get much more verbose error messages from the spark/java side?

2020-10-13 14:21:11 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
Setting default log level to "WARN". 
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 
2020-10-13 14:21:12 WARN Hail:37 - This Hail JAR was compiled for Spark 2.4.5, running with Spark 2.4.1. 
  Compatibility is not guaranteed. 
Running on Apache Spark version 2.4.1 
SparkUI available at http://nebula-3:4040 
Welcome to 
     __ __ <>__ 
    / /_/ /__ __/ / 
   / __ / _ `/ / / 
  /_/ /_/\_,_/_/_/ version 0.2.57-582b2e31b8bd 
LOGGING: writing to /ua/annis/hail/hail-20201013-1421-0.2.57-582b2e31b8bd.log 
2020-10-13 14:21:17 Hail: WARN: Hail has already been initialized. If this call was intended to change configuration, close the session with hl.stop() first. 
Traceback (most recent call last): 
  File "orig.py", line 8, in <module> 
    hl.init() 
  File "<decorator-gen-1758>", line 2, in init 
  File "/s/pkg/linux64/python/3.6.5/lib/python3.6/site-packages/hail/typecheck/check.py", line 614, in wrapper 
    return __original_func(*args_, **kwargs_) 
  File "/s/pkg/linux64/python/3.6.5/lib/python3.6/site-packages/hail/context.py", line 231, in init 
    skip_logging_configuration, optimizer_iterations) 
  File "/s/pkg/linux64/python/3.6.5/lib/python3.6/site-packages/hail/backend/spark_backend.py", line 194, in __init__ 
    jsc, app_name, master, local, True, min_block_size, tmpdir, local_tmpdir) 
  File "/s/pkg/linux64/python/3.6.5/lib/python3.6/site-packages/py4j/java_gateway.py", line 1257, in __call__ 
    answer, self.gateway_client, self.target_id, self.name) 
  File "/s/pkg/linux64/python/3.6.5/lib/python3.6/site-packages/hail/backend/spark_backend.py", line 42, in deco 
    'Error summary: %s' % (deepest, full, hail.__version__, deepest)) from None 
hail.utils.java.FatalError: IllegalArgumentException: requirement failed 

Java stack trace: 
java.lang.IllegalArgumentException: requirement failed 
        at scala.Predef$.require(Predef.scala:212) 
        at is.hail.backend.spark.SparkBackend$.apply(SparkBackend.scala:191) 
        at is.hail.backend.spark.SparkBackend.apply(SparkBackend.scala) 
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
        at java.lang.reflect.Method.invoke(Method.java:498) 
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) 
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 
        at py4j.Gateway.invoke(Gateway.java:282) 
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) 
        at py4j.commands.CallCommand.execute(CallCommand.java:79) 
        at py4j.GatewayConnection.run(GatewayConnection.java:238) 
        at java.lang.Thread.run(Thread.java:748) 



Hail version: 0.2.57-582b2e31b8bd 
Error summary: IllegalArgumentException: requirement failed

Hi @wsa,

Sorry for the huge delay, I forgot about this thread.

It looks you call hl.init twice? Can you share orig.py? What happens on lines 1 to 8?

EDIT: There is also more debugging information in the hail log file.

The hail logs don’t give any more information than the screen message gives.

The first 8 lines of orig.py, omitting the actual path names:

#!/s/bin/python3.6

import hail as hl

hl.stop()
hl.init(spark_conf={"spark.executor.extraClassPath": "/usr/lib64/libopenblas.so:/usr/lib64/liblapack.so"})

hl.init()
mt = hl.import_plink(bed='.bed',bim='.bim',fam='.fam',quant_pheno=True)

You shouldn’t have that second hl.init(). You can only initialize hail once.

I can’t believe I missed that. I’ve been staring at the code for too long.

Now we’ve moved on to new errors, but I suspect this is a problem in the data files rather than hail itself:

2020-10-29 15:17:10 SparkContext: INFO: Created broadcast 0 from broadcast at SparkBackend.scala:233
2020-10-29 15:17:10 Hail: WARN: Interpreting value '-9' as a valid quantitative phenotype, which differs from default PLINK behavior. Use missing='-9' to interpret '-9' as a missing value.
2020-10-29 15:17:10 Hail: WARN: Interpreting value '-9' as a valid quantitative phenotype, which differs from default PLINK behavior. Use missing='-9' to interpret '-9' as a missing value.

So these are warnings, not errors. You just have to decide what you intended. If you want -9 to mean “missing”, which you probably do I’m guessing, you can add missing='-9' as an option to import_plink.