Hail Py4JError while calling z:is.hail.backend.spark.SparkBackend.executeJSON

Actually I am having issues while running Hail 0.2.These are the following commands I ran in a ipython shell as follows:-

import hail as hl
mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
mt.count()

I am getting this error “Py4JError: An error occurred while calling z:is.hail.backend.spark.SparkBackend.executeJSON” while running the third step “mt.count()”.

It would be a great help if someone can help me in resolving this issue.

Thank you.

A few questions:

  1. what is the full stack trace? Both Python and java.
  2. How did you install Hail? pip, compiled your own, etc.
  3. Related, what system are you running on? Mac laptop, Linux laptop, cluster, etc.

Dear tpoterba,

Thank you for responding. I tried installing hail using pip on a laptop having Centos 7. I have spark 2.4.1 installed in standalone mode in my laptop.

Actually I removed all the packages installed using pip and built a fresh jar from the source. The jar got built successfully for the spark version 2.4.1.

Right now I am able to call the hail library succesfully in pyspark prompt,but when I am initializing hail using hl.init(), I am getting the following error:-

Using Python version 3.6.6 (default, Mar 29 2019 00:03:27)
SparkSession available as ‘spark’.
>>> import hail as hl
>>> hl.init()
Traceback (most recent call last):
** File “”, line 1, in **
** File “”, line 2, in init**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper**
** return original_func(*args, **kwargs)**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 256, in init**
** default_reference, idempotent, global_seed, _backend)**
** File “”, line 2, in init**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper**
** return original_func(*args, **kwargs)**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 97, in init**
** min_block_size, branching_factor, tmp_dir)**
TypeError: ‘JavaPackage’ object is not callable

This error means that the Hail jar isn’t on your classpath. Try something like this:

export PYSPARK_SUBMIT_ARGS="--conf spark.driver.extraClassPath=/Users/tpoterba/hail/hail/build/libs/hail-all-spark.jar --conf spark.executor.extraClassPath=/Users/tpoterba/hail/hail/build/libs/hail-all-spark.jar --driver-memory 8G pyspark-shell"

I am still getting the same error. This is my current bashrc file entries:-

export PYSPARK_PYTHON=python3

export SPARK_HOME=/home/aby/spark

export PATH=$PATH:$SPARK_HOME/bin

export HAIL_HOME=/home/aby/CBR-IISC/Hail/hail/hail

export PYTHONPATH="${PYTHONPATH:+$PYTHONPATH:}$HAIL_HOME/build/distributions/hail-python.zip"

export PYTHONPATH="$PYTHONPATH:$SPARK_HOME/python"

export PYTHONPATH="$PYTHONPATH:$HAIL_HOME/python:$SPARK_HOME/python:echo $SPARK_HOME/python/lib/py4j*-src.zip"

export PYSPARK_SUBMIT_ARGS="*
** --jars $HAIL_HOME/build/libs/hail-all-spark.jar *

** --conf spark.driver.extraClassPath="$HAIL_HOME/build/libs/hail-all-spark.jar" **
** --conf spark.executor.extraClassPath="$HAIL_HOME/build/libs/hail-all-spark.jar" **
** pyspark-shell"**

I am also having this problem. It seems java stops after a python command—— when using jps to see java processes, the process SparkSubmit disappears after executing mt.count().

$ ipython
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
Type ‘copyright’, ‘credits’ or ‘license’ for more information
IPython 7.4.0 – An enhanced Interactive Python. Type ‘?’ for help.

In [1]: import hail

In [2]: import hail as hl

In [3]: hl.init()
using hail jar at /home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/hail-all-spark.jar
Setting default log level to “WARN”.
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Running on Apache Spark version 2.2.0
SparkUI available at http://10.10.0.175:4040
Welcome to
__ __ <>__
/ // /__ __/ /
/ __ / _ `/ / /
/
/ //_,/// version 0.2.12-9409c0635781
LOGGING: writing to /home/wmr/hail-20190416-2051-0.2.12-9409c0635781.log

In [4]: !jps
52916 Jps
52661 SparkSubmit

In [5]: mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
2019-04-16 20:52:17 Hail: INFO: balding_nichols_model: generating genotypes for 3 populations, 50 samples, and 100 variants…

In [6]: !jps
52661 SparkSubmit
53206 Jps

In [7]: mt.count()
ERROR: dlopen("/tmp/libhail5615322307661804588.so"): /lib64/libc.so.6: version GLIBC_2.14' not found (required by /tmp/libhail5615322307661804588.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail5615322307661804588.so: /lib64/libc.so.6: versionGLIBC_2.14’ not found (required by /tmp/libhail5615322307661804588.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail5615322307661804588.so: /lib64/libc.so.6: version `GLIBC_2.14’ not found (required by /tmp/libhail5615322307661804588.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
at is.hail.nativecode.NativeCode.(NativeCode.java:25)
at is.hail.nativecode.NativeBase.(NativeBase.scala:22)
at is.hail.annotations.Region.(Region.scala:27)
at is.hail.annotations.Region$.apply(Region.scala:10)
at is.hail.annotations.Region$.scoped(Region.scala:13)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:81)
at is.hail.backend.spark.SparkBackend$.execute(SparkBackend.scala:49)
at is.hail.backend.spark.SparkBackend$.executeJSON(SparkBackend.scala:16)
at is.hail.backend.spark.SparkBackend.executeJSON(SparkBackend.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File “/home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py”, line 1035, in send_command
raise Py4JNetworkError(“Answer from Java side is empty”)
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py”, line 883, in send_command
response = connection.send_command(command)
File “/home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py”, line 1040, in send_command
“Error while receiving”, e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving

Py4JError Traceback (most recent call last)
in
----> 1 mt.count()

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/matrixtable.py in count(self)
2369 Number of rows, number of cols.
2370 “”"
-> 2371 return (self.count_rows(), self.count_cols())
2372
2373 @typecheck_method(output=str,

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/matrixtable.py in count_rows(self)
2329
2330 return Env.backend().execute(
-> 2331 TableCount(MatrixRowsTable(self._mir)))
2332
2333 def _force_count_rows(self):

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/backend/backend.py in execute(self, ir)
91 return ir.typ._from_json(
92 Env.hail().backend.spark.SparkBackend.executeJSON(
—> 93 self._to_java_ir(ir)))
94
95 def value_type(self, ir):

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py in call(self, *args)
1131 answer = self.gateway_client.send_command(command)
1132 return_value = get_return_value(
-> 1133 answer, self.gateway_client, self.target_id, self.name)
1134
1135 for temp_arg in temp_args:

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/utils/java.py in deco(*args, **kwargs)
213 import pyspark
214 try:
–> 215 return f(*args, **kwargs)
216 except py4j.protocol.Py4JJavaError as e:
217 s = e.java_exception.toString()

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
325 raise Py4JError(
326 “An error occurred while calling {0}{1}{2}”.
–> 327 format(target_id, “.”, name))
328 else:
329 type = answer[1]

Py4JError: An error occurred while calling z:is.hail.backend.spark.SparkBackend.executeJSON

In [8]: !jps
53408 Jps

In [9]:

@Jerry I think your issue is different. Your machine needs GLIBC_2.14 or later. What does ldd --version return?

@aby I never had success putting the PYSPARK_SUBMIT_ARGS bit in .bashrc or .zshrc files – I have it in .profile. Other than that, try an OS restart?

Dear Tpoterba,
My version is ldd (GNU libc) 2.12.

When I use export LD_LIBRARY_PATH=/opt/glibc-2.14/lib before running ipython, it shows

(hail) [wmr@huanglab ~]$ export LD_LIBRARY_PATH=/opt/glibc-2.14/lib
(hail) [wmr@huanglab ~]$ ipython
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import hail
Segmentation fault (core dumped)

This segfault isn’t coming from Hail, since Hail doesn’t use any c libraries before calling hl.init(). It’s probably coming from cpython, one of the dependencies like numpy.

Dear Tpoterba,

All problems went away after switching to the latest version of Ubuntu Server.

Kind regards,
Jerry

Hi,

I tried OS restart, but still the problem is persisting.

Can you share the output of:

pip show hail
pip show pyspark
which pip
which python
echo $PYSPARK_SUBMIT_ARGS
echo $PYTHONPATH

If you’re using the pip installed version of Hail, you do not need to compile from source or install spark manually. Just run this:

python -m pip install -U hail
unset HAIL_HOME
unset PYSPRK_SUBMIT_ARGS
python -c 'import hail as hl; hl.init(); hl.balding_nichols_model(3,100,100)._force_count_rows()'

I expect that will succeed

Hi danking,

While trying the steps you suggested I am getting the following error:-

LOGGING: writing to /root/hail-20190422-0953-0.2.12-2571917c39c6.log
2019-04-22 09:53:11 Hail: INFO: balding_nichols_model: generating genotypes for 3 populations, 100 samples, and 100 variants…
ERROR: dlopen("/tmp/libhail497944133450230796.so"): /lib64/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail497944133450230796.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail497944133450230796.so: /lib64/libstdc++.so.6: versionCXXABI_1.3.8’ not found (required by /tmp/libhail497944133450230796.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail497944133450230796.so: /lib64/libstdc++.so.6: version `CXXABI_1.3.8’ not found (required by /tmp/libhail497944133450230796.so)

These are the following output you asked for:-

pip show hail
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won’t be maintained after that date. A future version of pip will drop support for Python 2.7.
Name: hail
Version: 0.2
Summary: A library for scalable biological data analysis.
Home-page: https://www.hail.is
Author: Hail Team
Author-email: hail@broadinstitute.org
License: MIT
Location: /usr/lib/python2.7/site-packages
Requires:
Required-by:
[root@localhost ~]# pip show pyspark
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won’t be maintained after that date. A future version of pip will drop support for Python 2.7.
Name: pyspark
Version: 2.4.1
Summary: Apache Spark Python API
Home-page: https://github.com/apache/spark/tree/master/python
Author: Spark Developers
Author-email: dev@spark.apache.org
License: http://www.apache.org/licenses/LICENSE-2.0
Location: /home/aby/spark/python
Requires: py4j
Required-by:

[root@localhost ~]# which pip
/usr/bin/pip

[root@localhost ~]# which python
alias python=‘python3’
/usr/bin/python3

[root@localhost ~]# echo $PYSPARK_SUBMIT_ARGS
–jars /home/aby/CBR-IISC/Hail/hail/hail/build/libs/hail-all-spark.jar --conf spark.driver.extraClassPath="/home/aby/CBR-IISC/Hail/hail/hail/build/libs/hail-all-spark.jar" --conf spark.executor.extraClassPath="/home/aby/CBR-IISC/Hail/hail/hail/build/libs/hail-all-spark.jar" pyspark-shell

[root@localhost ~]# echo $PYTHONPATH
/home/aby/CBR-IISC/Hail/hail/hail/build/distributions/hail-python.zip:/home/aby/spark/python:/home/aby/CBR-IISC/Hail/hail/hail/python:/home/aby/spark/python:/home/aby/spark/python/lib/py4j-0.10.7-src.zip:/home/aby/CBR-IISC/Hail/hail/hail/build/distributions/hail-python.zip:/home/aby/spark/python:/home/aby/CBR-IISC/Hail/hail/hail/python:/home/aby/spark/python:/home/aby/spark/python/lib/py4j-0.10.7-src.zip:/home/aby/CBR-IISC/Hail/hail/hail/build/distributions/hail-python.zip:/home/aby/spark/python:/home/aby/CBR-IISC/Hail/hail/hail/python:/home/aby/spark/python:/home/aby/spark/python/lib/py4j-0.10.7-src.zip

But I still want to work with the compiled jar, it will be really great if you can tell me how to solve the following error:-

hl.init()
Traceback (most recent call last):
File “”, line 1, in
File “</usr/local/lib/python3.6/site-packages/decorator.py:decorator-gen-996>”, line 2, in init
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper
return original_func(*args, **kwargs)
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 256, in init
default_reference, idempotent, global_seed, _backend)
File “</usr/local/lib/python3.6/site-packages/decorator.py:decorator-gen-994>”, line 2, in init
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper
return original_func(*args, **kwargs)
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 97, in init
min_block_size, branching_factor, tmp_dir)
TypeError: ‘JavaPackage’ object is not callable

  1. It appears that your pip binary is a Python 2.7 pip not a Python 3.6 or later pip. I strongly recommend always using python -m pip install to install packages because it ensures that you are using the pip associated with the Python binary you expect.
  2. Above the line that starts “LOGGING: writing to”, are several lines that indicate whether you have a pip installed version of Hail or a custom compiled version of Hail. Please re-run that command and paste the full output.
  3. Your version of the C++ standard library is incompatible with the version of Hail you have. Your system must have at least GCC 4.9 installed in order to have access to CXXABI_1.3.8. What is the output of gcc --version?
  4. Please include the hail log file generated by the last error message you posted. The location of the log file is printed on the line that starts “LOGGING: writing to”.

Hi Danking,

You are right, it was a C++ and pip dependency issue. I upgraded my C++ to 5.4 and PIP for python3.6. Now hail is working fine.

Thank you for the guidance and time.

1 Like

Great news! Good luck with your future hail endeavors!

1 Like

I have similar error in the jupyter notebook.
After running

mt_pruned.count()
1509, 1069
PI_HAT_table = hl.identity_by_descent(mt_pruned, maf=mt_pruned[‘MAF’], min=0.2, max=0.9)
PI_HAT_table.ibd.show(5)

Py4JError: An error occurred while calling o1.backend

And at the same time, I have an error in conslole:

[Stage 84:=====================================================> (69 + 3) / 72]ERROR: dlopen(“/tmp/libhail1143076235271955295.so”): /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail1143076235271955295.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)

I use Hail on a server using the conda where I do not have sudo rights.

My .bashrc:

export HAIL_HOME=$(/nfs/home/rskitchenko/anaconda2/envs/hail/bin/pip show hail | grep Location | awk -F’ ’ ‘{print $2 “/hail”}’)
export SPARK_CLASSPATH=$HAIL_HOME/hail-all-spark.jar
export PYSPARK_SUBMIT_ARGS=“–conf spark.driver.extraClassPath=$SPARK_CLASSPATH --conf spark.executor.extraClassPath=$SPARK_CLASSPATH --driver-memory 20G pyspark-shell”

There are a few issues here.

Explaining the Error Message

The error message indicates that the server does not have a recent version of the C++ standard library installed. You need to install either GCC 5.0 or LLVM 3.4.

ERROR: dlopen("/tmp/libhail1143076235271955295.so"): /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail1143076235271955295.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)

Given that this issue occurred, the server likely also has an old version of the C standard library. Fixing that problem usually requires updating the kernel. The server needs version 2.14 or later of the GCC C Standard Library. You can find a list of the GCC C Standard Library included in various GNU/Linux distributions at Wikipedia.

Fixing Your .bashrc

We recommend against configuring Hail with a .bashrc like yours. To install Hail, you only need to run: pip install hail. To use Hail you only need to run: python or ipython. If you want ipython to use 20 Gigabytes of memory, you can set PYSPARK_SUBMIT_ARGS as follows:

export PYSPARK_SUBMIT_ARGS="--driver-memory 20G pyspark-shell"

In particular, do not need to define HAIL_HOME or SPARK_CLASSPATH and you should not set the extraClassPath.

If you are trying to use Hail on a Spark cluster of multiple servers, then you will need to use a different installation method. The Hail docs have more information about how to install Hail

Addressing the Error

You stated that you do not have sudo privileges on this machine. I doubt your system administrator will update to a GNU/Linux distribution released since 2012. If you have access to Docker, I recommend using a Docker image: danking00/hail:0.2.26:

docker run -it danking00/hail:0.2.26

The Dockerfile that generated this image is hosted in GitHub.

If you don’t have Docker, your laptop might be a reasonable choice. Many modern laptops have at least 16 GB of RAM and several cores. We support GNU/Linux and Mac OS X. If you have a Windows laptop, you could install Docker and use it as mentioned above.

If you have access to Google Cloud Platform or Amazon Web Services, we have tutorials for using Hail on the Cloud.

Finally, it’s possible your system administrator can create a dot-kit that enables you to use a recent version the C Standard Library and the C++ Standard Library. As I mentioned above, you’ll need a kernel built for the GCC C Standard Library version 2.14 or later and you’ll need GCC 5.0 or later or LLVM 3.4 or later.


I’m sorry there is no simple solution to this issue.