Hail Py4JError while calling z:is.hail.backend.spark.SparkBackend.executeJSON

Dear Tpoterba,

All problems went away after switching to the latest version of Ubuntu Server.

Kind regards,
Jerry

Hi,

I tried OS restart, but still the problem is persisting.

Can you share the output of:

pip show hail
pip show pyspark
which pip
which python
echo $PYSPARK_SUBMIT_ARGS
echo $PYTHONPATH

If you’re using the pip installed version of Hail, you do not need to compile from source or install spark manually. Just run this:

python -m pip install -U hail
unset HAIL_HOME
unset PYSPRK_SUBMIT_ARGS
python -c 'import hail as hl; hl.init(); hl.balding_nichols_model(3,100,100)._force_count_rows()'

I expect that will succeed

Hi danking,

While trying the steps you suggested I am getting the following error:-

LOGGING: writing to /root/hail-20190422-0953-0.2.12-2571917c39c6.log
2019-04-22 09:53:11 Hail: INFO: balding_nichols_model: generating genotypes for 3 populations, 100 samples, and 100 variants…
ERROR: dlopen("/tmp/libhail497944133450230796.so"): /lib64/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail497944133450230796.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail497944133450230796.so: /lib64/libstdc++.so.6: versionCXXABI_1.3.8’ not found (required by /tmp/libhail497944133450230796.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail497944133450230796.so: /lib64/libstdc++.so.6: version `CXXABI_1.3.8’ not found (required by /tmp/libhail497944133450230796.so)

These are the following output you asked for:-

pip show hail
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won’t be maintained after that date. A future version of pip will drop support for Python 2.7.
Name: hail
Version: 0.2
Summary: A library for scalable biological data analysis.
Home-page: https://www.hail.is
Author: Hail Team
Author-email: hail@broadinstitute.org
License: MIT
Location: /usr/lib/python2.7/site-packages
Requires:
Required-by:
[root@localhost ~]# pip show pyspark
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won’t be maintained after that date. A future version of pip will drop support for Python 2.7.
Name: pyspark
Version: 2.4.1
Summary: Apache Spark Python API
Home-page: https://github.com/apache/spark/tree/master/python
Author: Spark Developers
Author-email: dev@spark.apache.org
License: http://www.apache.org/licenses/LICENSE-2.0
Location: /home/aby/spark/python
Requires: py4j
Required-by:

[root@localhost ~]# which pip
/usr/bin/pip

[root@localhost ~]# which python
alias python=‘python3’
/usr/bin/python3

[root@localhost ~]# echo $PYSPARK_SUBMIT_ARGS
–jars /home/aby/CBR-IISC/Hail/hail/hail/build/libs/hail-all-spark.jar --conf spark.driver.extraClassPath="/home/aby/CBR-IISC/Hail/hail/hail/build/libs/hail-all-spark.jar" --conf spark.executor.extraClassPath="/home/aby/CBR-IISC/Hail/hail/hail/build/libs/hail-all-spark.jar" pyspark-shell

[root@localhost ~]# echo $PYTHONPATH
/home/aby/CBR-IISC/Hail/hail/hail/build/distributions/hail-python.zip:/home/aby/spark/python:/home/aby/CBR-IISC/Hail/hail/hail/python:/home/aby/spark/python:/home/aby/spark/python/lib/py4j-0.10.7-src.zip:/home/aby/CBR-IISC/Hail/hail/hail/build/distributions/hail-python.zip:/home/aby/spark/python:/home/aby/CBR-IISC/Hail/hail/hail/python:/home/aby/spark/python:/home/aby/spark/python/lib/py4j-0.10.7-src.zip:/home/aby/CBR-IISC/Hail/hail/hail/build/distributions/hail-python.zip:/home/aby/spark/python:/home/aby/CBR-IISC/Hail/hail/hail/python:/home/aby/spark/python:/home/aby/spark/python/lib/py4j-0.10.7-src.zip

But I still want to work with the compiled jar, it will be really great if you can tell me how to solve the following error:-

hl.init()
Traceback (most recent call last):
File “”, line 1, in
File “</usr/local/lib/python3.6/site-packages/decorator.py:decorator-gen-996>”, line 2, in init
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper
return original_func(*args, **kwargs)
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 256, in init
default_reference, idempotent, global_seed, _backend)
File “</usr/local/lib/python3.6/site-packages/decorator.py:decorator-gen-994>”, line 2, in init
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper
return original_func(*args, **kwargs)
File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 97, in init
min_block_size, branching_factor, tmp_dir)
TypeError: ‘JavaPackage’ object is not callable

  1. It appears that your pip binary is a Python 2.7 pip not a Python 3.6 or later pip. I strongly recommend always using python -m pip install to install packages because it ensures that you are using the pip associated with the Python binary you expect.
  2. Above the line that starts “LOGGING: writing to”, are several lines that indicate whether you have a pip installed version of Hail or a custom compiled version of Hail. Please re-run that command and paste the full output.
  3. Your version of the C++ standard library is incompatible with the version of Hail you have. Your system must have at least GCC 4.9 installed in order to have access to CXXABI_1.3.8. What is the output of gcc --version?
  4. Please include the hail log file generated by the last error message you posted. The location of the log file is printed on the line that starts “LOGGING: writing to”.

Hi Danking,

You are right, it was a C++ and pip dependency issue. I upgraded my C++ to 5.4 and PIP for python3.6. Now hail is working fine.

Thank you for the guidance and time.

1 Like

Great news! Good luck with your future hail endeavors!

1 Like

I have similar error in the jupyter notebook.
After running

mt_pruned.count()
1509, 1069
PI_HAT_table = hl.identity_by_descent(mt_pruned, maf=mt_pruned[‘MAF’], min=0.2, max=0.9)
PI_HAT_table.ibd.show(5)

Py4JError: An error occurred while calling o1.backend

And at the same time, I have an error in conslole:

[Stage 84:=====================================================> (69 + 3) / 72]ERROR: dlopen("/tmp/libhail1143076235271955295.so"): /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail1143076235271955295.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)

I use Hail on a server using the conda where I do not have sudo rights.

My .bashrc:

export HAIL_HOME=$(/nfs/home/rskitchenko/anaconda2/envs/hail/bin/pip show hail | grep Location | awk -F’ ’ ‘{print $2 “/hail”}’)
export SPARK_CLASSPATH=$HAIL_HOME/hail-all-spark.jar
export PYSPARK_SUBMIT_ARGS="–conf spark.driver.extraClassPath=$SPARK_CLASSPATH --conf spark.executor.extraClassPath=$SPARK_CLASSPATH --driver-memory 20G pyspark-shell"

There are a few issues here.

Explaining the Error Message

The error message indicates that the server does not have a recent version of the C++ standard library installed. You need to install either GCC 5.0 or LLVM 3.4.

ERROR: dlopen("/tmp/libhail1143076235271955295.so"): /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail1143076235271955295.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail1143076235271955295.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8’ not found (required by /tmp/libhail1143076235271955295.so)

Given that this issue occurred, the server likely also has an old version of the C standard library. Fixing that problem usually requires updating the kernel. The server needs version 2.14 or later of the GCC C Standard Library. You can find a list of the GCC C Standard Library included in various GNU/Linux distributions at Wikipedia.

Fixing Your .bashrc

We recommend against configuring Hail with a .bashrc like yours. To install Hail, you only need to run: pip install hail. To use Hail you only need to run: python or ipython. If you want ipython to use 20 Gigabytes of memory, you can set PYSPARK_SUBMIT_ARGS as follows:

export PYSPARK_SUBMIT_ARGS="--driver-memory 20G pyspark-shell"

In particular, do not need to define HAIL_HOME or SPARK_CLASSPATH and you should not set the extraClassPath.

If you are trying to use Hail on a Spark cluster of multiple servers, then you will need to use a different installation method. The Hail docs have more information about how to install Hail

Addressing the Error

You stated that you do not have sudo privileges on this machine. I doubt your system administrator will update to a GNU/Linux distribution released since 2012. If you have access to Docker, I recommend using a Docker image: danking00/hail:0.2.26:

docker run -it danking00/hail:0.2.26

The Dockerfile that generated this image is hosted in GitHub.

If you don’t have Docker, your laptop might be a reasonable choice. Many modern laptops have at least 16 GB of RAM and several cores. We support GNU/Linux and Mac OS X. If you have a Windows laptop, you could install Docker and use it as mentioned above.

If you have access to Google Cloud Platform or Amazon Web Services, we have tutorials for using Hail on the Cloud.

Finally, it’s possible your system administrator can create a dot-kit that enables you to use a recent version the C Standard Library and the C++ Standard Library. As I mentioned above, you’ll need a kernel built for the GCC C Standard Library version 2.14 or later and you’ll need GCC 5.0 or later or LLVM 3.4 or later.


I’m sorry there is no simple solution to this issue.

Thanks for the quick response!

  1. The solution with docker does not suit me, because there is a problem with root rights that go beyond the container, and my system administrator will be against it.
    https://forums.docker.com/t/how-can-we-deny-docker-developers-root-privileges-from-their-containers/78251

I made some changes according to your edits…

  1. I commend out the fields you pointed to

# export HAIL_HOME=…
# export SPARK_CLASSPATH=…

  1. And add new line

PYSPARK_SUBMIT_ARGS="–driver-memory 20G pyspark-shell"

  1. I received gcc-5 installation from the system administrator.

which gcc-5

/usr/bin/gcc-5

  1. Then I download gcc-5 for my hail environment in conda and add it to $PATH

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/nfs/home/rskitchenko/anaconda2/envs/hail/bin/x86_64-unknown-linux-gnu-gcc-5.2.0
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/nfs/home/rskitchenko/anaconda2/envs/hail/gcc/share/gcc-5.2.0
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/nfs/home/rskitchenko/anaconda2/envs/hail/share/gcc-5.2.0

  1. I set my default gcc:

export CC=/usr/bin/gcc-5

But after sourcing I still have old-gcc.

gcc -v

gcc version 4.8.5 (Ubuntu 4.8.5-2ubuntu1~14.04.1)

  1. Now my .bashrc:

export CC=/usr/bin/gcc-5
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/nfs/home/rskitchenko/anaconda2/envs/hail/bin/x86_64-unknown-linux-gnu-gcc-5.2.0
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/nfs/home/rskitchenko/anaconda2/envs/hail/gcc/share/gcc-5.2.0
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/nfs/home/rskitchenko/anaconda2/envs/hail/share/gcc-5.2.0
export PYSPARK_SUBMIT_ARGS="–driver-memory 20G pyspark-shell"

I already install hail by pip install hail, but I still get old libstdc++.so.
What should I do next? How to reinstall Hail or anaconda with gcc-5 libstdc++.so and etc?

Additional information:

ERROR: dlopen("/tmp/libhail8343362620706188710.so"): /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail8343362620706188710.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail8343362620706188710.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail8343362620706188710.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail8343362620706188710.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8’ not found (required by /tmp/libhail8343362620706188710.so)

I check it, but … :

rskitchenko@horse:~$ strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep CXXABI_1.3.8
CXXABI_1.3.8

All of your worker nodes also need the same version of the C and C++ standard libraries. Can you check that CXXABI_1.3.8 is present on every machine? This only applies if you have a cluster of machines. Are you working on a single machine?

I don’t think it is possible to get this message:

ERROR: dlopen("/tmp/libhail8343362620706188710.so"): /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8' not found (required by /tmp/libhail8343362620706188710.so)
FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail8343362620706188710.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail8343362620706188710.so)

while also getting a result from this:

strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep CXXABI_1.3.8

Is it possible the error message is coming from a different machine or the same machine in an environment with a different file system?

Are you working on a single machine?

I work on a cluster.

Is it possible the error message is coming from a different machine or the same machine in an environment with a different file system?

I don’t know how to check it(

GOOD NEWS! I did it! I have changed machine in the cluster, reinstalled dependencies, and now my problem solved in python and in ipython. But in jupyter notebook it is still doesn’t working :scream:

python and ipython console:

jupyter notebook:

Py4JError: An error occurred while calling o1.backend

with same error in console:

ERROR: dlopen("/tmp/libhail8343362620706188710.so"): /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail8343362620706188710.so)
FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail8343362620706188710.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version CXXABI_1.3.8’ not found (required by /tmp/libhail8343362620706188710.so)

Do you have any ideas?

  1. jupyter and python are in PATH

(hail) rskitchenko@sphinx:~/.conda/envs/hail/bin$ which jupyter
/nfs/home/rskitchenko/.conda/envs/hail/bin/jupyter
(hail) rskitchenko@sphinx:~/.conda/envs/hail/bin$ which python
/nfs/home/rskitchenko/.conda/envs/hail/bin/python

  1. Interpreters in ipython console and jupyter notebook are the same and this is logical. I thought there was no difference between ipython and jupyter.

In [1]: import sys
In [2]: sys.executable
Out[2]: ‘/nfs/home/rskitchenko/.conda/envs/hail/bin/python’

Can you attach the hail log file? It should appear in the same directory that you started Jupyter.

It seems impossible :disappointed_relieved:
image
However, I can send it somewhere else.

Just email it to hail@broadinstitute.org

1 Like

Apologies. I’ve raised the limits for new users. You should be able to attach it now.

The log indicates that Spark is using local executors, not a cluster of executors. Is this your intention? I suspect that machine is the one that has an out of date version of the standard library.