Hail Py4JError while calling z:is.hail.backend.spark.SparkBackend.executeJSON

#1

Actually I am having issues while running Hail 0.2.These are the following commands I ran in a ipython shell as follows:-

import hail as hl
mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
mt.count()

I am getting this error “Py4JError: An error occurred while calling z:is.hail.backend.spark.SparkBackend.executeJSON” while running the third step “mt.count()”.

It would be a great help if someone can help me in resolving this issue.

Thank you.

0 Likes

#2

A few questions:

  1. what is the full stack trace? Both Python and java.
  2. How did you install Hail? pip, compiled your own, etc.
  3. Related, what system are you running on? Mac laptop, Linux laptop, cluster, etc.
0 Likes

#3

Dear tpoterba,

Thank you for responding. I tried installing hail using pip on a laptop having Centos 7. I have spark 2.4.1 installed in standalone mode in my laptop.

Actually I removed all the packages installed using pip and built a fresh jar from the source. The jar got built successfully for the spark version 2.4.1.

Right now I am able to call the hail library succesfully in pyspark prompt,but when I am initializing hail using hl.init(), I am getting the following error:-

Using Python version 3.6.6 (default, Mar 29 2019 00:03:27)
SparkSession available as ‘spark’.
>>> import hail as hl
>>> hl.init()
Traceback (most recent call last):
** File “”, line 1, in **
** File “”, line 2, in init**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper**
** return original_func(*args, **kwargs)**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 256, in init**
** default_reference, idempotent, global_seed, _backend)**
** File “”, line 2, in init**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/typecheck/check.py”, line 561, in wrapper**
** return original_func(*args, **kwargs)**
** File “/home/aby/CBR-IISC/Hail/hail/hail/python/hail/context.py”, line 97, in init**
** min_block_size, branching_factor, tmp_dir)**
TypeError: ‘JavaPackage’ object is not callable

0 Likes

#4

This error means that the Hail jar isn’t on your classpath. Try something like this:

export PYSPARK_SUBMIT_ARGS="--conf spark.driver.extraClassPath=/Users/tpoterba/hail/hail/build/libs/hail-all-spark.jar --conf spark.executor.extraClassPath=/Users/tpoterba/hail/hail/build/libs/hail-all-spark.jar --driver-memory 8G pyspark-shell"
0 Likes

#5

I am still getting the same error. This is my current bashrc file entries:-

export PYSPARK_PYTHON=python3

export SPARK_HOME=/home/aby/spark

export PATH=$PATH:$SPARK_HOME/bin

export HAIL_HOME=/home/aby/CBR-IISC/Hail/hail/hail

export PYTHONPATH="${PYTHONPATH:+$PYTHONPATH:}$HAIL_HOME/build/distributions/hail-python.zip"

export PYTHONPATH="$PYTHONPATH:$SPARK_HOME/python"

export PYTHONPATH="$PYTHONPATH:$HAIL_HOME/python:$SPARK_HOME/python:echo $SPARK_HOME/python/lib/py4j*-src.zip"

export PYSPARK_SUBMIT_ARGS="*
** --jars $HAIL_HOME/build/libs/hail-all-spark.jar *

** --conf spark.driver.extraClassPath="$HAIL_HOME/build/libs/hail-all-spark.jar" **
** --conf spark.executor.extraClassPath="$HAIL_HOME/build/libs/hail-all-spark.jar" **
** pyspark-shell"**

0 Likes

#6

I am also having this problem. It seems java stops after a python command—— when using jps to see java processes, the process SparkSubmit disappears after executing mt.count().

$ ipython
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
Type ‘copyright’, ‘credits’ or ‘license’ for more information
IPython 7.4.0 – An enhanced Interactive Python. Type ‘?’ for help.

In [1]: import hail

In [2]: import hail as hl

In [3]: hl.init()
using hail jar at /home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/hail-all-spark.jar
Setting default log level to “WARN”.
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Running on Apache Spark version 2.2.0
SparkUI available at http://10.10.0.175:4040
Welcome to
__ __ <>__
/ // /__ __/ /
/ __ / _ `/ / /
/
/ //_,/// version 0.2.12-9409c0635781
LOGGING: writing to /home/wmr/hail-20190416-2051-0.2.12-9409c0635781.log

In [4]: !jps
52916 Jps
52661 SparkSubmit

In [5]: mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
2019-04-16 20:52:17 Hail: INFO: balding_nichols_model: generating genotypes for 3 populations, 50 samples, and 100 variants…

In [6]: !jps
52661 SparkSubmit
53206 Jps

In [7]: mt.count()
ERROR: dlopen("/tmp/libhail5615322307661804588.so"): /lib64/libc.so.6: version GLIBC_2.14' not found (required by /tmp/libhail5615322307661804588.so) FATAL: caught exception java.lang.UnsatisfiedLinkError: /tmp/libhail5615322307661804588.so: /lib64/libc.so.6: versionGLIBC_2.14’ not found (required by /tmp/libhail5615322307661804588.so)
java.lang.UnsatisfiedLinkError: /tmp/libhail5615322307661804588.so: /lib64/libc.so.6: version `GLIBC_2.14’ not found (required by /tmp/libhail5615322307661804588.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
at is.hail.nativecode.NativeCode.(NativeCode.java:25)
at is.hail.nativecode.NativeBase.(NativeBase.scala:22)
at is.hail.annotations.Region.(Region.scala:27)
at is.hail.annotations.Region$.apply(Region.scala:10)
at is.hail.annotations.Region$.scoped(Region.scala:13)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:81)
at is.hail.backend.spark.SparkBackend$.execute(SparkBackend.scala:49)
at is.hail.backend.spark.SparkBackend$.executeJSON(SparkBackend.scala:16)
at is.hail.backend.spark.SparkBackend.executeJSON(SparkBackend.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File “/home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py”, line 1035, in send_command
raise Py4JNetworkError(“Answer from Java side is empty”)
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py”, line 883, in send_command
response = connection.send_command(command)
File “/home/wmr/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py”, line 1040, in send_command
“Error while receiving”, e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving

Py4JError Traceback (most recent call last)
in
----> 1 mt.count()

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/matrixtable.py in count(self)
2369 Number of rows, number of cols.
2370 “”"
-> 2371 return (self.count_rows(), self.count_cols())
2372
2373 @typecheck_method(output=str,

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/matrixtable.py in count_rows(self)
2329
2330 return Env.backend().execute(
-> 2331 TableCount(MatrixRowsTable(self._mir)))
2332
2333 def _force_count_rows(self):

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/backend/backend.py in execute(self, ir)
91 return ir.typ._from_json(
92 Env.hail().backend.spark.SparkBackend.executeJSON(
—> 93 self._to_java_ir(ir)))
94
95 def value_type(self, ir):

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py in call(self, *args)
1131 answer = self.gateway_client.send_command(command)
1132 return_value = get_return_value(
-> 1133 answer, self.gateway_client, self.target_id, self.name)
1134
1135 for temp_arg in temp_args:

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/hail/utils/java.py in deco(*args, **kwargs)
213 import pyspark
214 try:
–> 215 return f(*args, **kwargs)
216 except py4j.protocol.Py4JJavaError as e:
217 s = e.java_exception.toString()

~/hail/anaconda3/envs/hail/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
325 raise Py4JError(
326 “An error occurred while calling {0}{1}{2}”.
–> 327 format(target_id, “.”, name))
328 else:
329 type = answer[1]

Py4JError: An error occurred while calling z:is.hail.backend.spark.SparkBackend.executeJSON

In [8]: !jps
53408 Jps

In [9]:

0 Likes

#7

@Jerry I think your issue is different. Your machine needs GLIBC_2.14 or later. What does ldd --version return?

0 Likes

#8

@aby I never had success putting the PYSPARK_SUBMIT_ARGS bit in .bashrc or .zshrc files – I have it in .profile. Other than that, try an OS restart?

0 Likes

#9

Dear Tpoterba,
My version is ldd (GNU libc) 2.12.

When I use export LD_LIBRARY_PATH=/opt/glibc-2.14/lib before running ipython, it shows

(hail) [wmr@huanglab ~] export LD_LIBRARY_PATH=/opt/glibc-2.14/lib (hail) [wmr@huanglab ~] ipython
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
Type ‘copyright’, ‘credits’ or ‘license’ for more information
IPython 7.4.0 – An enhanced Interactive Python. Type ‘?’ for help.

In [1]: import hail
Segmentation fault (core dumped)

0 Likes

#10

This segfault isn’t coming from Hail, since Hail doesn’t use any c libraries before calling hl.init(). It’s probably coming from cpython, one of the dependencies like numpy.

0 Likes

#11

Dear Tpoterba,

All problems went away after switching to the latest version of Ubuntu Server.

Kind regards,
Jerry

0 Likes

#12

Hi,

I tried OS restart, but still the problem is persisting.

0 Likes

#13

Can you share the output of:

pip show hail
pip show pyspark
which pip
which python
echo $PYSPARK_SUBMIT_ARGS
echo $PYTHONPATH

If you’re using the pip installed version of Hail, you do not need to compile from source or install spark manually. Just run this:

python -m pip install -U hail
unset HAIL_HOME
unset PYSPRK_SUBMIT_ARGS
python -c 'import hail as hl; hl.init(); hl.balding_nichols_model(3,100,100)._force_count_rows()'

I expect that will succeed

0 Likes