Py4JJavaError: An error occurred while calling z:is.hail.backend.spark.SparkBackend.apply

Hi,

I am reinstalling Hail for another condo environment on the cluster.
However, I met this issue when I just run

import hail as hl 
hl.init()

here is the log file:
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.31.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import hail as hl

In [2]: hl.init()
2023-01-12 12:11:00.668 WARN  NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-2-20bea8c9bdc1> in <module>
----> 1 hl.init()

<decorator-gen-1866> in init(sc, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmp_dir, default_reference, idempotent, global_seed, spark_conf, skip_logging_configuration, local_tmpdir, _optimizer_iterations, backend, driver_cores, driver_memory, worker_cores, worker_memory, gcs_requester_pays_configuration)

~/.conda/envs/myenv/lib/python3.7/site-packages/hail/typecheck/check.py in wrapper(__original_func, *args, **kwargs)
    575     def wrapper(__original_func, *args, **kwargs):
    576         args_, kwargs_ = check_all(__original_func, args, kwargs, checkers, is_method=is_method)
--> 577         return __original_func(*args_, **kwargs_)
    578 
    579     return wrapper

~/.conda/envs/myenv/lib/python3.7/site-packages/hail/context.py in init(sc, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmp_dir, default_reference, idempotent, global_seed, spark_conf, skip_logging_configuration, local_tmpdir, _optimizer_iterations, backend, driver_cores, driver_memory, worker_cores, worker_memory, gcs_requester_pays_configuration)
    360             global_seed=global_seed,
    361             skip_logging_configuration=skip_logging_configuration,
--> 362             gcs_requester_pays_configuration=gcs_requester_pays_configuration
    363         )
    364     if backend == 'local':

<decorator-gen-1868> in init_spark(sc, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmp_dir, default_reference, idempotent, global_seed, spark_conf, skip_logging_configuration, local_tmpdir, _optimizer_iterations, gcs_requester_pays_configuration)

~/.conda/envs/myenv/lib/python3.7/site-packages/hail/typecheck/check.py in wrapper(__original_func, *args, **kwargs)
    575     def wrapper(__original_func, *args, **kwargs):
    576         args_, kwargs_ = check_all(__original_func, args, kwargs, checkers, is_method=is_method)
--> 577         return __original_func(*args_, **kwargs_)
    578 
    579     return wrapper

~/.conda/envs/myenv/lib/python3.7/site-packages/hail/context.py in init_spark(sc, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmp_dir, default_reference, idempotent, global_seed, spark_conf, skip_logging_configuration, local_tmpdir, _optimizer_iterations, gcs_requester_pays_configuration)
    427         skip_logging_configuration, optimizer_iterations,
    428         gcs_requester_pays_project=gcs_requester_pays_project,
--> 429         gcs_requester_pays_buckets=gcs_requester_pays_buckets
    430     )
    431     if not backend.fs.exists(tmpdir):

~/.conda/envs/myenv/lib/python3.7/site-packages/hail/backend/spark_backend.py in __init__(self, idempotent, sc, spark_conf, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmpdir, local_tmpdir, skip_logging_configuration, optimizer_iterations, gcs_requester_pays_project, gcs_requester_pays_buckets)
    188             self._jbackend = hail_package.backend.spark.SparkBackend.apply(
    189                 jsc, app_name, master, local, True, min_block_size, tmpdir, local_tmpdir,
--> 190                 gcs_requester_pays_project, gcs_requester_pays_buckets)
    191             self._jhc = hail_package.HailContext.apply(
    192                 self._jbackend, log, True, append, branching_factor, skip_logging_configuration, optimizer_iterations)

~/.conda/envs/myenv/lib/python3.7/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1303         answer = self.gateway_client.send_command(command)
   1304         return_value = get_return_value(
-> 1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
   1307         for temp_arg in temp_args:

~/.conda/envs/myenv/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling z:is.hail.backend.spark.SparkBackend.apply.
: is.hail.utils.HailException: This Hail JAR was compiled for Spark 3.1.1, cannot run with Spark 3.2.1.
  The major and minor versions must agree, though the patch version can differ.
	at is.hail.utils.ErrorHandling.fatal(ErrorHandling.scala:17)
	at is.hail.utils.ErrorHandling.fatal$(ErrorHandling.scala:17)
	at is.hail.utils.package$.fatal(package.scala:78)
	at is.hail.backend.spark.SparkBackend$.checkSparkCompatibility(SparkBackend.scala:90)
	at is.hail.backend.spark.SparkBackend$.createSparkConf(SparkBackend.scala:99)
	at is.hail.backend.spark.SparkBackend$.configureAndCreateSparkContext(SparkBackend.scala:148)
	at is.hail.backend.spark.SparkBackend$.apply(SparkBackend.scala:230)
	at is.hail.backend.spark.SparkBackend.apply(SparkBackend.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
	at java.lang.Thread.run(Thread.java:745)

Is there any suggestions solving this problem? Thanks!

I can run:

CLASSPATH= python3 <<EOF
import pyspark
from pyspark.sql import SQLContext

sc = pyspark.SparkContext()
sqlContext = SQLContext(sc) 
sample = sqlContext.createDataFrame(
    [
        ('qwe', 23),
        ('rty',34),
        ('yui',56),
        ],
    ['abc', 'def'])
sample.show()
EOF
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/01/12 13:05:09 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/01/12 13:05:10 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
23/01/12 13:05:10 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
/gpfs/fs1/home/luoy/.conda/envs/myenv/lib/python3.7/site-packages/pyspark/sql/context.py:79: FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.
  FutureWarning
+---+---+                                                                       
|abc|def|
+---+---+
|qwe| 23|
|rty| 34|
|yui| 56|
+---+---+

The error message:

is.hail.utils.HailException: This Hail JAR was compiled for Spark 3.1.1, cannot run with Spark 3.2.1.

means that you have a too new version of Spark. You can either:

  1. Switch to an older version of Spark.
  2. Compile Hail from source for this new version of Spark.

It is working now. Thank you very much for your help!

1 Like