Pc_rel memory issue: ConnectionRefusedError: [Errno 111] Connection refused

jbakewell · February 8, 2024, 10:16am

Hi

I’m trying to run a script that includes the pc_rel function:

# Repartition
mt = mt.repartition(300).persist()
# Run kinship analysis
pc_rel = hl.pc_relate(mt.GT, 0.001, k=2, min_kinship=0.1)
# Export pc_rel to tsv
pc_rel_filename = output_prefix + '.hail_kinship.tsv'
pc_rel_path = os.path.join(qc_dir, pc_rel_filename)
pc_rel.flatten().export(pc_rel_path, delimiter = "\t")

However the script fails with the following trackback:

2024-02-08 10:03:11,290 An error occurred while trying to connect to the Java server (127.0.0.1:41254)
Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2024-02-08 10:03:11,292 An error occurred while trying to connect to the Java server (127.0.0.1:41254)
Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2024-02-08 10:03:11,292 An error occurred while trying to connect to the Java server (127.0.0.1:41254)
Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2024-02-08 10:03:11,292 An error occurred while trying to connect to the Java server (127.0.0.1:41254)
Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 977, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/py4j/java_gateway.py", line 1115, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused

I believe this is memory issue, but it occurs even when I allocate very large amounts of memory:

PYSPARK_SUBMIT_ARGS=" --driver-memory 500g --executor-memory 500g pyspark-shell"

Does anyone have suggestions how I might resolve this?

Thanks

DBScan · June 5, 2024, 8:14pm

I’ve ran into the same error several times, but with ld_prune.

pruned_ht = hl.ld_prune(filtered_mt.GT, r2 = 0.1, bp_window_size = 500000)

I’ve initialized HAIL like this:

hl.init(tmp_dir = '/scratch/hail',
        local_tmpdir = '/scratch/hail',
        master='local[128]',
        spark_conf={'spark.driver.memory': '1800g',
                    'spark.executor.memory': '1800g',
                    'spark.local.dir' : '/scratch/hail',
                   'java.io.tmpdir': '/scratch/hail'})

danielgoldstein · June 5, 2024, 8:26pm

@jbakewell Apologies that this was overlooked! Thank you @DBScan for resurfacing this issue. The connection refused error is cryptic (we’ll have a better error message for the next release), but it boils down to the python code no longer being able to communicate with the hail backend. This most often occurs due to running out of memory and the backend is killed by the OS. How much memory do you have on your machine? It looks like you’re not holding back but you also want to make sure not to promise more to spark than is ultimately available. If you could give some specifics about the machine you’re running that would help us debug this further. It would also be helpful to see the last ~30 lines of the hail log to see where in the query it failed.

DBScan · June 6, 2024, 10:08am

Hi @danielgoldstein , I use a single node with 2TB of RAM and 128 CPUs. The last lines of the log contain this (error seems to be different than before):

        at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$4(CompileAndEvaluate.scala:60)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$2(CompileAndEvaluate.scala:60)
        at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$2$adapted(CompileAndEvaluate.scala:58)
        at is.hail.backend.ExecuteContext.$anonfun$scopedExecution$1(ExecuteContext.scala:144)
        at is.hail.utils.package$.using(package.scala:664)
        at is.hail.backend.ExecuteContext.scopedExecution(ExecuteContext.scala:144)
        at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:58)
        at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$apply$1(CompileAndEvaluate.scala:17)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:17)
        at is.hail.expr.ir.TableWriter.apply(TableWriter.scala:51)
        at is.hail.expr.ir.Interpret$.run(Interpret.scala:921)
        at is.hail.expr.ir.Interpret$.alreadyLowered(Interpret.scala:66)
        at is.hail.expr.ir.LowerOrInterpretNonCompilable$.evaluate$1(LowerOrInterpretNonCompilable.scala:20)
        at is.hail.expr.ir.LowerOrInterpretNonCompilable$.rewrite$1(LowerOrInterpretNonCompilable.scala:59)
        at is.hail.expr.ir.LowerOrInterpretNonCompilable$.apply(LowerOrInterpretNonCompilable.scala:64)
        at is.hail.expr.ir.lowering.LowerOrInterpretNonCompilablePass$.transform(LoweringPass.scala:83)
        at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:32)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:32)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:30)
        at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:29)
        at is.hail.expr.ir.lowering.LowerOrInterpretNonCompilablePass$.apply(LoweringPass.scala:78)
        at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:21)
        at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:19)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:19)
        at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:45)
        at is.hail.backend.spark.SparkBackend._execute(SparkBackend.scala:600)
        at is.hail.backend.spark.SparkBackend.$anonfun$execute$4(SparkBackend.scala:636)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.backend.spark.SparkBackend.$anonfun$execute$3(SparkBackend.scala:631)
        at is.hail.backend.spark.SparkBackend.$anonfun$execute$3$adapted(SparkBackend.scala:630)
        at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:78)
        at is.hail.utils.package$.using(package.scala:664)
        at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:78)
        at is.hail.utils.package$.using(package.scala:664)
        at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:13)
        at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:65)
        at is.hail.backend.spark.SparkBackend.$anonfun$withExecuteContext$2(SparkBackend.scala:407)
        at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:55)
        at is.hail.utils.ExecutionTimer$.logTime(ExecutionTimer.scala:62)
        at is.hail.backend.spark.SparkBackend.withExecuteContext(SparkBackend.scala:393)
        at is.hail.backend.spark.SparkBackend.execute(SparkBackend.scala:630)
        at is.hail.backend.BackendHttpHandler.handle(BackendServer.scala:88)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
        at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:822)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:794)
        at sun.net.httpserver.ServerImpl$DefaultExecutor.execute(ServerImpl.java:199)
        at sun.net.httpserver.ServerImpl$Dispatcher.handle(ServerImpl.java:544)
        at sun.net.httpserver.ServerImpl$Dispatcher.run(ServerImpl.java:509)
        at java.lang.Thread.run(Thread.java:750)



Hail version: 0.2.130-bea04d9c79b5
Error summary: SparkException: Job 16 cancelled because SparkContext was shut down

The first lines indicate an issue during ld_prune:

  File "<stdin>", line 1, in <module>
  File "<decorator-gen-1774>", line 2, in ld_prune
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/methods/statgen.py", line 4857, in ld_prune
    variants_to_remove = hl.maximal_independent_set(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<decorator-gen-1624>", line 2, in maximal_independent_set
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/methods/misc.py", line 152, in maximal_independent_set
    edges = edges.checkpoint(new_temp_file())
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<decorator-gen-1214>", line 2, in checkpoint
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/table.py", line 1960, in checkpoint
    self.write(output=output, overwrite=overwrite, stage_locally=stage_locally, _codec_spec=_codec_spec)
  File "<decorator-gen-1216>", line 2, in write
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/table.py", line 2002, in write
    Env.backend().execute(
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/spark_backend.py", line 226, in execute
    raise err
  File "conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/spark_backend.py", line 218, in execute
    return super().execute(ir, timed)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/backend.py", line 190, in execute
    raise e.maybe_user_error(ir) from None
  File "/.conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/backend.py", line 188, in execute
    result, timings = self._rpc(ActionTag.EXECUTE, payload)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/py4j_backend.py", line 221, in _rpc
    raise fatal_error_from_java_error_triplet(
hail.utils.java.FatalError: SparkException: Job 16 cancelled because SparkContext was shut down

danielgoldstein · June 10, 2024, 1:48pm

@DBScan I would try first setting the memory configuration much lower. If you look here, spark.executor.memory is the expected memory per executor process, of which you have 128 because you’ve configured master='local[128]'. In general you want spark.driver.memory + spark.executor.memory * executors < TOTAL_MEMORY otherwise Spark may try to use more than is available and run out. Depending on the memory per executor that you need (exactly what that value is is not yet clear to me) you might not be able to take full advantage of all the cores on your machine.

Additionally, if it fails again it would be great if you can send the full stack trace at the end of the log, so we can find out exactly where the error was raised. Thanks!

danielgoldstein · June 10, 2024, 1:51pm

Another topic of note here is how large your partitions are. Many smaller partitions will likely require less memory as it can perform the computations in smaller chunks. How many partitions do you have and how big is each one?

DBScan · June 10, 2024, 2:15pm

Hi @danielgoldstein , I will try with a much lower memory configuration first and will report back. I have 50’000 partitions, I’m following the tutorials from here: hgdp_tgp/tutorials at master · atgu/hgdp_tgp (github.com). I’ve ran into some performance issues and created a post a while back (if you want to check it out, it’s this one here Slow speed when using gnomadV3 callset - Hail Query & hailctl - Hail Discussion. Thanks for your help so far!

danielgoldstein · June 10, 2024, 2:20pm

Thanks @DBScan and sorry we missed your initial post! We’ll follow up there as well.

DBScan · June 10, 2024, 2:53pm

Hi @danielgoldstein , now I got the java.lang.OutOfMemoryError: Java heap space message. Are you sure that 'spark.executor.memory': '1800g' means 1800 GB of RAM per core?

From an old thread I’ve found this, which suggest it’s the total number of RAM?
How do I increase the memory or RAM available to the JVM when I start Hail through Python? - Hail Query & hailctl - Hail Discussion

I actually have no experience with SPARK, so maybe I am on the completely wrong track here.

danielgoldstein · June 10, 2024, 7:29pm

It appears to be, from this documentation:

Amount of memory to use per executor process, in the same format as JVM memory strings with a size unit suffix (“k”, “m”, “g” or “t”) (e.g. 512m , 2g ).

Regarding the linked thread, it’s not clear to me what Dan is implying there with respect to multiple executors, if at all. What is true is that any data loaded into python must first pass through the spark driver process, so if you expect to load X GiB of data into your python notebook you must use --driver-memory Xg to not run out of memory.

DBScan · June 11, 2024, 7:25am

Hi @danielgoldstein , I managed to run the LD pruning, but when exporting the filtered variants to a VCF I ran into another issue:

 hl.export_vcf(filtered_pruned_mt.rows(), ld_maf_pruned_vcf)

2024-06-11 09:18:22.137 Hail: WARN: export_vcf: ignored the following fields:
    'variant_qc' (row)
2024-06-11 09:18:24.072 Hail: WARN: export_vcf found no row field 'info'. Emitting no INFO fields.
Traceback (most recent call last):                            (128 + 128) / 994]
  File "<stdin>", line 1, in <module>
  File "<decorator-gen-1650>", line 2, in export_vcf
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/methods/impex.py", line 636, in export_vcf
    Env.backend().execute(ir.MatrixWrite(dataset._mir, writer))
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/spark_backend.py", line 226, in execute
    raise err
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/spark_backend.py", line 218, in execute
    return super().execute(ir, timed)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/backend.py", line 190, in execute
    raise e.maybe_user_error(ir) from None
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/backend.py", line 188, in execute
    result, timings = self._rpc(ActionTag.EXECUTE, payload)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/envs/hail-0.2.127/lib/python3.12/site-packages/hail/backend/py4j_backend.py", line 221, in _rpc
    raise fatal_error_from_java_error_triplet(
hail.utils.java.FatalError: SparkException: Job 18 cancelled because SparkContext was shut down

Java stack trace:
org.apache.spark.SparkException: Job 18 cancelled because SparkContext was shut down
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$cleanUpAfterSchedulerStop$1(DAGScheduler.scala:1184)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$cleanUpAfterSchedulerStop$1$adapted(DAGScheduler.scala:1182)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
        at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:1182)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:2883)
        at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84)
        at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:2780)
        at org.apache.spark.SparkContext.$anonfun$stop$11(SparkContext.scala:2105)
        at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1484)
        at org.apache.spark.SparkContext.stop(SparkContext.scala:2105)
        at org.apache.spark.SparkContext.$anonfun$new$35(SparkContext.scala:670)
        at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
        at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2066)
        at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.Try$.apply(Try.scala:213)
        at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
        at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:952)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2238)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2259)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2278)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2303)
        at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1021)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:406)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:1020)
        at is.hail.backend.spark.SparkBackend.parallelizeAndComputeWithIndex(SparkBackend.scala:429)
        at is.hail.backend.BackendUtils.collectDArray(BackendUtils.scala:82)
at __C9446Compiled.__m9517split_CollectDistributedArray_region24_45(Emit.scala)
        at __C9446Compiled.__m9517split_CollectDistributedArray(Emit.scala)
        at __C9446Compiled.apply(Emit.scala)
        at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$4(CompileAndEvaluate.scala:60)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$2(CompileAndEvaluate.scala:60)
        at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$2$adapted(CompileAndEvaluate.scala:58)
        at is.hail.backend.ExecuteContext.$anonfun$scopedExecution$1(ExecuteContext.scala:144)
        at is.hail.utils.package$.using(package.scala:664)
        at is.hail.backend.ExecuteContext.scopedExecution(ExecuteContext.scala:144)
        at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:58)
        at is.hail.expr.ir.CompileAndEvaluate$.evalToIR(CompileAndEvaluate.scala:28)
        at is.hail.expr.ir.LowerOrInterpretNonCompilable$.evaluate$1(LowerOrInterpretNonCompilable.scala:30)
        at is.hail.expr.ir.LowerOrInterpretNonCompilable$.rewrite$1(LowerOrInterpretNonCompilable.scala:59)
        at is.hail.expr.ir.LowerOrInterpretNonCompilable$.apply(LowerOrInterpretNonCompilable.scala:64)
        at is.hail.expr.ir.lowering.LowerOrInterpretNonCompilablePass$.transform(LoweringPass.scala:83)
        at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:32)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:32)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:30)
        at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:29)
        at is.hail.expr.ir.lowering.LowerOrInterpretNonCompilablePass$.apply(LoweringPass.scala:78)
        at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:21)
        at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:19)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:19)
        at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:45)
        at is.hail.backend.spark.SparkBackend._execute(SparkBackend.scala:600)
        at is.hail.backend.spark.SparkBackend.$anonfun$execute$4(SparkBackend.scala:636)
        at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:84)
        at is.hail.backend.spark.SparkBackend.$anonfun$execute$3(SparkBackend.scala:631)
        at is.hail.backend.spark.SparkBackend.$anonfun$execute$3$adapted(SparkBackend.scala:630)
        at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:78)
        at is.hail.utils.package$.using(package.scala:664)
        at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:78)
        at is.hail.utils.package$.using(package.scala:664)
        at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:13)
        at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:65)
        at is.hail.backend.spark.SparkBackend.$anonfun$withExecuteContext$2(SparkBackend.scala:407)
        at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:55)
        at is.hail.utils.ExecutionTimer$.logTime(ExecutionTimer.scala:62)
        at is.hail.backend.spark.SparkBackend.withExecuteContext(SparkBackend.scala:393)
        at is.hail.backend.spark.SparkBackend.execute(SparkBackend.scala:630)
        at is.hail.backend.BackendHttpHandler.handle(BackendServer.scala:88)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
        at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:822)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:794)
        at sun.net.httpserver.ServerImpl$DefaultExecutor.execute(ServerImpl.java:199)
        at sun.net.httpserver.ServerImpl$Dispatcher.handle(ServerImpl.java:544)
        at sun.net.httpserver.ServerImpl$Dispatcher.run(ServerImpl.java:509)
        at java.lang.Thread.run(Thread.java:750)



Hail version: 0.2.130-bea04d9c79b5
Error summary: SparkException: Job 18 cancelled because SparkContext was shut down

Topic		Replies	Views
Ld_prune OutOfMemoryError: Java heap space Hail Query & hailctl	5	696	January 21, 2020
Hl.maximal_independent_set - job 'cancelled because SparkContext was shut down' Hail Query & hailctl	28	7720	February 4, 2021
Getting java heap error tried a bunch of things with the executor and memory settings Hail Batch & General Cloud	2	3407	October 5, 2022
Error summary: OutOfMemoryError: Java heap space Hail Query & hailctl	15	2603	August 18, 2022
Heap out of memory Hail Query & hailctl	14	1814	July 21, 2020

Pc_rel memory issue: ConnectionRefusedError: [Errno 111] Connection refused

Related topics