I’m using Hail via IPython on my laptop, like this:
import hail as hl
ibd = hl.read_matrix_table("/path/to/my.mt").identity_by_descent()
When I try to run this, I get a big error message that starts with this:
FatalError: OutOfMemoryError: Java heap space
or
OutOfMemoryError: GC overhead limit exceed
or
RemoteDisconnected('Remote end closed connection without response')
How do I increase the memory available to the Java process?
You can set the memory using an environment variable:
PYSPARK_SUBMIT_ARGS="--driver-memory 8g --executor-memory 8g pyspark-shell" ipython
This will start an ipython
notebook with 8 GB of memory. If you want ipython
to always start with 8 GB of memory, you can add this to your .bashrc
(or the equivalent file for your shell):
export PYSPARK_SUBMIT_ARGS="--driver-memory 8g --executor-memory 8g pyspark-shell"
1 Like
olavur
March 4, 2021, 11:39am
3
I had the same problem, and was able to solve it with the code below.
import hail as hl
hl.init(spark_conf={'spark.driver.memory': '100g'})
1 Like