Hi community, I’m trying to read a small block of an LD matrix in
BlockMatrix format into a numpy array, but I get a
>>> from hail.linalg import BlockMatrix >>> import numpy as np >>> bm_files = "gnomad.genomes.r2.1.1.nfe.common.adj.ld.bm" >>> bm = BlockMatrix.read(bm_files) >>> x = bm[0:10000, 0:10000] Initializing Hail with default parameters... ... >>> X = x.to_numpy() # throws "Error summary: OutOfMemoryError: Java heap space"
How do I increase Java heap space? A 10k by 10k matrix should consume only 800MB of RAM? In my applications, I would like to read 50k by 50k blocks into memory.
I tried the first suggestion in this post,
PYSPARK_SUBMIT_ARGS="--driver-memory 24G pyspark-shell" but it still got the same issue.