Hello all,
I have a Hail table of variants (~100k rows), and I am trying to annotate it by looking up a Reference data Hail table (with all possible mutations at all sites) with this:
# calls is the Hail table, ref_data_ht is the reference Hail table
calls = calls.annotate(
gnomad_af = hl.if_else(hl.is_defined(ref_data_ht[calls.key]),(hl.if_else(hl.is_defined(ref_data_ht[calls.key].gnomad_exomes.AF),ref_data_ht[calls.key].gnomad_exomes.AF, 0.0)), 0.0),
meta_svm = hl.if_else(hl.is_defined(ref_data_ht[calls.key]),(hl.if_else(hl.is_defined(ref_data_ht[calls.key].dbnsfp.MetaSVM_pred),ref_data_ht[calls.key].dbnsfp.MetaSVM_pred, 'na')), 'na'),
cadd = hl.if_else(hl.is_defined(ref_data_ht[calls.key]),(hl.if_else(hl.is_defined(ref_data_ht[calls.key].cadd.PHRED),ref_data_ht[calls.key].cadd.PHRED, 0.0)), 0.0),
mpc = hl.if_else(hl.is_defined(ref_data_ht[calls.key]),(hl.if_else(hl.is_defined(ref_data_ht[calls.key].mpc.MPC),ref_data_ht[calls.key].mpc.MPC, '0.0')), '0.0'),
spliceAI_delta = hl.if_else(hl.is_defined(ref_data_ht[calls.key]),(hl.if_else(hl.is_defined(ref_data_ht[calls.key].splice_ai.delta_score),ref_data_ht[calls.key].splice_ai.delta_score, 0.0)), 0.0),
spliceAI_consq = hl.if_else(hl.is_defined(ref_data_ht[calls.key]),(hl.if_else(hl.is_defined(ref_data_ht[calls.key].splice_ai.splice_consequence),ref_data_ht[calls.key].splice_ai.splice_consequence, 'na')), 'na')
)
However, after this step, whenever I try to either write
or export
the calls
Hail table, I always run into Error summary: OutOfMemoryError: Java heap space
issues. I am not sure if this step is supposed to be very taxing on the memory, but I have tried to already set the memory with:
hl.init(log='./log.log', spark_conf={'spark.driver.memory': '30g', 'spark.executor.memory': '30g'}, master='local[60]')
and
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--executor-memory 30G --driver-memory 30G pyspark-shell'
I have tried giving it 100GB of memory, but to no avail. Is there any advice on how to get around this or make this step more efficient? Thank you very much!