Running in local mode. I’m using the following config:
[('spark.jars',
'file:///home/nicholas/miniconda3/envs/hail/lib/python3.7/site-packages/hail/hail-all-spark.jar'),
('spark.hadoop.io.compression.codecs',
'org.apache.hadoop.io.compress.DefaultCodec,is.hail.io.compress.BGzipCodec,is.hail.io.compress.BGzipCodecTbi,org.apache.hadoop.io.compress.GzipCodec'),
('spark.ui.showConsoleProgress', 'false'),
('spark.executor.id', 'driver'),
('spark.logConf', 'true'),
('spark.kryo.registrator', 'is.hail.kryo.HailKryoRegistrator'),
('spark.driver.host', 'sci-pvm-nicholas.calicolabs.local'),
('spark.hadoop.mapreduce.input.fileinputformat.split.minsize', '134217728'),
('spark.serializer', 'org.apache.spark.serializer.KryoSerializer'),
('spark.driver.extraClassPath',
'/home/nicholas/miniconda3/envs/hail/lib/python3.7/site-packages/hail/hail-all-spark.jar'),
('spark.kryoserializer.buffer.max', '1g'),
('spark.driver.port', '35007'),
('spark.driver.maxResultSize', '0'),
('spark.executor.extraClassPath', './hail-all-spark.jar'),
('spark.master', 'local[*]'),
('spark.repl.local.jars',
'file:///home/nicholas/miniconda3/envs/hail/lib/python3.7/site-packages/hail/hail-all-spark.jar'),
('spark.submit.deployMode', 'client'),
('spark.app.name', 'Hail'),
('spark.driver.memory', '32G'),
('spark.app.id', 'local-1585615971418'),
('spark.executor.heartbeatInterval', '10s'),
('spark.network.timeout', '10000s')]
I increased network timeout and driver memory. And I initialize:
hl.init(min_block_size=128)
like so. I was running out of memory previously, so I made these updates per another thread on the forum. I suppose that could be happening again.