I’m using Hail to perform a GWAS for some expression QTLs on Google Cloud. The code is 1) performing a series of regressions w/ linreg_multipheno, 2) after each block of phenotypes, joining the variant_table output to the output, and 3) after all regressions, repartitioning to 100 partitions and exporting the resulting key table with parallel=True.
I’m having two problems. First, writing the results is taking much longer than the regressions. For the first run, regressions finished within an hour while writing the results took the next 9 hours. Any ideas about how to speed this up?
Second, while writing the output files, the code threw an out of memory error, pasted below. Only about 750 MiB of key table results were actually written at the time. Possibly relevant is that the export had previously caused a stack overflow, which I fixed by increasing the stack size.
[Stage 774:=====> (2535 + 1226) / 22371]
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00007fc4b8b80000, 2197815296, 0) failed; error='Cannot allocate memory' (errno=12)
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 2197815296 bytes for committing reserved memory.
# An error report file with more information is saved as:
ERROR: (gcloud.dataproc.jobs.wait) Job [b5dc094b-b3a6-440a-b5f9-cb5f83b93c56] entered state [ERROR] while waiting for [DONE].