BGEN filter_cols memory leak

ebartell · April 27, 2021, 12:54pm

Hi!
I am trying to run GWAS using multiple cores on the broad hpc. I have had success following instructions from this forum and have seen similar direction for using many cores on the google cloud, but am hoping to run this on prem and am not sure if this is possible, or if I am making a simple mistake.

I believe the relevant errors are one of two, either
[Stage 3:> (0 + 4) / 293]BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
or the immediately following
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00002b4dc2e41928, pid=8835, tid=0x00002b4da530d700
#
# JRE version: Java™ SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot™ 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libblas.so.3+0x2e0928] dgemm_incopy+0x88
#
#
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.

My setup is as follows:
ish -l h_vmem=8G -pe smp 4 -binding linear:4 (I have also tried submitting this as a job with more memory, but I don’t believe I’ve requested more than 8G per core successfully yet)

use Anaconda3
use Java-1.8
## use -v OpenBLAS ## I’ve tried with and without
use .openblas_for_hail-0.2.20
use -v .mkl-2019.3.199
## conda create -n hail python==3.7 ## already run
source activate hail
conda install pip
pip install -U hail

export PYSPARK_SUBMIT_ARGS="–driver-memory 8G pyspark-shell --executor-memory 8G" ## and with 4G as well. Without this line, I run out of memory a la “java.lang.OutOfMemoryError: Java heap space, in Stage 2.”

Thank you for your help!

danking · April 27, 2021, 1:58pm

Hey @ebartell !

Sorry to hear you’re having trouble with the Broad cluster. I just ran this and it worked for me. Can you share what you’re doing differently?

use UGER
ish -l os=RedHat7
use Anaconda3
use Java-1.8
use .openblas_for_hail-0.2.20
yes | conda env remove -n hail
yes | conda create -n hail python==3.7
conda activate hail
yes | pip3 install ipython hail
python3 -c '
import hail as hl
mt = hl.balding_nichols_model(3, n_variants=1000, n_samples=10)
t = hl.linear_regression_rows(x=mt.GT.n_alt_alleles(), y=mt.pop, covariates=[1])
t.show()'

After running the above sequence of commands at least once, this should suffice:

use UGER
ish -l os=RedHat7
use Anaconda3
use Java-1.8
use .openblas_for_hail-0.2.20
conda activate hail
python3 -c '
import hail as hl
mt = hl.balding_nichols_model(3, n_variants=1000, n_samples=10)
t = hl.linear_regression_rows(x=mt.GT.n_alt_alleles(), y=mt.pop, covariates=[1])
t.show()'

ebartell · April 27, 2021, 2:24pm

Thank you for your help.
I started an interactive session with 2 cores and 4G memory, and running your code works. Certainly I am not running
yes | conda env remove -n hail
every time, and likely I had an older version of the hail library installed; this could be the fix I was missing.
When I try my python script, I need the additional
export PYSPARK_SUBMIT_ARGS="–driver-memory 4G pyspark-shell --executor-memory 4G"
or mine dies on memory; it is currently running and seems to have already passed its usual break point. I can update here if the errors I was seeing reoccur.
Thank you!

ebartell · April 27, 2021, 4:55pm

I don’t believe it needs it, but should I include source activate hail after creating the environment but prior to the pip3 install? When I do, in the future all I seem to need is the source activate hail (and the use lines) to run GWAS.
Thanks!

danking · April 27, 2021, 5:22pm

Ah, yes, I think that’s right. I’m pretty sure anaconda also now recommends conda activate ...

ebartell · May 4, 2021, 4:43pm

Thank you for your help, and apologies to re-open this, I am running into a new error.

I am able to have this run

python3 -c '
import hail as hl
mt = hl.balding_nichols_model(3, n_variants=1000, n_samples=10)
t = hl.linear_regression_rows(x=mt.GT.n_alt_alleles(), y=mt.pop, covariates=[1])
t.show()'

But when I run my larger script, I come upon a new error (progress!):

ERROR:root:Exception while sending command.
...    
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
...
py4j.protocol.Py4JError: An error occurred while calling o1.pyPersistTable
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (IP address:port)

In order for this to happen, I first launch an interactive session with 2 cores.
ish -pe smp 2 -binding linear:2 -l h_vmem=8G
and then

use Anaconda3
use Java-1.8
use .openblas_for_hail-0.2.20
yes | conda env remove -n hail
yes | conda create -n hail python==3.7
source activate hail
# yes | pip3 install matplotlib
yes | pip3 install ipython hail

python3 -c '
import hail as hl
mt = hl.balding_nichols_model(3, n_variants=1000, n_samples=10)
t = hl.linear_regression_rows(x=mt.GT.n_alt_alleles(), y=mt.pop, covariates=[1])
t.show()'

export PYSPARK_SUBMIT_ARGS="--driver-memory 8G pyspark-shell --executor-memory 8G"

python z030_script_UKB_runHail.py --chrom ${SGE_TASK_ID} --analysisName ${filename}

I am happy to share my hail python script if it will help.

danking · May 4, 2021, 4:58pm

Hey @ebartell ,

Sorry you’re running into trouble again! Do you have the hail .log file generated by the failing run? Can you share that here or upload it to gist.github.com and share that here?

ebartell · May 4, 2021, 5:21pm

Thanks again for your help. I’ve uploaded it here:

gist.github.com

https://gist.github.com/ericbartell/0d9b85f855b87f2a7c4cde70e5beef5c

hail-20210504-1639-0.2.66-50e9d4de51e1.log

2021-05-04 16:39:31 Hail: INFO: Running Hail version 0.2.66-50e9d4de51e1
2021-05-04 16:39:32 SparkContext: WARN: Using an existing SparkContext; some configuration may not take effect.
2021-05-04 16:39:33 root: INFO: RegionPool: initialized for thread 15: Thread-5
2021-05-04 16:39:33 root: INFO: globbing path /broad/ukbb/imputed_v3/ukb_imp_chr22_v3.bgen returned 1 files: ukb_imp_chr22_v3.bgen
2021-05-04 16:39:34 MemoryStore: INFO: Block broadcast_0 stored as values in memory (estimated size 34.7 KiB, free 4.1 GiB)
2021-05-04 16:39:34 MemoryStore: INFO: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.2 KiB, free 4.1 GiB)
2021-05-04 16:39:34 BlockManagerInfo: INFO: Added broadcast_0_piece0 in memory on ugertmp-c021.broadinstitute.org:35029 (size: 3.2 KiB, free: 4.1 GiB)
2021-05-04 16:39:34 SparkContext: INFO: Created broadcast 0 from broadcast at SparkBackend.scala:275
2021-05-04 16:39:34 root: INFO: decoder cache miss (0 hits, 1 misses, 0.000
2021-05-04 16:39:34 root: INFO: instruction count: 3: __C1FSContainer.<init>

This file has been truncated. show original

tpoterba · May 4, 2021, 6:07pm

Can you share your python script?

ebartell · May 4, 2021, 6:56pm

Yes, and thank you for your help! This is the full script, including pieces unrelated to Hail.


print("Running gwas. using HAIL")


print("importing libraries")
import hail as hl
hl.init()

import pandas as pd
#import matplotlib as mpl
import numpy as np
# mpl.use('Agg')
# from matplotlib import pyplot as plt
import scipy.stats as ss
import sys
import os

acceptableSampleSizes = ["all","halfGwas1","halfGwas2","ckbN_halfGwas1","ckbN_halfGwas2","Q1_CKB-N-.5unrel","Q2_CKB-N-.5unrel","Q3_CKB-N-.5unrel","Q4_CKB-N-.5unrel",
                        "1of10unrel","2of10unrel","3of10unrel","4of10unrel"] + [str(x) for x in range(1,51)]

import argparse
parser = argparse.ArgumentParser(description='Run sibship for given list of SNPs (or summstats file).')
# parser.add_argument('--lociFile', dest='lociFile', type=str,
#                    help='File location for SNP   chr list of SNPs (tab separated). If none provided, will run on all SNPs.',
#                    required=False, default="allSnps")
parser.add_argument('--analysisName', dest='analysisName', type=str,
                   help='Analysis name, used to name output.',required=True)
parser.add_argument('--chrom', dest='chrom', type=int,required=True,
                   help='Chrom of analysis (in UKB). Required, used to parallelize with multiple jobs over chroms.')
parser.add_argument('--relatedness', metavar='str', dest='relatedness', type=str, action='store', choices=["all","unrelated","WB"], default="WB",
                   help="Define which set of individuals to use, unrelateds or all. (default: WB for white british).")
parser.add_argument('--sampleSize', metavar='str', dest='sampleSize', type=str, action='store', choices=acceptableSampleSizes, default="all",
                   help="Define which sample size to use, all or predefined subset. (default: all).")
# parser.add_argument('--out', dest='out', type=str,
#                    help='Output location; UNUSED.',required=True)

# parser.add_argument('--lociFileRsidCol', dest='lociFileRsidCol', type=str,
#                    help='RSID column (to ID snps for analysis). default SNP',required=False,default="SNP")
# parser.add_argument('--lociFileChromCol', dest='lociFileChromCol', type=str,
#                    help='Chrom column (used to subset loci file). default CHR',required=False,default="CHR")
# parser.add_argument('--lociFileEffectAlleleCol', dest='lociFileEffectAlleleCol', type=str,
#                    help='Effect allele column (used to match alleles). default ALLELE1',required=False,default="ALLELE1")
# parser.add_argument('--summStatsDelim', metavar='str', dest='summStatsDelim', type=str, action='store', choices=["tab","comma","space","\\s"," ",",","\\t","\t"], default="tab",
#                    help="Use this delimiter when parsing summStats (default: tab).")

parser.add_argument('--verbose', '-v', action='count', default=0)
parser.add_argument('--quiet', '-q', action='count', default=0)

# parser.add_argument('integers', metavar='N', type=int, nargs='+',
#                    help='an integer for the accumulator')

# inArgs = '--chrom 22 --analysisName boltLMM_unrelateds_noPCsLinreg --lociFile ../ht_shr_leg_upperbody/z040_out_bgen_Height_all_unrel_noPCs_filtered_v2_prunedByDist.tsv'.split(" ")

inArgs = '--chrom 22 --analysisName Height_all --relatedness WB --sampleSize all'.split(" ") ##################
args = parser.parse_args(inArgs) ############################### testing ##################################


args = parser.parse_args()
print("args: %s " % args)


chrom = args.chrom
# prunedHeightSnps = pd.read_csv("../ht_shr_leg_upperbody/z040_out_bgen_Height_all_filtered_v2_prunedByDist.tsv",sep="\t")
# inputSSname = "UKB_all"
# prunedHeightSnps = pd.read_csv("/cvar/jhlab/ebartell/Public_GWAS_Summstats/sig_wood_height_pruned_loci_checkTurchinAnalysis.tsv",sep="\t")
# inputSSname = "Wood2014"
# if args.lociFile != "allSnps":
#     print("loading summstats.")
#     prunedHeightSnps = pd.read_csv(args.lociFile,sep="\t")
#     inputSSname = args.analysisName
#     if args.lociFileRsidCol != "SNP":
#         prunedHeightSnps["SNP"] = prunedHeightSnps[args.lociFileRsidCol]
#     if args.lociFileChromCol != "CHR":
#         prunedHeightSnps["CHR"] = prunedHeightSnps[args.lociFileChromCol]
#     if args.lociFileEffectAlleleCol != "ALLELE1":
#         prunedHeightSnps["ALLELE1"] = prunedHeightSnps[args.lociFileEffectAlleleCol]
#     prunedHeightSnps_thisChrom = prunedHeightSnps[prunedHeightSnps.CHR == chrom]
#     prunedHeightSnps_thisChrom_rsids = list(prunedHeightSnps_thisChrom.SNP)
#     print("%s gws variants." % len(prunedHeightSnps_thisChrom_rsids))


print("loading bgen.")
indexMapper = {}
contigRecoding = {}
chromString = "{"
for i in range(1,23):
    indexMapper["/broad/ukbb/imputed_v3/ukb_imp_chr%s_v3.bgen" % i] = "/broad/jhlab_ukbiobank/users/ebartell/qfam/subsetBgens/ukb_imp_chr%s_v3_hail.bgen.idx2" % i
    if i < 10:
        contigRecoding["0%s" % i] = str(i)
        # contigRecoding["%s" % i] = "0" + str(i)
    else:
        contigRecoding["%s" % i] = str(i)
    chromString += "%s," % str(i)

chromString = chromString[:-1] + "}"
### very slow, only runs once. DONE ALREADY
# bgen_idx = hl.index_bgen("/broad/ukbb/imputed_v3/ukb_imp_chr%s_v3.bgen" % chromString, indexMapper,contig_recoding=contigRecoding)


### very fast
bgen = hl.import_bgen("/broad/ukbb/imputed_v3/ukb_imp_chr%s_v3.bgen" % chrom,
                            entry_fields=['GT', 'GP','dosage'], sample_file="/broad/jhlab_ukbiobank/PhenoData/LinkIds_ukb_500K_ImpData_v3_app11898_chrAUT.sample",
                            index_file_map=indexMapper)

print("load phenotypes")
#Load phenotype file, thru hail and also normally in python (easier to use, cuz I don't yet fully know all hail fxns)
phenoFile = "/broad/jhlab_ukbiobank/users/aravikum/prs_sim/prsice_outputs/" + args.analysisName
phenos = pd.read_csv(phenoFile,sep="\t")

colTypes = {"FID": hl.tstr,"IID":hl.tstr,"Noise_added":hl.tfloat64}
phenosHailTable = hl.import_table(phenoFile, impute=True, 
    types=colTypes).key_by('IID')


print("load covariates (including PCs)")
covariatesFile = "/broad/jhlab_ukbiobank/users/ebartell/z020_out_covariates.txt"
colTypesCov = {"FID": hl.tstr,"IID":hl.tstr,"ArrayNum":hl.tint32}
for i in range(1,11):
    colTypesCov["PC%s" % i] = hl.tfloat64

covariatesHailTable = hl.import_table(covariatesFile, 
    types=colTypesCov).key_by('IID')


###################### Load dropped samples and drop them HERE
# print("load sample exclusions and drop them UNIMPLIMENTED")
print("loading individuals always removed.")

individualsToRemove = set()

drop1 = pd.read_csv("/broad/jhlab_ukbiobank/users/jcole/Pheno/bolt.in_plink_but_not_imputed.FID_IID.968.txt",sep=" ",header=None)
individualsToRemove |= set([str(x) for x in drop1[0]])
if args.relatedness in ["WB","unrelated"]:
    drop2 = pd.read_csv("/broad/jhlab_ukbiobank/users/ebartell/bolt.in_v3_sample_but_not_UKBEURN455K.N32265_obtained3.12.19.txt",sep=" ",header=None)
    individualsToRemove |= set([str(x) for x in drop2[0]])

if args.relatedness == "unrelated":
    drop3 = pd.read_csv("/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/samplesNotUnrelated_remove.csv",sep=" ",header=None)
    individualsToRemove |= set([str(x) for x in drop3[0]])

if args.sampleSize != "all":
    print("sample size argument %s is not implemented yet. running for all individuals.")
    subsetSamplesFileRemove = ""
    if args.sampleSize == "halfGwas1":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove_firstHalfComplement.csv"
    elif args.sampleSize == "halfGwas2":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove_secondHalfComplement.csv"
    elif args.sampleSize == "ckbN_halfGwas1":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove_ckb1_complement.csv"
    elif args.sampleSize == "ckbN_halfGwas2":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove_ckb2_complement.csv"
    elif args.sampleSize == "Q1_CKB-N-.5unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_CKB-N.5_1Of4Pieces.csv"
    elif args.sampleSize == "Q2_CKB-N-.5unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_CKB-N.5_2Of4Pieces.csv"
    elif args.sampleSize == "Q3_CKB-N-.5unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_CKB-N.5_3Of4Pieces.csv"
    elif args.sampleSize == "Q4_CKB-N-.5unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_CKB-N.5_4Of4Pieces.csv"
    elif args.sampleSize == "1of10unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_1Of10Pieces.csv"
    elif args.sampleSize == "2of10unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_2Of10Pieces.csv"
    elif args.sampleSize == "3of10unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_3Of10Pieces.csv"
    elif args.sampleSize == "4of10unrel":
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/included_individuals_unrel-sample-fam-remove-complement_4Of10Pieces.csv"
    elif args.sampleSize in set([str(x) in range(1,51)]): ##### 50 subsets
        subsetSamplesFileRemove = "/broad/jhlab_ukbiobank/users/ebartell/ht_shr_leg_upperbody/subsetGwas/subsetSamples/included_individuals_unrel-sample-fam-remove-complement_%sOf50Pieces.csv" % args.sampleSize 
        analysisDirec="unrel50/"
    else:
        print("sample size argument %s is not implemented yet. running for all individuals.")
    if subsetSamplesFileRemove != "":
        drop3 = pd.read_csv(subsetSamplesFileRemove,sep=" ",header=None)
        individualsToRemove |= set([str(x) for x in drop3[0]])



#Subset bgen to only related individuals.
remove_individuals_hl = hl.literal(set(individualsToRemove))
print("subset to relateds")
bgen = bgen.filter_cols(~remove_individuals_hl.contains(bgen.s))


bgen = bgen.annotate_cols(pheno = phenosHailTable[bgen.s])
bgen = bgen.annotate_cols(covariates = covariatesHailTable[bgen.s])


gwas = hl.linear_regression_rows(
    y=bgen.pheno.Noise_added,
    x=bgen.GT.n_alt_alleles(),
    covariates=[1.0, bgen.covariates.PC1, bgen.covariates.PC2, bgen.covariates.PC3,bgen.covariates.PC4, bgen.covariates.PC5, bgen.covariates.PC6, bgen.covariates.PC7,
    bgen.covariates.PC8, bgen.covariates.PC9, bgen.covariates.PC10, bgen.covariates.ArrayNum])




# p = hl.plot.manhattan(gwas.p_value)
# p.
# show(p)

gwas_pd = gwas.to_pandas()
def makeMarkerName(row):
    alleles = row["alleles"]
    alleles.sort()
    return "%s:%s:%s" % (row["locus.contig"],row["locus.position"],":".join(alleles))

gwas_pd["SNP"] = gwas_pd.apply(makeMarkerName,axis=1)
gwas_pd.to_csv("hail/hailGWAS_%s_chrom%s_%s_%s_1core.tsv" % (args.analysisName,chrom,args.relatedness,args.sampleSize),sep='\t',index=False)

tpoterba · May 4, 2021, 7:12pm

To help debug, can you define the following function:

def force_count(mt):
    print(mt.head(1)._force_count_rows())

and call it after these lines:
bgen = bgen.filter_cols(~remove_...
bgen = bgen.annotate_cols(pheno=...
bgen = bgen.annotate_cols(covariates=...

If it fails after one of those, that would be good to know.

ebartell · May 4, 2021, 7:20pm

Thank you for this suggestion, I believe this will help a lot with debugging. I pop the same error in the above code here at the first of the 3 steps:

#testing
def force_count(mt):
    print(mt.head(1)._force_count_rows())

#Subset bgen to only related individuals.
remove_individuals_hl = hl.literal(set(individualsToRemove))
print("subset to relateds")
bgen = bgen.filter_cols(~remove_individuals_hl.contains(bgen.s))
force_count(bgen)
print("filter done")

I get the “subset to relateds” but not the results from

force_count(bgen)
print("filter done")

tpoterba · May 4, 2021, 7:23pm

OK, great. Could you possibly upload the log for that failing run as well? That should hopefully be enough for us to replicate/fix.

ebartell · May 4, 2021, 8:15pm

Yep, this is the link:

gist.github.com

https://gist.github.com/ericbartell/34e38a8076b5f48e30fcaa3f55157572

hail-20210504-1917-0.2.66-50e9d4de51e1.log

2021-05-04 19:17:20 Hail: INFO: Running Hail version 0.2.66-50e9d4de51e1
2021-05-04 19:17:20 SparkContext: WARN: Using an existing SparkContext; some configuration may not take effect.
2021-05-04 19:17:21 root: INFO: RegionPool: initialized for thread 15: Thread-5
2021-05-04 19:17:21 root: INFO: globbing path /broad/ukbb/imputed_v3/ukb_imp_chr22_v3.bgen returned 1 files: ukb_imp_chr22_v3.bgen
2021-05-04 19:17:22 MemoryStore: INFO: Block broadcast_0 stored as values in memory (estimated size 34.7 KiB, free 4.1 GiB)
2021-05-04 19:17:22 MemoryStore: INFO: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.2 KiB, free 4.1 GiB)
2021-05-04 19:17:22 BlockManagerInfo: INFO: Added broadcast_0_piece0 in memory on ugertmp-c021.broadinstitute.org:38861 (size: 3.2 KiB, free: 4.1 GiB)
2021-05-04 19:17:22 SparkContext: INFO: Created broadcast 0 from broadcast at SparkBackend.scala:275
2021-05-04 19:17:22 root: INFO: decoder cache miss (0 hits, 1 misses, 0.000
2021-05-04 19:17:22 root: INFO: instruction count: 3: __C1FSContainer.<init>

This file has been truncated. show original

Apologies, I am not well versed in debugging hail; I looked through the log but do not yet see the issue.

tpoterba · May 5, 2021, 3:55pm

Still investigating – I can replicate locally though, so nothing more necessary from your end.

tpoterba · May 5, 2021, 7:03pm

Fixed here:

github.com/hail-is/hail

[query] Fix bad memory leak triggered by set contains IR function

hail-is:main ← tpoterba:fix-set-contains-leak

opened 07:03PM - 05 May 21 UTC

tpoterba

+52 -7

CHANGELOG: Fixed a memory leak triggered by `hl.literal(...).contains(...) Th…is bug is present elsewhere in the code generator, but the set contains function is probably the worst place for it to happen. This leads to a full copy of the set where the binary search is executed. The core problem was a bug in PArrayBackedContainer not casting its codes properly, leading to the no-op coerce logic in PCanonicalArray being bypassed in favor of the generic copy-the-world implementation of `store`. The test I have added catches the memory leak, but now fails at compile time at the assertion to `PCanonicalArray.store` instead.

We’ll release a new version when that goes in.

tpoterba · May 6, 2021, 6:08pm

0.2.67 is up, that should solve this problem.

ebartell · May 7, 2021, 3:55pm

Thank you so much for your help! The filter step succeeds in interactive multi-core.

Topic		Replies	Views
Heap out of memory Hail Query & hailctl	14	1803	July 21, 2020
Answer from Java side is empty Hail Query & hailctl	1	3081	May 10, 2021
Pc_rel memory issue: ConnectionRefusedError: [Errno 111] Connection refused Hail Query & hailctl	10	773	June 11, 2024
Error summary: OutOfMemoryError: Java heap space Hail Query & hailctl	15	2587	August 18, 2022
Export_vcf OutOfMemoryError: Java heap space despite --driver-memory 8g Hail Query & hailctl	26	2854	January 11, 2019

BGEN filter_cols memory leak

Related topics