Hello,
I am in the process of trying to make some changes to our VCF genotypes that were recommended to our group. For those genotypes where the Reason for No Call (RNC) is equal to 1 (where ‘1’ represents a monoallelic site’), we hope to convert those GT sites to REF/REF or ‘0/0’.
Towards this goal, I was thinking of adding a new definition to our code’s existing dictionary. The last (and newest) definition would flag genotypes where the RNC is equal to ‘11’:
# dict of lambdas for filter expressions
genotype_exprs = {
# DP: Genotypes below reading depth
'GDP' : lambda ds: ds.DP < args.depth_threshold,
# GDP: heterozygous genotypes outside of allelic balance
'GAB' : lambda ds: (
ds.GT.is_het() & (
(ds.AD[0]/ds.DP > max(1 - args.allelic_balance, args.allelic_balance))
| (ds.AD[0]/ds.DP < min(1 - args.allelic_balance, args.allelic_balance))
)
),
# GGQ: Genotypes below quality threshold
'GGQ': lambda ds: ds.GQ < args.genotype_quality,
# GNP: Male het chrX nonpar variants
'GNP': lambda ds: ds.GT.is_het() & ds.locus.in_x_nonpar() & (ds.sa.GENDER == "Male"),
# GRNC: Genotypes with RNC == 1
'GRNC': lambda ds: ds.RNC[0] == "11"
}
From here, we would 1) change the GT to ‘0/0’ for RNC = 11 variants and 2) filter out the flagged variants as shown below
# Apply filters again to actually remove genotypes
vcf = vcf.annotate_entries(
GT=hl.if_else(genotype_exprs['GRNC'](vcf), hl.Call([0, 0]), vcf.GT)
).filter_entries(
genotype_exprs['GDP'](vcf) | genotype_exprs['GAB'](vcf) | genotype_exprs['GGQ'](vcf) | genotype_exprs['GNP'](vcf), keep=False)
However, this generates the following error in its current state. Can anyone suggest an alternative method of performing this change?
Traceback (most recent call last):
File "/data/Segre_Lab/users/mgarcia/variant_qc_tool_prep_UKB_WES/hail_variant_qc_v4.py", line 550, in <module>
GT=hl.if_else(genotype_exprs['GRNC'](vcf), '0/0', vcf.GT)
File "<decorator-gen-738>", line 2, in if_else
File "/gpfs/fs1/home/mg1646/.conda/envs/variantqc_hailenv/lib/python3.8/site-packages/hail/typecheck/check.py", line 584, in wrapper
return __original_func(*args_, **kwargs_)
File "/gpfs/fs1/home/mg1646/.conda/envs/variantqc_hailenv/lib/python3.8/site-packages/hail/expr/functions.py", line 410, in if_else
raise TypeError(f"'if_else' and 'cond' require the 'consequent' and 'alternate' arguments to have the same type\n"
TypeError: 'if_else' and 'cond' require the 'consequent' and 'alternate' arguments to have the same type
consequent: type 'str'
alternate: type 'call'