Hi developer,
I was trying to use sample_qc
function to compute QC some metrics. The ‘PL’ field of my vcf are all missing, but the ‘GQ’ field are not missing. The error message is: HailException: PL cannot have missing elements.
My understanding is that the gq_stats
from the sample_qc
function are computed from ‘GQ’ field. But why Hail complains missing elements in PL field? How am I able to compute gq_stats
when my vcf only contains ‘GQ’ but not ‘PL’?
Thanks!
Hey @jialiwang1211 !
I’m sorry you’re having trouble with Hail. The issue is almost certainly not with sample_qc
. Are you using split_multi_hts
? Hail assumes that either:
- the PL field is missing, or
- the PL field is an array of not missing values.
It sounds like you have a PL field that is an array of missing values. You can fix this by running this after you read or import your data:
mt = mt.annotate(PL = hl.null(mt.PL.dtype))
Also, be aware that split_multi_hts cannot recalculate an appropriate GQ if the PL field is missing.
Hi @danking,
Thank you for your prompt reply!
Yes I was using split_multi_hts
, and yes the PL field in my vcf is an array of missing values.
The problem is fixed by using the code:
mt = hl.variant_qc(hl.split_multi_hts(mt.drop('PL')), name='qc')
Note that I need to drop the PL field in the multi_hts step, otherwise the GQ field becomes NA.
Thanks!