Hi there,
I’m wondering if there is a best practice for merging VCFs together in hail, akin to the bcftools merge
function.
Best
Nick
Hi there,
I’m wondering if there is a best practice for merging VCFs together in hail, akin to the bcftools merge
function.
Best
Nick
What kind of merge? Inner/outer on variants, samples, etc?
Sorry about not having enough info.
I have 50k sample vcfs, and I want to merge them into single vcf keeping all variants found in any sample (outer).
Are these GVCFs or VCFs? If the former, there’s a good solution. The latter, it’s going to be very difficult/impossible to get a sensible result, partly because VCFs are lossy (no reference blocks)
It’s the latter. I’m okay with a bunch of no calls in the merged VCF with all the samples because these are somatic variant calls, and I don’t expect any variant to be common. If that was your concern.
Or is there another concern I’m not catching?
this is a duplicate of:
This is possible to do by writing a hierarchical merge with intermediate checkpoints, but it’ll probably take an hour or two to write down the correct code for that and I don’t have time right now.
Oh thats funny. Alright. I’ll try writing something up. Thanks!