Combining VDSs with same samples

I’ve built a collection of 22 VDSs, each containing variants from the same 100 samples, but covering a different chromosome:

I want to merge these into a single VDS. However, when I try to run


the combiner thinks I’m loading 2200 samples, i.e. it doesn’t identify the identically-named samples in the different batches.
Is there any way to merge multiple VDSes containing the same sample-set into a single VDS containing the same samples? This question was answered in 2017,, but the solution uses HailContexts, which don’t seem to be accessible in the current Hail version (version 0.2.99-57537fea08d4).

Any help you could offer would be greatly appreciated. Thanks!

Here’s a way – we should add this onto the VDS module:

vdses = [hl.vds.read_vds(path) for path in vds_paths)
def vds_union_rows(vdses):
    new_ref_mt = hl.MatrixTable.union_rows(*(vds.reference_data for vds in vdses))
    new_var_mt = hl.MatrixTable.union_rows(*(vds.variant_data for vds in vdses))
    return hl.vds.VariantDataset(new_ref_mt, new_var_mt)

I have a PR to add this to mainline, thanks for the tip @jbgaither ! [query] add VDS.union_rows by danking · Pull Request #12268 · hail-is/hail · GitHub