Hi, I was wondering if it’s possible to merge two vcf files (or resulting mts) with different samples but overlapping variants, such that any downstream QC takes into account information from the combined data set. I am working locally on a computer.
By “overlapping”, do you mean identical? If so, union_cols will work out-of-the-box. The default row join type is an inner join, which will restrict to variants shared by both datasets.
You can use mt.union_cols(mt1, join_type='outer') but the resulting matrix table will have missing entries that may require some special handling in downstream analysis.