I had a problem statement and wanted to check with the community as to what will be an ideal solution for this.
- I have an existing matrix table with say 5000 vcf files imported in it.
- I would like to append another vcf file to the above Mt
- Perform analysis say regression on the merged dataset and save it to disk.
- Repeat #2 and #3 many times.
The dataset will grow over time as each vcf file added subsequently is appended to the Mt.
Would union be an option or is there any other efficient solution? Thanks in advance!