@danking since I’m wanting to run 79,800 separate gwases, distributed across our cluster. Which approach do you think is more effeicient?
- Merge the geno matrix table with the full pheno table, then write a script to run a gwas on a single annotation (eg a single column from the pheno table)
or
- Write a script that subsets the pheno table for each iteration before merging pheno and geno? Then run the gwas.
Either way I’ll need a way to preserve the pheno column index, so I’ll likely write a script that takes a column index as an argument.