GWAS on subset of UKBioBank

@danking since I’m wanting to run 79,800 separate gwases, distributed across our cluster. Which approach do you think is more effeicient?

  1. Merge the geno matrix table with the full pheno table, then write a script to run a gwas on a single annotation (eg a single column from the pheno table)

or

  1. Write a script that subsets the pheno table for each iteration before merging pheno and geno? Then run the gwas.

Either way I’ll need a way to preserve the pheno column index, so I’ll likely write a script that takes a column index as an argument.