Reading multiple matrix tables

I would like a hail script that reads in the individual chromosomal matrix tables into one larger matrix table for ld pruning and running pc_relate. I tried using the wild card chr* but get an error. Is there a way to read in multiple matrix tables.

mt_fn=’/project/adgc/imp.topmed_adsp5k/mt/adgc.aa.imp30r2.topmed_adsp5k.chr*.mt’
mt=hl.read_matrix_table(mt_fn)

Hail version: 0.2.19-c6ec8b76eb26
Error summary: HailException: MatrixTable and Table files are directories; path ‘/project/adgc/imp.topmed_adsp5k/mt/adgc.aa.imp30r2.topmed_adsp5k.chr*.mt’ is not a directory

this is intentional; a MatrixTable is already a composite object, so it shouldn’t be a common use case to glob them.

You can just iterate in Python. First, let’s define a helper function that makes a nested union N log N, not quadratic (see here for more info):

def union_cols_all(mts):
    mts = mts[:]

    iteration = 0
    while (len(mts) > 1):
        iteration += 1
        print(f'iteration {iteration}')
        tmp = []
        for i in range(0, len(mts), 2):
            tmp.append(mts[i].union_cols(mts[i+1]))
        mts = tmp[:]
    return mts[0]

And then read and union in Python:

files = [f'/path/to/chr{chrom}' for chrom in list(range(23)) + ['X', 'Y']]
mt = union_cols_all([hl.read_matrix_table(file) for file in files])