Slow speed when using gnomadV3 callset

I am trying to build my own ancestry classifier using genotypes from the 1000Genomes and HGDP dataset, taken from here Downloads | gnomAD (broadinstitute.org). The speed I observe is very slow (for example during LD pruning and filtering), compared to when I analyzed my own dataset. In both approaches, I initialize HAIL like this:

hl.init(tmp_dir = '/scratch/hail',
        local_tmpdir = '/scratch/hail',
        master='local[128]',
        spark_conf={'spark.driver.memory': '1800g',
                    'spark.executor.memory': '1800g',
                    'spark.local.dir' : '/scratch/hail',
                   'java.io.tmpdir': '/scratch/hail'}) 

I’ve noticed that the gnomad dataset contains 50’000 partitions, so I tried to change them during read_matrix_table, but I didnt observe a difference. Is there anything else I could try?