I was using Hail for computing the polygenic score in AllofUs, but the computation seems to be very slow. AllofUs genotype array has 1,824,517 variants and 165,127 individuals, and I used `hl.agg.sum()`

to aggregate a score for each individual:

```
mt_array_path = os.getenv("MICROARRAY_HAIL_STORAGE_PATH")
mt_array = hl.read_matrix_table(mt_array_path)
# generate some random effect sizes
mt_array = mt_array.annotate_rows(rand = hl.rand_unif(-1, 1))
mt_array = mt_array.annotate_cols(
score= hl.agg.sum(
mt_array.rand * mt_array.GT.n_alt_alleles()
)
)
mt_array.cols().export(f'{bucket}/test/Score_rand.tsv')
```

The above computation took more than 30 minutes with 2 workers and 10 preemptible.

As for comparison, I used R to simulate the PGS calculation with the same number of variants with my Mac machine:

```
rand = runif(1824517, -1, 1)
geno = rbinom(1824517, 2, 0.2)
microbenchmark::microbenchmark(sum(rand * geno), times = 100)
```

It took 10ms to aggregate a score for each individual, and ~25 minutes for all 165,127 samples.

Any suggestions on speeding up PGS calculation would be really appreciated!