Hello,
I am trying to run LD score regression on my GWAS results in order to assess polygenecity vs confounding from my population. My GWAS was a linear regression of ~8.8 million variants and 900 samples.
I followed the example available at the link below. I was able to replicate the analysis one of the 1kg datasets available through hail on chromsome 21.
https://ibg.colorado.edu/cdrom2023/session/Day-5b%20Hail%20I,%20Tim%20Poterba/2021_IBG_Hail/05-advanced-hail-functionality.ipynb
gwas_control_gene = hl.read_table(gwas_results_file)
mt = hl.read_matrix_table(mt_gwas_ready_file)
ht_scores = hl.experimental.ld_score(
entry_expr=mt.n_alt_alleles,
locus_expr=mt.locus,
radius=1e6
)
betas = gwas_control_gene
betas = betas.annotate(z_score = betas.beta / betas.standard_error)
betas = betas.annotate(chi_sq_statistic = betas.z_score ** 2)
ht_results = hl.experimental.ld_score_regression(
weight_expr=ht_scores[ht.locus].univariate,
ld_score_expr=ht_scores[ht.locus].univariate,
chi_sq_exprs=betas[ht.key].chi_sq_statistic,
n_samples_exprs=betas[ht.key].n
)
However, when I run it on my GWAS data, I get the results shown below. Can you provide some more information on what would cause an NaN result?
Thanks,
Andrew