Running ld_matrix on multiallelic variants

KatalinaBobowik · February 7, 2022, 2:27am

Hi,

I’m trying to calculate ld for all variants within a dataset using ld_matrix() (see the script here). Within the dataset, there are multiallelic variants which have been split using split_multi_hts(), and I’m curious whether or not it’s appropriate to run ld_matrix on variants which have been split. e.g., is there any situation where two common, independent alleles might interfere with one another in the LD calculation?

Thanks in advance!

pwc2 · February 7, 2022, 5:16pm

Hey @KatalinaBobowik!

My first thought would be that it would make more sense to only include biallelic variants, since the ld_matrix method is just computing the windowed pairwise correlation between variants.

I was just looking around a bit, and on the PLINK 2.0 linkage disequilibrium page it mentions:

Since two-variant r2 only makes sense for biallelic variants, these collapse multiallelic variants down to most common allele vs. the rest.

And this paper eLD: entropy-based linkage disequilibrium index between multiallelic sites makes it look like it is a bit more involved to include multiallelic sites in LD calculations, also stating in the abstract:

Commonly used LD indices such as r2 handle LD of biallelic variants for two sites.

Though I’m not 100% sure here, and it may be worth trying to run ld_matrix on both just the biallelic variants, as well as the biallelic variants + split multiallelic variants and taking a look at the results.

KatalinaBobowik · February 7, 2022, 11:57pm

Thanks @pwc2 , those are great resources and a very helpful approach. I’ll test running ld_matrix on biallelic variants only and then compare that to the results with multiallelic variants included.

Thanks again!

Topic		Replies	Views
Calculate LD of one variant against all variants in a region Hail Query & hailctl	13	580	May 19, 2023
Calculate LD score using gnomAD matrices Hail Query & hailctl	0	37	October 28, 2024
Multiallelic in multisample MatrixTable Hail Query & hailctl	5	341	November 29, 2022
Hail now detects multiallelics on VCF import Updates	4	1104	March 20, 2018
Export only biallelic variants Hail Query & hailctl	2	369	November 19, 2020

Running ld_matrix on multiallelic variants

Related topics