Identity by Descent Estimation

danking · September 7, 2016, 9:12pm

The following is the proposed documentation for the identity-by-descent estimation command. I’m not sure how to expose the output. A sample by sample matrix is very large; therefore, we probably should not store it in the global annotations. If the data was shared by sample, we could store each row with its sample.

I’m not yet familiar with the vocabulary of the community, so someone should could check that I’ve appropriately warned about bi-allelic data and LD pruning.

`ibd`

Compute an estimation of identity-by-descent for each pair of samples. Conceptually, this command’s output is a symmetric, sample-by-sample matrix. The implementation is based on the IBD algorithm described in the PLINK paper.

This command assumes the dataset is bi-allelic. This command does not perform LD pruning but linkage disequilibrium may negatively influence the results.

Usage

-m | --minor-allele-frequency <expr>—a hail language expression for the minor allele frequency of the given variant, v. You may also access the variant annotations va. The expression is evaluated once for each variant. If no expression is given, the minor allele frequency is calculated from the data set.

Examples

... ibd --minor-allele-frequency 'va.mafs[v]'

danking · September 7, 2016, 9:47pm

If the --minor-allele-frequency expression evaluates to NA, I currently trigger an error and ask the user to fix the expression. Is this a sensible response?

Topic		Replies	Views
Best way to check relatedness in large sample sets Help [0.1]	4	1260	October 9, 2018
[Experimental] Population Aware Relatedness Estimation Updates	0	1454	August 12, 2017
Hl.agg.linreg() output question Hail Query & hailctl	2	339	July 16, 2020
Linear regression burden tests, collapsing genotypes by variant key Updates	1	1573	October 4, 2018
[Feature] Chained linear regression Updates	0	924	October 26, 2018

Identity by Descent Estimation

ibd

Usage

Examples

Related Topics

`ibd`