Genotype difference between two pairs of individuals

Hi there,

I’m interested in using Hail to run a GWAS on the genotype difference between two pairs of individuals, and I wonder if there’s a way to subtract a person’s allele count at a locus by the corresponding allele count of another person at the same locus? I tried to subtract a number from mt.GT.n_alt_alleles(), but it returned SyntaxError: can’t assign to function call. May I ask for some suggestion? Thanks!

So you can’t subtract a number from a function call like that, but you can do something like:

mt = mt.annotate_entries(adjusted_n_alt_alleles = mt.GT.n_alt_alleles() - ....)

where you fill in the … with whatever you want to subtract. Then later on you just use mt.adjusted_n_alt_alleles instead of mt.GT.n_alt_alleles(). You’re creating a new entry field that reflects what you want, and then using that in your analysis.

Thanks for the suggestion. However, I encountered another error when doing this. Here is my code:

mt1 = hl.import_plink(bed=‘person1.bed’,bim=‘person1.bim’,fam=‘person1.fam’,quant_pheno=True)
mt2 = hl.import_plink(bed=‘person2.bed’,bim=‘person2.bim’,fam=‘person2.fam’,quant_pheno=True)
mt1 = mt1.annotate_entries(adjusted_n_alt_alleles = mt1.GT.n_alt_alleles() - mt2.GT.n_alt_alleles())

And then the following error showed up:

hail.expr.expressions.base_expression.ExpressionException: Cannot combine expressions from different source objects.
Found fields from 1 objects:
<hail.matrixtable.MatrixTable object at 0x7fdc5b152a20>: [‘GT’]

May I ask for some suggestions about this? Thanks!

You can’t subtract two entry fields from two different matrix tables – the way to do this is with a join. The syntax below is shorthand for MatrixTable.index_entries, which conceptually does a lookup in mt2 per entry of mt1.

mt1 = hl.import_plink(bed=‘person1.bed’,bim=‘person1.bim’,fam=‘person1.fam’,quant_pheno=True)
mt2 = hl.import_plink(bed=‘person2.bed’,bim=‘person2.bim’,fam=‘person2.fam’,quant_pheno=True)
mt1 = mt1.annotate_entries(adjusted_n_alt_alleles = mt1.GT.n_alt_alleles() - mt2[mt1.row_key, mt1.col_key].GT.n_alt_alleles())
1 Like

Alright I’ve figured it out following your suggestion, thank you very much!

It returns NA… because the names are not the same.