Hi all, is there a quick way to calculate PVE (percent variance explained) and include it in the result table, besides locus, alleles, beta, p-value, etc?
Should be able to write a
result.annotate statement to compute it from the existing statistics:
Finally did stg like:
result_lm1 = hl.linear_mixed_regression_rows(data_model1.GT.n_alt_alleles(), lm1) maf_table = data_model1.annotate_rows(MAF = hl.min(data_model1.variant_qc.AF)).rows() result_lm1_ann = result_lm1.annotate(MAF = maf_table[result_lm1_ann.locus result_lm1_ann.alleles].MAF)
etc, but looks like it takes ages to annotate results with the MAF. Am I missing something here?
Hail is lazy which means that we build a list of the operations and do not execute it until you “observe” the value. You can observe values with
Because of that, it’s hard to comment on the performance without seeing the full script. Can you share it?
This also means that if you split a matrix table like:
mt1 = mt.anntoate(...) mt2 = mt.annotate(...)
And then try to join them:
mt1 = mt1.annotate_rows(... mt2.rows()[mt1.row_key])
You’ll do all the work to produce
mt twice. It looks like you might have this pattern.
hl.linear_mixed_regression_rows has the
pass_through argument to help address this issue.
hl.linear_mixed_regression_rows is not as fast as other LMM methods like BOLTLMM or SAIGE. Have you tried using those instead?
Thanks, I think the
pass_through argument is all I need, did not notice that! I can calculate MAF at the start, use the argument to include it in the result table, and then use
annotate_rows. Also, I’m impatient, it did not take ages, after all.