How to generate Manhattan plot for gene(burden) hail matrix table after regression analysis?

Abhishek · September 26, 2021, 3:41pm

I have a gene(burden) hail matrix table where I condense all variants into genes for easier computation.

mt = mt.annotate_entries(
            mac_adj=hl.if_else(
                mt.variant_qc.AF[1] <= 0.5,
                mt.GT.n_alt_alleles(),
                2 - mt.GT.n_alt_alleles()
            )
        )
gene_mt = mt.group_rows_by(mt.symbol, mt.feature).aggregate(
            mac=hl.agg.sum(mt.mac_adj),
            mac_HC=hl.agg.sum(hl.if_else(mt.lof == 'HC', mt.mac_adj, 0))

However, group_rows_by doesn’t have option of annotating with locus attribute.
Later on, when I do regression analysis on matrix table and get the result, I am unable to plot Manhattan plot as it requires locus attribute.

AttributeError: Table instance has no field, method, or property ‘locus’
Hint: use ‘describe()’ to show the names of all data fields.

How do you get Manhattan plot for gene matrix table based analysis?

PS: QQ plot is working fine

kumarveerapen · September 26, 2021, 6:17pm

Hi, @Abhishek . Usually with gene burden tests, we describe our results with a table than with a Manhattan plot. If you do want to create a Manhattan plot, you could perhaps use chr#:start-position for each gene for visualization.

Abhishek · September 26, 2021, 6:21pm

Hi Veerapen,

Yes, I am generating a table separately as well for the results.

Can you provide me the hail code to generate chr#:start-position for each gene as a column, so I can annotate it in my final hail table(results) or gene hail matrix table?

Abhishek · September 27, 2021, 5:28am

I came up with this(which does not seem to be the optimized version but it works)

var_t = mt.rows()
var_t = var_t.group_by(vep_symbol=var_t.symbol, vep_feature=var_t.feature).aggregate(
            num_variants=hl.agg.count(),
            gene_contig=hl.agg.take(var_t.locus.contig, 1)[0],
            gene_pos=hl.agg.min(var_t.locus.position),
            gene_ref=hl.agg.take(var_t.locus.dtype.reference_genome.name, 1)[0]
            )

gene_mt = mt.group_rows_by(mt.symbol, mt.feature).aggregate(
            mac=hl.agg.sum(mt.mac_adj)
            )

gene_mt = gene_mt.annotate_rows(
            locus=hl.locus(var_t[gene_mt.row_key].gene_contig, var_t[gene_mt.row_key].gene_pos,
                           var_t[gene_mt.row_key].gene_ref.take(1)[0])
        )

tpoterba · September 27, 2021, 12:30pm

I think your last approach works, but I’ll say a bit more info about why a manhattan plot isn’t really appropriate here. We plot GWAS results using genomic coordinate as the x-axis and see “skyscrapers” in the results because linkage disequilibrium results in correlated genotypes and therefore correlated p-values in nearby sites. If you plot results per gene in the same way (using start locus or end locus or whatever), you won’t see any of the same structure because the results for nearby genes should not be correlated.

Abhishek · September 28, 2021, 5:49am

What will be a better plot for gene matrix table downstream analysis after regression?

kumarveerapen · September 28, 2021, 2:36pm

Perhaps a volcano plot where the x-axis would be the effect size (log odds ratio) and y-axis being the -log10(p-value). Was inspired by a presentation by @Siwei this AM who used a similar visualization to present her gene-based test results today.

Abhishek · September 28, 2021, 5:19pm

It sounds like a good plot. Do you have a reference image or link to the presentation by @Siwei ?
I would like to run through by my team.

Topic		Replies	Views
Adding gene name and rs id to Manhattan plot Hail Query & hailctl	2	644	September 27, 2021
How to plot a value vs variant location Hail Query & hailctl	4	390	October 24, 2022
Manhattan plot not showing up Hail Query & hailctl	5	24	April 15, 2025
Aggregate GT over rows Hail Query & hailctl	2	206	January 9, 2024
Errors in manhattan plot Hail Query & hailctl	6	435	October 29, 2020

How to generate Manhattan plot for gene(burden) hail matrix table after regression analysis?

Related topics