Adding gene name and rs id to Manhattan plot

Hello, is there a functionality for me to annotate gene_name and rsid to my Manhattan plot. I have been trying to figure it out and I have not gotten anything. Thank you

1 Like

I have the same requirement. Did you figure it out?

You can pass a dictionary of fields that you would like to show up when hovering over data points to hover_fields when plotting. To get both rsID and gene name you could annotate the rows of your dataset with the dbSNP table that is available via the Hail annotation database.

Here’s an example assuming we already have a Hail table with GWAS results that we want to plot, like in the Hail GWAS tutorial.

# Run GWAS
gwas = hl.linear_regression_rows(
    y=mt.pheno.CaffeineConsumption,
    x=mt.GT.n_alt_alleles(),
    covariates=[1.0, mt.pheno.isFemale, mt.scores[0], mt.scores[1], mt.scores[2]]
)
# Create annotation DB instance and annotate GWAS results table with dbSNP dataset
db = hl.experimental.DB(region='us', cloud='gcp')
gwas = db.annotate_rows_db(gwas, 'dbSNP')
# Extract just the rsID and GENEINFO fields, then drop rest of the dbSNP fields
gwas = gwas.annotate(rsid=gwas.dbSNP.rsid, gene_info=gwas.dbSNP.info.GENEINFO)
gwas = gwas.drop('dbSNP')

Then you can create a Manhattan plot with hover fields for rsid and gene_info with

from bokeh.io import show, output_notebook
from bokeh.layouts import gridplot
output_notebook()

hover_fields = {'rsid': gwas.rsid, 'gene_info': gwas.gene_info}
p = hl.plot.manhattan(gwas.p_value, hover_fields=hover_fields)

show(p)
2 Likes