Hello, is there a functionality for me to annotate gene_name and rsid to my Manhattan plot. I have been trying to figure it out and I have not gotten anything. Thank you
1 Like
I have the same requirement. Did you figure it out?
You can pass a dictionary of fields that you would like to show up when hovering over data points to hover_fields
when plotting. To get both rsID and gene name you could annotate the rows of your dataset with the dbSNP table that is available via the Hail annotation database.
Here’s an example assuming we already have a Hail table with GWAS results that we want to plot, like in the Hail GWAS tutorial.
# Run GWAS
gwas = hl.linear_regression_rows(
y=mt.pheno.CaffeineConsumption,
x=mt.GT.n_alt_alleles(),
covariates=[1.0, mt.pheno.isFemale, mt.scores[0], mt.scores[1], mt.scores[2]]
)
# Create annotation DB instance and annotate GWAS results table with dbSNP dataset
db = hl.experimental.DB(region='us', cloud='gcp')
gwas = db.annotate_rows_db(gwas, 'dbSNP')
# Extract just the rsID and GENEINFO fields, then drop rest of the dbSNP fields
gwas = gwas.annotate(rsid=gwas.dbSNP.rsid, gene_info=gwas.dbSNP.info.GENEINFO)
gwas = gwas.drop('dbSNP')
Then you can create a Manhattan plot with hover fields for rsid
and gene_info
with
from bokeh.io import show, output_notebook
from bokeh.layouts import gridplot
output_notebook()
hover_fields = {'rsid': gwas.rsid, 'gene_info': gwas.gene_info}
p = hl.plot.manhattan(gwas.p_value, hover_fields=hover_fields)
show(p)
2 Likes