Static plotting or dataframe extraction

I have just started investigating the use of Hail for GWAS analysis. I created a jupyter notebook of your GWAS demo and everything seemed to go well, locally. However, when I saved this notebook to my github repository, I noticed that the plots did not render. I see that this has to do with Bokeh’s use of javascript to load the images and Github’s disabling of javascript functionality. I’ve seen some workarounds to this that suggest loading INLINE, but have not been able to successfully implement them. In addition, I see no way mentioned to programmatically save/export these plots as png when hail is used outside of a notebook. I do see a save icon, along with others, next to each plot in the notebook. I also see that bokeh has some ways to save or export plots as pngs, but they are all failing in their own way.

The interactiveness of the bokeh plots is nice and I’d like to continue using Github to share my notebooks but I’d like to have access to the raw plots as images to include in other documents and potential publications.

Is there a way to render and/or save these plots that I’m not seeing?
Can hail have the option to use matplotlib to plot instead of bokeh?
Can these data be extracted from the hail objects and into a dataframe or similar, then plotted with matplotlib?

I’m guessing that the latter is likely the simplest, but I still haven’t figured it out.

Thanks in advance,
Jake

Is there a way to render and/or save these plots that I’m not seeing?

These bokeh functions should work to export plots, I think. How are these failing?

I’m not totally sure how GitHub’s renderer works – the GWAS tutorial in our documentation is generated using nbconvert to export HTML, which seems to work fine.

Can hail have the option to use matplotlib to plot instead of bokeh?

We do intend to build multiple plotting backends, but it’s not a high priority right now. It feels like we really need to build our own plotting dialect which the Hail plotting functionality rests on, then build backends for that in bokeh, plotly, matplotlib, etc.

Can these data be extracted from the hail objects and into a dataframe or similar, then plotted with matplotlib?

That’s probably the first step to building support for multiple backends. We generally do this internally:

Again, not a super high priority to address right now, but pull requests are welcome!

Thank you for the quick response. It is my understanding the github escapes all javascript from the file prior to rendering so the bokeh javascript never runs so the images never load. Plots that I’ve previously generated with matplotlib are included inline as pngs with the notebook so no javascript is required to render them. Of course, I don’t believe that they are interactive as are the bokeh plots.

“Failing” is not really a fair assessment.

Let’s say

p = hl.plot.manhattan(gwas.p_value)

from the tutorial.

The following is interesting. Producing an html page complete with all javascript and images encoded in the html text. Not terribly helpful for me.

import bokeh.plotting
bokeh.plotting.output_file(“all_of_my_plots.HTML”)
bokeh.plotting.save( p )

Both of the following styles of image access fail with respect to phantomjs.

from bokeh.io.export import get_screenshot_as_png
from selenium import webdriver
image = get_screenshot_as_png(p, height=100, width=300, driver=webdriver)

yields …

AttributeError: module ‘selenium.webdriver’ has no attribute ‘get’

The above is likely my fault as I need to better understand what to use as “webdriver”

And finally, after …

pip install --upgrade --user pillow selenium
sudo port install npm6
sudo npm install -g phantomjs-prebuilt --ignore-scripts

from bokeh.io import export_png
export_png(p, filename=“manhattan_plot.png”)

it fails with …

RuntimeError: Error encountered in PhantomJS detection: ‘internal/validators.js:125\n throw new ERR_INVALID_ARG_TYPE(name, 'string', value);\n ^\n\nTypeError [ERR_INVALID_ARG_TYPE]: The “file” argument must be of type string. Received type object\n at validateString (internal/validators.js:125:11)\n at normalizeSpawnArguments (child_process.js:411:3)\n at spawn (child_process.js:545:16)\n at Object. (/opt/local/lib/node_modules/phantomjs-prebuilt/bin/phantomjs:22:10)\n at Module._compile (internal/modules/cjs/loader.js:776:30)\n at Object.Module._extensions…js (internal/modules/cjs/loader.js:787:10)\n at Module.load (internal/modules/cjs/loader.js:653:32)\n at tryModuleLoad (internal/modules/cjs/loader.js:593:12)\n at Function.Module._load (internal/modules/cjs/loader.js:585:3)\n at Function.Module.runMain (internal/modules/cjs/loader.js:829:12)\n’

Again, “fail” isn’t really accurate. This is more of a lack of understanding in the usage at this point.

Eventually, I’ll put down my hammer and figure it out.

Thanks again,
Jake

I think part of the problem is that we fix our bokeh dependency to an ancient version – Hail 0.2.15 (to be released in the next day or two) updates to the latest.

For reference, I also now remember hitting those same issues with bokeh, though things are working now.

Understood. Thank you. I’ll wait until this release before I dig any deeper.

(it’s out)