Extracting DP into a list for plotting

Hello! I am new to Hail and I would like some advice. I can access the DP from my MatrixTable with this: mt.rows().info.DP.show()
And I will get this:


How can I extract the DP values into a single python list, or any other format that I can use? Thank you so much.

You can do:

all_dp_values = mt.aggregate_rows(hl.agg.collect(mt.info.DP))

Note this is going to take a while and put a LOT of data in memory in Python. Collecting data into Python lists is not generally recommended – what in particular are you trying to do with the DP values? There’s probably a way to do that in Hail.

Thank you for your response! I am trying to plot a histogram of the values. I am not too familiar with all the Hail functions so I always tend to extract them to work in Python or R

ah, I see. This will work fine with small data, but what if you want to plot a histogram of all GQ values (an entry/genotype field)? Collecting these as a list in Python is not feasible beyond tiny datasets.

Hail has some plotting functions in hl.plot, like hl.plot.histogram, which make plots for you in ways that can scale to large datasets. These plots are generated using the bokeh library.