UKBiobank chromosome XY

tpoterba · December 7, 2018, 5:17pm

This looks like two related bugs we’ve fixed:

Try an update?

hhx037 · December 10, 2018, 10:21am

OK, every time I update things go wrong and I end up re-installing the whole thing, so I must be doing something wrong…

git clone https://github.com/hail-is/hail.git
cd hail/hail
./gradlew -Dspark.version=2.2.0 shadowJar archiveZip
cd …

That compiles, and then I copy hail over to where the previous version is located: cp -R hail /usr/local/
After that, it doesn’t work anymore. What am I missing?

tpoterba · December 10, 2018, 10:34am

it doesn’t work anymore

Can you elaborate?

hhx037 · December 10, 2018, 11:04am

I get back into python and it doesn’t find Hail.

hhx037 · December 10, 2018, 11:12am

OK, nevermind, now it works!
Before I wasn’t using Conda, I think that was the issue.

hhx037 · December 10, 2018, 11:29am

Or does it… I’m still getting the same error

After copying the new Hail over, I updated the conda environment as follow:
conda-env update -n hail -f $HAIL_HOME/python/hail/environment.yml

After attempting the logistic regression on X:

Hail version: 0.2.5-b9537d16564d
Error summary: AssertionError: assertion failed: is_female not in struct{__y: float64, __cov0: float64, __cov1: float64, __cov2: float64, __cov3: float64, __cov4: float64, __cov5: float64, __cov6: float64, __cov7: float64, __cov8: float64}

tpoterba · December 10, 2018, 11:45am

ok, must be something different – can you give us the full stack trace and the pipeline that replicates it?

hhx037 · December 10, 2018, 12:07pm

Sure, here is the script, will send you the log:

import hail as hl
import hail.expr.aggregators as agg
hl.init()
from pprint import pprint
from bokeh.io import output_notebook, show, export_png
from bokeh.layouts import gridplot
from bokeh.models import Span

import os

ds = hl.read_matrix_table('/mnt/output/sb/V/M/imputed_genotypes/HRC.vcfs/HRT_QCed_annotated_final.mt')

rg = ds.locus.dtype.reference_genome

x_contigs = set(rg.x_contigs)

y_contigs = set(rg.y_contigs)

autosomes = [c for c in rg.contigs if c not in x_contigs and c not in y_contigs]

mt_auto = hl.filter_intervals(ds, [hl.parse_locus_interval(c, rg) for c in autosomes])

mt_x = hl.filter_intervals(ds, [hl.parse_locus_interval(c, rg) for c in x_contigs])

x_chr_var = hl.case().when((mt_x.is_female | mt_x.locus.in_x_par()), hl.gp_dosage(mt_x.GP)).default(hl.sum(mt_x.GP * [0, 2]))

gwas_x = hl.logistic_regression_rows(x=x_chr_var, y=mt_x.pheno_case, covariates=[1, mt_x.is_female, mt_x.age, mt_x.weight, mt_x.PC1, mt_x.PC2, mt_x.PC3,mt_x.PC4,mt_x.PC5], test='wald', pass_through=[mt_x.rsid, mt_x.variant_qc, mt_x.EA, mt_x.NEA, mt_x.EAF])

tpoterba · December 10, 2018, 7:20pm

ok, I’m pretty baffled.

Could you put the following above the last line and paste the output here (feel free to edit names of fields):

print(mt_x._jmir.typ().colType().parsableString())

tpoterba · December 10, 2018, 7:28pm

also print(mt_x.col.dtype)

tpoterba · December 10, 2018, 7:30pm

ah wait I might see an issue…

tpoterba · December 10, 2018, 7:32pm

can you also send the Python stack trace in an email?

tpoterba · December 10, 2018, 7:32pm

(or here)

Topic		Replies	Views
Importing XY psuedoautosomal data into hail Hail Query & hailctl	2	652	November 20, 2018
Importing X chromosome bgen in Hail Help [0.1]	6	1114	November 9, 2017
Male heterozygotes on X chromosome Hail Query & hailctl	0	185	January 11, 2024
Reference genomes in 0.2 Updates	0	1928	April 6, 2018
Ploidy error when trying to extract X-chromosome snps Hail Query & hailctl	3	152	May 16, 2024

UKBiobank chromosome XY

Related topics