Using Genotype Probabilities in logreg from import_vcf

Hi,

I can’t see how to use the genotype probabilities, instead of the hard call, when importing a vcf.

ds = hl.import_vcf('22.vcf.bgz', call_fields=["GP"], skip_invalid_loci=True)

Hail version: 0.2-3b08196a75cb
Error summary: HailException: Can only convert a header line with type String' to a call type. Found Float’.

ds.describe()

Global fields:
None
Column fields:
‘s’: str
Row fields:
‘locus’: locus
‘alleles’: array
‘rsid’: str
‘qual’: float64
‘filters’: set
‘info’: struct {
AC: array,
AN: int32,
RefPanelAF: array,
TYPED: bool,
INFO: float64
}
Entry fields:
‘GT’: call
‘ADS’: array
‘DS’: float64
‘GP’: array
Column key: [‘s’]
Row key: [‘locus’, ‘alleles’]

I couldn’t find anything about using genotype probabilities in Hail 0.2, help please.

Stephane

Hail treats call fields (GT, PGT) specially by assigning them the call type. GP is just of type array<float>. Depending on what you want to do downstream, there are functions that can help manipulate it, like computing the dosage:

https://hail.is/docs/devel/functions/genetics.html?highlight=gp_dosage#hail.expr.functions.gp_dosage

That does sound like what I’m looking for.
How do I include the gp dosage in logreg, something like this?

hl.logistic_regression_rows(test='wald',y=ds.is_case,x=hl.gp_dosage(ds.GP),covariates=[1, ds.is_female])

yes, that looks good