Negating logical operators

Is there a way in Hail, within filtering operations, to combine positive and negative logical statements?

e.g. (matrix.X.contains(a_value)) !& (matrix.Y.contains(another_value))

An example here is filtering out clinvar benign variants from a data set following VEP annotation. I couldn’t find filtering conditions that also allowed a row to be kept if the clinvar annotations were entirely missing values. The logic I would like to use looks like:

matrix = matrix.filter_rows((not hl.is_missing(clinvar)) 
& ((clinvar.contains(benign) & (clinvar.stars > 0)), keep=False)

As there is missing data present in the clinvar annotations, I had to use a replacement step to make sure the logical operations on the values were accurately carried out

clinvar = hl.if_else(clinvar, hl.missing(hl.str())

I’ve tried using a few different variations:

  • ‘not’ operator
  • !(logical condition)
  • !& operator (I’m not sure if this exists in python at all…)

I think I’ve checked through all the documentation and didn’t find anything that was a good fit. I don’t mind using the replacement method, but I have quite a few similar conditions in mind, and doubling the number of operations instead of offering an alternative path in the logic is reasonably expensive.

Please let me know if something like this exists in the current syntax!

As a side note, is this a bug in the query syntax parsing? In an example MatrixTable ‘m2’ here, the clinvar values are all missing strings (displayed value ‘NA’). Both keep=True and False forms of this query filters out all rows:

.contains('benign'), keep=False).rows().show()



I’m not sure if this is because the specific query against missing values creates an error state… Use of hl.or_else here can be used during the query to replace the missing string with an empty one, but intuitively it feels like this is adding additional computational operations

The negation operator in hail is ~. So like ~hl.is_missing(clinvar). This is the best we can do, because python’s !, and, or, and not operators all require actual booleans, they don’t work with hail’s BooleanExpression.

For your second question, missing rows always get filtered out. See warning here: Hail | MatrixTable

Use hl.coerce(m2.clinvar.clinical_significance, True) if you want to replace the missings with True, or False if you’d prefer.