Pass default for hail expressions in function

I’m working on a function that accepts hail expressions for some of the parameters:

def filter_to_clinvar_pathogenic(
    t: Union[hl.MatrixTable, hl.Table],
    clnrevstat_expr: hl.ArrayExpression, 
    clnsig_expr: hl.ArrayExpression, 
    clnsigconf_expr: hl.ArrayExpression,
    remove_no_assertion: bool = True,
    remove_conflicting: bool = True,
)

Is it possible to pass in default values for hail expression variables ex: clnsig_expr: hl.ArrayExpression = default_ht.clnsig_expr? We weren’t sure about how hail expressions might behave.

There’s a github discussion thread that explains what we’re interested in doing in more detail here: https://github.com/broadinstitute/gnomad_methods/pull/257#discussion_r546100414

Where does default_ht come from here? Are all of these expressions supposed to be fields of t, or are they from different tables?

I believe the default_ht should be t in the example above and the expressions should be fields of t. We have discussed making the defaults the field string instead of the expression, i.e. t['clnrevstat'] in place of clnrevstat_expr but we are not certain if that will impact the downstream evaluation, lazy vs. repeated. Are there advantages to using the expression here vs the actual field? Thank you.

So I think depending on the interface you have some options:

  1. You can just have all the arguments be strings, look them up in the table with t['clnrevstat'] the way you described. That’s not any less performant than passing in t.clnrevstat. The nice part about this method is that it’s easy to specify the defaults, and it’s impossible to pass in multiple expressions from different tables by accident.

  2. You can accept a mix of expressions and strings, with the defaults specified as strings. In this scenario, you’ll have to check the type of each argument, converting all strings to expressions with the square bracket syntax.

  3. You take in expressions, and you let the default arguments be None. Then you check each argument to see if it’s None, and if it is you look up the field of the table you want to be the default with the . syntax or the [] syntax.

For 2 and 3, if you expect all the fields to be from the same table, you’ll probably want some assertions to check that.