Issue with hl.experimental.densify()

Hi! I’m currently having an issue with hl.experimental.densify() on a script which was previously working. When I remove the densify command, the script runs successfully, and I’m curious whether this might be a bug?

Below is the strack trace:

Traceback (most recent call last):
  File "<string>", line 20, in <module>
  File "<string>", line 12, in <module>
  File "/usr/local/lib/python3.10/dist-packages/hailtop/batch/job.py", line -1, in wrapped
  File "/tob-wgs/scripts/eqtl_hail_batch/generate_eqtl_spearman.py", line 210, in prepare_genotype_info
  File "/usr/local/lib/python3.10/dist-packages/hail/experimental/vcf_combiner/densify.py", line 39, in densify
    dense = hl.rbind(t.locus.position,
  File "/usr/local/lib/python3.10/dist-packages/hail/expr/functions.py", line 566, in rbind
    return hl.bind(f, *args, _ctx=_ctx)
  File "<decorator-gen-672>", line 2, in bind
  File "/usr/local/lib/python3.10/dist-packages/hail/typecheck/check.py", line 577, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/usr/local/lib/python3.10/dist-packages/hail/expr/functions.py", line 515, in bind
    lambda_result = to_expr(f(*args))
  File "/usr/local/lib/python3.10/dist-packages/hail/experimental/vcf_combiner/densify.py", line 40, in <lambda>
    lambda pos: hl._zip_func(scan, t.__entries,
  File "/usr/local/lib/python3.10/dist-packages/hail/expr/functions.py", line 3738, in _zip_func
    return construct_expr(
  File "<decorator-gen-632>", line 2, in construct_expr
  File "/usr/local/lib/python3.10/dist-packages/hail/typecheck/check.py", line 577, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/usr/local/lib/python3.10/dist-packages/hail/expr/expressions/typed_expressions.py", line 4573, in construct_expr
    x.assign_type(type)
  File "/usr/local/lib/python3.10/dist-packages/hail/ir/base_ir.py", line 308, in assign_type
    assert computed == typ, (computed, typ)
AssertionError: (dtype('array<struct{RGQ: int32, END: int32, gvcf_info: struct{AC: array<int32>, AF: array<float64>, AN: int32, AS_BaseQRankSum: array<float64>, AS_FS: array<float64>, AS_InbreedingCoeff: array<float64>, AS_MQ: array<float64>, AS_MQRankSum: array<float64>, AS_QD: array<float64>, AS_QUALapprox: array<int32>, AS_RAW_BaseQRankSum: str, AS_RAW_MQ: array<float64>, AS_RAW_MQRankSum: array<tuple(float64, int32)>, AS_RAW_ReadPosRankSum: array<tuple(float64, int32)>, AS_ReadPosRankSum: array<float64>, AS_SB_TABLE: array<array<int32>>, AS_SOR: array<float64>, AS_VarDP: array<int32>, BaseQRankSum: float64, ExcessHet: float64, FS: float64, InbreedingCoeff: float64, MQ: float64, MQRankSum: float64, MQ_DP: int32, QD: float64, QUALapprox: int32, RAW_GT_COUNT: array<int32>, RAW_MQandDP: array<int32>, ReadPosRankSum: float64, SOR: float64, VarDP: int32}, DP: int32, GQ: int32, MIN_DP: int32, PID: str, PS: int32, SB: array<int32>, GT: call, PGT: call, AD: array<int32>, PL: array<int32>, __contig: int32}>'), dtype('array<tuple(struct{RGQ: int32, END: int32, gvcf_info: struct{AC: array<int32>, AF: array<float64>, AN: int32, AS_BaseQRankSum: array<float64>, AS_FS: array<float64>, AS_InbreedingCoeff: array<float64>, AS_MQ: array<float64>, AS_MQRankSum: array<float64>, AS_QD: array<float64>, AS_QUALapprox: array<int32>, AS_RAW_BaseQRankSum: str, AS_RAW_MQ: array<float64>, AS_RAW_MQRankSum: array<tuple(float64, int32)>, AS_RAW_ReadPosRankSum: array<tuple(float64, int32)>, AS_ReadPosRankSum: array<float64>, AS_SB_TABLE: array<array<int32>>, AS_SOR: array<float64>, AS_VarDP: array<int32>, BaseQRankSum: float64, ExcessHet: float64, FS: float64, InbreedingCoeff: float64, MQ: float64, MQRankSum: float64, MQ_DP: int32, QD: float64, QUALapprox: int32, RAW_GT_COUNT: array<int32>, RAW_MQandDP: array<int32>, ReadPosRankSum: float64, SOR: float64, VarDP: int32}, DP: int32, GQ: int32, MIN_DP: int32, PID: str, PS: int32, SB: array<int32>, GT: call, PGT: call, AD: array<int32>, PL: array<int32>, __contig: int32}, struct{RGQ: int32, END: int32, gvcf_info: struct{AC: array<int32>, AF: array<float64>, AN: int32, AS_BaseQRankSum: array<float64>, AS_FS: array<float64>, AS_InbreedingCoeff: array<float64>, AS_MQ: array<float64>, AS_MQRankSum: array<float64>, AS_QD: array<float64>, AS_QUALapprox: array<int32>, AS_RAW_BaseQRankSum: str, AS_RAW_MQ: array<float64>, AS_RAW_MQRankSum: array<tuple(float64, int32)>, AS_RAW_ReadPosRankSum: array<tuple(float64, int32)>, AS_ReadPosRankSum: array<float64>, AS_SB_TABLE: array<array<int32>>, AS_SOR: array<float64>, AS_VarDP: array<int32>, BaseQRankSum: float64, ExcessHet: float64, FS: float64, InbreedingCoeff: float64, MQ: float64, MQRankSum: float64, MQ_DP: int32, QD: float64, QUALapprox: int32, RAW_GT_COUNT: array<int32>, RAW_MQandDP: array<int32>, ReadPosRankSum: float64, SOR: float64, VarDP: int32}, DP: int32, GQ: int32, MIN_DP: int32, PID: str, PS: int32, SB: array<int32>, GT: call, PGT: call, AD: array<int32>, PL: array<int32>, __contig: int32})>'))

Thanks!

Hi @KatalinaBobowik, thanks for the bug report! This should be fixed by #12020.

1 Like

I am facing a similar issue. hl.experimental.densify() was used in a script that worked before and now we are getting

AssertionError: (dtype('array<struct{LA: array<int32>, LGT: call, LAD: array<int32>, LPGT: call, LPL: array<int32>, RGQ: int32, END: int32, gvcf_info: struct{BaseQRankSum: float64, ClippingRankSum: float64, DS: bool, ExcessHet: float64, InbreedingCoeff: float64, MLEAC: array<int32>, MLEAF: array<float64>, MQ: float64, MQRankSum: float64, RAW_MQ: float64, ReadPosRankSum: float64}, DP: int32, GQ: int32, MIN_DP: int32, PID: str, SB: array<int32>, __contig: int32}>'), dtype('array<tuple(struct{LA: array<int32>, LGT: call, LAD: array<int32>, LPGT: call, LPL: array<int32>, RGQ: int32, END: int32, gvcf_info: struct{BaseQRankSum: float64, ClippingRankSum: float64, DS: bool, ExcessHet: float64, InbreedingCoeff: float64, MLEAC: array<int32>, MLEAF: array<float64>, MQ: float64, MQRankSum: float64, RAW_MQ: float64, ReadPosRankSum: float64}, DP: int32, GQ: int32, MIN_DP: int32, PID: str, SB: array<int32>, __contig: int32}, struct{LA: array<int32>, LGT: call, LAD: array<int32>, LPGT: call, LPL: array<int32>, RGQ: int32, END: int32, gvcf_info: struct{BaseQRankSum: float64, ClippingRankSum: float64, DS: bool, ExcessHet: float64, InbreedingCoeff: float64, MLEAC: array<int32>, MLEAF: array<float64>, MQ: float64, MQRankSum: float64, RAW_MQ: float64, ReadPosRankSum: float64}, DP: int32, GQ: int32, MIN_DP: int32, PID: str, SB: array<int32>, __contig: int32})>'))

Should #12020 have fixed this as well? We are on hail version 0.2.97.

Yeah that’s the same bug. Somewhat surprisingly, we haven’t released since July 7th. I’ll cut a release ASAP and post here when 0.2.98 is out.

@jsarro, Hail 0.2.98 has been released and includes a fix for this bug.