Unable to write MT after split_multi_hts

I’m encountering the following error when I try to write a matrix table to the GCS. I find this only happens after I’ve applied split_multi_hts as shown below (“mt_smard_test” writes fine, but “mt_smard_test2” doesn’t). Is it possible to write the matrix table after I’ve split on multi-allelic variants?


2022-08-10 20:12:07 Hail: INFO: wrote matrix table with 26521 rows and 98590 columns in 8 partitions to [ ]/data/smard1_test.mt

mt_smard_test2 = hl.split_multi_hts(mt_smard_test)


HailUserError Traceback (most recent call last)
/tmp/ipykernel_118/3960666400.py in
----> 1 mt_smard_test2.write(f’{bucket}/data/smard1_test2.mt’)

in write(self, output, overwrite, stage_locally, _codec_spec, _partitions, _checkpoint_file)

/opt/conda/lib/python3.7/site-packages/hail/typecheck/check.py in wrapper(__original_func, *args, **kwargs)
575 def wrapper(original_func, *args, **kwargs):
576 args
, kwargs
= check_all(__original_func, args, kwargs, checkers, is_method=is_method)
→ 577 return original_func(*args, **kwargs)
579 return wrapper

/opt/conda/lib/python3.7/site-packages/hail/matrixtable.py in write(self, output, overwrite, stage_locally, _codec_spec, _partitions, _checkpoint_file)
2555 writer = ir.MatrixNativeWriter(output, overwrite, stage_locally, _codec_spec, _partitions, _partitions_type, _checkpoint_file)
→ 2556 Env.backend().execute(ir.MatrixWrite(self._mir, writer))
2558 class _Show:

/opt/conda/lib/python3.7/site-packages/hail/backend/py4j_backend.py in execute(self, ir, timed)
102 return (value, timings) if timed else value
103 except FatalError as e:
→ 104 self._handle_fatal_error_from_backend(e, ir)
106 async def _async_execute(self, ir, timed=False):

/opt/conda/lib/python3.7/site-packages/hail/backend/backend.py in _handle_fatal_error_from_backend(self, err, ir)
187 ‘Hail stack trace:\n’
188 f’{better_stack_trace}')
→ 189 raise HailUserError(message_and_trace) from None

HailUserError: Error summary: HailException: array index out of bounds: index=2, length=2

Hail stack trace:
File “/tmp/ipykernel_118/1588062174.py”, line 1, in
mt_smard_test2 = hl.split_multi_hts(mt_smard_test)

File “/opt/conda/lib/python3.7/site-packages/hail/methods/statgen.py”, line 2374, in split_multi_hts
[hl.sum(split.AD) - split.AD[split.a_index], split.AD[split.a_index]])

File “/opt/conda/lib/python3.7/site-packages/hail/expr/expressions/typed_expressions.py”, line 481, in getitem
return self._method(“indexArray”, self.dtype.element_type, item)

File “/opt/conda/lib/python3.7/site-packages/hail/expr/expressions/base_expression.py”, line 695, in _method
x = ir.Apply(name, ret_type, self._ir, *(a._ir for a in args))

File “/opt/conda/lib/python3.7/site-packages/hail/ir/ir.py”, line 2263, in init

Does your dataset have sex chromosomes? if so, can you share what a VCF entry looks like for a male on chrX?