Hi,
I am new to hail 0.2, having previously used hail 0.1.
I am having an issue with using split_multi and/or split_multi_hts on my imported .vcf file. The .vcf file is a multisample (approx 300) file of trios containing WGS data, that I want ultimately run de novo calls on. I am running hail 0.2 in jupyter notebooks.
I am able to import the vcf and I can run split_multi or split_multi_hts. However the problem comes when I either run sample_qc or de_novo. I get this same error:
HailUserError Traceback (most recent call last)
in
----> 1 results.export(output=‘/scratch/c.sbi9hc/DRAGEN_analysis/hail_de_novo.tsv’,delimiter = “\t”)
in export(self, output, types_file, header, parallel, delimiter)
/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/typecheck/check.py in wrapper(__original_func, *args, **kwargs)
575 def wrapper(original_func, *args, **kwargs):
576 args, kwargs = check_all(__original_func, args, kwargs, checkers, is_method=is_method)
→ 577 return original_func(*args, **kwargs)
578
579 return wrapper
/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/table.py in export(self, output, types_file, header, parallel, delimiter)
1044 parallel = ir.ExportType.default(parallel)
1045 Env.backend().execute(
→ 1046 ir.TableWrite(self._tir, ir.TableTextWriter(output, types_file, header, parallel, delimiter)))
1047
1048 def group_by(self, *exprs, **named_exprs) → ‘GroupedTable’:
/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/backend/py4j_backend.py in execute(self, ir, timed)
94 ‘Hail stack trace:\n’
95 f’{better_stack_trace}')
—> 96 raise HailUserError(message_and_trace) from None
97
98 raise e
HailUserError: Error summary: HailException: array index out of bounds: index=2, length=2
Hail stack trace:
File “”, line 1, in
mt = hl.split_multi_hts(mt)
File “/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/methods/statgen.py”, line 2322, in split_multi_hts
(hl.range(0, 3).map(lambda i:
File “/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/methods/statgen.py”, line 2326, in
).map(lambda j: split.PL[j]))))))
File “/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/methods/statgen.py”, line 2326, in
).map(lambda j: split.PL[j]))))))
File “/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/expr/expressions/typed_expressions.py”, line 481, in getitem
return self._method(“indexArray”, self.dtype.element_type, item)
File “/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/expr/expressions/base_expression.py”, line 596, in _method
x = ir.Apply(name, ret_type, self._ir, *(a._ir for a in args))
File “/apps/genomics/hail/0.2/el7/AVX512/gnu-7.3/hail/ir/ir.py”, line 2138, in init
self.save_error_info()
I have looked at these threads:
However neither seem to help solve my issue. I have considered using vcf_combiner however it would be preferred if I can work out why this is happening. Any help would be massively appreciated.
Thank you