ArrayIndexOutOfBoundsException

I’m trying to export per chromosome vcfs but am running into an ArrayIndexOutOfBoundsException after running hl.filter_intervals:

mt = hl.read_matrix_table(release_mt_path(data_source, freeze, nested=False, temp=True))
rg = hl.get_reference('GRCh38')
contigs = rg.contigs[:24]
for contig in contigs:
    contig_mt = hl.filter_intervals(mt, [hl.parse_locus_interval(contig)])
    hl.export_vcf(contig_mt, release_vcf_path(data_source, freeze, contig=contig), metadata=header_dict)

Running mt.rows().show() before the for loop runs, but running contig_mt.rows().show() throws the same error. The mt has 10000 partitions, 12476588 rows, and 99399 columns.

I would appreciate any thoughts/suggestions!

Can we have the full stack trace?

Should be easy enough for us to replicate once we have a little more info.

  File "/tmp/569a6bb8745a423dad88fc816df62135/prepare_data_release.py", line 839, in <module>
    main(args)
  File "/tmp/569a6bb8745a423dad88fc816df62135/prepare_data_release.py", line 808, in main
    contig_mt.rows().show()
  File "</opt/conda/default/lib/python3.6/site-packages/decorator.py:decorator-gen-904>", line 2, in show
  File "/opt/conda/default/lib/python3.6/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1450, in show
    handler(self._show(n, width, truncate, types))
  File "/opt/conda/default/lib/python3.6/site-packages/IPython/core/display.py", line 282, in display
    print(*objs)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1241, in __str__
    return self._ascii_str()
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1268, in _ascii_str
    rows, has_more, dtype = self.data()
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1251, in data
    rows, has_more = t._take_n(self.n)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1372, in _take_n
    rows = self.take(n + 1)
  File "</opt/conda/default/lib/python3.6/site-packages/decorator.py:decorator-gen-916>", line 2, in take
  File "/opt/conda/default/lib/python3.6/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 2064, in take
    return self.head(n).collect(_localize)
  File "</opt/conda/default/lib/python3.6/site-packages/decorator.py:decorator-gen-910>", line 2, in collect
  File "/opt/conda/default/lib/python3.6/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1863, in collect
    return Env.backend().execute(e._ir)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/backend/backend.py", line 109, in execute
    result = json.loads(Env.hc()._jhc.backend().executeJSON(self._to_java_ir(ir)))
  File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/opt/conda/default/lib/python3.6/site-packages/hail/utils/java.py", line 225, in deco
    'Error summary: %s' % (deepest, full, hail.__version__, deepest)) from None
hail.utils.java.FatalError: ArrayIndexOutOfBoundsException: 1

Java stack trace:
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.spark.sql.catalyst.expressions.GenericRow.get(rows.scala:174)
	at is.hail.annotations.RegionValueBuilder.addRow(RegionValueBuilder.scala:299)
	at is.hail.annotations.RegionValueBuilder.addAnnotation(RegionValueBuilder.scala:542)
	at is.hail.annotations.RegionValueBuilder.addAnnotation(RegionValueBuilder.scala:580)
	at is.hail.annotations.RegionValueBuilder.addAnnotation(RegionValueBuilder.scala:531)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1$$anonfun$apply$1.apply(EmitFunctionBuilder.scala:229)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1$$anonfun$apply$1.apply(EmitFunctionBuilder.scala:229)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1.apply(EmitFunctionBuilder.scala:229)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1.apply(EmitFunctionBuilder.scala:225)
	at is.hail.utils.package$.using(package.scala:596)
	at is.hail.annotations.Region$.scoped(Region.scala:18)
	at is.hail.expr.ir.EmitFunctionBuilder.encodeLiterals(EmitFunctionBuilder.scala:225)
	at is.hail.expr.ir.EmitFunctionBuilder.resultWithIndex(EmitFunctionBuilder.scala:559)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:55)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:80)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:120)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:129)
	at is.hail.expr.ir.TableFilter.execute(TableIR.scala:447)
	at is.hail.expr.ir.TableKeyByAndAggregate.execute(TableIR.scala:1340)
	at is.hail.expr.ir.TableExplode.execute(TableIR.scala:1186)
	at is.hail.expr.ir.TableMapRows.execute(TableIR.scala:967)
	at is.hail.expr.ir.Interpret$.apply(Interpret.scala:749)
	at is.hail.expr.ir.Interpret$.apply(Interpret.scala:90)
	at is.hail.expr.ir.Interpret$.apply(Interpret.scala:59)
	at is.hail.expr.ir.InterpretNonCompilable$$anonfun$5.apply(InterpretNonCompilable.scala:16)
	at is.hail.expr.ir.InterpretNonCompilable$$anonfun$5.apply(InterpretNonCompilable.scala:16)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
	at is.hail.expr.ir.InterpretNonCompilable$.apply(InterpretNonCompilable.scala:16)
	at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:37)
	at is.hail.expr.ir.CompileAndEvaluate$$anonfun$2.apply(CompileAndEvaluate.scala:37)
	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:24)
	at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:37)
	at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:57)
	at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:57)
	at is.hail.utils.package$.using(package.scala:596)
	at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:10)
	at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:9)
	at is.hail.utils.package$.using(package.scala:596)
	at is.hail.annotations.Region$.scoped(Region.scala:18)
	at is.hail.expr.ir.ExecuteContext$.scoped(ExecuteContext.scala:9)
	at is.hail.backend.Backend.execute(Backend.scala:57)
	at is.hail.backend.Backend.executeJSON(Backend.scala:63)
	at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)



Hail version: 0.2.26-2dcc3d963867
Error summary: ArrayIndexOutOfBoundsException: 1
ERROR: (gcloud.dataproc.jobs.submit.pyspark) Job [569a6bb8745a423dad88fc816df62135] failed with error:
Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found in 'gs://dataproc-1aca38e4-67fe-4b64-b451-258ef1aea4d1-us-central1/google-cloud-dataproc-metainfo/43a6cb23-0aae-4450-8866-d4999bba8067/jobs/569a6bb8745a423dad88fc816df62135/driveroutput'.
Traceback (most recent call last):
  File "/Users/kchao/.conda/envs/hail/bin/hailctl", line 10, in <module>
    sys.exit(main())
  File "/Users/kchao/.conda/envs/hail/lib/python3.6/site-packages/hailtop/hailctl/__main__.py", line 94, in main
    cli.main(args)
  File "/Users/kchao/.conda/envs/hail/lib/python3.6/site-packages/hailtop/hailctl/dataproc/cli.py", line 107, in main
    jmp[args.module].main(args, pass_through_args)
  File "/Users/kchao/.conda/envs/hail/lib/python3.6/site-packages/hailtop/hailctl/dataproc/submit.py", line 75, in main
    check_call(cmd)
  File "/Users/kchao/.conda/envs/hail/lib/python3.6/subprocess.py", line 291, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['gcloud', 'dataproc', 'jobs', 'submit', 'pyspark', 'prepare_data_release.py', '--cluster=kc', '--files=', '--py-files=/var/folders/8_/hllv8xr94p3539n63wzrm0c11_2fgd/T/pyscripts_hc66tjdw.zip', '--properties=', '--', '-f', '4', '--prepare_release_vcf']' returned non-zero exit status 1.```

Whoah, weird. I saw something similar while working on a change last week, so it’s possible that this has been fixed in the latest master (but not latest release). We’re planning to make a release today, as soon as https://github.com/hail-is/hail/pull/7479 goes in

thanks! I have a basic question, is there any way I can test that this works in the latest version of hail? I am hoping to finish writing the vcfs before a meeting tomorrow

yes, I can give you a wheel to install.

thank you so much!!

Download and pip install this public wheel:

gs://hail-common/hailctl/dataproc/tpoterba-dev/0.2.26-2706ad7eee6a/hail-0.2.26-py3-none-any.whl

thank you so much for this! I got this stack trace again:

Traceback (most recent call last):
  File "/tmp/4a4566b10b97448eb948461c601340cd/prepare_data_release.py", line 839, in <module>
    main(args)
  File "/tmp/4a4566b10b97448eb948461c601340cd/prepare_data_release.py", line 808, in main
    contig_mt.rows().show()
  File "</opt/conda/default/lib/python3.6/site-packages/decorator.py:decorator-gen-918>", line 2, in show
  File "/opt/conda/default/lib/python3.6/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1450, in show
    handler(self._show(n, width, truncate, types))
  File "/opt/conda/default/lib/python3.6/site-packages/IPython/core/display.py", line 282, in display
    print(*objs)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1241, in __str__
    return self._ascii_str()
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1268, in _ascii_str
    rows, has_more, dtype = self.data()
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1251, in data
    rows, has_more = t._take_n(self.n)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1372, in _take_n
    rows = self.take(n + 1)
  File "</opt/conda/default/lib/python3.6/site-packages/decorator.py:decorator-gen-930>", line 2, in take
  File "/opt/conda/default/lib/python3.6/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 2064, in take
    return self.head(n).collect(_localize)
  File "</opt/conda/default/lib/python3.6/site-packages/decorator.py:decorator-gen-924>", line 2, in collect
  File "/opt/conda/default/lib/python3.6/site-packages/hail/typecheck/check.py", line 585, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/table.py", line 1863, in collect
    return Env.backend().execute(e._ir)
  File "/opt/conda/default/lib/python3.6/site-packages/hail/backend/backend.py", line 109, in execute
    result = json.loads(Env.hc()._jhc.backend().executeJSON(self._to_java_ir(ir)))
  File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/opt/conda/default/lib/python3.6/site-packages/hail/utils/java.py", line 225, in deco
    'Error summary: %s' % (deepest, full, hail.__version__, deepest)) from None
hail.utils.java.FatalError: ArrayIndexOutOfBoundsException: 1

Java stack trace:
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.spark.sql.catalyst.expressions.GenericRow.get(rows.scala:174)
	at is.hail.annotations.RegionValueBuilder.addRow(RegionValueBuilder.scala:299)
	at is.hail.annotations.RegionValueBuilder.addAnnotation(RegionValueBuilder.scala:542)
	at is.hail.annotations.RegionValueBuilder.addAnnotation(RegionValueBuilder.scala:580)
	at is.hail.annotations.RegionValueBuilder.addAnnotation(RegionValueBuilder.scala:531)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1$$anonfun$apply$1.apply(EmitFunctionBuilder.scala:249)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1$$anonfun$apply$1.apply(EmitFunctionBuilder.scala:249)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1.apply(EmitFunctionBuilder.scala:249)
	at is.hail.expr.ir.EmitFunctionBuilder$$anonfun$encodeLiterals$1.apply(EmitFunctionBuilder.scala:245)
	at is.hail.utils.package$.using(package.scala:596)
	at is.hail.annotations.Region$.scoped(Region.scala:18)
	at is.hail.expr.ir.EmitFunctionBuilder.encodeLiterals(EmitFunctionBuilder.scala:245)
	at is.hail.expr.ir.EmitFunctionBuilder.resultWithIndex(EmitFunctionBuilder.scala:610)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:55)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:80)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:121)
	at is.hail.expr.ir.Compile$.apply(Compile.scala:130)
	at is.hail.expr.ir.TableFilter.execute(TableIR.scala:447)
	at is.hail.expr.ir.TableKeyByAndAggregate.execute(TableIR.scala:1359)
	at is.hail.expr.ir.TableExplode.execute(TableIR.scala:1205)
	at is.hail.expr.ir.TableMapRows.execute(TableIR.scala:995)
	at is.hail.expr.ir.Interpret$.apply(Interpret.scala:722)
	at is.hail.expr.ir.Interpret$.alreadyLowered(Interpret.scala:67)
	at is.hail.expr.ir.InterpretNonCompilable$.interpretAndCoerce$1(InterpretNonCompilable.scala:16)
	at is.hail.expr.ir.InterpretNonCompilable$.is$hail$expr$ir$InterpretNonCompilable$$rewrite$1(InterpretNonCompilable.scala:53)
	at is.hail.expr.ir.InterpretNonCompilable$$anonfun$1.apply(InterpretNonCompilable.scala:25)
	at is.hail.expr.ir.InterpretNonCompilable$$anonfun$1.apply(InterpretNonCompilable.scala:25)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
	at is.hail.expr.ir.InterpretNonCompilable$.rewriteChildren$1(InterpretNonCompilable.scala:25)
	at is.hail.expr.ir.InterpretNonCompilable$.is$hail$expr$ir$InterpretNonCompilable$$rewrite$1(InterpretNonCompilable.scala:54)
	at is.hail.expr.ir.InterpretNonCompilable$.apply(InterpretNonCompilable.scala:58)
	at is.hail.expr.ir.lowering.InterpretNonCompilablePass$.transform(LoweringPass.scala:45)
	at is.hail.expr.ir.lowering.LoweringPass$class.apply(LoweringPass.scala:12)
	at is.hail.expr.ir.lowering.InterpretNonCompilablePass$.apply(LoweringPass.scala:40)
	at is.hail.expr.ir.lowering.LoweringPipeline$$anonfun$apply$1.apply(LoweringPipeline.scala:19)
	at is.hail.expr.ir.lowering.LoweringPipeline$$anonfun$apply$1.apply(LoweringPipeline.scala:17)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
	at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:17)
	at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:14)
	at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:57)
	at is.hail.backend.Backend$$anonfun$execute$1.apply(Backend.scala:57)
	at is.hail.utils.package$.using(package.scala:596)
	at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:10)
	at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:9)
	at is.hail.utils.package$.using(package.scala:596)
	at is.hail.annotations.Region$.scoped(Region.scala:18)
	at is.hail.expr.ir.ExecuteContext$.scoped(ExecuteContext.scala:9)
	at is.hail.backend.Backend.execute(Backend.scala:57)
	at is.hail.backend.Backend.executeJSON(Backend.scala:63)
	at sun.reflect.GeneratedMethodAccessor44.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)



Hail version: 0.2.26-2706ad7eee6a
Error summary: ArrayIndexOutOfBoundsException: 1
ERROR: (gcloud.dataproc.jobs.submit.pyspark) Job [4a4566b10b97448eb948461c601340cd] failed with error:
Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found at 'https://console.cloud.google.com/dataproc/jobs/4a4566b10b97448eb948461c601340cd?project=maclab-ukbb&region=us-central1' and in 'gs://dataproc-1aca38e4-67fe-4b64-b451-258ef1aea4d1-us-central1/google-cloud-dataproc-metainfo/9dc3e497-f7d8-4930-ac01-2a417920aca5/jobs/4a4566b10b97448eb948461c601340cd/driveroutput'.
Traceback (most recent call last):
  File "/Users/kchao/.conda/envs/hail-test/bin/hailctl", line 8, in <module>
    sys.exit(main())
  File "/Users/kchao/.conda/envs/hail-test/lib/python3.8/site-packages/hailtop/hailctl/__main__.py", line 94, in main
    cli.main(args)
  File "/Users/kchao/.conda/envs/hail-test/lib/python3.8/site-packages/hailtop/hailctl/dataproc/cli.py", line 107, in main
    jmp[args.module].main(args, pass_through_args)
  File "/Users/kchao/.conda/envs/hail-test/lib/python3.8/site-packages/hailtop/hailctl/dataproc/submit.py", line 75, in main
    check_call(cmd)
  File "/Users/kchao/.conda/envs/hail-test/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['gcloud', 'dataproc', 'jobs', 'submit', 'pyspark', 'prepare_data_release.py', '--cluster=kc', '--files=', '--py-files=/var/folders/8_/hllv8xr94p3539n63wzrm0c11_2fgd/T/pyscripts_k1ygihg5.zip', '--properties=', '--', '-f', '4', '--prepare_release_vcf']' returned non-zero exit status 1.

Ah, OK. Let me see if I can replicate now.

Do you have the log?

yes! I think the site is preventing me from uploading the log, but I can slack it to you

yep, that’s fine

I noticed something odd about the newest whl ( gs://hail-common/hailctl/dataproc/tpoterba-dev/0.2.26-228d3a30149b/hail-0.2.26-py3-none-any.whl). I think variant_qc is doing something odd:

import hail as hl

hl.init(log='/filter_intervals_test.log')

mt = hl.read_matrix_table('gs://broad-ukbb/broad.freeze_4/hardcalls/hardcalls.split.mt')
mt = hl.filter_intervals(mt, [hl.parse_locus_interval('chr20', reference_genome='GRCh38')])

mt.rows().show()
mt = hl.variant_qc(mt)

mt.rows().show()

contigs = mt.aggregate_rows(hl.agg.counter(mt.locus.contig))
print(contigs)

In the first show, all of the rows are chr20, but in the second show, they’re all chr19. Also, print(contigs) shows {'chr20': 311049}.

can I have the log again?

This is quite weird.

filter_intervals_test.log (1.9 MB)

what if you do:

mt = hl.read_matrix_table('gs://broad-ukbb/broad.freeze_4/hardcalls/hardcalls.split.mt')
mt = hl.filter_intervals(mt, [hl.parse_locus_interval('chr20', reference_genome='GRCh38')])
hl.summarize_variants(mt)
==============================
Alleles per variant
-------------------
  2 alleles: 311049 variants
==============================
Variants per contig
-------------------
  chr20: 311049 variants
==============================
Allele type distribution
------------------------
        SNP: 281922 alternate alleles
   Deletion: 17989 alternate alleles
  Insertion: 11138 alternate alleles
==============================```

This input data isn’t public, is it? I can’t replicate something similar locally.