Efficient way to get maximum and minimum locus position from a mt?

danking · July 21, 2021, 5:47pm

Are you running this on a Spark cluster or something else? If you’re not on a Spark cluster, you probably need to set PYSPARK_SUBMIT_ARGS to permit Hail to use more memory.

I also strongly, strongly recommend you convert your data into a Hail native format before continuing with any serious analysis. You can do that like this:

import hail as hl
hl.import_vcf('c1_b2_v1.vcf.bgz').write('c1_b2_v1.mt')

Then you can load that dataset with:

mt = hl.read_matrix_table('c1_b2_v1.mt')

Topic		Replies	Views
Best ways to filter Mt down to GT values Hail Query & hailctl	1	346	March 28, 2023
Calculate minimal representation Hail Query & hailctl	12	369	July 17, 2021
How do I create a locus and allele keyed table from chromosome, start position, end position, reference allele and alt allele? Hail Query & hailctl	2	499	March 31, 2021
Liftover Range Exception and Support for chrom 'MT' Hail Query & hailctl	3	622	September 9, 2019
Prepare hail entries for spark.ml Hail Query & hailctl	3	446	October 29, 2019

Efficient way to get maximum and minimum locus position from a mt?

Related topics