As of hash cf235511d2ee
index_bgen
:
- Changed the index file-format. You will need to rerun
index_bgen
in order to load BGEN files into Hail. - Added a new optional argument
index_file_map
which allows you to write the index files to a different location than where the BGEN files are stored. Be aware that the index file paths must end in .idx2 - options for
contig_recoding
,skip_invalid_loci
, andreference_genome
were moved fromimport_bgen
toindex_bgen
import_bgen
:
- Removed arguments
contig_recoding
,skip_invalid_loci
, andreference_genome
. Use these options withindex_bgen
instead. - Added a new optional argument
variants
that allows you to specify either a Python list of variants (Struct with locus and alleles), a StructExpression with two fields – locus and alleles, or a Table that is keyed by locus and alleles. This can significantly improve performance when a pipeline does not need to look at all variants in the file. - Added a new optional argument
index_file_map
which allows you to specify which index file to use for a given BGEN input file. The default is to look for the index file having the same path name + “.idx2” in the directory the BGEN file is located.
Things to be aware of:
- When loading multiple BGEN files with import_bgen, the argument for
reference_genome
when indexing all files must be identical. For example, you cannot index one file with GRCh37 and another with GRCh38 and then load both files at the same time.