when i run the command for download, the web speed is too slow and i cannot download anything. Do anyone have the data‘s web link for download? Thanks so much for your help!!!
You will need two files:
- https://storage.googleapis.com/hail-tutorial/1kg.vcf.bgz, and
- https://storage.googleapis.com/hail-tutorial/1kg_annotations.txt
Are you using a cluster or a single computer? If you are using a cluster, ensure these two files are in HDFS or in another filesystem that is readable by the entire cluster.
To convert the compressed VCF file into a Hail Matrix Table (and save it on disk), execute this:
import hail as hl
imported = hl.import_vcf('/path/to/1kg.vcf.bgz', min_partitions=16)
imported.write('/some/other/path/1kg.mt, overwrite=True)
In the notebook, you must replace
mt = hl.read_matrix_table('data/1kg.mt')
with:
mt = hl.read_matrix_table('/some/other/path/1kg.mt')
You also must replace the path to 1kg_annotations.txt
here:
table = (hl.import_table('data/1kg_annotations.txt', impute=True)
.key_by('Sample'))
Thanks! I have downloaded the data by your links and commands:smile: