I was trying to read a large VCF file. The size of which is found using
ls -lh command in unix and it produces the below output
-rw-rw-r-- 1 Test Test 42G Sep 10 13:31 merged_file.vcf.gz
May I check with you on how long does
hail take to read/import the above file?
If I don’t provide
force_bgz argument, it throws an error. So, I have provided it.
It’s been running for more than an hour.
Could there be any other reason to encounter such a delay or it is expected for my file size?
can hail read multiple files one by one and append to a matrixtable?
Please do let me know if you expect me to run any other unix command to know the hardware details of my server
My system details by
lscpu command gave me the below output
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 2 Core(s) per socket: 12 Socket(s): 1 NUMA node(s): 1 CPU family: 6 Model: 79 Stepping: 1 CPU MHz: 2499.921 CPU max MHz: 2900.0000 CPU min MHz: 1200.0000 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 30720K NUMA node0 CPU(s): 0-23
lsblk command gave the below output
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT Testa 8:0 0 3.7T 0 disk └─Testa1 8:1 0 3.7T 0 part /data/T1 Testb 8:16 0 3.7T 0 disk └─Testb1 8:17 0 3.7T 0 part /data/T2 Testc 8:32 0 3.7T 0 disk └─Testc1 8:33 0 3.7T 0 part /data/T3 nvme0n1 259:0 0 477G 0 disk ├─nvme0n1p1 259:1 0 476G 0 part / ├─nvme0n1p2 259:2 0 1K 0 part └─nvme0n1p5 259:3 0 975M 0 part