So I tried to run it the same command as above but on an even smaller locus, ACE2:
hl.experimental.run_combiner(sample_paths=inputs, out_file=output_file, tmp_path=temp_bucket, intervals=intervals, reference_genome='GRCh38', overwrite=True, contig_recoding=contig_recoding)
And it looks like it’s getting through the first 5 stages based on the log file, but then it hangs for a really long time and the log repeatedly prints:
2020-06-09 23:30:26 NettyRpcEnv: WARN: Ignored failure: java.util.concurrent.TimeoutException: Cannot receive any reply from compute-0-9.local:39381 in 10 seconds
2020-06-09 23:30:28 Executor: WARN: Issue communicating with driver in heartbeater
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
Over and over again. I’m not sure if I should just let it run for a very long time. At your advice I am running this on a qlogin node with a log of cores and memory: qlogin -pe smp 64 -l s_vmem=2G
. Previously, I used h_vmem=2G
and I kept hitting Java heap errors but I found that can happen when running Java with h_vmem
set instead of soft s_vmem
. Now it just kind of hangs for a long time. Does choosing a small interval like
intervals=[hl.Interval(hl.Locus("chrX", 15360138, reference_genome='GRCh38'), hl.Locus("chrX", 15802945, reference_genome='GRCh38'), includes_end=True)]
Not make it run much faster? It’s only grabbing that one small region for so many files. Is it creating the whole gVCF first and then shrinking it to that interval? If that is the case I should just make the whole gVCF the matrix table. I’ll try running overnight and seeing if it gets anywhere, in parallel I’m trying GLnexus again as well.
This does finish on a small test set, in that case, after running:
mt=hl.methods.read_matrix_table(output_file)
mt2=hl.experimental.densify(mt)
hl.export_vcf(mt2, 'ace2.vcf.bgz')
I receive the error:
ValueError: Method 'export_vcf' requires row key to be two fields 'locus' (type 'locus<any>') and 'alleles' (type 'array<str>')
Found:
'locus': locus<GRCh38>
Not quite sure what is going on there either. Don’t know why this combination did not add the alleles
field necessary to export the MatrixTable as a VCF.
Let me know your thoughts. And thanks for sticking with it.
Best,
Jim