Expanding cores available to Hail by using extra VMs


#1

Sorry if this is not a Hail question per se, but I suppose you’d know the answer.
I am limited to 46 cores per VM, and at the moment I’m successfully running Hail 0.2 on such VM.
I have to analyse UK Biobank so I think it would be a good idea to have more cores available for the computations. I can create other VMs (also with 46 cores), so my questions are:

  1. Is it possible to expand Hail computations onto cores from other VMs using spark submit?
  2. If so, on these external VMs, would I have to install Hail etc on top of Spark?
  3. Apart from the RAM per core, will I need to associate large local disk space for caching or something like that?