Failed to install Hail on Azure following the tutorial

I followed this tutorial and everything seems to be okay until I came to Step 21.

“Save the changes and restart the affected services. These changes need a restart of Spark2 service. Ambari UI will prompt a required restart reminder, click Restart to restart all affected services.”

After I initiated the restart, one of the items in the restart dialog (“run_customscriptaction”) seems to have failed.

I clicked into the “Run Customscriptaction Actionexecute” screen and this is the error log:

The stdout is quite large, but the bottom of it looks like this

statsmodels-0.  99% |############################## | Time: 0:00:00  76.77 MB/s
statsmodels-0.  99% |############################## | Time: 0:00:00  76.78 MB/s
statsmodels-0.  99% |############################## | Time: 0:00:00  76.80 MB/s
statsmodels-0. 100% |###############################| Time: 0:00:00  76.80 MB/s
statsmodels-0. 100% |###############################| Time: 0:00:00  76.72 MB/s
anaconda-2020.   0% |                              | ETA:  --:--:--   0.00  B/s
anaconda-2020. 100% |###############################| Time: 0:00:00  39.12 MB/s
anaconda-2020. 100% |###############################| Time: 0:00:00  27.61 MB/s
('Start downloading script locally: ', u'https://raw.githubusercontent.com/TheEagleByte/azure-hail/master/hail-install.sh')
Fromdos line ending conversion successful
('Unexpected error:', "('Execution of custom script failed with exit code', 1)")
Removing temp location of the script

Command failed after 1 tries

It seems that the installation script is looking for a particular pip that was not installed. Do you think a misconfiguration causes this?

Thanks a lot!

@kumarveerapen

I wonder if @IqviaGarrettBromley could chime in on this since he has had experience in setting up Hail on Azure.

1 Like

Hi @xunzhustjude,

Can you share a screenshot or something of the script actions page for the cluster in the Azure dashboard? That is where they should have executed and displayed a status of successful/failed.

Also, at the moment, not sure I would consider setting up in Azure, as there is a known issue where you are unable to write any hail table or matrix table to the ADLS Gen 2 storage. You will receive errors regarding “the stream has already been closed”. The last working version on Azure is 0.2.34, and with 0.2.35 there were ~2.6k lines of changes around the io package where the bug was introduced.

If you have the option to, I would recommend moving to a different cloud provider, or use that version (which I wouldn’t necessarily recommend).

Thanks for your help, @IqviaGarrettBromley!

Can you share a screenshot or something of the script actions page for the cluster in the Azure dashboard?

When clicking into the failed action:

I attached the entire debug information here: debug_information.txt (6.2 KB)

If you have the option to, I would recommend moving to a different cloud provider, or use that version (which I wouldn’t necessarily recommend).

Currently Azure is what we prefer because it is a close collaborator of St. Jude. However, I will discuss with our team members regarding the possibility of switching to another cloud provider for Hail.


Sorry for making multiple posts, but it seems that as a new user I am not able to upload more than one picture per post.

Thanks for the info. I just used that script last month, so not sure why it’s suddenly throwing an error. Can you show what the cluster configuration looks like? Versions and storage options and such?

Has this been solved? :slight_smile:

@IqviaGarrettBromley

Thanks for helping!

The Cluster type as shown on Azure is: Spark 2.4 (HDI 4.0)

Here are the rest of the versions

Which “configuration” file do you need?

@kumarveerapen Sorry Kumar, I was distracted by something else for the past few days. I’m coming back to this now.

1 Like