Hail on Goocle Cloud with Windows OS

Hello,

I have several installation issues on Google Cloud. I follow this tutorial:

And I ask myself the question: can it be an OS problem? I work on windows.
Let me explain. On the github of NealeLab (GitHub - Nealelab/cloudtools: Scripts for working with Google Cloud Dataproc service), an important point is
Prerequisites:
Mac OS X
Python 2
Google Cloud SDK …

I guess of course you can adapt the scripts … but I feel that everything has been done and designed for Mac ?

Thanks

Ines

Hi @ines,

Hail itself will probably not run on Windows.

I’m sorry you are having trouble following the “Using Hail with Jupyter Notebooks on Google Cloud” post! That post is really out of date. My colleague Tim has added a warning to that post.

The directions at https://github.com/Nealelab/cloudtools are up-to-date. You’re correct, cloudtools was designed for Mac OS X, however cloudtools should also work, with some minor fixes, on any Windows machine with the Linux Subsystem for Windows. In particular, you will need to modify Line 59 of connect.py to:

        r'/path/to/chrome.exe',

If you make this change, I expect cluster start and cluster connect NAME notebook will both work for you.

Thank you for this answer.

Unfortunately the problem is not solved.

I manage to install Hail with dataproc on GC but I can not also install a jupyter notebook that works.

I explain in detail my attempts:
A) Install Hail with gs: //hail-common/hail-init.sh: it works, I can submit jobs to the cluster so everything is good.
B) Install Hail with gs: //hail-common/init_notebook.py
The cluster is created, I can submit python job but I don’t manage to access the jupyter notebook (with the port 8123).
C) Install Hail with gs: //hail-common/cloudtools/init_notebook1.py, nothing works. The cluster is not created, here is the error :
ERROR: (gcloud.dataproc.clusters.create) Operation [] failed: initialization action failed. Failed action ‘gs: //hail-common/cloudtools/init_notebook1.py’, see output in: gs: // dataprocm / dataproc-initialization-script-0_output.

So I have several questions:

  1. I believe there are several versions of this script gs: //hail-common/cloudtools/init_notebook1.py, init_notebook 2, 3 …? Why? Which one to take?
  2. I find it surprising that attempt B works; Maybe everything is well installed but I can not access it? If yes, why write a new script ?

It is really important for me to have access to a jupyter notebook with Hail. Help me please …

Best,

Inès

what is the error that caused the notebook to fail? It should be in the output file referenced in the error message.

also, how did you create your cluster? are you using cloudtools? If not, what is the full gcloud dataproc clusters create command?

I don’t use cloudtools (I have tried without success)

The command is :

gcloud dataproc clusters create ines11 --zone us-east1-d --master-machine-type n1-highmem-8 --master-boot-disk-size 100 --num-workers 2 --worker-machine-type n1-highmem-8 --worker-boot-disk-size 75 --num-worker-local-ssds 1 --num-preemptible-workers 4 --image-version 1.1 --project avl-hail-ines --properties “spark:spark.driver.extraJavaOptions=-Xss4M,spark:spark.executor.extraJavaOptions=-Xss4M,spark:spark.driver.memory=45g,spark:spark.driver.maxResultSize=30g,spark:spark.task.maxFailures=20,spark:spark.kryoserializer.buffer.max=1g,hdfs:dfs.replication=1” --initialization-actions gs://hail-common/cloudtools/init_notebook.py

the correct usage of the new-style init_notebook scripts requires a conda installation script beforehand. This is an example gcloud invocation from cloudtools:

gcloud dataproc clusters create t1 \
    --image-version=1.2 \
    --master-machine-type=n1-highmem-8 \
    --metadata=JAR=gs://hail-common/builds/devel/jars/hail-devel-aa83f2a1d041-Spark-2.2.0.jar,ZIP=gs://hail-common/builds/devel/python/hail-devel-aa83f2a1d041.zip,MINICONDA_VERSION=4.4.10 \
    --master-boot-disk-size=100GB \
    --num-master-local-ssds=0 \
    --num-preemptible-workers=0 \
    --num-worker-local-ssds=0 \
    --num-workers=2 \
    --preemptible-worker-boot-disk-size=40GB \
    --worker-boot-disk-size=40GB \
    --worker-machine-type=n1-highmem-8 \
    --zone=us-central1-b \
    --properties=spark:spark.driver.memory=41g,spark:spark.driver.maxResultSize=0,spark:spark.task.maxFailures=20,spark:spark.kryoserializer.buffer.max=1g,spark:spark.driver.extraJavaOptions=-Xss4M,spark:spark.executor.extraJavaOptions=-Xss4M,hdfs:dfs.replication=1 \
    --initialization-actions=gs://dataproc-initialization-actions/conda/bootstrap-conda.sh,gs://hail-common/cloudtools/init_notebook1.py

And The output file for the error :

gsutil cat gs://dataproc-7d953528-6da9-4479-840f-10cef05dbc8f-us/google-cloud-dataproc-metainfo/7c24f861-2120-4e58-8039-f45dce00e0d1/ines12-m/dataproc-initialization-script-0_output
Traceback (most recent call last):
File “/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0”, line 54, in
jar_path = get_metadata(‘JAR’)
File “/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0”, line 15, in get_metadata
return decode_f(check_output([’/usr/share/google/get_metadata_value’, ‘attributes/{}’.format(key)]))
File “/usr/lib/python2.7/subprocess.py”, line 573, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command ‘[’/usr/share/google/get_metadata_value’, ‘attributes/JAR’]’ returned non-zero exit status 22

Ok I will try that! Thanks

Ok The cluster is created.
How do you access the notebook?

I tried:

gcloud compute ssh --zone=us-central1-b --ssh-flag="-D" --ssh-flag=“10000” --ssh-flag="-N" “t1-m”

“C:\Program Files (x86)\Google\Chrome\Application\chrome.exe” “http://t1-m:8123” --proxy-server=“socks5://localhost:10000” --host-resolver-rules=“MAP * 0.0.0.0 , EXCLUDE localhost” --user-data-dir=“C:/temp”

I have access to the cluster on port 8088 but not to the notebook.

And also :
gcloud dataproc jobs submit pyspark --cluster=t1 --project=avl-hail-ines gs://ines-python/start.py
where the file start.py is from cloudtools but with the change that @danking said to me (r’/path/to/chrome.exe’)

It doesn’t work for both …

Hello,

If I connect to the master node and type jupyter notebook. I have this error:

[C 13:14:22.736 NotebookApp] Bad config encountered during initialization:
[C 13:14:22.736 NotebookApp] The ‘contents_manager_class’ trait of <notebook.notebookapp.NotebookApp object at 0x7fa44cf39978> instance must be a type, but ‘jgscm.GoogleStorageContentManager’ could not be imported

The command used to create the cluster :

gcloud dataproc clusters create hail1 --image-version=1.2 --master-machine-type=n1-highmem-8 --metadata=JAR=gs://hail-common/builds/devel/jars/hail-devel-aa83f2a1d041-Spark-2.2.0.jar,ZIP=gs://hail-common/builds/devel/python/hail-devel-aa83f2a1d041.zip,MINICONDA_VERSION=4.4.10 --master-boot-disk-size=100GB --num-master-local-ssds=0 --project avl-hail-ines --num-preemptible-workers=0 --num-worker-local-ssds=0 --num-workers=2 --preemptible-worker-boot-disk-size=40GB --worker-boot-disk-size=40GB --worker-machine-type=n1-highmem-8 --zone=us-east1-d --properties=spark:spark.driver.memory=41g,spark:spark.driver.maxResultSize=0,spark:spark.task.maxFailures=20,spark:spark.kryoserializer.buffer.max=1g,spark:spark.driver.extraJavaOptions=-Xss4M,spark:spark.executor.extraJavaOptions=-Xss4M,hdfs:dfs.replication=1 --initialization-actions=gs://dataproc-initialization-actions/conda/bootstrap-conda.sh,gs://hail-common/cloudtools/init_notebook1.py

Thank you for your help !

Hi @ines!

I’m sorry you encountered this issue! There was a recent breaking change to Google Cloud’s python library that we were unaware of. We are releasing a fix for this now. To avoid this situation in the future, we have specified a specific version of google cloud’s python library and will upgrade when after we have verified the new version works.

This is the correct way to set up an SSH tunnel. However, instead of http://t1-m:8123 can you try http://localhost:8123? Since you are tunneling to t1-m (where the jupyter notebook server is running) using the socks proxy, the jupyter notebook server can be accessed through localhost.

Thank you

But it still does not work with http://localhost:8123

the two comman lines :
gcloud compute ssh --project=bla --zone=us-central1-b --ssh-flag="-D" --ssh-flag=“10000” --ssh-flag="-N" “t1-m”

“C:\Program Files (x86)\Google\Chrome\Application\chrome.exe” “http://localhost:8123” --proxy-server=“socks5://localhost:10000” --host-resolver-rules=“MAP * 0.0.0.0 , EXCLUDE localhost” --user-data-dir=“C:/temp”

Can you describe what does happen? Can you include a screenshot of the Chrome window?

so it’s in french but it means :
This site is inaccessible
The connection has been reset.
Try the suggestions below:

Check the connection
Check the proxy and the firewall
Run Windows Network Diagnostics
ERR_CONNECTION_RESET

BUT I can see the cluster.

Alright. I suspect the jupyter notebook is not starting correctly. Can you share these log files:

  • /var/log/dataproc-initialization-script-0.log
  • /var/log/dataproc-initialization-script-1.log
  • /var/log/dataproc-startup-script.log

I have the log files. How do you transmit them? Txt files are not compatible.

Hi, I had similar issues setting up Hail on Google cloud yesterday using the windows Linux subsystem. I think the problem was Chrome on windows could not write to /temp. I managed to get around this by installing chromium-browser, updating connect.py with this (/usr/bin/chromium-browser), then porting it through an Xming server. Hope this helps.

Thank you so much ! With Chromium, does everything work with all the steps I’ve described above?