Hailctl breaks with latest GCP SDK 294.0.0

Hello,

I recently updated my GCP SDK to 294.0.0 and received the following error when starting a cluster with hailctl.

wma13-a4c:~ 250 1 $ hailctl dataproc start mw --properties=spark:spark.speculation=true --num-preemptible-workers=2 --worker-machine-type=n1-highmem-8 --max-idle=120m --packages slackclient==2.0.0,websocket-client,sklearn,statsmodels,scikit-learn,hdbscan,matplotlib,google-cloud-bigquery,gnomad
Your active configuration is: [seqr-work]
gcloud beta dataproc clusters create
mw
–image-version=1.4-debian9
–properties=spark:spark.task.maxFailures=20,spark:spark.driver.extraJavaOptions=-Xss4M,spark:spark.executor.extraJavaOptions=-Xss4M,hdfs:dfs.replication=1,dataproc:dataproc.logging.stackdriver.enable=false,dataproc:dataproc.monitoring.stackdriver.enable=false,spark:spark.speculation=true,spark:spark.driver.memory=41g
–initialization-actions=gs://hail-common/hailctl/dataproc/0.2.33/init_notebook.py
–metadata=^|||^WHEEL=gs://hail-common/hailctl/dataproc/0.2.33/hail-0.2.33-py3-none-any.whl|||PKGS=aiohttp>=3.6,<3.7|aiohttp_session>=2.7,<2.8|asyncinit>=0.2.4,<0.3|bokeh>1.1,<1.3|decorator<5|gcsfs==0.2.1|humanize==1.0.0|hurry.filesize==0.9|nest_asyncio|numpy<2|pandas>0.24,<0.26|parsimonious<0.9|PyJWT|python-json-logger==0.1.11|requests>=2.21.0,<2.21.1|scipy>1.2,<1.4|tabulate==0.8.3|tqdm==4.42.1|slackclient==2.0.0|websocket-client|sklearn|statsmodels|scikit-learn|hdbscan|matplotlib|google-cloud-bigquery|gnomad
–master-machine-type=n1-highmem-8
–master-boot-disk-size=100GB
–num-master-local-ssds=0
–num-preemptible-workers=2
–num-worker-local-ssds=0
–num-workers=2
–preemptible-worker-boot-disk-size=40GB
–worker-boot-disk-size=40GB
–worker-machine-type=n1-highmem-8
–zone=us-central1-b
–initialization-action-timeout=20m
–labels=creator=mwilson_broadinstitute_org
–max-idle=120m
Starting cluster ‘mw’…
WARNING: The --num-preemptible-workers flag is deprecated. Use the --num-secondary-workers flag instead.
WARNING: The --preemptible-worker-boot-disk-size flag is deprecated. Use the --secondary-worker-boot-disk-size flag instead.
ERROR: (gcloud.beta.dataproc.clusters.create) Error parsing [cluster].
The [cluster] resource is not properly specified.
Failed to find attribute [region]. The attribute can be set in the following ways:

  • provide the argument [–region] on the command line
  • set the property [dataproc/region]
    Traceback (most recent call last):
    File “/Users/mwilson/anaconda3/bin/hailctl”, line 8, in
    sys.exit(main())
    File “/Users/mwilson/anaconda3/lib/python3.6/site-packages/hailtop/hailctl/main.py”, line 94, in main
    cli.main(args)
    File “/Users/mwilson/anaconda3/lib/python3.6/site-packages/hailtop/hailctl/dataproc/cli.py”, line 107, in main
    jmp[args.module].main(args, pass_through_args)
    File “/Users/mwilson/anaconda3/lib/python3.6/site-packages/hailtop/hailctl/dataproc/start.py”, line 202, in main
    sp.check_call(cmd)
    File “/Users/mwilson/anaconda3/lib/python3.6/subprocess.py”, line 311, in check_call
    raise CalledProcessError(retcode, cmd)
    subprocess.CalledProcessError: Command ‘[‘gcloud’, ‘beta’, ‘dataproc’, ‘clusters’, ‘create’, ‘mw’, ‘–image-version=1.4-debian9’, ‘–properties=spark:spark.task.maxFailures=20,spark:spark.driver.extraJavaOptions=-Xss4M,spark:spark.executor.extraJavaOptions=-Xss4M,hdfs:dfs.replication=1,dataproc:dataproc.logging.stackdriver.enable=false,dataproc:dataproc.monitoring.stackdriver.enable=false,spark:spark.speculation=true,spark:spark.driver.memory=41g’, ‘–initialization-actions=gs://hail-common/hailctl/dataproc/0.2.33/init_notebook.py’, ‘–metadata=^|||^WHEEL=gs://hail-common/hailctl/dataproc/0.2.33/hail-0.2.33-py3-none-any.whl|||PKGS=aiohttp>=3.6,<3.7|aiohttp_session>=2.7,<2.8|asyncinit>=0.2.4,<0.3|bokeh>1.1,<1.3|decorator<5|gcsfs==0.2.1|humanize==1.0.0|hurry.filesize==0.9|nest_asyncio|numpy<2|pandas>0.24,<0.26|parsimonious<0.9|PyJWT|python-json-logger==0.1.11|requests>=2.21.0,<2.21.1|scipy>1.2,<1.4|tabulate==0.8.3|tqdm==4.42.1|slackclient==2.0.0|websocket-client|sklearn|statsmodels|scikit-learn|hdbscan|matplotlib|google-cloud-bigquery|gnomad’, ‘–master-machine-type=n1-highmem-8’, ‘–master-boot-disk-size=100GB’, ‘–num-master-local-ssds=0’, ‘–num-preemptible-workers=2’, ‘–num-worker-local-ssds=0’, ‘–num-workers=2’, ‘–preemptible-worker-boot-disk-size=40GB’, ‘–worker-boot-disk-size=40GB’, ‘–worker-machine-type=n1-highmem-8’, ‘–zone=us-central1-b’, ‘–initialization-action-timeout=20m’, ‘–labels=creator=mwilson_broadinstitute_org’, ‘–max-idle=120m’]’ returned non-zero exit status 1.

It looks like google updated the flags for preemptibles. I reverted back to 259.0.0 but figured I’d let you guys know. Considering I haven’t updated the google SDK in awhile I’m not sure when this change happened, I couldn’t find any documentation on it in a brief search.

Sorry just realized the main cause of the error was the region flag which I think was addressed elsewhere…

1 Like

The worker stuff is just warnings, yeah. Next release we’re also going to make sure you have region set either as an arg or in config.