Basic question on running hailctl first command

Hi everyone, I’m new to hail and I want to learn how to run it. I’ve set up a HD insight cluster on Azure, and in the same resouce group, with the storage account. (example, my_cluster, my_sa, my_rg).

Then I run my first hailctl command: hailctl hdinsight start mycluster my_sa my_rg, and I see the following error:

─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/davidmai/.local/lib/python3.10/site-packages/hailtop/hailctl/hdinsight/cli.py:88 in start  │
│                                                                                                  │
│    85 │   def default_artifact(filename: str) -> str:                                            │
│    86 │   │   return f'https://raw.githubusercontent.com/hail-is/hail/{hail_version}/hail/pyth   │
│    87 │                                                                                          │
│ ❱  88 │   hdinsight_start(                                                                       │
│    89 │   │   cluster_name,                                                                      │
│    90 │   │   storage_account,                                                                   │
│    91 │   │   resource_group,                                                                    │
│                                                                                                  │
│ /home/davidmai/.local/lib/python3.10/site-packages/hailtop/hailctl/hdinsight/start.py:53 in      │
│ start                                                                                            │
│                                                                                                  │
│    50 │   if http_password is None:                                                              │
│    51 │   │   http_password = secret_alnum_string(12) + '_aA0'                                   │
│    52 │                                                                                          │
│ ❱  53 │   exec(                                                                                  │
│    54 │   │   'az',                                                                              │
│    55 │   │   'hdinsight',                                                                       │
│    56 │   │   'create',                                                                          │
│                                                                                                  │
│ /home/davidmai/.local/lib/python3.10/site-packages/hailtop/hailctl/hdinsight/start.py:14 in exec │
│                                                                                                  │
│    11                                                                                            │
│    12                                                                                            │
│    13 def exec(*args):                                                                           │
│ ❱  14 │   subprocess.check_call(args)                                                            │
│    15                                                                                            │
│    16                                                                                            │
│    17 class VepVersion(str, Enum):                                                               │
│                                                                                                  │
│ /usr/lib/python3.10/subprocess.py:364 in check_call                                              │
│                                                                                                  │
│    361 │                                                                                         │
│    362 │   check_call(["ls", "-l"])                                                              │
│    363 │   """                                                                                   │
│ ❱  364 │   retcode = call(*popenargs, **kwargs)                                                  │
│    365 │   if retcode:                                                                           │
│    366 │   │   cmd = kwargs.get("args")                                                          │
│    367 │   │   if cmd is None:                                                                   │
│                                                                                                  │
│ /usr/lib/python3.10/subprocess.py:345 in call                                                    │
│                                                                                                  │
│    342 │                                                                                         │
│    343 │   retcode = call(["ls", "-l"])                                                          │
│    344 │   """                                                                                   │
│ ❱  345 │   with Popen(*popenargs, **kwargs) as p:                                                │
│    346 │   │   try:                                                                              │
│    347 │   │   │   return p.wait(timeout=timeout)                                                │
│    348 │   │   except:  # Including KeyboardInterrupt, wait handled that.                        │
│                                                                                                  │
│ /usr/lib/python3.10/subprocess.py:969 in __init__                                                │
│                                                                                                  │
│    966 │   │   │   │   │   self.stderr = io.TextIOWrapper(self.stderr,                           │
│    967 │   │   │   │   │   │   │   encoding=encoding, errors=errors)                             │
│    968 │   │   │                                                                                 │
│ ❱  969 │   │   │   self._execute_child(args, executable, preexec_fn, close_fds,                  │
│    970 │   │   │   │   │   │   │   │   pass_fds, cwd, env,                                       │
│    971 │   │   │   │   │   │   │   │   startupinfo, creationflags, shell,                        │
│    972 │   │   │   │   │   │   │   │   p2cread, p2cwrite,                                        │
│                                                                                                  │
│ /usr/lib/python3.10/subprocess.py:1778 in _execute_child                                         │
│                                                                                                  │
│   1775 │   │   │   │   │   │   │   for dir in os.get_exec_path(env))                             │
│   1776 │   │   │   │   │   fds_to_keep = set(pass_fds)                                           │
│   1777 │   │   │   │   │   fds_to_keep.add(errpipe_write)                                        │
│ ❱ 1778 │   │   │   │   │   self.pid = _posixsubprocess.fork_exec(                                │
│   1779 │   │   │   │   │   │   │   args, executable_list,                                        │
│   1780 │   │   │   │   │   │   │   close_fds, tuple(sorted(map(int, fds_to_keep))),              │
│   1781 │   │   │   │   │   │   │   cwd, env_list,                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: expected str, bytes or os.PathLike object, not int

Any suggestions to trouble shoot and fix thei issue?

Also, I’m curious how hailctl authenticate into my azure instance to find these resouces, is there any behind the scene documentation on the workflow pls?

Thx
David

Hey @David_Mai ! I think you’re the first person besides myself to attempt to use hdinsight with Hail. Unfortunately, in version 0.2.119 we changed the command line parser and that introduced a bug where the number of workers is an int instead of a str. I’ll PR a fix for that. In the meantime, you can either downgrade to 0.2.119 or run the following bash commands to fix your file in-situ.

We don’t run the HDInsight tests regularly due to the need to spin up and spin down clusters, so you may run into some bitrot. Please do post any issues and we’ll get them fixed!


pushd /home/davidmai/.local/lib/python3.10/site-packages/hailtop/hailctl/hdinsight
cp start.py start.py.bak; sed 's/num_workers,$/str(num_workers),/' start.py.bak > start.py
popd

This will create this change:

modified   hail/python/hailtop/hailctl/hdinsight/start.py
@@ -69,7 +69,7 @@ def start(
         '--location',
         location,
         '--workernode-count',
-        num_workers,
+        str(num_workers),
         '--ssh-password',
         sshuser_password,
         '--ssh-user',

Ok thx Dan! this is helpful. we have some azure credits so we’re not running gcp. if you want, we’re happy to run your build/test it out and work together to document anything we find, just let me know thanks!