Trouble Installing Hail

Hello,

I’ve been attempting to install Hail for the past few hours and I just can’t seem to get it to work.
I have so far tried re-downloading Java have used both Python 3.8 and Python 3.7, and I just keep encountering roadblocks. When using Python 3.7, using “pip install hail” results in this block of error text:

C:\Users\rizza>pip3 install hail
Collecting hail
  Using cached hail-0.2.49-py3-none-any.whl (62.8 MB)
Collecting scipy<1.4,>1.2
  Using cached scipy-1.3.3-cp37-cp37m-win_amd64.whl (30.5 MB)
Collecting python-json-logger==0.1.11
  Using cached python-json-logger-0.1.11.tar.gz (6.0 kB)
Collecting aiohttp-session<2.8,>=2.7
  Using cached aiohttp_session-2.7.0-py3-none-any.whl (14 kB)
Collecting pandas<0.26,>0.24
  Using cached pandas-0.25.3-cp37-cp37m-win_amd64.whl (9.2 MB)
Collecting decorator<5
  Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Collecting parsimonious<0.9
  Using cached parsimonious-0.8.1.tar.gz (45 kB)
Collecting pyspark<2.4.2,>=2.4
  Using cached pyspark-2.4.1.tar.gz (215.7 MB)
    ERROR: Command errored out with exit status 1:
     command: 'c:\users\rizza\appdata\local\programs\python\python37\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-install-p20v89uz\\pyspark\\setup.py'"'"'; __file__='"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-install-p20v89uz\\pyspark\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\rizza\AppData\Local\Temp\pip-pip-egg-info-9opsnj4f'
         cwd: C:\Users\rizza\AppData\Local\Temp\pip-install-p20v89uz\pyspark\
    Complete output (47 lines):
    Could not import pypandoc - required to package PySpark
    WARNING: The wheel package is not available.
      ERROR: Command errored out with exit status 1:
       command: 'c:\users\rizza\appdata\local\programs\python\python37\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-wheel-oxsd6ynl\\pypandoc\\setup.py'"'"'; __file__='"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-wheel-oxsd6ynl\\pypandoc\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\rizza\AppData\Local\Temp\pip-wheel-7psu2qmp'
           cwd: C:\Users\rizza\AppData\Local\Temp\pip-wheel-oxsd6ynl\pypandoc\
      Complete output (8 lines):
      no pandoc found, building platform unspecific wheel...
      use 'python setup.py download_pandoc' to download pandoc.
      usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
         or: setup.py --help [cmd1 cmd2 ...]
         or: setup.py --help-commands
         or: setup.py cmd --help

      error: invalid command 'bdist_wheel'
      ----------------------------------------
      ERROR: Failed building wheel for pypandoc
    ERROR: Failed to build one or more wheels
    Traceback (most recent call last):
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\installer.py", line 128, in fetch_build_egg
        subprocess.check_call(cmd)
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\subprocess.py", line 363, in check_call
        raise CalledProcessError(retcode, cmd)
    subprocess.CalledProcessError: Command '['c:\\users\\rizza\\appdata\\local\\programs\\python\\python37\\python.exe', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', 'C:\\Users\\rizza\\AppData\\Local\\Temp\\tmps1st36c1', '--quiet', 'pypandoc']' returned non-zero exit status 1.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\rizza\AppData\Local\Temp\pip-install-p20v89uz\pyspark\setup.py", line 224, in <module>
        'Programming Language :: Python :: Implementation :: PyPy']
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\__init__.py", line 143, in setup
        _install_setup_requires(attrs)
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\__init__.py", line 138, in _install_setup_requires
        dist.fetch_build_eggs(dist.setup_requires)
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\dist.py", line 698, in fetch_build_eggs
        replace_conflicting=True,
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\pkg_resources\__init__.py", line 783, in resolve
        replace_conflicting=replace_conflicting
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\pkg_resources\__init__.py", line 1066, in best_match
        return self.obtain(req, installer)
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\pkg_resources\__init__.py", line 1078, in obtain
        return installer(requirement)
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\dist.py", line 754, in fetch_build_egg
        return fetch_build_egg(self, req)
      File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\installer.py", line 130, in fetch_build_egg
        raise DistutilsError(str(e))
    distutils.errors.DistutilsError: Command '['c:\\users\\rizza\\appdata\\local\\programs\\python\\python37\\python.exe', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', 'C:\\Users\\rizza\\AppData\\Local\\Temp\\tmps1st36c1', '--quiet', 'pypandoc']' returned non-zero exit status 1.
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

When I try with Python 3.8 installed, I can pip install Hail, but I can’t import it into Python. Here’s what happens when I try:

C:\Users\rizza>py
Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import hail
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\hail\__init__.py", line 28, in <module>
    from .table import Table, GroupedTable, asc, desc
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\hail\table.py", line 4, in <module>
    import pyspark
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\__init__.py", line 51, in <module>
    from pyspark.context import SparkContext
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\context.py", line 31, in <module>
    from pyspark import accumulators
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\accumulators.py", line 97, in <module>
    from pyspark.serializers import read_int, PickleSerializer
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\serializers.py", line 71, in <module>
    from pyspark import cloudpickle
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\cloudpickle.py", line 145, in <module>
    _cell_set_template_code = _make_cell_set_template_code()
  File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\cloudpickle.py", line 126, in _make_cell_set_template_code
    return types.CodeType(
TypeError: an integer is required (got type bytes)

I am incredibly stumped. Can someone help me through this? Which version of Python should I be using, and how do I get it to agree with Hail and its dependents?

Hail isn’t known to work (or even install!) on Windows. I think your best bet is to use Docker, as described here:

Saw this same PySpark installation error on Linux in a GitHub Actions workflow and was able to resolve it by installing wheel.