Hello,
I’ve been attempting to install Hail for the past few hours and I just can’t seem to get it to work.
I have so far tried re-downloading Java have used both Python 3.8 and Python 3.7, and I just keep encountering roadblocks. When using Python 3.7, using “pip install hail” results in this block of error text:
C:\Users\rizza>pip3 install hail Collecting hail Using cached hail-0.2.49-py3-none-any.whl (62.8 MB) Collecting scipy<1.4,>1.2 Using cached scipy-1.3.3-cp37-cp37m-win_amd64.whl (30.5 MB) Collecting python-json-logger==0.1.11 Using cached python-json-logger-0.1.11.tar.gz (6.0 kB) Collecting aiohttp-session<2.8,>=2.7 Using cached aiohttp_session-2.7.0-py3-none-any.whl (14 kB) Collecting pandas<0.26,>0.24 Using cached pandas-0.25.3-cp37-cp37m-win_amd64.whl (9.2 MB) Collecting decorator<5 Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB) Collecting parsimonious<0.9 Using cached parsimonious-0.8.1.tar.gz (45 kB) Collecting pyspark<2.4.2,>=2.4 Using cached pyspark-2.4.1.tar.gz (215.7 MB) ERROR: Command errored out with exit status 1: command: 'c:\users\rizza\appdata\local\programs\python\python37\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-install-p20v89uz\\pyspark\\setup.py'"'"'; __file__='"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-install-p20v89uz\\pyspark\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\rizza\AppData\Local\Temp\pip-pip-egg-info-9opsnj4f' cwd: C:\Users\rizza\AppData\Local\Temp\pip-install-p20v89uz\pyspark\ Complete output (47 lines): Could not import pypandoc - required to package PySpark WARNING: The wheel package is not available. ERROR: Command errored out with exit status 1: command: 'c:\users\rizza\appdata\local\programs\python\python37\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-wheel-oxsd6ynl\\pypandoc\\setup.py'"'"'; __file__='"'"'C:\\Users\\rizza\\AppData\\Local\\Temp\\pip-wheel-oxsd6ynl\\pypandoc\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\rizza\AppData\Local\Temp\pip-wheel-7psu2qmp' cwd: C:\Users\rizza\AppData\Local\Temp\pip-wheel-oxsd6ynl\pypandoc\ Complete output (8 lines): no pandoc found, building platform unspecific wheel... use 'python setup.py download_pandoc' to download pandoc. usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] or: setup.py --help [cmd1 cmd2 ...] or: setup.py --help-commands or: setup.py cmd --help error: invalid command 'bdist_wheel' ---------------------------------------- ERROR: Failed building wheel for pypandoc ERROR: Failed to build one or more wheels Traceback (most recent call last): File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\installer.py", line 128, in fetch_build_egg subprocess.check_call(cmd) File "c:\users\rizza\appdata\local\programs\python\python37\lib\subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['c:\\users\\rizza\\appdata\\local\\programs\\python\\python37\\python.exe', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', 'C:\\Users\\rizza\\AppData\\Local\\Temp\\tmps1st36c1', '--quiet', 'pypandoc']' returned non-zero exit status 1. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\rizza\AppData\Local\Temp\pip-install-p20v89uz\pyspark\setup.py", line 224, in <module> 'Programming Language :: Python :: Implementation :: PyPy'] File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\__init__.py", line 143, in setup _install_setup_requires(attrs) File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\__init__.py", line 138, in _install_setup_requires dist.fetch_build_eggs(dist.setup_requires) File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\dist.py", line 698, in fetch_build_eggs replace_conflicting=True, File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\pkg_resources\__init__.py", line 783, in resolve replace_conflicting=replace_conflicting File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\pkg_resources\__init__.py", line 1066, in best_match return self.obtain(req, installer) File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\pkg_resources\__init__.py", line 1078, in obtain return installer(requirement) File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\dist.py", line 754, in fetch_build_egg return fetch_build_egg(self, req) File "c:\users\rizza\appdata\local\programs\python\python37\lib\site-packages\setuptools\installer.py", line 130, in fetch_build_egg raise DistutilsError(str(e)) distutils.errors.DistutilsError: Command '['c:\\users\\rizza\\appdata\\local\\programs\\python\\python37\\python.exe', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', 'C:\\Users\\rizza\\AppData\\Local\\Temp\\tmps1st36c1', '--quiet', 'pypandoc']' returned non-zero exit status 1. ---------------------------------------- ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
When I try with Python 3.8 installed, I can pip install Hail, but I can’t import it into Python. Here’s what happens when I try:
C:\Users\rizza>py Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import hail Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\hail\__init__.py", line 28, in <module> from .table import Table, GroupedTable, asc, desc File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\hail\table.py", line 4, in <module> import pyspark File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\__init__.py", line 51, in <module> from pyspark.context import SparkContext File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\context.py", line 31, in <module> from pyspark import accumulators File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\accumulators.py", line 97, in <module> from pyspark.serializers import read_int, PickleSerializer File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\serializers.py", line 71, in <module> from pyspark import cloudpickle File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\cloudpickle.py", line 145, in <module> _cell_set_template_code = _make_cell_set_template_code() File "C:\Users\rizza\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyspark\cloudpickle.py", line 126, in _make_cell_set_template_code return types.CodeType( TypeError: an integer is required (got type bytes)
I am incredibly stumped. Can someone help me through this? Which version of Python should I be using, and how do I get it to agree with Hail and its dependents?