Hi guys,
I am wondering if anyone has successfully compiled Hail 0.2 on CentOS7?
We have enabled devtoolset-8 which provides gcc 8.3.1, and the lz4 libraries and header files have been installed
$ ls /usr/lib64/liblz4.so
/usr/lib64/liblz4.so
$ ls /usr/include/lz4.h
/usr/include/lz4.h
But I still got error:
$ make install-on-cluster HAIL_COMPILE_NATIVES=1 SCALA_VERSION=2.12 SPARK_VERSION=3.1.1
gmake -C src/main/c prebuilt
gmake[1]: Entering directory ‘/share/apps/luffy/binf/hail/hail/src/main/c’
gmake[1]: *** No rule to make target ‘lz4.h’, needed by ‘build/Decoder.o’. Stop.
gmake[1]: Leaving directory ‘/share/apps/luffy/binf/hail/hail/src/main/c’
gmake: *** [Makefile:374: native-lib-prebuilt] Error 2
Any idea will be appreciated
Cheers,
Derrick
1 Like
Hi Derrick, try make clean
first then try again.
1 Like
Hi Dan,
make clean did the trick the compilation completed. Thank you.
May I also get some clarifications of what’s the benefit of compiling Hail this way compare to the normal python pip install?
I suspect there is little or no performance benefit of compiling Hail from source. Currently, the vast majority of pipelines do not make use of Hail-developed native code. We do use native linear algebra libraries like LAPACK and BLAS, but those are dynamically linked at run-time.
We recommend compiling Hail from source on clusters for two reasons. First, we compile on an Ubuntu system, so its likely there are some subtle incompatibilities with non-Debian distributions (Debian seems not popular for clusters?). Second, the PyPI package depends on PySpark and, in my experience, Spark is not usually installed in a way that pip recognizes. As a result, after pip install hail
you’ll likely have two installations: your real Spark installation and a local-mode-only PySpark. As you might imagine, this has lead to some pretty confusing bug reports!