Hi, I’m studying Hail and installing Hail on spark.
I have plan that run GWAS about 1000 genomes. So, I install and set up hail on spark.
Linux: Centos 7.8
Python: 3.7.3 (anaconda)
Apache spark: spark-2.2.0-bin-hadoop2.6
Hadoop: hadoop-2.6.0
Java -version (info. I’m using linux server by korea Institution, So i can’t use root permission)
openjdk version “1.8.0_262”
OpenJDK Runtime Environment (build 1.8.0_262-b10)
OpenJDK 64-Bit Server VM (build 25.262-b10, mixed mode)
Hail version: 0.2.68
- Run start-master.sh and start-slaves.sh in spark sbin directory.
- (bash) pyspark
I got message below.
How can i set up hail on spark?
Do i need to change java version?
Thank you for your services.
My , <conf/spark-defaults.conf> and <./spark-env.sh> are below.
<.bashrc>
#SPARK
export SPARK_HOME=/home/edu1/tools/spark-2.2.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/python:$PATH
export PYTHONPATH=$HAIL_HOME/python:$SPARK_HOME/python:$(echo ${SPARK_HOME}/python/lib/py4j-*-src.zip):$PYTHONPATH
# Hail
export HAIL_HOME=/home/edu1/miniconda2/envs/Hail-on-spark/lib/python3.7/site-packages/hail
export PATH=$PATH:$HAIL_HOME/bin
export PYTHONPATH=$PYTHONPATH:$HAIL_HOME/python
export SPARK_CLASSPATH=$HAIL_HOME/backend/hail-all-spark.jar
# JAVA (I just can modify .bashrc, so This would not apply to java path.)
export JAVA_HOME=/home/edu1/tools/jdk-1.8.0_231
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=$JAVA_HOME/lib/tools.jar
# Hadoop
export HADOOP_INSTALL=/home/edu1/tools/hadoop-2.6.0
export PAHT=$PATH:$HADOOP_INSTALL/bin
export LD_LIBRARY_PATH=$HADOOP_INSTALL/lib/native
</spark/conf/spark-defaults.conf>
spark.master spark://training.server:7077
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator is.hail.kryo.HailKryoRegistrator
spark.speculation True
spark.driver.memory 37414m
spark.executor.memory 37414m
spark.executor.instances 1
spark.driver.extraClassPath /home/edu1/miniconda2/envs/Hail-on-spark/lib/python3.7/site-packages/hail/backend/hail-all-spark.jar
spark.executor.extraClassPath /home/edu1/miniconda2/envs/Hail-on-spark/lib/python3.7/site-packages/hail/hail-all-spark.jar
spark.jars /home/edu1/miniconda2/envs/Hail-on-spark/lib/python3.7/site-packages/hail/backend/hail-all-spark.jar
spark.eventLog.enabled true
spark.history.fs.logDirectory file:/tmp/spark-events
spark.enevtLog.dir file:/tmp/spark-events
spark.ui.reverseProxy true
spark.ui.reverseProxyUrl spark://training.server/spark
spark.executor.extraJavaOptions -Dlog4j.debug=true
</spark/conf/spark-env.sh>
export SPARK_WORKER_INSTANCES=1