Config setup - Hail Installation

Aks · August 26, 2020, 8:49am

Hello Everyone,

I am trying to install Hail in ubuntu. But I guess the documentation is updated.

Now in the docs I don’t see the below config statements which were present earlier.

conf = pyspark.SparkConf().setAll([
    ('spark.jars', str(hail_jars)),
    ('spark.driver.extraClassPath', str(hail_jars)),
    ('spark.executor.extraClassPath', './hail-all-spark.jar'),
    ('spark.serializer', 'org.apache.spark.serializer.KryoSerializer'),
    ('spark.kryo.registrator', 'is.hail.kryo.HailKryoRegistrator'),
    ('spark.driver.memory', '180g'),
    ('spark.executor.memory', '180g'),
    ('spark.local.dir', '/t1,/data/abcd/spark')
])
sc = pyspark.SparkContext('local[*]', 'Hail', conf=conf)
hl.init(sc)

Does that mean hail installation is simplified and we don’t really have to do all these config steps?

Under Linux and spark cluster section in recently updated docs, I don’t see any such config statements, as shown above, so we don’t have to do all this?

If we still have any of the above config statements in the recently updated hail doc, can you please direct me to the doc where I can find the above statements?

tpoterba · August 26, 2020, 11:06am

Yes, that’s exactly what this means! You may still want to update memory settings / temp dirs specific to your cluster, but I think it’ll be easier to do that as arguments to the pyspark executable.

Topic		Replies	Views
Pip-installed Hail requires additional configuration options in Spark referring to the path to the Hail Python module directory HAIL_DIR Hail Batch & General Cloud	1	656	June 3, 2022
How to install hail on spark cluster Hail Query & hailctl	13	1680	September 15, 2020
Install Hail using Spark Hail Query & hailctl	15	1384	April 13, 2018
Hail installation in Cloudera Help [0.1]	1	994	September 11, 2017
Connect hail to master spark server on kubernetes Hail Query & hailctl	3	658	May 13, 2022

Config setup - Hail Installation

Related topics