Hi @danking,
Yes its working. I am able to see the matrix table. However one quick question
I am used to working on Jupyter notebook and python. So I wanted to use Pyspark. So typing below command from the instructions documentation,
(pip3 show hail | grep Location | awk -F' ' '{print $2 "/hail"}')
I got the hail directory path which is like as shown below
/home/abcd/.pyenv/versions/3.7.2/envs/bio/lib/python3.7/site-packages/hail
Later when I run in Jupyter notebook, the following commands
hail_home = Path('/home/abcd/.pyenv/versions/3.7.2/envs/bio/lib/python3.7/site-packages/hail')
hail_jars = hail_home/'build'/'libs'/'hail-all-spark.jar'
conf = pyspark.SparkConf().setAll([
('spark.jars', str(hail_jars)),
('spark.driver.extraClassPath', str(hail_jars)),
('spark.executor.extraClassPath', './hail-all-spark.jar'),
('spark.serializer', 'org.apache.spark.serializer.KryoSerializer'),
('spark.kryo.registrator', 'is.hail.kryo.HailKryoRegistrator'),
('spark.driver.memory', '80g'),
('spark.executor.memory', '80g'),
('spark.local.dir', '/tmp,/data/volume03/spark')
])
sc = pyspark.SparkContext('local[*]', 'Hail', conf=conf)
hl.init(sc)
I get an error as shown below
TypeError: 'JavaPackage' object is not callable
Is anything wrong with my hail_home
path?
I realize that I don’t have the folder build
under hail_home
which is causing issue while identifying java_package
.
But the command in the doc, gives the below path for hail_home
/home/abcd/.pyenv/versions/3.7.2/envs/bio/lib/python3.7/site-packages/hail
post update
I see it’s under backend
folder. May I check why the path is different? have I installed it in a incorrect location because documentation mentions other location.
So I updated hail_jars
= hail_home/'backend'/'hail-all-spark.jar'
Now it works I guess