Anyone also trying to run Hail on AWS EMR clusters and having issues? Let's huddle

thondeboer · November 16, 2021, 7:59am

I am trying to run Hail on a Spark cluster on AWS (we don’t have access to GCS) and running into all kinds of issues (VEP annotations just fails without any error messages (other than this:

[Stage 0:==>                                                (2438 + 24) / 54195]

[Stage 0:==>                                                (2438 + 21) / 54195]
Command exiting with ret '1'

And since most people here seem to be running on GCS/Terra I am looking for kindred spirits to help each other debug and optimize on AWS EMR clusters…I got most of it working except for the VEP annotations of the full UKBB 200K dataset, but not sure if I just need bigger clusters etc.)

Ping me if you are interested in running Hail on EMR cluster on AWS!

Thon

Abhishek · November 18, 2021, 9:52am

Hi,

We were doing a similar analysis. However, we were facing a different error.

thondeboer · November 19, 2021, 12:23am

Yes, i ran into this as well…I set it to 10,000 but you are saying that it needs only slightly raised?

and you are talking about “fs.s3.maxConnections” or a different one?

Here is my current spark config setting…

    def _get_spark_conf(self):
        '''The spark configurations needed for HAIL, GLOW and DELTA LAKE on SPARK'''
        s = [
                {
                    "Classification": "spark-defaults",
                    "Properties": {
                        "spark.jars": "/usr/local/lib/python3.7/site-packages/hail/backend/hail-all-spark.jar",
                        "spark.kryo.registrator": "is.hail.kryo.HailKryoRegistrator",
                        "spark.serializer" : "org.apache.spark.serializer.KryoSerializer"
                    }
                },
                {
                    "Classification":"spark",
                    "Properties":{
                        "maximizeResourceAllocation":"true"
                    }
                },
                {
                    "Classification":"yarn-site",
                    "Properties":{
                        "yarn.nodemanager.vmem-check-enabled":"false"
                    }
                },
                {
                    "Classification": "emrfs-site",
                    "Properties": {
                        "fs.s3.maxConnections": "10000",
                    }
                }
            ]
        return s

Show me yours!

Thon

Abhishek · November 28, 2021, 9:23am

Thanks for the code snippet. It helps to understand better.

Yes, I am talking about fs.s3.maxConnections. We had to set it up at: 1.2*Number of cores.

I can’t share the spark configurations here as they are in our client environment.

Topic		Replies	Views
Running Hail on AWS Help [0.1]	29	3752	January 9, 2019
TypeError: 'JavaPackage' object is not callable on AWS EMR when adding jars Hail Query & hailctl	1	592	March 30, 2021
Timeout waiting for connection from pool - loading gVCF from S3 Science	3	1906	November 15, 2021
S3 connection error Hail Query & hailctl	5	853	September 28, 2020
IOException: No FileSystem for scheme: gs Hail Batch & General Cloud	6	1762	November 2, 2021

Anyone also trying to run Hail on AWS EMR clusters and having issues? Let's huddle

Related topics