Submitting to cluster 'kc'... gcloud command: gcloud dataproc jobs submit pyspark pipeline/regional_constraint.py \ --files= \ --py-files=/var/folders/xq/8jnhrt2s2h58ts2v0br5g8gm0000gp/T/pyscripts_cxsvxj2n.zip \ --properties= \ -- \ --search-for-simul-breaks \ --slack-channel \ @kc (she/her) \ --chisq-threshold \ 6.6 Job [70781f611e214b7f9cf269cff9615de6] submitted. Waiting for job output... 2021-09-10 17:57:55 WARN SparkConf:69 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. 2021-09-10 17:57:55 WARN SparkConf:69 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. 2021-09-10 17:57:56 WARN SparkConf:69 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. 2021-09-10 17:57:59 WARN SparkConf:69 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. 2021-09-10 17:57:59 INFO SparkContext:57 - Running Spark version 3.1.1 2021-09-10 17:57:59 INFO ResourceUtils:57 - ============================================================== 2021-09-10 17:57:59 INFO ResourceUtils:57 - No custom resources configured for spark.driver. 2021-09-10 17:57:59 INFO ResourceUtils:57 - ============================================================== 2021-09-10 17:57:59 INFO SparkContext:57 - Submitted application: Hail 2021-09-10 17:57:59 INFO SparkContext:57 - Spark configuration: spark.app.name=Hail spark.app.startTime=1631296679651 spark.driver.extraClassPath=/opt/conda/miniconda3/lib/python3.8/site-packages/hail/backend/hail-all-spark.jar spark.driver.extraJavaOptions=-Xss4M spark.driver.maxResultSize=0 spark.driver.memory=41g spark.dynamicAllocation.enabled=true spark.dynamicAllocation.maxExecutors=10000 spark.dynamicAllocation.minExecutors=1 spark.eventLog.dir=hdfs://kc-m/user/spark/eventlog spark.eventLog.enabled=true spark.executor.cores=4 spark.executor.extraClassPath=./hail-all-spark.jar spark.executor.extraJavaOptions=-Xss4M spark.executor.instances=2 spark.executor.memory=15g spark.executorEnv.OPENBLAS_NUM_THREADS=1 spark.executorEnv.PYTHONHASHSEED=0 spark.hadoop.hive.execution.engine=mr spark.hadoop.io.compression.codecs=org.apache.hadoop.io.compress.DefaultCodec,is.hail.io.compress.BGzipCodec,is.hail.io.compress.BGzipCodecTbi,org.apache.hadoop.io.compress.GzipCodec spark.hadoop.mapreduce.input.fileinputformat.split.minsize=0 spark.history.fs.logDirectory=hdfs://kc-m/user/spark/eventlog spark.jars=file:/opt/conda/miniconda3/lib/python3.8/site-packages/hail/backend/hail-all-spark.jar spark.kryo.registrator=is.hail.kryo.HailKryoRegistrator spark.kryoserializer.buffer.max=1g spark.logConf=true spark.master=yarn spark.repl.local.jars=file:///opt/conda/miniconda3/lib/python3.8/site-packages/hail/backend/hail-all-spark.jar spark.rpc.message.maxSize=512 spark.scheduler.minRegisteredResourcesRatio=0.0 spark.scheduler.mode=FAIR spark.serializer=org.apache.spark.serializer.KryoSerializer spark.shuffle.service.enabled=true spark.speculation=true spark.sql.adaptive.enabled=true spark.sql.autoBroadcastJoinThreshold=115m spark.sql.catalogImplementation=hive spark.sql.cbo.enabled=true spark.sql.cbo.joinReorder.enabled=true spark.submit.deployMode=client spark.submit.pyFiles=/tmp/70781f611e214b7f9cf269cff9615de6/pyscripts_cxsvxj2n.zip spark.task.maxFailures=20 spark.ui.port=0 spark.ui.showConsoleProgress=false spark.yarn.am.memory=640m spark.yarn.dist.jars=file:///opt/conda/miniconda3/lib/python3.8/site-packages/hail/backend/hail-all-spark.jar spark.yarn.dist.pyFiles=file:///tmp/70781f611e214b7f9cf269cff9615de6/pyscripts_cxsvxj2n.zip spark.yarn.executor.memoryOverhead=25g spark.yarn.historyServer.address=kc-m:18080 spark.yarn.isPython=true spark.yarn.jars=local:/usr/lib/spark/jars/* spark.yarn.tags=dataproc_hash_ccb0a766-b2ae-3f8b-a5f5-2ae3f68956d9,dataproc_job_70781f611e214b7f9cf269cff9615de6,dataproc_master_index_0,dataproc_uuid_72470ed4-7524-3bd1-9dbd-21f6d5a6883a spark.yarn.unmanagedAM.enabled=true 2021-09-10 17:57:59 INFO ResourceProfile:57 - Default ResourceProfile created, executor resources: Map(memoryOverhead -> name: memoryOverhead, amount: 25600, script: , vendor: , cores -> name: cores, amount: 4, script: , vendor: , memory -> name: memory, amount: 15360, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 2021-09-10 17:57:59 INFO ResourceProfile:57 - Limiting resource is cpus at 4 tasks per executor 2021-09-10 17:57:59 INFO ResourceProfileManager:57 - Added ResourceProfile id: 0 2021-09-10 17:57:59 INFO SecurityManager:57 - Changing view acls to: root 2021-09-10 17:57:59 INFO SecurityManager:57 - Changing modify acls to: root 2021-09-10 17:57:59 INFO SecurityManager:57 - Changing view acls groups to: 2021-09-10 17:57:59 INFO SecurityManager:57 - Changing modify acls groups to: 2021-09-10 17:57:59 INFO SecurityManager:57 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2021-09-10 17:58:00 INFO Utils:57 - Successfully started service 'sparkDriver' on port 46295. 2021-09-10 17:58:00 INFO SparkEnv:57 - Registering MapOutputTracker 2021-09-10 17:58:00 INFO SparkEnv:57 - Registering BlockManagerMaster 2021-09-10 17:58:00 INFO BlockManagerMasterEndpoint:57 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 2021-09-10 17:58:00 INFO BlockManagerMasterEndpoint:57 - BlockManagerMasterEndpoint up 2021-09-10 17:58:00 INFO SparkEnv:57 - Registering BlockManagerMasterHeartbeat 2021-09-10 17:58:00 INFO DiskBlockManager:57 - Created local directory at /hadoop/spark/tmp/blockmgr-db1bde2e-eab3-4bf2-8a4f-976a6b5719fc 2021-09-10 17:58:00 INFO MemoryStore:57 - MemoryStore started with capacity 21.7 GiB 2021-09-10 17:58:00 INFO SparkEnv:57 - Registering OutputCommitCoordinator 2021-09-10 17:58:00 INFO log:169 - Logging initialized @5852ms to org.sparkproject.jetty.util.log.Slf4jLog 2021-09-10 17:58:00 INFO Server:375 - jetty-9.4.36.v20210114; built: 2021-01-14T16:44:28.689Z; git: 238ec6997c7806b055319a6d11f8ae7564adc0de; jvm 1.8.0_282-b08 2021-09-10 17:58:00 INFO Server:415 - Started @5983ms 2021-09-10 17:58:00 INFO AbstractConnector:331 - Started ServerConnector@60fcab16{HTTP/1.1, (http/1.1)}{0.0.0.0:37617} 2021-09-10 17:58:00 INFO Utils:57 - Successfully started service 'SparkUI' on port 37617. 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@5611b339{/jobs,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@770209f4{/jobs/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@3d95051d{/jobs/job,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@5bbada0d{/jobs/job/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@4aef86a7{/stages,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@73c08ed6{/stages/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@1fe83ecc{/stages/stage,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@79b728c3{/stages/stage/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@2117bc3{/stages/pool,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@5eaeaf9a{/stages/pool/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@20e59df7{/storage,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@6e8b63b7{/storage/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@52bde63c{/storage/rdd,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@34085268{/storage/rdd/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@993e874{/environment,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@26f52f99{/environment/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@171003af{/executors,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@dfcaca0{/executors/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@2fd814b9{/executors/threadDump,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@13939fe9{/executors/threadDump/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@57c31fcc{/static,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@305743bc{/,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@254df5f{/api,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@69099211{/jobs/job/kill,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@1748de45{/stages/stage/kill,null,AVAILABLE,@Spark} 2021-09-10 17:58:00 INFO SparkUI:57 - Bound SparkUI to 0.0.0.0, and started at http://kc-m.c.broad-mpg-gnomad.internal:37617 2021-09-10 17:58:00 INFO SparkContext:57 - Added JAR file:/opt/conda/miniconda3/lib/python3.8/site-packages/hail/backend/hail-all-spark.jar at spark://kc-m.c.broad-mpg-gnomad.internal:46295/jars/hail-all-spark.jar with timestamp 1631296679651 2021-09-10 17:58:00 INFO FairSchedulableBuilder:57 - Creating Fair Scheduler pools from default file: fairscheduler.xml 2021-09-10 17:58:00 INFO FairSchedulableBuilder:57 - Created pool: default, schedulingMode: FAIR, minShare: 0, weight: 1 2021-09-10 17:58:01 INFO Utils:57 - Using initial executors = 2, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances 2021-09-10 17:58:01 INFO RMProxy:134 - Connecting to ResourceManager at kc-m/10.128.0.41:8032 2021-09-10 17:58:01 INFO AHSProxy:42 - Connecting to Application History server at kc-m/10.128.0.41:10200 2021-09-10 17:58:01 INFO Client:57 - Requesting a new application from cluster with 30 NodeManagers 2021-09-10 17:58:02 INFO Configuration:2795 - resource-types.xml not found 2021-09-10 17:58:02 INFO ResourceUtils:442 - Unable to find 'resource-types.xml'. 2021-09-10 17:58:02 INFO Client:57 - Verifying our application has not requested more than the maximum memory capability of the cluster (48048 MB per container) 2021-09-10 17:58:02 INFO Client:57 - Will allocate AM container, with 1024 MB memory including 384 MB overhead 2021-09-10 17:58:02 INFO Client:57 - Setting up container launch context for our AM 2021-09-10 17:58:02 INFO Client:57 - Setting up the launch environment for our AM container 2021-09-10 17:58:02 INFO Client:57 - Preparing resources for our AM container 2021-09-10 17:58:02 INFO Client:57 - Uploading resource file:/opt/conda/miniconda3/lib/python3.8/site-packages/hail/backend/hail-all-spark.jar -> hdfs://kc-m/user/root/.sparkStaging/application_1631280245753_0005/hail-all-spark.jar 2021-09-10 17:58:03 INFO Client:57 - Uploading resource file:/usr/lib/spark/python/lib/pyspark.zip -> hdfs://kc-m/user/root/.sparkStaging/application_1631280245753_0005/pyspark.zip 2021-09-10 17:58:03 INFO Client:57 - Uploading resource file:/usr/lib/spark/python/lib/py4j-0.10.9-src.zip -> hdfs://kc-m/user/root/.sparkStaging/application_1631280245753_0005/py4j-0.10.9-src.zip 2021-09-10 17:58:03 INFO Client:57 - Uploading resource file:/tmp/70781f611e214b7f9cf269cff9615de6/pyscripts_cxsvxj2n.zip -> hdfs://kc-m/user/root/.sparkStaging/application_1631280245753_0005/pyscripts_cxsvxj2n.zip 2021-09-10 17:58:03 INFO Client:57 - Uploading resource file:/hadoop/spark/tmp/spark-c11da9d2-3e67-4dd2-b8dc-0042b7f772cd/__spark_conf__7345539158989889263.zip -> hdfs://kc-m/user/root/.sparkStaging/application_1631280245753_0005/__spark_conf__.zip 2021-09-10 17:58:03 INFO SecurityManager:57 - Changing view acls to: root 2021-09-10 17:58:03 INFO SecurityManager:57 - Changing modify acls to: root 2021-09-10 17:58:03 INFO SecurityManager:57 - Changing view acls groups to: 2021-09-10 17:58:03 INFO SecurityManager:57 - Changing modify acls groups to: 2021-09-10 17:58:03 INFO SecurityManager:57 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2021-09-10 17:58:03 INFO Client:57 - Submitting application application_1631280245753_0005 to ResourceManager 2021-09-10 17:58:03 INFO YarnClientImpl:329 - Submitted application application_1631280245753_0005 2021-09-10 17:58:04 INFO Client:57 - Application report for application_1631280245753_0005 (state: ACCEPTED) 2021-09-10 17:58:04 INFO Client:57 - client token: N/A diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1631296683314 final status: UNDEFINED tracking URL: http://kc-m:8088/proxy/application_1631280245753_0005/ user: root 2021-09-10 17:58:04 INFO SecurityManager:57 - Changing view acls to: root 2021-09-10 17:58:04 INFO SecurityManager:57 - Changing modify acls to: root 2021-09-10 17:58:04 INFO SecurityManager:57 - Changing view acls groups to: 2021-09-10 17:58:04 INFO SecurityManager:57 - Changing modify acls groups to: 2021-09-10 17:58:04 INFO SecurityManager:57 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2021-09-10 17:58:04 WARN SparkConf:69 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. 2021-09-10 17:58:04 INFO RMProxy:134 - Connecting to ResourceManager at kc-m/10.128.0.41:8030 2021-09-10 17:58:04 INFO YarnRMClient:57 - Registering the ApplicationMaster 2021-09-10 17:58:04 INFO YarnClientSchedulerBackend:57 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> kc-m, PROXY_URI_BASES -> http://kc-m:8088/proxy/application_1631280245753_0005), /proxy/application_1631280245753_0005 2021-09-10 17:58:04 INFO ApplicationMaster:57 - Preparing Local resources 2021-09-10 17:58:04 INFO ApplicationMaster:57 - =============================================================================== Default YARN executor launch context: env: SPARK_WORKER_WEBUI_PORT -> 18081 SPARK_ENV_LOADED -> 1 CLASSPATH -> ./hail-all-spark.jar{{PWD}}{{PWD}}/__spark_conf__{{PWD}}/__spark_libs__/*/usr/lib/spark/jars/*:/etc/hive/conf:/usr/share/java/mysql.jar:/usr/local/share/google/dataproc/lib/*{{PWD}}/__spark_conf__/__hadoop_conf__ SPARK_LOG_DIR -> /var/log/spark SPARK_LOCAL_DIRS -> /hadoop/spark/tmp SPARK_DIST_CLASSPATH -> :/etc/hive/conf:/usr/share/java/mysql.jar:/usr/local/share/google/dataproc/lib/* SPARK_SUBMIT_OPTS -> -Dscala.usejavacp=true SPARK_CONF_DIR -> /usr/lib/spark/conf PYTHONHASHSEED -> 0 SPARK_HOME -> /usr/lib/spark/ PYTHONPATH -> /usr/lib/spark/python/lib/py4j-0.10.9-src.zip:/usr/lib/spark/python/lib/pyspark.zip{{PWD}}/pyspark.zip{{PWD}}/py4j-0.10.9-src.zip{{PWD}}/pyscripts_cxsvxj2n.zip SPARK_MASTER_PORT -> 7077 OPENBLAS_NUM_THREADS -> 1 SPARK_WORKER_DIR -> /hadoop/spark/work SPARK_WORKER_PORT -> 7078 SPARK_DAEMON_MEMORY -> 4000m SPARK_MASTER_WEBUI_PORT -> 18080 SPARK_LIBRARY_PATH -> :/usr/lib/hadoop/lib/native SPARK_SCALA_VERSION -> 2.12 command: {{JAVA_HOME}}/bin/java \ -server \ -Xmx15360m \ '-Xss4M' \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.driver.port=46295' \ '-Dspark.ui.port=0' \ '-Dspark.rpc.message.maxSize=512' \ -Dspark.yarn.app.container.log.dir= \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.YarnCoarseGrainedExecutorBackend \ --driver-url \ spark://CoarseGrainedScheduler@kc-m.c.broad-mpg-gnomad.internal:46295 \ --executor-id \ \ --hostname \ \ --cores \ 4 \ --app-id \ application_1631280245753_0005 \ --resourceProfileId \ 0 \ --user-class-path \ file:$PWD/__app__.jar \ --user-class-path \ file:$PWD/hail-all-spark.jar \ 1>/stdout \ 2>/stderr resources: pyscripts_cxsvxj2n.zip -> resource { scheme: "hdfs" host: "kc-m" port: -1 file: "/user/root/.sparkStaging/application_1631280245753_0005/pyscripts_cxsvxj2n.zip" } size: 507728 timestamp: 1631296683093 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "hdfs" host: "kc-m" port: -1 file: "/user/root/.sparkStaging/application_1631280245753_0005/__spark_conf__.zip" } size: 267095 timestamp: 1631296683254 type: ARCHIVE visibility: PRIVATE pyspark.zip -> resource { scheme: "hdfs" host: "kc-m" port: -1 file: "/user/root/.sparkStaging/application_1631280245753_0005/pyspark.zip" } size: 886359 timestamp: 1631296683026 type: FILE visibility: PRIVATE py4j-0.10.9-src.zip -> resource { scheme: "hdfs" host: "kc-m" port: -1 file: "/user/root/.sparkStaging/application_1631280245753_0005/py4j-0.10.9-src.zip" } size: 41587 timestamp: 1631296683057 type: FILE visibility: PRIVATE hail-all-spark.jar -> resource { scheme: "hdfs" host: "kc-m" port: -1 file: "/user/root/.sparkStaging/application_1631280245753_0005/hail-all-spark.jar" } size: 99426066 timestamp: 1631296682957 type: FILE visibility: PRIVATE =============================================================================== 2021-09-10 17:58:04 INFO Utils:57 - Using initial executors = 2, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances 2021-09-10 17:58:04 INFO YarnAllocator:57 - Resource profile 0 doesn't exist, adding it 2021-09-10 17:58:04 INFO YarnSchedulerBackend$YarnSchedulerEndpoint:57 - ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@kc-m.c.broad-mpg-gnomad.internal:46295) 2021-09-10 17:58:04 INFO YarnAllocator:57 - Will request 2 executor container(s) for ResourceProfile Id: 0, each with 4 core(s) and 40960 MB memory. with custom resources: 2021-09-10 17:58:04 INFO YarnAllocator:57 - Submitted 2 unlocalized container requests. 2021-09-10 17:58:04 INFO ApplicationMaster:57 - Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals 2021-09-10 17:58:05 INFO Client:57 - Application report for application_1631280245753_0005 (state: RUNNING) 2021-09-10 17:58:05 INFO Client:57 - client token: N/A diagnostics: N/A ApplicationMaster host: 10.128.0.41 ApplicationMaster RPC port: -1 queue: default start time: 1631296683314 final status: UNDEFINED tracking URL: http://kc-m:8088/proxy/application_1631280245753_0005/ user: root 2021-09-10 17:58:05 INFO YarnClientSchedulerBackend:57 - Application application_1631280245753_0005 has started running. 2021-09-10 17:58:05 INFO YarnScheduler:57 - Starting speculative execution thread 2021-09-10 17:58:05 INFO Utils:57 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35657. 2021-09-10 17:58:05 INFO NettyBlockTransferService:81 - Server created on kc-m.c.broad-mpg-gnomad.internal:35657 2021-09-10 17:58:05 INFO BlockManager:57 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 2021-09-10 17:58:05 INFO BlockManagerMaster:57 - Registering BlockManager BlockManagerId(driver, kc-m.c.broad-mpg-gnomad.internal, 35657, None) 2021-09-10 17:58:05 INFO BlockManagerMasterEndpoint:57 - Registering block manager kc-m.c.broad-mpg-gnomad.internal:35657 with 21.7 GiB RAM, BlockManagerId(driver, kc-m.c.broad-mpg-gnomad.internal, 35657, None) 2021-09-10 17:58:05 INFO BlockManagerMaster:57 - Registered BlockManager BlockManagerId(driver, kc-m.c.broad-mpg-gnomad.internal, 35657, None) 2021-09-10 17:58:05 INFO BlockManager:57 - external shuffle service port = 7337 2021-09-10 17:58:05 INFO BlockManager:57 - Initialized BlockManager: BlockManagerId(driver, kc-m.c.broad-mpg-gnomad.internal, 35657, None) 2021-09-10 17:58:05 INFO ServerInfo:57 - Adding filter to /metrics/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2021-09-10 17:58:05 INFO ContextHandler:916 - Started o.s.j.s.ServletContextHandler@174badf5{/metrics/json,null,AVAILABLE,@Spark} 2021-09-10 17:58:05 INFO SingleEventLogFileWriter:57 - Logging events to hdfs://kc-m/user/spark/eventlog/application_1631280245753_0005.inprogress 2021-09-10 17:58:05 INFO Utils:57 - Using initial executors = 2, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances 2021-09-10 17:58:05 INFO YarnAllocator:57 - Resource profile 0 doesn't exist, adding it 2021-09-10 17:58:05 INFO YarnClientSchedulerBackend:57 - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 2021-09-10 17:58:05 INFO Hail:28 - SparkUI: http://kc-m.c.broad-mpg-gnomad.internal:37617 Running on Apache Spark version 3.1.1 SparkUI available at http://kc-m.c.broad-mpg-gnomad.internal:37617 Welcome to __ __ <>__ / /_/ /__ __/ / / __ / _ `/ / / /_/ /_/\_,_/_/_/ version 0.2.74-0c3a74d12093 LOGGING: writing to /RMC.log INFO (regional_missense_constraint 356): Searching for two simultaneous breaks in transcripts that didn't have a single significant break... INFO (regional_missense_constraint 376): Getting start and end positions and total size for each transcript... INFO (regional_missense_constraint_generic 247): Total number of bases in the exome: 54426835 INFO (regional_missense_constraint_generic 248): Total number of missense variants in gnomAD exomes: 5257859 INFO (regional_missense_constraint_generic 251): Getting average bases between missense variants and returning... INFO (regional_missense_constraint 422): Minimum window size (window size needed to observe 10 missense variants on average): 100 INFO (regional_missense_constraint 428): Searching for transcripts with simultaneous breaks... INFO (constraint_utils 1272): Getting smallest window end and post-window positions... INFO (constraint_utils 1273): Also getting maximum simultaneous break size (0.900000 * largest transcript size)... INFO (constraint_utils 1008): Annotating each transcript with max window size... 2021-09-10 18:01:37 Hail: INFO: Ordering unsorted dataset with network shuffle0] INFO (constraint_utils 1011): Maximum window size: 2083174====(6910 + 1) / 6910] INFO (constraint_utils 1058): HT count: 22750897 INFO (constraint_utils 1287): Checking all possible window sizes... + 2) / 8043] 2021-09-10 18:14:15 Hail: INFO: wrote table with 22750897 rows in 8043 partitions to gs://regional_missense_constraint/temp/simul_break_100_temp.ht Total size: 136.89 GiB * Rows: 136.89 GiB * Globals: 1.89 KiB * Smallest partition: 105 rows (7.35 KiB) * Largest partition: 52494 rows (17.12 GiB) ---------------------------------------- Global fields: 'plateau_models': struct { total: dict> } 'plateau_x_models': struct { total: dict> } 'plateau_y_models': struct { total: dict> } 'coverage_model': tuple ( float64, float64 ) 'plateau_models_1': struct { total: dict> } 'plateau_x_models_1': struct { total: dict> } 'plateau_y_models_1': struct { total: dict> } 'coverage_model_1': tuple ( float64, float64 ) 'obs_mis_10': set 'obs_mis_10_window_size': int32 'obs_mis_20': set 'obs_mis_20_window_size': int32 'obs_mis_50': set 'obs_mis_50_window_size': int32 'all_simul_transcripts': set ---------------------------------------- Row fields: 'transcript': str 'locus': locus 'min_window_end': int32 'start_pos': int32 'end_pos': int32 'transcript_size': int32 '_localize': bool 'pos_per_transcript': array 'n_pos_per_transcript': int32 'mu_snp': float64 'total_exp': float64 '_mu_scan': dict 'total_mu': float64 'cumulative_obs': dict 'observed': int32 'cumulative_exp': float64 'total_obs': int64 'reverse': struct { obs: int64, exp: float64 } 'forward_oe': float64 'overall_oe': float64 'break_sizes': array 'break_chisqs': array 'window_ends': array 'post_window_pos': int32 'post_window_index': int32 ---------------------------------------- Key: ['locus', 'transcript'] ---------------------------------------- INFO (constraint_utils 1217): Preparing HT to search for two breaks... INFO (constraint_utils 1094): Annotating HT with obs, exp, OE values for pre-window positions... INFO (constraint_utils 1096): Pre-window values are cumulative values minus values at start of window of constraint... INFO (constraint_utils 1124): Creating HT with obs, exp, OE values for post-window positions... 2021-09-10 18:19:06 Hail: INFO: wrote table with 22750897 rows in 8043 partitions to gs://regional_missense_constraint/temp/next.ht Total size: 1.10 GiB * Rows: 1.10 GiB * Globals: 1.89 KiB * Smallest partition: 59 rows (1.03 KiB) * Largest partition: 52494 rows (2.48 MiB) INFO (constraint_utils 1152): Annotating HT with obs, exp, OE values for positions in window of constraint... INFO (constraint_utils 1181): Annotating HT with obs, exp, OE values for post-window positions... 2021-09-10 18:25:09 Hail: INFO: wrote table with 22750897 rows in 8043 partitions to gs://regional_missense_constraint/temp/simul_break_temp_annot.ht Total size: 138.70 GiB * Rows: 138.70 GiB * Globals: 1.89 KiB * Smallest partition: 105 rows (12.71 KiB) * Largest partition: 52494 rows (17.12 GiB) ---------------------------------------- Global fields: 'plateau_models': struct { total: dict> } 'plateau_x_models': struct { total: dict> } 'plateau_y_models': struct { total: dict> } 'coverage_model': tuple ( float64, float64 ) 'plateau_models_1': struct { total: dict> } 'plateau_x_models_1': struct { total: dict> } 'plateau_y_models_1': struct { total: dict> } 'coverage_model_1': tuple ( float64, float64 ) 'obs_mis_10': set 'obs_mis_10_window_size': int32 'obs_mis_20': set 'obs_mis_20_window_size': int32 'obs_mis_50': set 'obs_mis_50_window_size': int32 'all_simul_transcripts': set ---------------------------------------- Row fields: 'transcript': str 'locus': locus 'min_window_end': int32 'start_pos': int32 'end_pos': int32 'transcript_size': int32 '_localize': bool 'pos_per_transcript': array 'n_pos_per_transcript': int32 'mu_snp': float64 'total_exp': float64 '_mu_scan': dict 'total_mu': float64 'cumulative_obs': dict 'observed': int32 'cumulative_exp': float64 'total_obs': int64 'reverse': struct { obs: int64, exp: float64 } 'forward_oe': float64 'overall_oe': float64 'break_sizes': array 'break_chisqs': array 'window_ends': array 'post_window_pos': int32 'post_window_index': int32 'exp_at_start': float64 'pre_obs': int64 'pre_exp': float64 'pre_oe': float64 'next_values': struct { cum_obs: int64, obs: int32, exp: float64, mu_snp: float64, oe: float64, reverse_obs: int64, reverse_exp: float64 } 'exp_at_end': float64 'window_obs': int64 'window_exp': float64 'window_oe': float64 'post_obs': int64 'post_exp': float64 'post_oe': float64 ---------------------------------------- Key: ['locus', 'transcript'] ---------------------------------------- INFO (constraint_utils 1222): Searching for two breaks... INFO (constraint_utils 526): Creating section null (no regional variability in missense depletion) and alt (evidence of domains of missense constraint) expressions... INFO (constraint_utils 623): Multiplying all section nulls and all section alts... INFO (constraint_utils 631): Adding chisq value and getting max chisq... ---------------------------------------- Global fields: 'plateau_models': struct { total: dict> } 'plateau_x_models': struct { total: dict> } 'plateau_y_models': struct { total: dict> } 'coverage_model': tuple ( float64, float64 ) 'plateau_models_1': struct { total: dict> } 'plateau_x_models_1': struct { total: dict> } 'plateau_y_models_1': struct { total: dict> } 'coverage_model_1': tuple ( float64, float64 ) 'obs_mis_10': set 'obs_mis_10_window_size': int32 'obs_mis_20': set 'obs_mis_20_window_size': int32 'obs_mis_50': set 'obs_mis_50_window_size': int32 'all_simul_transcripts': set ---------------------------------------- Row fields: 'transcript': str 'locus': locus 'min_window_end': int32 'start_pos': int32 'end_pos': int32 'transcript_size': int32 '_localize': bool 'pos_per_transcript': array 'n_pos_per_transcript': int32 'mu_snp': float64 'total_exp': float64 '_mu_scan': dict 'total_mu': float64 'cumulative_obs': dict 'observed': int32 'cumulative_exp': float64 'total_obs': int64 'reverse': struct { obs: int64, exp: float64 } 'forward_oe': float64 'overall_oe': float64 'break_sizes': array 'break_chisqs': array 'window_ends': array 'post_window_pos': int32 'post_window_index': int32 'exp_at_start': float64 'pre_obs': int64 'pre_exp': float64 'pre_oe': float64 'next_values': struct { cum_obs: int64, obs: int32, exp: float64, mu_snp: float64, oe: float64, reverse_obs: int64, reverse_exp: float64 } 'exp_at_end': float64 'window_obs': int64 'window_exp': float64 'window_oe': float64 'post_obs': int64 'post_exp': float64 'post_oe': float64 'section_nulls': array 'section_alts': array 'total_null': float64 'total_alt': float64 'chisq': float64 'max_chisq': float64 'is_break': bool ---------------------------------------- Key: ['locus', 'transcript'] ---------------------------------------- ---------------------------------------- Global fields: None ---------------------------------------- Row fields: 'transcript': str 'locus': locus 'min_window_end': int32 'start_pos': int32 'end_pos': int32 'transcript_size': int32 'pos_per_transcript': array 'n_pos_per_transcript': int32 'mu_snp': float64 'total_exp': float64 '_mu_scan': dict 'total_mu': float64 'cumulative_obs': dict 'observed': int32 'cumulative_exp': float64 'total_obs': int64 'reverse': struct { obs: int64, exp: float64 } 'forward_oe': float64 'overall_oe': float64 'break_sizes': array 'break_chisqs': array 'window_ends': array 'post_window_pos': int32 'post_window_index': int32 'exp_at_start': float64 'exp_at_end': float64 'window_obs': int64 'window_exp': float64 'window_oe': float64 'post_obs': int64 'post_exp': float64 'post_oe': float64 'section_nulls': array 'section_alts': array 'total_null': float64 'total_alt': float64 'chisq': float64 'max_chisq': float64 'is_break': bool ---------------------------------------- Key: ['locus', 'transcript'] ---------------------------------------- 2021-09-10 18:27:32 Hail: INFO: Ordering unsorted dataset with network shuffle3] 2021-09-10 18:29:27 Hail: INFO: Ordering unsorted dataset with network shuffle3] INFO (regional_missense_constraint 687): Copying hail log to logging bucket...1]]] 2021-09-10 19:40:24 Hail: INFO: copying log to 'gs://regional_missense_constraint/logs/RMC.log'... Traceback (most recent call last): File "/tmp/70781f611e214b7f9cf269cff9615de6/regional_constraint.py", line 807, in main(args) File "/tmp/70781f611e214b7f9cf269cff9615de6/regional_constraint.py", line 429, in main context_ht = search_two_break_windows( File "/tmp/70781f611e214b7f9cf269cff9615de6/pyscripts_cxsvxj2n.zip/rmc/utils/constraint.py", line 1378, in search_two_break_windows File "", line 2, in checkpoint File "/opt/conda/default/lib/python3.8/site-packages/hail/typecheck/check.py", line 577, in wrapper return __original_func(*args_, **kwargs_) File "/opt/conda/default/lib/python3.8/site-packages/hail/table.py", line 1238, in checkpoint self.write(output=output, overwrite=overwrite, stage_locally=stage_locally, _codec_spec=_codec_spec) File "", line 2, in write File "/opt/conda/default/lib/python3.8/site-packages/hail/typecheck/check.py", line 577, in wrapper return __original_func(*args_, **kwargs_) File "/opt/conda/default/lib/python3.8/site-packages/hail/table.py", line 1271, in write Env.backend().execute(ir.TableWrite(self._tir, ir.TableNativeWriter(output, overwrite, stage_locally, _codec_spec))) File "/opt/conda/default/lib/python3.8/site-packages/hail/backend/py4j_backend.py", line 98, in execute raise e File "/opt/conda/default/lib/python3.8/site-packages/hail/backend/py4j_backend.py", line 74, in execute result = json.loads(self._jhc.backend().executeJSON(jir)) File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__ File "/opt/conda/default/lib/python3.8/site-packages/hail/backend/py4j_backend.py", line 30, in deco raise FatalError('%s\n\nJava stack trace:\n%s\n' hail.utils.java.FatalError: SparkException: Job aborted due to stage failure: ResultStage 13 (runJob at ContextRDD.scala:238) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:770) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:685) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:70) at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:29) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200) at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:128) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:106) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.nio.channels.ClosedChannelException at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:62) at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:223) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more Java stack trace: org.apache.spark.SparkException: Job aborted due to stage failure: ResultStage 13 (runJob at ContextRDD.scala:238) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:770) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:685) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:70) at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:29) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200) at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:128) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:106) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.nio.channels.ClosedChannelException at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:62) at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:223) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2254) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2203) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2202) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2202) at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1763) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2438) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2383) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2372) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:868) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2202) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2223) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2255) at is.hail.sparkextras.ContextRDD.crunJobWithIndex(ContextRDD.scala:238) at is.hail.rvd.RVD$.getKeyInfo(RVD.scala:1264) at is.hail.rvd.RVD$.makeCoercer(RVD.scala:1339) at is.hail.rvd.RVD$.coerce(RVD.scala:1295) at is.hail.rvd.RVD.changeKey(RVD.scala:176) at is.hail.rvd.RVD.changeKey(RVD.scala:169) at is.hail.rvd.RVD.enforceKey(RVD.scala:161) at is.hail.expr.ir.TableKeyBy.execute(TableIR.scala:1263) at is.hail.expr.ir.TableMapRows.execute(TableIR.scala:1903) at is.hail.expr.ir.Interpret$.run(Interpret.scala:790) at is.hail.expr.ir.Interpret$.alreadyLowered(Interpret.scala:56) at is.hail.expr.ir.InterpretNonCompilable$.interpretAndCoerce$1(InterpretNonCompilable.scala:16) at is.hail.expr.ir.InterpretNonCompilable$.rewrite$1(InterpretNonCompilable.scala:53) at is.hail.expr.ir.InterpretNonCompilable$.apply(InterpretNonCompilable.scala:58) at is.hail.expr.ir.lowering.InterpretNonCompilablePass$.transform(LoweringPass.scala:67) at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:15) at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81) at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:15) at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81) at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:13) at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:12) at is.hail.expr.ir.lowering.InterpretNonCompilablePass$.apply(LoweringPass.scala:62) at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:14) at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:12) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:12) at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:29) at is.hail.backend.spark.SparkBackend._execute(SparkBackend.scala:381) at is.hail.backend.spark.SparkBackend.$anonfun$execute$1(SparkBackend.scala:365) at is.hail.expr.ir.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:47) at is.hail.utils.package$.using(package.scala:627) at is.hail.expr.ir.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:47) at is.hail.utils.package$.using(package.scala:627) at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:17) at is.hail.expr.ir.ExecuteContext$.scoped(ExecuteContext.scala:46) at is.hail.backend.spark.SparkBackend.withExecuteContext(SparkBackend.scala:275) at is.hail.backend.spark.SparkBackend.execute(SparkBackend.scala:362) at is.hail.backend.spark.SparkBackend.$anonfun$executeJSON$1(SparkBackend.scala:406) at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52) at is.hail.backend.spark.SparkBackend.executeJSON(SparkBackend.scala:404) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Hail version: 0.2.74-0c3a74d12093 Error summary: SparkException: Job aborted due to stage failure: ResultStage 13 (runJob at ContextRDD.scala:238) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:770) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:685) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:70) at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:29) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200) at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:128) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:106) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.nio.channels.ClosedChannelException at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:62) at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:223) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more ERROR: (gcloud.dataproc.jobs.submit.pyspark) Job [70781f611e214b7f9cf269cff9615de6] failed with error: Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found at: https://console.cloud.google.com/dataproc/jobs/70781f611e214b7f9cf269cff9615de6?project=broad-mpg-gnomad®ion=us-central1 gcloud dataproc jobs wait '70781f611e214b7f9cf269cff9615de6' --region 'us-central1' --project 'broad-mpg-gnomad' https://console.cloud.google.com/storage/browser/dataproc-faa46220-ec08-4f5b-92bd-9722e1963047-us-central1/google-cloud-dataproc-metainfo/7f5f4152-d9c0-4962-9165-79bda3b6e4f3/jobs/70781f611e214b7f9cf269cff9615de6/ gs://dataproc-faa46220-ec08-4f5b-92bd-9722e1963047-us-central1/google-cloud-dataproc-metainfo/7f5f4152-d9c0-4962-9165-79bda3b6e4f3/jobs/70781f611e214b7f9cf269cff9615de6/driveroutput Traceback (most recent call last): File "/Users/kchao/anaconda3/envs/hail/bin/hailctl", line 8, in sys.exit(main()) File "/Users/kchao/anaconda3/envs/hail/lib/python3.7/site-packages/hailtop/hailctl/__main__.py", line 100, in main cli.main(args) File "/Users/kchao/anaconda3/envs/hail/lib/python3.7/site-packages/hailtop/hailctl/dataproc/cli.py", line 122, in main jmp[args.module].main(args, pass_through_args) File "/Users/kchao/anaconda3/envs/hail/lib/python3.7/site-packages/hailtop/hailctl/dataproc/submit.py", line 78, in main gcloud.run(cmd) File "/Users/kchao/anaconda3/envs/hail/lib/python3.7/site-packages/hailtop/hailctl/dataproc/gcloud.py", line 9, in run return subprocess.check_call(["gcloud"] + command) File "/Users/kchao/anaconda3/envs/hail/lib/python3.7/subprocess.py", line 328, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['gcloud', 'dataproc', 'jobs', 'submit', 'pyspark', 'pipeline/regional_constraint.py', '--cluster=kc', '--files=', '--py-files=/var/folders/xq/8jnhrt2s2h58ts2v0br5g8gm0000gp/T/pyscripts_cxsvxj2n.zip', '--properties=', '--', '--search-for-simul-breaks', '--slack-channel', '@kc (she/her)', '--chisq-threshold', '6.6']' returned non-zero exit status 1.