Spark tuning for Elasticsearch export?

Pleased to report that setting spark.dynamicAllocation.enabled = false fixed the Elasticsearch slowness/crashing problem without me having to fiddle with other memory settings. It seems that by default AWS EMR will turn this on, and (along with who knows what other AWS oddness) it caused problems in the export. Thanks to @pavlos for the tip (Small MatrixTable hangs on write into Google bucket).

1 Like