AWS – Dynamically update your running EMR cluster with reconfiguration for instance fleets
Amazon EMR on EC2 now supports real-time update of application configurations for EMR instance fleets without requiring cluster termination or restart. With this feature, customers can now dynamically adjust application configurations, such as Spark’s executor memory, YARN’s resource allocation, and HDFS settings seamlessly, on a running cluster, minimizing interruptions to your workloads. This is particularly useful for adjusting resource allocation and fine-tune applications to match data processing and job performance requirements, while ensuring optimal resource utilization.
Amazon EMR is a cloud big data platform for data processing, interactive analysis, and machine learning using open-source frameworks such as Apache Spark, Apache Flink, and Trino. Previously, you had to terminate and relaunch instance fleet clusters with new configurations. This process resulted in downtime, increased operational effort, and delayed workflow adjustments. With support for reconfiguration, EMR dynamically applies the updated configurations on cluster nodes on a rolling basis while ensuring cluster stability and resource availability. It provides notifications to customers via Amazon CloudWatch and EMR events. In the event of a failure or an incompatible update, EMR rolls back the changes to ensure your cluster remains operational. You can continue to run workloads on the cluster during the update process.
You can leverage this feature on all EMR 5.21 and later releases using AWS CLI, or API. This capability is available in all AWS Regions, including the AWS GovCloud (US) Regions, where Amazon EMR on EC2 is available. To learn more, please refer to the documentation here.
Read More for the details.