AWS – Amazon EMR Managed Scaling is now Spark shuffle data aware
Amazon EMR Managed Scaling automatically resizes EMR clusters for best performance and resource utilization. Today, we are excited to announce a new capability in Managed Scaling that prevents it from scaling down instances that store intermediate shuffle data for Apache Spark. Intelligently scaling down clusters without removing the instances that store intermediate shuffle data prevents job re-attempts and re-computations, which leads to better performance, and lower cost.
Read More for the details.