AWS – Amazon EMR enables enhanced Apache Spark capabilities for Lake Formation tables with full table access
Amazon EMR now supports read and write operations from Apache Spark jobs on AWS Lake Formation registered tables when the job role has full table access. This capability enables Data Manipulation Language (DML) operations including CREATE, ALTER, DELETE, UPDATE, and MERGE INTO statements on Apache Hive and Iceberg tables from within the same Apache Spark application.
While Lake Formation’s fine-grained access control (FGAC) offers granular security controls at row, column, and cell levels, many ETL workloads simply need full table access. This new feature enables Apache Spark to directly read and write data when full table access is granted, removing FGAC limitations that previously restricted certain ETL operations. You can now leverage advanced Spark capabilities including RDDs, custom libraries, UDFs, and custom images (AMIs for EMR on EC2, custom images for EMR-Serverless) with Lake Formation tables. Additionally, data teams can run complex, interactive Spark applications through SageMaker Unified Studio in compatibility mode while maintaining Lake Formation’s table-level security boundaries.
This feature is available in all AWS Regions where Amazon EMR and AWS Lake Formation are supported.
To learn more about this feature, visit the Lake Formation unfiltered access section in EMR Serverless documentation.
Read More for the details.