2025 05 29

AWS – Amazon EMR enables enhanced Apache Spark capabilities for Lake Formation tables with full table access

Amazon EMR now supports read and write operations from Apache Spark jobs on AWS Lake Formation registered tables when the job role has full table access. This capability enables Data Manipulation Language (DML) operations including CREATE, ALTER, DELETE, UPDATE, and MERGE INTO statements on Apache Hive and Iceberg tables from within the same Apache Spark application.

While Lake Formation’s fine-grained access control (FGAC) offers granular security controls at row, column, and cell levels, many ETL workloads simply need full table access. This new feature enables Apache Spark to directly read and write data when full table access is granted, removing FGAC limitations that previously restricted certain ETL operations. You can now leverage advanced Spark capabilities including RDDs, custom libraries, UDFs, and custom images (AMIs for EMR on EC2, custom images for EMR-Serverless) with Lake Formation tables. Additionally, data teams can run complex, interactive Spark applications through SageMaker Unified Studio in compatibility mode while maintaining Lake Formation’s table-level security boundaries.

This feature is available in all AWS Regions where Amazon EMR and AWS Lake Formation are supported.

To learn more about this feature, visit the Lake Formation unfiltered access section in EMR Serverless documentation.

AWS – Amazon EMR enables enhanced Apache Spark capabilities for Lake Formation tables with full table access

Related Posts

AWS – Amazon VPC Route Server now available in new regions

GCP – Palo Alto Networks automates customer intelligence document creation with agentic design

GCP – Vibe querying: Write SQL queries faster with Comments to SQL in BigQuery