AWS – Announcing Amazon S3 Tables – Fully managed Apache Iceberg tables optimized for analytics workloads
Amazon S3 Tables deliver the first cloud object store with built-in Apache Iceberg support, and the easiest way to store tabular data at scale. S3 Tables are specifically optimized for analytics workloads, resulting in up to 3x faster query throughput and up to 10x higher transactions per second compared to self-managed tables. With S3 Tables support for the Apache Iceberg standard, your tabular data can be easily queried by popular AWS and third-party query engines. Additionally, S3 Tables are designed to perform continual table maintenance to automatically optimize query efficiency and storage cost over time, even as your data lake scales and evolves. S3 Tables integration with AWS Glue Data Catalog is in preview, allowing you to stream, query, and visualize data—including S3 Metadata tables—using AWS Analytics services such as Amazon Data Firehose, Athena, Redshift, EMR, and QuickSight.
S3 Tables introduce table buckets, a new bucket type that is purpose-built to store tabular data. With table buckets, you can quickly create tables and set up table-level permissions to manage access to your data lake. You can then load and query data in your tables with standard SQL, and take advantage of Apache Iceberg’s advanced analytics capabilities such as row-level transactions, queryable snapshots, schema evolution, and more. Table buckets also provide policy-driven table maintenance, helping you to automate operational tasks such as compaction, snapshot management, and unreferenced file removal.
Amazon S3 Tables are now available in the US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions, and coming soon to additional Regions. For pricing details, visit the S3 pricing page. To learn more, visit the product page, documentation, and AWS News Blog.
Read More for the details.