Amazon Aurora Serverless v2 now offers up to 30% improved performance for databases running on the latest serverless platform version (version 3), and also supports scaling from 0 up to 256 Aurora Capacity Units (ACUs). Aurora Serverless v2 measures capacity in ACUs where each ACU is a combination of approximately 2 gibibytes (GiB) of memory, corresponding CPU, and networking. You specify the capacity range and the database scales within this range to support your application’s needs.
With improved performance, you can now use Aurora Serverless for even more demanding workloads. All new clusters, database restores, and new clones will launch on the latest platform version. Existing clusters can be upgraded by stopping and restarting the cluster or by using Blue/Green Deployments. You can determine the cluster’s platform version in the AWS Console’s instance configuration section or via the RDS API’s ServerlessV2PlatformVersion parameter for a DB cluster.
The latest platform version is available in all AWS Regions including the AWS GovCloud (US) Regions. Aurora Serverless is an on-demand, automatic scaling configuration for Amazon Aurora. For pricing details and Region availability, visit Amazon Aurora Pricing. To learn more, read the documentation, and get started by creating an Aurora Serverless v2 database using only a few steps in the AWS Management Console.
The age of AI is here, and organizations everywhere are racing to deploy powerful models to drive innovation, enhance products, and create entirely new user experiences. But moving from a trained model in a lab to a scalable, cost-effective, and production-grade inference service is a significant engineering challenge. It requires deep expertise in infrastructure, networking, security, and all of the Ops (MLOps, LLMOps, DevOps, etc.).
Today, we’re making it dramatically simpler. We’re excited to announce the GKE inference reference architecture: a comprehensive, production-ready blueprint for deploying your inference workloads on Google Kubernetes Engine (GKE).
This isn’t just another guide; it’s an actionable, automated, and opinionated framework designed to give you the best of GKE for inference, right out of the box.
Start with a strong foundation: The GKE base platform
Before you can run, you need a solid place to stand. This reference architecture is built on the GKE base platform. Think of this as the core, foundational layer that provides a streamlined and secure setup for any accelerated workload on GKE.
Built on infrastructure-as-code (IaC) principles using Terraform, the base platform establishes a robust foundation with the following:
Automated, repeatable deployments: Define your entire infrastructure as code for consistency and version control.
Built-in scalability and high availability: Get a configuration that inherently supports autoscaling and is resilient to failures.
Security best practices: Implement critical security measures like private clusters, Shielded GKE Nodes, and secure artifact management from the start.
Integrated observability: Seamlessly connect to Google Cloud Observability for deep visibility into your infrastructure and applications.
Starting with this standardized base ensures you’re building on a secure, scalable, and manageable footing, accelerating your path to production.
Why the inference-optimized platform?
The base platform provides the foundation, and the GKE inference reference architecture is the specialized, high-performance engine that’s built on top of it. It’s an extension that’s tailored specifically to solve the unique challenges of serving machine learning models.
Here’s why you should start with our accelerated platform for your AI inference workloads:
1. Optimized for performance and cost
Inference is a balancing act between latency, throughput, and cost. This architecture is fine-tuned to master that balance.
Intelligent accelerator use: It streamlines the use of GPUs and TPUs, so you can use custom compute classes to ensure that your pods land on the exact hardware they need. With node auto-provisioning (NAP), the cluster automatically provisions the right resources, when you need them.
Smarter scaling: Go beyond basic CPU and memory scaling. We integrate a custom metrics adapter that allows the Horizontal Pod Autoscaler (HPA) to scale your models. Scaling is based on real-world inference metrics like queries per second (QPS) or latency, ensuring you only pay for what you use.
Faster model loading: Large models mean large container images. We leverage the Container File System API and Image streaming in GKE along with Cloud Storage FUSE to dramatically reduce pod startup times. Your containers can start while the model data streams in the background, minimizing cold-start latency.
2. Built to scale any inference pattern
Whether you’re doing real-time fraud detection, batch processing analytics, or serving a massive frontier model, this architecture is designed to handle it. It provides a framework for the following:
Real-time (online) inference: Prioritizes low-latency responses for interactive applications.
Batch (offline) inference: Efficiently processes large volumes of data for non-time-sensitive tasks.
Streaming inference: Continuously processes data as it arrives from sources like Pub/Sub.
The architecture leverages GKE features like the cluster autoscaler and the Gateway API for advanced, flexible, and powerful traffic management that can handle massive request volumes gracefully.
3. Simplified operations for complex models
We’ve baked in features to abstract away the complexity of serving modern AI models, especially LLMs. The architecture includes guidance and integrations for advanced model optimization techniques such as quantization (INT8/INT4), tensor and pipeline parallelism, and KV Cache optimizations like Paged and Flash Attention.
Furthermore, with GKE in Autopilot mode, you can offload node management entirely to Google, so you can focus on your models, not your infrastructure.
We’ve included examples for deploying popular workloads like ComfyUI and a general-purpose online inference with GPUs and TPUs to help you get started quickly.
By combining the rock-solid foundation of the GKE base platform with the performance and operational enhancements of the inference reference architecture, you can deploy your AI workloads with confidence, speed, and efficiency. Stop reinventing the wheel and start building the future on GKE.
The future of AI on GKE
The GKE inference reference architecture is more than just a collection of tools, it’s a reflection of Google’s commitment to making GKE the best platform for running your inference workloads. By providing a clear, opinionated, and extensible architecture, we are empowering you to accelerate your AI journey and bring your innovative ideas to life.
We’re excited to see what you’ll build with the GKE inference reference architecture. Your feedback is welcome! Please share your thoughts in the GitHub repository.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7g instances are available in the AWS Middle East (Bahrain), AWS Africa (Cape Town), and AWS Asia Pacific (Jakarta) regions. These instances are powered by AWS Graviton3 processors that provide up to 25% better compute performance compared to AWS Graviton2 processors, and built on top of the the AWS Nitro System, a collection of AWS designed innovations that deliver efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage.
Amazon EC2 Graviton3 instances also use up to 60% less energy to reduce your cloud carbon footprint for the same performance than comparable EC2 instances. For increased scalability, these instances are available in 9 different instance sizes, including bare metal, and offer up to 30 Gbps networking bandwidth and up to 20 Gbps of bandwidth to the Amazon Elastic Block Store (EBS).
Amazon DynamoDB is announcing the support of Console-to-Code, powered by Amazon Q Developer. Console-to-Code makes it simple, fast, and cost-effective to create DynamoDB resources at scale by getting you started with your automation code.
DynamoDB is a serverless, NoSQL, fully managed database with single-digit millisecond performance at any scale. Customers use the DynamoDB console to learn and prototype cloud solutions. Console-to-Code helps you record those actions and uses generative AI to suggest code in you preferred infrastructure-as-code (IAC) format for the actions you want. You can use this code as a starting point for infrastructure automation and further customize for your production workloads. For example, with Console-to-Code, you can record creating an Amazon DynamoDB table and choose to generate code for the AWS CDK (TypeScript, Python, or Java) or CloudFormation (YAML or JSON).
We’re excited to announce the general availability of two new Amazon CloudWatch metrics for AWS Outposts racks: VifConnectionStatus and VifBgpSessionState. These metrics provide you with greater visibility into the connectivity status of your Outposts racks’ Local Gateway (LGW) and Service Link Virtual Interfaces (VIFs) with your on-premises devices.
These metrics provide you with the ability to monitor Outposts VIF connectivity status directly within the CloudWatch console, without having to rely on external networking tools or coordination with other teams. You can use these metrics to set alarms, troubleshoot connectivity issues, and ensure your Outposts racks are properly integrated with your on-premises infrastructure. The VifConnectionStatus metric indicates whether an Outposts VIF is successfully connected, configured, and ready to forward traffic. A value of “1” means that the VIF is operational, while “0” means that it is not ready. The VifBgpSessionState metric shows the current state of the BGP session between the Outposts VIF and the on-premises device, with values ranging from 1 (IDLE) to 6 (ESTABLISHED).
The VifConnectionStatus and VifBgpSessionState metrics are available for all Outposts VIFs in all commercial AWS Regions where Outposts racks are available.
Amazon Relational Database Service (Amazon RDS) for SQL Server now supports Cumulative Update CU20 for SQL Server 2022 (RDS version 16.00.4205.1.v1), and General Distribution Releases (GDR) for SQL Server 2016 SP3 (RDS version 13.00.6460.7.v1), SQL Server 2017 (RDS version 14.00.3495.9.v1) and SQL Server 2019 (RDS version 15.00.4435.7.v1). The new CU20 and GDR releases address the vulnerabilities described in CVE-2025-49717, CVE-2025-49718 and CVE-2025-49719. Additionally, CU20 also includes important security fixes, performance improvements, and bug fixes. For additional information, see the Microsoft SQL Server 2022 CU20 documentation and GDR release notes KB5058717,KB5058714,KB5058722 and KB5058721.
We recommend that you upgrade your Amazon RDS for SQL Server instances to these latest versions using Amazon RDS Management Console or by using the AWS SDK or CLI. You can learn more about upgrading your database instances by using Amazon RDS User Guide.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M7i and M7i-flex instances powered by custom 4th Gen Intel Xeon Scalable processors (code-named Sapphire Rapids) are available in Asia Pacific (Osaka) region. These custom processors, available only on AWS, offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers.
M7i-flex instances are the easiest way for you to get price-performance benefits for a majority of general-purpose workloads. They deliver up to 19% better price-performance compared to M6i. M7i-flex instances offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources such as web and application servers, virtual-desktops, batch-processing, and microservices.
M7i deliver up to 15% better price-performance compared to M6i. M7i instances are a great choice for workloads that need the largest instance sizes or continuous high CPU usage, such as gaming servers, CPU-based machine learning (ML), and video-streaming. M7i offer larger instance sizes, up to 48xlarge, and two bare metal sizes (metal-24xl, metal-48xl). These bare-metal sizes support built-in Intel accelerators: Data Streaming Accelerator, In-Memory Analytics Accelerator, and QuickAssist Technology that are used to facilitate efficient offload and acceleration of data operations and optimize performance for workloads.
AWS customers can now view and manage their support cases from the AWS Console Mobile App. Customers can view and reply to case correspondence, and resolve, reopen, or create support cases while on the go and away from their workstations. Visit the Services tab and select “Support” to get started.
The AWS Console Mobile App enables AWS customers monitor and manage a select set of resources and receive push notifications to stay informed and connected with their AWS resources while on-the-go. The sign-in process supports biometrics authentication, making access to AWS resources simple, secure, and quick. For AWS services not available natively, customers can access the AWS Management Console via an in-app browser to access service pages without additional authentication, manual navigation, or need to switch from the app to a browser.
With this launch, Amazon VPC Reachability Analyzer and Amazon VPC Network Access Analyzer are now available in Asia Pacific (Jakarta), Asia Pacific (Malaysia), Asia Pacific (Thailand), Europe (Zurich), and Middle East (UAE).
VPC Reachability Analyzer allows you to diagnose network reachability between a source resource and a destination resource in your virtual private clouds (VPCs) by analyzing your network configurations. For example, Reachability Analyzer can help you identify a missing route table entry in your VPC route table that could be blocking network reachability between an EC2 instance in Account A that is not able to connect to another EC2 instance in Account B in your AWS Organization.
VPC Network Access Analyzer allows you to identify unintended network access to your AWS resources, helping you meet your security and compliance guidelines. For example, you can create a scope to verify that all paths from your web-applications to the internet, traverse the firewall, and detect any paths that bypass the firewall.
Today, Amazon QuickSight is announcing the general availability of a native Apache Impala connector.
Apache Impala is a massively parallel processing (MPP) SQL query engine that runs natively on Apache Hadoop. QuickSight customers can now connect using their username password credentials for Impala and import their data into SPICE.
Apache Impala connector for Amazon QuickSight is now available in the following regions: US East (N.Virgina and Ohio), US West (Oregon), Canada (Central), South America (Sao Paulo), Africa (South Africa), Europe (Frankfurt, Zurich, Stockholm, Milan, Spain, Ireland, London, Paris), Asia Pacific (Mumbai, Singapore, Tokyo, Seoul, Jakarta, Sydney). For more details, click here.
Google is committed to helping federal agencies meet their mission, more securely and more efficiently, with innovative cloud technologies. Today, we’re reinforcing our commitment to FedRAMP 20x, an innovative pilot program that marks a paradigm shift in federal cloud authorization. FedRAMP 20x is a new assessment process designed to move away from traditional narrative-based requirements towards continuous compliance and automated validation of machine-readable evidence. Our approach is built around Google Cloud Compliance Manager (now available for public preview) and is designed to transform the path to FedRAMP authorization for our partners and customers.
Compliance Manager accelerates the FedRAMP authorization process by automating end to end management of compliance for partners and customers building on Google Cloud. By providing automated, externally validated cloud controls to demonstrate compliance with FedRAMP 20x Key Security Indicators (KSIs), Compliance Manager allows partners to spend fewer resources manually collecting evidence and is designed to reduce the time required to achieve FedRAMP authorization. Compliance Manager will natively support FedRAMP 20x compliance with general availability later this year.
During a recent proof of concept demonstration to the FedRAMP Program Management Office (PMO), Google showcased how Compliance Manager enables strategic Google Cloud partners such as stackArmor to submit applications for 20x Phase One authorization and beyond.
Google Cloud’s latest capabilities are an exciting step forward in accelerating the FedRAMP 20x cloud-native approach to security assessment and validation. We need true innovation from industry to realize this vision of automated security and Google Cloud is leading the way by building it natively into their platform. As Google goes to market in support of FedRAMP 20x, we can’t help but wonder who’s next?
Pete Waterman
Director, FedRAMP
Compliance Manager’s ability to automate KSI compliance is also being assessed by Coalfire, a FedRAMP recognized Third Party Assessment Organization (3PAO). Coalfire is providing independent validation that agencies can benefit from a much faster, more automated path to deploying secure Google Cloud solutions, directly accelerating their access to critical cloud technologies.
Google is dedicated to accelerating federal compliance through both the existing FedRAMP Rev5 authorization path and the pilot FedRAMP 20x process. Recent Rev5 High authorizations for Google Cloud services including Agent Assist, Looker (Google Cloud core), and Vertex AI Vector Search.
If you are spending more effort than expected on compliance and audits, you can get started with Compliance Manager and streamline compliance and audits for your organization. Want to learn more? Register for the Google Public Sector Summit on October 29, 2025, in Washington, D.C., where you will gain crucial insights and skills to navigate this new era of innovation and harness the latest cloud technologies.
In a prior blog post, we introduced BigQuery’s advanced runtime, where we detailed enhanced vectorization and discussed techniques like dictionary and run-length encoded data, vectorized filter evaluation and compute pushdown, and parallelizable execution.
This blog post dives into short query optimizations, a key component of BigQuery’s advanced runtime. These optimizations can significantly speed up the “short” queries all while using fewer BigQuery “slots” (our term for computational capacity). They are commonly used by business intelligence (BI) tools such as LookerStudio or custom applications powered by BigQuery.
Similar to other BigQuery optimization techniques, the system uses a set of internal rules to determine if it should consolidate a distributed query plan into a single, more efficient step for short queries. These rules consider factors like:
The estimated amount of data to be read
How effectively the filters are reducing the data size
The type and physical arrangement of the data in storage
The overall query structure
The runtime statistics of past query executions
Along with enhanced vectorization, short query optimizations are an example of how we work to continuously improve performance and efficiency for BigQuery users.
Optimizations specific to short query execution
BigQuery’s short query optimizations dramatically speed up short, eligible queries to significantly reduce slot usage and improve query latency. Normally, BigQuery breaks down queries into multiple stages, each with smaller tasks processed in parallel across a distributed system. However, for suitable queries, short query optimizations skip this multi-stage distributed execution, leading to substantial gains in performance and efficiency. When possible, it also uses multithreaded execution, including for queries with joins and aggregations. For these queries, BigQuery automatically determines if a query is eligible and dispatched to a single stage. BigQuery also employs history-based optimization (HBO), which learns from past query executions. HBO helps BigQuery decide whether a query should run in a single stage or multiple stages based on its historical performance, ensuring the single stage approach remains beneficial even as workloads evolve.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebccb70bf10>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Full view of data
Short query optimizations process the entire query as a single stage, giving the runtime complete visibility into all tables involved. This allows the runtime to read both sides of the join and gather precise metadata, including column cardinalities. For instance, join columns in the query below have a low cardinality, so it is efficiently stored using dictionary and run-length encodings (RLE). Consequently, the runtime devises a much simpler query execution plan by leveraging encoding during execution.
The following query calculates top-ranked products and their performance metrics for an e-commerce scenario. It’s based on a type of query observed in Google’s internal data pipelines that benefited from short query optimizations. The following query uses this BigQuery public dataset, allowing you to replicate the results. Metrics throughout this blog were captured during internal Google testing.
code_block
<ListValue: [StructValue([(‘code’, ‘WITHrn AllProducts AS (rn SELECTrn id,rn name AS product_namern FROMrn `bigquery-public-data.thelook_ecommerce.products`rn ),rn AllUsers AS (rn SELECTrn idrn FROMrn `bigquery-public-data.thelook_ecommerce.users`rn ),rn AllSales AS (rn SELECTrn oi.user_id,rn oi.sale_price,rn ap.product_namern FROMrn `bigquery-public-data.thelook_ecommerce.order_items` AS oirn INNER JOIN AllProducts AS aprn ON oi.product_id = ap.idrn INNER JOIN AllUsers AS aurn ON oi.user_id = au.idrn ),rn ProductPerformanceMetrics AS (rn SELECTrn product_name,rn ROUND(SUM(sale_price), 2) AS total_revenue,rn COUNT(*) AS units_sold,rn COUNT(DISTINCT user_id) AS unique_customersrn FROMrn AllSalesrn GROUP BYrn product_namern ),rn RankedProducts AS (rn SELECTrn product_name,rn total_revenue,rn units_sold,rn unique_customers,rn RANK() OVER (ORDER BY total_revenue DESC) as revenue_rankrn FROMrn ProductPerformanceMetricsrn )rnSELECTrn revenue_rank,rn product_name,rn total_revenue,rn units_sold,rn unique_customersrnFROMrn RankedProductsrnORDER BYrn revenue_rankrnLIMIT 25;’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ebccb720670>)])]>
By skipping the shuffle layer, overall query execution requires less CPU, memory, and network bandwidth. In addition to that, short query optimizations take full advantage of enhanced vectorization described in the Understanding BigQuery enhanced vectorization blog post.
Figure 2: Internal Google testing: One stage plan for internal testing of query from Figure 1.
Queries with joins and aggregations
In data analytics, it’s common to join data from several tables and then calculate aggregate results. Typically, a query performing this distributed operation will go through many stages. Each stage can involve shuffling data around, which adds overhead and slows things down. BigQuery’s short query optimizations can dramatically improve this process. When enabled, BigQuery intelligently recognizes if the amount of data being queried is small enough to be handled by a much simpler plan. This optimization leads to substantial improvements: for the query described in Figure 3, during internal Google testing we observed 2x to 8x faster execution times and an average of 9x reduction in slot-seconds.
code_block
<ListValue: [StructValue([(‘code’, “SELECTrn p.category,rn dc.name AS distribution_center_name,rn u.country AS user_country,rn SUM(oi.sale_price) AS total_sales_amount,rn COUNT(DISTINCT o.order_id) AS total_unique_orders,rn COUNT(DISTINCT o.user_id) AS total_unique_customers_who_ordered,rn AVG(oi.sale_price) AS average_item_sale_price,rn SUM(CASE WHEN oi.status = ‘Complete’ THEN 1 ELSE 0 END) AS completed_order_items_count,rn COUNT(DISTINCT p.id) AS total_unique_products_sold,rn COUNT(DISTINCT ii.id) AS total_unique_inventory_items_soldrnFROMrn `bigquery-public-data.thelook_ecommerce.orders` AS o,rn `bigquery-public-data.thelook_ecommerce.order_items` AS oi,rn `bigquery-public-data.thelook_ecommerce.products` AS p,rn `bigquery-public-data.thelook_ecommerce.inventory_items` AS ii, `bigquery-public-data.thelook_ecommerce.distribution_centers` AS dc, `bigquery-public-data.thelook_ecommerce.users` AS urnWHERErno.order_id = oi.order_id AND oi.product_id = p.id AND ii.product_distribution_center_id = dc.id AND oi.inventory_item_id = ii.id AND o.user_id = u.idrnGROUP BYrn p.category,rn dc.name,rn u.countryrnORDER BYrn total_sales_amount DESCrnLIMIT 1000;”), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ebccb7208b0>)])]>
We can see how the execution graph changes when short query optimizations are applied to the query in Figure 3.
Figure 4: Internal Google testing: execution the join-aggregate query in 9 stages using BigQuery distributed execution.
Figure 5: Internal Google testing: with advanced runtime short query optimizations, join-aggregate query completes in 1 stage.
Optional job creation
Short query optimizations and optional job creation mode are two independent yet complementary features that enhance query performance. While optional job creation mode contributes significantly to the efficiency of short queries regardless of short query optimizations, they work even better together. When both are enabled, the advanced runtime streamlines internal operations and utilizes the query cache more efficiently, which leads to even faster delivery of the results.
Better throughput
By reducing the resources required for queries, short query optimizations not only deliver performance gains, but also significantly improves the overall throughput. This efficiency means that more queries can be executed concurrently within the same resource allocation.
The following graph, captured from Google internal data pipeline, shows an example query that benefits from short query optimizations. The blue line shows the maximum QPS (or throughput) that can be sustained. The red line shows QPS on the same reservation after advanced runtime was enabled. In addition to better latency, the same reservation can now handle over 3x higher throughput.
Figure 6: Internal Google testing: Throughput comparison, red line shows improvements from short query optimizations in the advanced runtime
Optimal impact
BigQuery’s short query optimizations feature is designed primarily for BI queries intended for human-readable output. Even though BigQuery utilizes a dynamic algorithm to determine eligible queries, it works with other performance-enhancing features like history-based optimizations, BI Engine, optional job creation, etc. Nevertheless, some workload patterns will benefit from short query optimizations more than others.
This optimization may not significantly improve queries that read or produce a lot of data. To optimize for short query performance, it is crucial to keep the query working set and result size small through pre-aggregation and filtering. Implementing partitioning and clustering strategies appropriate to your workload can also significantly reduce the amount of data processed, and utilizing the optional job creation mode is beneficial for short-lived queries that can be easily retried.
Short query optimizations in action
Let’s see how these optimizations actually impact our test query by looking closer at the query in Figure 1 and its query plan in Figure 2. The query shape is based on actual workloads observed in production and made to work against a BigQuery public dataset, so you can test it for yourself.
Despite the query scanning only 6.5 MB, running this query without advanced runtime takes over 1 second and consumes about 20 slot-seconds (execution time may vary depending on available resources in the project).
Figure 7: Internal Google testing: Sample query execution details without Advanced Runtime
With BigQuery’s enhanced vectorization in the advanced runtime, during internal Google testing this query finishes in 0.5 seconds while consuming 50x less resources.
Figure 8: Internal Google testing: Sample query execution details with Advanced Runtime Short Query Optimizations
The magnitude of improvement here is less common, showing an example of real workload improvement from Google internal pipelines. We have also seen classic BI queries with several aggregations, filters, group by and sort, or snowflake joins achieve faster performance and better slot utilization.
Try it for yourself
Short query optimizations boost query price/performance, allowing for higher throughput and lower latencies for common BI small queries. It achieves this by combining cutting-edge algorithms with Google’s latest innovations across storage, compute, and networking. This is just one of many performance improvements that we’re continually delivering to BigQuery customers like enhanced vectorization, history based optimizations, optional-job-creation mode, column metadata index (CMETA) and others.
Now that both key pillars of advanced runtime are in public preview, all you have to do to test it with your workloads is to enable it using the single ALTER PROJECT command as documented here. This enables both enhanced vectorization and short query optimizations. If you already did that earlier for enhanced vectorization, your project is automatically also enabled for short query optimizations.
Try it now with your own workload following steps in BigQuery advanced runtime documentation here, and share your feedback and experience with us at bqarfeedback@google.com.
In a world where data is your most valuable asset, protecting it isn’t just a nice-to-have — it’s a necessity. That’s why we are thrilled to announce a significant leap forward in protecting the data in your Cloud SQL instances, with Enhanced Backups for Cloud SQL.
This powerful new capability integrates Google Cloud Backup and DR Service directly into Cloud SQL, providing a robust, centralized, and secure solution to help ensure business continuity for your database workloads. The Backup and DR Service already protects Compute Engine VMs, Persistent Disks, and Hyperdisk, extending its ability to protect all of your workloads.
Modern defense for modern threats
Enhanced Backups for Cloud SQL provides advanced protection by storing database backups in logically air-gapped and immutable backup vaults. Managed by Google and completely separate from your source project, these vaults provide a critical defense against threats that could compromise your entire environment.
For customers like JFrog, Cloud SQL Enhanced Backup with Google Cloud Backup and DR is proving to be a superior and robust alternative:
“Using this integration will help us significantly bolster our security posture by offering logically air-gapped and immutable backup vaults, creating a vital defense layer against diverse data-loss scenarios.” – Shiran Melamed, DevOps Group Leader, JFrog
Control, compliance, and peace of mind
We designed Enhanced Backups to be both powerful and easy to use, giving you fine-grained control over your data protection strategy. These capabilities are now available in Preview for both Cloud SQL Enterprise and Enterprise Plus editions, and offer key features to help ensure your data is always secure and recoverable:
Immutable, air-gapped vaults: Protect your data with immutable backups stored in a secure, logically air-gapped vault. Setting minimum enforced retention and retention locks ensure backups cannot be deleted or changed for a predefined period, while a zero-trust access policy provides granular control.
Business continuity: Your data is safeguarded against both source-instance and source-project deletion, so you can recover your data even if the source project itself becomes unavailable.
Flexible policies that fit your needs: Your business isn’t one-size-fits-all, and your backup strategy shouldn’t be either. We offer highly customizable backup schedules, including hourly, daily, weekly, monthly, and yearly options. You can store backups for periods ranging from days to decades.
Centralized command and control: Manage everything from a single, unified dashboard in the Google Cloud console. Monitor job status, identify unprotected resources, and generate reports, all in one place.
But you don’t have to take our word for it. See how customers like SQUARE ENIX and Rotoplas are already benefiting from Enhanced Backups for Cloud SQL:
“At SQUARE ENIX, protecting our users’ data is paramount. Google Cloud SQL’s Enhanced Backup integrated with the Backup and DR service is essential to our resiliency strategy. Its robust protection against instance- and even project-level deletion, combined with a secure, isolated vault and long-term retention, provides a critical safeguard for our most valuable asset. This capability will give us confidence in our data’s integrity and recoverability, allowing our teams to focus on creating the unforgettable experiences our users expect.” – Kazutaka Iga, SRE,SQUARE ENIX
“Google Cloud SQL’s Enhanced Backup feature along with Google Professional Services support is a value add to our backup strategy at Rotoplas. The ability to centralize management, flexibly schedule backups, and store them independent of the source project gives us unprecedented control. This streamlined approach simplifies our operations and enhances security, ensuring our data is always protected and easily recoverable.” – Agustín Chávez Cabrera, Devops manager, Rotoplas
Get started with Enhanced Backups
Getting started with Enhanced Backups is simple. Here’s how you can enable this enhanced protection for your Cloud SQL instances:
1. Create or select a backup vault: In the Backup and DR service, either create a new backup vault or use an existing one.
2. Create a backup plan: Define a backup plan for Cloud SQL within your chosen backup vault, setting your desired backup frequency and retention rules.
3. Apply the backup plan to the Cloud SQL instances: Apply your new backup plan to existing or new Cloud SQL instances.
Once you apply a backup plan, your backups will automatically be scheduled and moved to the secure backup vault based on the rules you defined. The entire experience can be managed through the tools you already use — whether it’s the Google Cloud console, gcloud command-line tool, or APIs — so there’s no additional infrastructure for you to deploy or manage.
Protect your data now
With Enhanced Backups for Cloud SQL, you can build a superior data protection strategy that enhances security, simplifies operations, and strengthens your overall data resilience for Cloud SQL instances.
Get started and use it yourself. The new features are available now in supported regions.
Experience the new management solution in the console.
Amazon OpenSearch Serverless now offers automatic semantic enrichment, a breakthrough feature that simplifies semantic search implementation. You can now boost your search relevance with minimal effort, eliminating complex manual configurations through an automated setup process.
Semantic search goes beyond keyword matching by understanding the context and meaning of search queries. For example, when searching for “how to treat a headache,” semantic search intelligently returns relevant results about “migraine remedies” or “pain management techniques” even when these exact terms aren’t present in the query.
Previously, implementing semantic search required ML (Machine Learning) expertise, model hosting, and OpenSearch integration. Automatic semantic enrichment simplifies this process dramatically. With automatic semantic enrichment, you simply specify which fields need semantic search capabilities. OpenSearch Service handles all semantic enrichment automatically during data ingestion.
The feature launches with support for two language variants: English-only and Multi-lingual, covering 15 languages including Arabic, Chinese, Finnish, French, Hindi, Japanese, Korean, Spanish, and more. You pay only for actual usage during data ingestion, with no ongoing costs for storage or search queries.
This new feature is automatically enabled for all serverless collections and is now available in the following regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (Spain), Europe (Stockholm). To get started, visit our technical documentation, read our blog post, and see Amazon OpenSearch Service semantic search pricing. Check the AWS Regional Services List for availability in your region.
AWS Private Certificate Authority (AWS Private CA) now supports AWS PrivateLink with all AWS Private CA Federal Information Processing Standard (FIPS) endpoints that are available in commercial AWS Regions and the AWS GovCloud (US) Regions. With this launch, you can establish a private connection between your virtual private cloud (VPC) and AWS Private CA FIPS endpoints instead of connecting over the public internet, helping you meet your organization’s business, compliance, and regulatory requirements to limit public internet connectivity.
AWS Private CA offers FIPS endpoints in the following AWS Regions: US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Canada (Central), Canada West (Calgary), AWS GovCloud (US-East), and AWS GovCloud (US-West).
AWS announces the general availability of Automated Reasoning checks, a safeguard within Amazon Bedrock Guardrails that uses formal verification techniques to validate the accuracy and policy compliance of outputs from generative AI models. Automated Reasoning checks deliver up to 99% accuracy at detecting correct responses from LLMs – giving you provable assurance in detecting AI hallucinations while also assisting with ambiguity detection in model responses.
Automated Reasoning checks provides a fundamentally different approach from traditional testing methods. Unlike sampling outputs for quality, Automated Reasoning checks offers mathematically rigorous guarantees that AI responses adhere to defined business rules and domain knowledge. This is especially valuable for enterprises in regulated industries that require unambiguous validation of AI outputs before deployment.
Automated Reasoning checks in Amazon Bedrock Guardrails is now generally available in the US (N. Virginia), US (Ohio), US (Oregon), Europe (Frankfurt), Europe (Ireland), and Europe (Paris) Regions. Customers can access the service through the Amazon Bedrock console, as well as the Amazon Bedrock Python SDK.
To learn more about Automated Reasoning checks and how you can integrate it into your generative AI workflows, please visit the Amazon Bedrock documentation. You can also reach out to your AWS account team or an AWS Solutions Architect to discuss your specific use case and requirements. CloudFormation support will be coming soon. Read the news blog, review the documentation, or visit the Guardrails webpage to learn more.
Editor’s note: Today’s post is by Asad Rahman, Director of IT forWayfair, a leading e-commerce company specializing in furniture and home goods. Wayfair chose ChromeOS and ChromeOS Flex devices to support contact center staff, improve productivity, and streamline security tasks.
“ChromeOS is a natural fit for how Wayfair operates today. We’ve embraced Google Workspace and Gemini to power collaboration and insights, and our teams are fast-moving and globally distributed. ChromeOS keeps pace with us, providing a modern infrastructure that’s simple, scalable, and streamlined.”
Snapshot:
Easy deployment across multiple setups: Successfully transitioned 127 employees across 150+ devices with ChromeOS Flex.
Cost savings: Potential savings of over $1.2 million over the next 6 months.
Improved security: ChromeOS layered security model, automatic updates, and sandboxing have dramatically reduced the attack surface.
As the destination for all things home, Wayfair’s goal is to make it easy for customers to create a home that is just right for them. Delivering on that promise for millions of customers and an online catalog of over 30 million items requires scalable and secure technology that empowers our global teams. ChromeOS has been a reliable partner in our operations for years, powering digital displays in our offices and serving our retail and grab-and-go spots. We are now building on that success by expanding our ChromeOS deployment to our call centers, further enhancing the productivity and collaboration that are crucial to providing a seamless customer experience.
Recognizing the need for change
For years, we wrestled with a frustrating paradox. We had fully embraced Google Workspace leveraging tools like Gemini and NotebookLM. Yet, our endpoints were still stuck on a different platform. We had a vision to provide a cost-efficient, cloud-based, remote, secure environment that would eliminate expensive VDI licenses and complex engineering overhead, improve user experience, and align to a zero-trust architecture for future workforce expansion. Our users needed to collaborate seamlessly and do their jobs efficiently, but the solution we had just wasn’t cutting it. Sales and service agents, along with our BPO partners, faced daily performance and reliability challenges. Log-in times were a nightmare, often upwards of 10 minutes in our virtualized environments, and live updates would bog down productivity in the middle of the workday. Our end-users were constantly asking for something simpler, something that just worked.
On the IT side, we were stretched thin. Most of our resources and time were spent maintaining our stack instead of innovating. A reduction in personnel coupled with the departure of our lead IT architect further underscored the need for a solution that was inherently easier to manage and less resource-intensive. We knew we needed a change, and we needed it fast.
Easy, secure, and built for our future
The decision to lean into ChromeOS for our call centers felt like a natural evolution. Given our existing familiarity in other areas, and our grounding in Google Workspace, it just made sense. We needed a user experience that was intuitive and easy while being simple to manage for our IT team. The security and ease of deployment, while not our initial driving force, quickly became a critical factor once we saw the potential for a broader rollout.
We decided to launch ChromeOS in our Georgia based call center. The good news? We could leverage our existing devices by simply deploying ChromeOS Flex. This meant a smoother transition with less capital expenditure. We successfully transitioned 127 employees across over 150 devices.
Easy deployment across multiple setups
The deployment process was remarkably easy. Our call center agents were already somewhat familiar with the ChromeOS interface, which certainly helped. ChromeOS worked seamlessly with tools that our users depend on, like voice over the internet communication and workplace management tools. Additionally, we can rest assured that our business processing outsource partners have secure remote access to company resources since ChromeOS can work with various hardware solutions, like BYOD. The feedback we received has been stellar. We are now expanding beyond our in-office agents and deploying ChromeOS to our remote workers—working both stateside and globally.
$1.2 million in savings
The great news is that our initial estimates for moving to ChromeOS are looking even better than we projected! We’re now looking at a potential savings of more than $1.2 million over the next 6 months. With ChromeOS, we’ve also successfully reduced our fixed costs, meaning we actually see savings when we reduce users, a significant improvement. Plus, we’re no longer tied into those long-term, expensive, iron-clad contracts, giving us much more flexibility.
This transition has been great for our IT team as well. Previously, the engineering effort to keep everything running was pretty complex and expensive, but ChromeOS requires fewer resources from them. This is a stark contrast to our old system, where users understandably faced frustration with performance slowdowns, especially audio and lag issues. Those problems resulted in a huge number of support issues—we were dealing with over 7,500 tickets a year! The shift to ChromeOS has streamlined things for everyone.
Don’t send me back: The unmistakable benefits of ChromeOS
The results have been nothing short of transformative. Overall feedback was positive, with users praising ChromeOS simplicity and speed, and the elimination of VDI screen switching. For years, VDI issues were consistently one of our top three IT tickets. Since moving to ChromeOS, those tickets have been reduced to none. Now, if there’s an issue, it’s almost always user error, not a problem with the underlying technology.
I’ve heard direct quotes from our users like, “Don’t send me back!” That’s the kind of enthusiasm that validates every decision we made. Log-in times are now measured in seconds, not the agonizing 10 minutes they once were. We’ve also seen less latency on certain applications, which has made a real difference in our agents’ ability to serve our customers efficiently.
And when it comes to security? ChromeOS security model, automatic updates, and sandboxing have dramatically reduced our attack surface. Our security team has provided positive feedback, noting the peace of mind that comes with knowing devices are inherently more secure, with less manual intervention required from their end. The built-in protection is a game-changer for a company of our size and with our current IT resources.
Looking ahead, we’ll begin assessing the remaining call center population, with the goal of replacing all legacy laptops and desktops with ChromeOS by the end of year. Embracing ChromeOS has been one of the best decisions we’ve made. It’s not just about simpler IT; it’s about empowering our employees, improving productivity, and building a more secure and resilient foundation for Wayfair, one “Don’t send me back,” at a time.
The world is not just changing; it’s being re-engineered in real-time by data and AI. The way we interact with data is undergoing a fundamental transformation, moving beyond human-led analysis to a collaborative partnership with intelligent agents. This is the agentic shift, a new era where specialized AI agents work autonomously and cooperatively to unlock insights at a scale and speed that was previously unimaginable. At Google Cloud, we’re not just participants in this shift — we are building the core intelligence, interconnected ecosystems, and AI-native data platforms that power it.
To make this agentic reality possible, you need a different kind of data platform — not a collection of siloed tools, but a single, unified, AI-native cloud. That’s Google’s Data Cloud. At its heart are our unified analytical and operational engines, that are removing the historic divide between business transaction data and strategic analysis. Google Data Cloud provides agents with a complete, real-time understanding of the business, transforming it from a collection of processes into a self-aware, self-tuning, reliable organization.
Today, we are delivering major innovations across three key areas that bring this vision to life:
A new suite of data agents: specialized AI agents designed to act as expert partners for every data user, from data scientists and engineers to business analysts.
An interconnected network for agent collaboration: a suite of APIs, tools, and protocols that allow developers to integrate Google agents with their own agents and AI efforts creating a single, intelligent ecosystem.
A unified, AI native foundation: a platform that enables intelligent agents by unifying data, providing persistent memory, and embedding AI-driven reasoning.
Specialized data agents as expert partners
The agentic era begins with a new workforce of specialized AI agents, providing an AI-native interface to turn intent into action.
For data engineers: We are introducing the Data Engineering Agent (Preview) in BigQuery to simplify and automate complex data pipelines. You can now use natural language prompts to streamline the entire workflow, from data ingestion from sources like Google Cloud Storage to performing transformations and maintaining data quality. Simply describe what you need — “Create a pipeline to load a CSV file, cleanse these columns, and join it with another table” — and the agent generates and orchestrates the entire workflow.
Fig. 1 – Data engineering agent for automation of complex data pipelines
For data scientists: We are reimagining an AI-first Colab Enterprise Notebook experience available in BigQuery and Vertex AI, featuring a new Data Science Agent (Preview). Powered by Gemini, the Data Science Agent triggers entire autonomous analytical workflows, including exploratory data analysis (EDA), data cleaning, featurization, machine learning predictions, and much more. It creates a plan, executes the code, reasons about the results, and presents its findings, all while allowing you to provide feedback and collaborate in sync.
Fig. 2 – Data science agent to transform each stage of data science tasks
For business users and analysts: Last year, we introduced the Conversational Analytics Agent, empowering users to get answers from their data using natural language. Today, we’re taking that agent to the next level, with our Code Interpreter (Preview). This enhancement supports the many critical business questions that go beyond what simple SQL can answer — for example, “Perform a customer segmentation analysis to group customers into distinct cohorts?” Powered by Gemini’s advanced reasoning capabilities, and developed in partnership with Google DeepMind, the Code Interpreter translates complex natural language questions into executable Python code. It delivers a complete analytical flow — generating code, providing clear natural language explanations, and creating interactive visualizations — all within the governed and secure environment of the Google Data Cloud.
Fig 3 – Conversational Analytics with Code Interpreter for advanced analysis
Building the interconnected agent ecosystem
The agentic ecosystem is not a closed loop; it’s an open platform for builders. The true potential of the agentic shift is realized when developers not only use existing agents, but also extend and connect them to their own intelligent systems, creating a broader network. Our first-party agents provide powerful, out-of-the-box capabilities as well as foundational building blocks including APIs, tools, and protocols to build custom agents, integrate conversational intelligence into existing applications, and orchestrate complex, multi-agent workflows that solve unique business problems.
To enable this, we are launching Gemini Data Agents APIs, with first being the new Conversational Analytics API (Preview). This API provides the building blocks to integrate Looker’s powerful natural language processing and Code Interpreter capabilities directly into your own applications, products, and workflows. This allows you to create unique, engaging, and accessible data experiences that meet your specific business needs.
Beyond conversational experiences, we are providing the tools to create custom agents from the ground up. Our new Data Agents API and the Agent Development Kit (ADK) allow you to build specialized agents tailored to your unique business processes. The foundation for all this secure interaction is our investment in Model Context Protocol (MCP), including the MCP Toolbox for Databases and the addition of the new Looker MCP Server (Preview).
Fig 4 – Gemini CLI querying semantic layer from Looker MCP Server
A unified and AI-native data foundation
Intelligent agents and the networks they form cannot operate on a traditional data stack. They need a cognitive foundation that unifies data from across the enterprise and provides new capabilities to understand meaning and provide a persistent memory to reason against.
A core requirement of this AI-native foundation is that it unifies live transactional and historical analytical data stored in OLTP and OLAP systems. We started down this path with a columnar engine for AlloyDB to supercharge analytics for PostgreSQL workloads. Today, we are extending that performance commitment to our flagship scale-out database with the new Spanner columnar engine (Preview); analytical queries on Spanner columnar engine perform up to 200× faster than on Spanner’s row store on the SSD tier — right on your transactional data. As part of our unified Data Cloud, this innovation directly benefits our analytical engine, BigQuery via Data Boost, which leverages the Spanner columnar engine to close the gap between transactional and analytical workloads and make it faster for BigQuery to analyze live operational data.
With this unified data plane in place, the next requirement is giving agents a comprehensive memory that is grounded in your company’s factual data. To ensure trustworthy agents and prevent hallucinations, they must use a technique called Retrieval-Augmented Generation (RAG). The foundation of effective RAG is vector search that spans both real-time operational data and deep historical, analytical data. This is why we embed vector search and generation capabilities directly into our data foundations — to give agents access to both transactional and analytical memory.
However, optimizing vector search is complex, often forcing developers to make tough trade-offs between performance, quality, and operational overhead. In AlloyDB AI, new capabilities like adaptive filtering (Preview) solve this for transactional memory, automatically maintaining vector indexes and optimizing for fast queries on live operational data. To provide deep analytical memory, we are also bringing autonomous vector embeddings and generation to BigQuery. Now, BigQuery can automatically prepare and index multimodal data for vector search, a crucial step in building a rich, long-term semantic memory for your agents.
Finally, on top of this unified and accessible data, we are embedding AI reasoning directly into our query engines. With the new AI Query Enginein BigQuery (Preview), all data practitioners can perform AI-powered computations on structured and unstructured data right inside BigQuery, quickly and easily getting answers to subjective questions like “Which of these customer reviews sound the most frustrated?”
AI Query Engine brings the power of LLMs directly to SQL
The future is agentic
The announcements today — from specialized agents for every user to the AI-native foundation that powers them — are more than just a roadmap. They are the building blocks for the new agentic enterprise. By bringing together a new workforce of intelligent agents, enabling them to collaborate within an open and interconnected network, and grounding them in a unified data cloud that erases the line between operational and analytical worlds, we are providing a platform that lets you be an innovator, not just an integrator. This is a fundamental shift in how your organization will interact with its data, moving from complex human-led analysis to a powerful partnership between your teams and intelligent agents. The agentic era is here. We are incredibly excited to see what you will build, and we invite you to join us on this journey to redefine what’s possible with data.
For years, organizations have struggled with the workload conflict between online transaction processing (OLTP) and analytical query processing. OLTP systems such as Spanner are optimized for high-volume, low-latency transactions, and use row-oriented storage that’s efficient for individual record access. Analytical workloads, conversely, require rapid aggregations and scans across large datasets. These tasks are traditionally handled by separate data warehouses that employ columnar storage and incoming data pipelines from transaction systems. Separating OLTP and analytical workflows requires periodic data transfers, which often leads to stale data, complex ETL pipelines, and operational overhead.
Today, we’re thrilled to announce Spanner columnar engine, which brings new analytical capabilities directly to Spanner databases. Just as AlloyDB’s columnar engine enhanced PostgreSQL analytics, Spanner’s new columnar engine lets you analyze vast amounts of operational data in real-time, all while maintaining Spanner’s global consistency, high availability, and strong transactional guarantees — and without impacting transactional workloads.
The power of Spanner columnar engine helps organizations, such as Verisoul.ai, eliminate the problem of data silos typically found when combining high volume transaction systems with fast analytics. “Detecting fraud in real time is only half the story—showing customers the ‘why’ helps them act faster and turn trust into measurable ROI,” says Raine Scott and Niel Ketkar, founders of Verisoul.ai, a machine-learning platform that stops fake users and fraud. “Spanner’s new columnar engine allows high-velocity transactional writes and rich analytics in one place, eliminating data copies and replication lag so customers get instant answers.”
Columnar storage meets vectorized execution
Figure: Spanner columnar engine architecture
The heart of the Spanner columnar engine is its innovative architecture, which combines columnar storage with vectorized query execution.
Columnar Storage in Spanner: Hybrid Architecture
Unlike traditional row-oriented storage, where an entire row is stored contiguously, columnar storage stores data column by column. This offers several advantages for analytical workloads:
Reduced I/O: Analytical queries often access only a few columns at a time. With columnar storage, only the relevant columns need to be read from disk, significantly reducing I/O operations.
Improved compression: Data within a single column is typically of the same data type and often exhibits similar storage patterns, leading to much higher compression ratios. This means more data can fit in memory and fewer bytes need to be read.
Efficient scans: When scanning a column, consecutive values can be processed together, for more efficient data processing.
Spanner columnar engine integrates a columnar format alongside its existing row-oriented storage. This unified transactional and analytical processing design allows Spanner to maintain its OLTP performance while accelerating analytical queries up to 200X on your live operational data.
Vectorized execution: turbocharging your queries
To complement columnar storage, the columnar engine makes use of Spanner’s vectorized execution capabilities. While traditional query engines process data tuple-by-tuple (row by row), a vectorized engine processes data in batches (vectors) of rows. This approach dramatically improves CPU utilization, with:
Reduced function call overhead: Instead of calling a function for each individual row, vectorized engines call functions once for an entire batch, significantly reducing overhead.
Optimized memory access: Vectorized processing often results in more cache-friendly memory access patterns, further boosting performance.
The combination of columnar storage and vectorized execution means that analytical queries on Spanner can run orders of magnitude faster, allowing for real-time insights on your global-scale data.
Better with BigQuery: Accelerating federated queries
The Spanner columnar engine takes its integration with Google’s Data Cloud ecosystem a step further, specifically enhancing integrations between Spanner and BigQuery. For enterprises that leverage BigQuery for data warehousing and analytics, federating queries directly to Spanner has always been a valuable capability. Now, with the Spanner columnar engine, this integration becomes even more potent, by delivering faster insights on operational data.
Data Boost, Spanner’s fully managed, elastically scalable compute service for analytical workloads, is at the forefront of this acceleration. When BigQuery issues a federated query to Spanner, and that query can benefit from columnar scans and vectorized execution, Data Boost automatically leverages the Spanner columnar engine. This provides:
Faster analytical insights: Complex analytical queries initiated from BigQuery that target your Spanner data execute significantly faster, bringing near-real-time operational data into your broader analytical landscape.
Reduced impact on OLTP: Data Boost helps ensure that analytical workloads are offloaded from your primary Spanner compute resources, preventing impact on transactional operations.
Simplified data architecture: You can get the best of both worlds – Spanner’s transactional consistency and BigQuery’s analytical prowess – without the need for complex ETL pipelines to duplicate data.
This integration empowers data analysts and scientists to combine Spanner’s live operational data with other datasets in BigQuery for richer, more timely insights and decision-making.
Columnar engine in action: Accelerating your analytical queries
Let’s look at some sample queries that should see significant acceleration with the Spanner columnar engine. These types of queries, common in analytical and graph workloads, benefit from columnar scans and vectorized processing.
Scenario: Imagine a large e-commerce database; for demonstration purposes, we’ll use the same schema as the TPC-H benchmark.
Query 1: Revenue from discounted shipments in a given Year
SQL
code_block
<ListValue: [StructValue([(‘code’, ‘@{scan_method=columnar}rnSELECTrn sum(l.l_extendedprice * l.l_discount) AS revenuernFROMrn lineitem lrnWHERErn l.l_shipdate >= date “1994-01-01″rn AND l.l_shipdate < date_add(date “1994-01-01”, INTERVAL 1 year)rn AND l.l_discount BETWEEN 0.08 – 0.01 AND 0.08 + 0.01rn AND l.l_quantity < 25;’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3e333c7ada60>)])]>
Acceleration: This query heavily benefits from scanning only the l_shipdate, l_extendedprice, l_discount, and l_quantity columns from the lineitem table. Vectorized execution rapidly applies the date, discount, and quantity filters to identify qualifying rows.
Acceleration: This query heavily benefits from scanning only the l_discount and l_quantity columns from the lineitem table. Vectorized execution rapidly applies the equality filter (l_discount = 0) to identify matching rows.
Query 3: Item count and discount range for specific tax brackets..
SQL
code_block
<ListValue: [StructValue([(‘code’, ‘@{scan_method=columnar}rnSELECTrn count(*),rn min(l_discount),rn max(l_discount)rnFROMrn lineitemrnWHERErn l_tax IN (0.01, 0.02);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e333c325340>)])]>
Acceleration: This query benefits heavily from scanning only the l_tax and l_discount columns from the lineitem table. Vectorized execution rapidly applies the IN filter on the l_tax column to identify all matching rows.
Query 4: Scan friend relationships to find the N most connected people in the graph using Spanner Graph
GQL
code_block
<ListValue: [StructValue([(‘code’, ‘@{scan_method=columnar}rnGRAPH social_graphrnMATCH (p:Person)-[k:Knows]->(:Person)rnRETURN COUNT(k) AS friend_count GROUP BY p ORDER BY friend_count DESC LIMIT 10;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e334f6545b0>)])]>
Acceleration:This query benefits heavily from scanning subgraphs and by loading only relevant columns from the graph.
Query 5: Perform a K-nearest neighbor vector similarity search to retrieve the top 10 most semantically similar embeddings with perfect recall
GQL
code_block
<ListValue: [StructValue([(‘code’, ‘@{scan_method=columnar}rnSELECT e.Id as key, COSINE_DISTANCE(@vector_param, e.Embedding) as distancernFROM Embeddings ernORDER BY distance LIMIT 10;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e334f654e20>)])]>
Acceleration:This query benefits heavily from scanning contiguously stored vector embeddings and by loading only relevant columns from the table.
Get started with Spanner columnar engine today!
The Spanner columnar engine is designed for businesses looking to unlock faster, deeper, real-time insights from their operational data without compromising Spanner’s foundational strengths. We are incredibly excited about the possibilities this opens up for developers and data analysts alike. We invite you to be among the first to try the Spanner columnar engine. Request access to the Preview of Spanner columnar engine today by signing up atbit.ly/spannercolumnar. We look forward to seeing what you build!
At Google I/O 2025, we announced a new, reimagined AI-first Colab with agentic capabilities, making it a true coding partner that understands your current code, actions, intentions, and goals. Today, we are excited to bring these capabilities to Google Cloud BigQuery and Vertex AI via the Colab Enterprise notebook. Designed to simplify and transform data science and analytics workflows for organizations, the new capabilities in the Colab Enterprise notebook can:
Automate end-to-end data science workflows through the built-in Data Science Agent (DSA), which creates multi-step plans, generating and executing code, reasons about the results, and presents its findings.
Generate, explain and transform code, as well as explain errors and fix them automatically. It can also provide code assistance while you type.
Create visualizations from simple prompts.
Let’s take a closer look.
Simplify workflows with Data Science Agent
Data science can be complex, iterative, and time-consuming. You must first translate your business problem into a machine learning task, identify and clean raw data, transform it, train a model, evaluate it and then repeat the loop to optimize it. This requires skill and time. The Data Science Agent (DSA) in Colab accelerates data science development with agentic capabilities that facilitate data exploration, transformation and ML modeling.
You start with a simple prompt such as “Train a model to predict ‘income bracket’ from table bigquery-public-data.ml_datasets.census_adult_income“ in the notebook chat. The Data Science Agent then generates a detailed plan covering all aspects of data science modeling from data loading, exploration, cleaning, visualization, feature engineering, data splitting, model training/optimization and evaluation.
You can accept, cancel, or modify this plan. The generated code is executed on the Colab runtime. If the agent makes an error it can autocorrect and generate new code rectifying it. You maintain full control, approving each step and can make manual edits if desired. This iterative approach ensures transparency and trust.
The agent also has full contextual awareness of your notebook, understanding existing code, outputs, and variables to provide tailored code for each step of the plan, allowing you to also make iterative changes to your existing code.
Data Science Agent helps simplify workflows
Once you are satisfied with the notebook you’ve co-developed with AI, you can then schedule it for automated runs, or use it in a multi-step DAG with BigQuery Pipelines.
Multi-cell code generation for anything you want to do with data
AI-first Colab Enterprise notebooks also support code generation for a wide range of tasks and follow the same interaction pattern as the Data Science Agent mentioned above. For example, using the chat interface you can prompt to:
Generate code for arbitrary Python-based data transformation, visualization, analytics (e.g., run a causal analysis).
Manage Colab environment (e.g., install new libraries).
Generate code for interacting with other Google Cloud services (e.g., manage a function deployment to Cloud Run).
The human-in-the-loop interaction design allows for approval, changes and editing of the generated code.
Code generation using the chat interface
You can also transform your existing code. Simply describe a change in natural language (e.g., “add error handling to this data loading function” or “refactor this monolithic function into smaller, more modular parts”) and the agent will identify and modify the relevant code for you.
Easy visualizations
The Python visualization ecosystem is rich with many choices such as Matplotlib, Seaborn, Plotly etc. While these already work well in Colab notebooks, using these libraries requires writing boilerplate code and high familiarity with the library to get a chart with good fit and finish.
AI-First Colab Notebooks excel in generating Python code for such visualizations. Simply start with a prompt like “Generate a chart displaying…” referencing your data source which can be a BigQuery table, a local Dataframe in Colab or even an uploaded file. Next just approve and run the code to have the visualization generated for you. To modify the visualization, for example, change axis to log axis or change color of a chart, simply prompt for the incremental changes and the agent will adjust the code to your needs
Generate Python code for easy visualizations
Explaining and fixing errors
Colab has an built-in error explanation and fixing flow. If your AI generated or user authored code cell runs into an error, you can click the ‘Explain Error’ shortcut which opens the notebook chat, which explains the error and generates the remediation code in diff view for approval.
Explain and fix errors
Fast and intelligent code completion
Code completion in Colab Enterprise offers implicit suggestions as you type, accelerating your workflow by reducing keystrokes. Accept suggestions with a tab or modify them.
Code completion in Colab Enterprise
Get started today
The AI-first Colab Enterprise with its Data Science Agent is transforming how data professionals work. Across BigQuery and Vertex AI, the Colab Enterprise experience is seamless and the notebooks created are interoperable, regardless of where they are created.
The AI-first notebook experience with Data Science Agent is currently available in US and Asia regions in Preview and will be rolled out to other Google Cloud regions in the coming days.
If you have a feature request, a question on availability in your region or feedback, reach us at vertex-notebooks-previews-external@google.com or fill out this form.