Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M7i instances powered by custom 4th Gen Intel Xeon Scalable processors (code-named Sapphire Rapids) are available in the Middle East (UAE) region. These custom processors, available only on AWS, offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers.
M7i deliver up to 15% better price-performance compared to M6i. M7i instances are a great choice for workloads that need the largest instance sizes or continuous high CPU usage, such as gaming servers, CPU-based machine learning (ML), and video-streaming. M7i offer larger instance sizes, up to 48xlarge, and two bare metal sizes (metal-24xl, metal-48xl). These bare-metal sizes support built-in Intel accelerators: Data Streaming Accelerator, In-Memory Analytics Accelerator, and QuickAssist Technology that are used to facilitate efficient offload and acceleration of data operations and optimize performance for workloads.
Storage buckets are where your data lives in the cloud. Much like digital real estate, these buckets are your own plot of land on the internet. When you move away and no longer need a specific bucket, someone else can reuse the plot of land it refers to — if the old address is still accessible to the public.
This is the core idea behind a dangling bucket attack. It happens when you delete a storage bucket, but references to it still exist in your application code, mobile apps, and public documentation. An attacker can then simply claim the same bucket name in their own project, effectively hijacking your old address to potentially serve malware and steal data from users who unknowingly still rely on a bucket that is no longer officially in use.
Fortunately, you can protect your applications and users from this threat with the following four steps.
First, implement a safe decommissioning plan
When you delete a bucket, do so carefully. A deliberate decommissioning process is critical. Before you type gcloud storage rm, follow these steps:
Step 1: Audit and learn
Before deleting anything, take the time to understand who and what are still accessing the bucket. Use logs to check for recent traffic. If you see active traffic requests coming from old versions of your app, third-party services, and users, investigate them. Pay extra attention to requests attempting to pull executable code, machine learning models, dynamic web content (such as Java Script), and sensitive configuration files.
You might see a lot of requests coming from bots, data crawlers, and scanners by checking the user agent of the requester. Their requests are essentially background noise, and don’t indicate that the bucket is actively required for your systems to function correctly. These are not dangerous and can be safely disregarded because they don’t represent legitimate traffic from your applications and users.
Step 2: Delete with confidence
Many automated processes and user activities don’t happen every day, so it’s important to wait at least a week before deleting the bucket. Waiting for at least a week increases the confidence that you’ve observed a full cycle of activity, including:
Weekly reports: Scripts that generate reports and perform data backups on a weekly schedule.
Batch jobs: Automated tasks that might only run over the weekend or on a specific day of the week.
Infrequent user access: Users who may only use a feature that relies on the bucket’s data once a week.
After you’ve verified that no legitimate traffic is hitting the bucket for at least a week, and you’ve updated all of your legacy code, then you can proceed with deleting the bucket. Deleting a Google Cloud project effectively deletes all resources associated with it, including all Google Cloud Storage buckets.
Next, find and fix code that references dangling buckets
Preventing future issues is key, but you may have references to dangling buckets in your environment right now. Here’s a plan to hunt them down and fix them.
Step 3: Proactive discovery
Analyze your logs: This is one of your most powerful tools. Query your Cloud Logging data for server and application logs showing repeated 404 Not Found errors for storage URLs. For example, a high volume of failed requests to the same non-existent bucket name is a major red flag (and to remediate it, we recommend you continue with Step 3 and then proceed to Step 4.)
Scan your codebase and documentation: Perform a comprehensive scan of all your private and open-source code repositories (including old and archived ones), configurations, and documentation for any references to your storage bucket names that may no longer be in use. One of the ways to find them is to look for the following patterns:
You can find whether a bucket still exists by querying https://storage.googleapis.com/{your-bucket-name}. If you see response NoSuchBucket, it means you identified a dangling bucket reference.
code_block
<ListValue: [StructValue([(‘code’, ‘<Error>rn<Code>NoSuchBucket</Code>rn<Message>The specified bucket does not exist.</Message>rn</Error>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef76a4bbc40>)])]>
If the bucket exists (and you do not get a NoSuchBucket error), you should verify that it actually belongs to your organization — a threat actor may have already claimed the name.
The easiest way to check for ownership is to try to read the bucket’s Identity and Access Management (IAM) permissions.
If you run a command like gcloud storage buckets get-iam-policy gs://{bucket-name} and receive an Access Denied or403 Forbiddenerror, this is a sign bucket is claimed by someone else. It proves the bucket exists, but your account doesn’t have permission to manage it — indicating it has been taken over. This reference should be treated as a risk and be removed.
For your convenience, we provide a script below that can find dangling references in a given file.
code_block
<ListValue: [StructValue([(‘code’, ‘import rernimport sysrnfrom typing import Optional, Setrnrnimport requestsrnfrom requests.exceptions import RequestExceptionrnrndef check_bucket(bucket_name: str) -> Optional[requests.Response]:rn try:rn with requests.Session() as session:rn response = session.head(f”https://storage.googleapis.com/{bucket_name}”)rn return responsern except RequestException as e:rn print(f”An error occurred while checking bucket {bucket_name}: {e}”)rn return Nonernrnrndef sanitize_bucket_name(bucket_name: str) -> Optional[str]:rn # Remove common prefixes and quotesrn bucket_name = bucket_name.replace(“gs://”, “”)rn bucket_name = bucket_name.replace(“\””, “”)rn bucket_name = bucket_name.replace(“\'”, “”)rn bucket_name = bucket_name.split(“/”)[0]rnrn # Validate the bucket name format according to GCS naming conventionsrn if re.match(“^[a-z0-9-._]+$”, bucket_name) is None:rn return Nonern return bucket_namernrnrndef extract_bucket_names(line: str) -> Set[str]:rn all_buckets: Set[str] = set()rnrn pattern = re.compile(rn r’gs://([a-z0-9-._]+)|’rn r'([a-z0-9-._]+)\.storage\.googleapis\.com|’rn r’storage\.googleapis\.com/([a-z0-9-._]+)|’rn r'([a-z0-9-._]+)\.commondatastorage\.googleapis\.com|’rn r’commondatastorage\.googleapis\.com/([a-z0-9-._]+)’,rn re.IGNORECASErn )rnrn for match in pattern.finditer(line):rn # The first non-empty group is the bucket namern if raw_bucket := next((g for g in match.groups() if g is not None), None):rn if sanitized_bucket := sanitize_bucket_name(raw_bucket):rn all_buckets.add(sanitized_bucket)rnrn return all_bucketsrnrnrndef main(filename: str) -> None:rn with open(filename, ‘r’) as f:rn for i, line in enumerate(f, 1):rn bucket_names = extract_bucket_names(line)rn for bucket_name in bucket_names:rn response = check_bucket(bucket_name)rn if response.status_code == 404:rn print(f”Dangling bucket found: {bucket_name} (line {i}), {line}”)rnrnrnif __name__ == “__main__”:rn if len(sys.argv) != 2:rn print(“Usage: python find_dangling_buckets.py <filename>”)rn sys.exit(1)rn rn main(sys.argv[1])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef76a4bb3a0>)])]>
Please be aware that this script and recommendations can only find hardcoded references, not those generated dynamically during runtime. Also, codebase might have hardcoded bucket names that do not follow the pattern but are being used by Google Cloud Storage clients.
Step 4: Reclaim and secure
If you find a dangling bucket name that might represent security risk to you or your clients, act fast.
If youdo not own the dangling bucket:
Use all available data from the previous step to find dangling buckets and remove any hardcoded references in your code or documentation. Deploy the fix to your users to permanently resolve the issue.
If youown the dangling bucket:
Reclaim the name: Create a new storage bucket with the exact same name in a secure project you control. This prevents an attacker from claiming it.
Lock it down: Apply a restrictive IAM policy to the reclaimed bucket. Deny all access to allUsers and allAuthenticatedUsers and enable Uniform Bucket-Level Access. Enable Public Access Prevention control to turn the bucket into a private “sinkhole.”
By building these practices into your development lifecycle and operational procedures, you can effectively close the door on dangling bucket takeovers. Securing your cloud environment is a continuous process, and these steps will add powerful layers of protection for you and your users.
To learn more about managing storage buckets, you can review our documentation here.
AWS announces support for Billing View in AWS Budgets, enabling organizations to create budgets that span multiple member accounts without requiring access to the management account. This integration helps organizations better align monitoring their spend with their business structure and operational needs.
With this enhancement, you can create budgets based on filtered views of cost management data based on cost allocation tags or specific AWS accounts in your organization. For example, engineering leaders can create budgets for applications that span multiple accounts using views filtered by cost allocation tags, while FinOps teams can create organization-wide budgets using unfiltered views – all without requiring management account access. This helps streamline budget management while maintaining security best practices by minimizing management account access.
This feature is available in all AWS Regions where AWS Budgets and Billing View are available, except the AWS GovCloud (US) Regions and the China Regions.
To learn more about AWS Budgets and Billing View integration, refer to AWS Budgets and Billing View in the AWS Cost Management User Guide.
As NASA embarks on a new era of human spaceflight, beginning with the Artemis campaign’s aim to return to the Moon, preparations are underway to ensure crew health and wellness. This includes exploring whether remote care capabilities can deliver detailed diagnoses and treatment options if a physician is not onboard or if real-time communication with Earth is limited. Supporting crew health through space-based medical care is becoming increasingly important as NASA missions venture deeper into space.
To address this challenge, Google and NASA collaborated on an innovative proof-of-concept for an automated Clinical Decision Support System (CDSS) known as the “Crew Medical Officer Digital Assistant” (CMO-DA). Designed to assist astronauts with medical help during extended space missions, this multi-modal interface leverages AI. The goal is to potentially support human exploration of the Moon, Mars, and beyond.
The CMO-DA tool could help astronauts autonomously diagnose and treat symptoms when crews are not in direct contact with Earth-based medical experts. Trained on spaceflight literature, the AI system uses cutting-edge natural language processing and machine learning techniques to safely provide real-time analyses of crew health and performance. The tool is designed to support a designated crew medical officer or flight surgeon in maintaining crew health and making medical decisions driven by data and predictive analytics.
Initial trials tested CMO-DA on a wide range of medical scenarios. Outputs were measured using the Objective Structured Clinical Examination framework, a tool used to evaluate the clinical skills of medical students and working healthcare professionals. Early results showed promise for reliable diagnoses based on reported symptoms. Google and NASA are now collaborating with medical doctors to test and refine the model, aiming to enhance autonomous crew health and performance during future space exploration missions.
This innovative system isn’t just about supporting space exploration; it’s about pushing the boundaries of what’s possible with AI to provide essential care in the most remote and demanding environments. This tool represents an important milestone for AI assisted medical care and our continued exploration of the cosmos. It holds potential for advancing space missions and could also benefit people here on Earth by providing early access to quality medical care in remote areas.
At Google Public Sector, we’re passionate about supporting your mission. Learn more about how Google’s AI solutions can empower your agency and register to attend our Google Public Sector Summit taking place October 29, 2025, in Washington, D.C. There, you will have an opportunity to hear from public sector leaders and industry experts, and get hands-on with Google’s latest AI technologies.
Amazon OpenSearch Serverless announces support for Neural Search, Hybrid Search, Workflow API, and AI connectors. This new set of APIs facilitates use cases such as retrieval augmented generation (RAG) and semantic search.
Neural search enables semantic queries through text and images instead of vectors. Neural search uses a high-level API with connectors to Amazon SageMaker, Amazon Bedrock, and other AI services to generate enrichments like dense or sparse vectors during query and ingestion. Hybrid search enables combining lexical, neural, and k-NN (vector) queries to deliver higher search relevancy. The workflow API allows you to package OpenSearch AI resources like models, connectors, and pipelines into templates to automate multi-step configurations required to enable AI features such as neural search, and simplified integration with specific model providers like Amazon Bedrock, Cohere, OpenAI or DeepSeek.
Neural Search, Hybrid Search, Workflow API, and AI connectors are enabled for all serverless collections in the following regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (Spain), Europe (Stockholm). Check the AWS Regional Services List for availability in your region.
For more information about these features, please see the documentation for Neural Search, Hybrid Search, Workflow API, and AI connectors. To learn more about Amazon OpenSearch Serverless, please visit the product page.
Amazon Aurora Serverless v2 now offers up to 30% improved performance for databases running on the latest serverless platform version (version 3), and also supports scaling from 0 up to 256 Aurora Capacity Units (ACUs). Aurora Serverless v2 measures capacity in ACUs where each ACU is a combination of approximately 2 gibibytes (GiB) of memory, corresponding CPU, and networking. You specify the capacity range and the database scales within this range to support your application’s needs.
With improved performance, you can now use Aurora Serverless for even more demanding workloads. All new clusters, database restores, and new clones will launch on the latest platform version. Existing clusters can be upgraded by stopping and restarting the cluster or by using Blue/Green Deployments. You can determine the cluster’s platform version in the AWS Console’s instance configuration section or via the RDS API’s ServerlessV2PlatformVersion parameter for a DB cluster.
The latest platform version is available in all AWS Regions including the AWS GovCloud (US) Regions. Aurora Serverless is an on-demand, automatic scaling configuration for Amazon Aurora. For pricing details and Region availability, visit Amazon Aurora Pricing. To learn more, read the documentation, and get started by creating an Aurora Serverless v2 database using only a few steps in the AWS Management Console.
The age of AI is here, and organizations everywhere are racing to deploy powerful models to drive innovation, enhance products, and create entirely new user experiences. But moving from a trained model in a lab to a scalable, cost-effective, and production-grade inference service is a significant engineering challenge. It requires deep expertise in infrastructure, networking, security, and all of the Ops (MLOps, LLMOps, DevOps, etc.).
Today, we’re making it dramatically simpler. We’re excited to announce the GKE inference reference architecture: a comprehensive, production-ready blueprint for deploying your inference workloads on Google Kubernetes Engine (GKE).
This isn’t just another guide; it’s an actionable, automated, and opinionated framework designed to give you the best of GKE for inference, right out of the box.
Start with a strong foundation: The GKE base platform
Before you can run, you need a solid place to stand. This reference architecture is built on the GKE base platform. Think of this as the core, foundational layer that provides a streamlined and secure setup for any accelerated workload on GKE.
Built on infrastructure-as-code (IaC) principles using Terraform, the base platform establishes a robust foundation with the following:
Automated, repeatable deployments: Define your entire infrastructure as code for consistency and version control.
Built-in scalability and high availability: Get a configuration that inherently supports autoscaling and is resilient to failures.
Security best practices: Implement critical security measures like private clusters, Shielded GKE Nodes, and secure artifact management from the start.
Integrated observability: Seamlessly connect to Google Cloud Observability for deep visibility into your infrastructure and applications.
Starting with this standardized base ensures you’re building on a secure, scalable, and manageable footing, accelerating your path to production.
Why the inference-optimized platform?
The base platform provides the foundation, and the GKE inference reference architecture is the specialized, high-performance engine that’s built on top of it. It’s an extension that’s tailored specifically to solve the unique challenges of serving machine learning models.
Here’s why you should start with our accelerated platform for your AI inference workloads:
1. Optimized for performance and cost
Inference is a balancing act between latency, throughput, and cost. This architecture is fine-tuned to master that balance.
Intelligent accelerator use: It streamlines the use of GPUs and TPUs, so you can use custom compute classes to ensure that your pods land on the exact hardware they need. With node auto-provisioning (NAP), the cluster automatically provisions the right resources, when you need them.
Smarter scaling: Go beyond basic CPU and memory scaling. We integrate a custom metrics adapter that allows the Horizontal Pod Autoscaler (HPA) to scale your models. Scaling is based on real-world inference metrics like queries per second (QPS) or latency, ensuring you only pay for what you use.
Faster model loading: Large models mean large container images. We leverage the Container File System API and Image streaming in GKE along with Cloud Storage FUSE to dramatically reduce pod startup times. Your containers can start while the model data streams in the background, minimizing cold-start latency.
2. Built to scale any inference pattern
Whether you’re doing real-time fraud detection, batch processing analytics, or serving a massive frontier model, this architecture is designed to handle it. It provides a framework for the following:
Real-time (online) inference: Prioritizes low-latency responses for interactive applications.
Batch (offline) inference: Efficiently processes large volumes of data for non-time-sensitive tasks.
Streaming inference: Continuously processes data as it arrives from sources like Pub/Sub.
The architecture leverages GKE features like the cluster autoscaler and the Gateway API for advanced, flexible, and powerful traffic management that can handle massive request volumes gracefully.
3. Simplified operations for complex models
We’ve baked in features to abstract away the complexity of serving modern AI models, especially LLMs. The architecture includes guidance and integrations for advanced model optimization techniques such as quantization (INT8/INT4), tensor and pipeline parallelism, and KV Cache optimizations like Paged and Flash Attention.
Furthermore, with GKE in Autopilot mode, you can offload node management entirely to Google, so you can focus on your models, not your infrastructure.
We’ve included examples for deploying popular workloads like ComfyUI and a general-purpose online inference with GPUs and TPUs to help you get started quickly.
By combining the rock-solid foundation of the GKE base platform with the performance and operational enhancements of the inference reference architecture, you can deploy your AI workloads with confidence, speed, and efficiency. Stop reinventing the wheel and start building the future on GKE.
The future of AI on GKE
The GKE inference reference architecture is more than just a collection of tools, it’s a reflection of Google’s commitment to making GKE the best platform for running your inference workloads. By providing a clear, opinionated, and extensible architecture, we are empowering you to accelerate your AI journey and bring your innovative ideas to life.
We’re excited to see what you’ll build with the GKE inference reference architecture. Your feedback is welcome! Please share your thoughts in the GitHub repository.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7g instances are available in the AWS Middle East (Bahrain), AWS Africa (Cape Town), and AWS Asia Pacific (Jakarta) regions. These instances are powered by AWS Graviton3 processors that provide up to 25% better compute performance compared to AWS Graviton2 processors, and built on top of the the AWS Nitro System, a collection of AWS designed innovations that deliver efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage.
Amazon EC2 Graviton3 instances also use up to 60% less energy to reduce your cloud carbon footprint for the same performance than comparable EC2 instances. For increased scalability, these instances are available in 9 different instance sizes, including bare metal, and offer up to 30 Gbps networking bandwidth and up to 20 Gbps of bandwidth to the Amazon Elastic Block Store (EBS).
Amazon DynamoDB is announcing the support of Console-to-Code, powered by Amazon Q Developer. Console-to-Code makes it simple, fast, and cost-effective to create DynamoDB resources at scale by getting you started with your automation code.
DynamoDB is a serverless, NoSQL, fully managed database with single-digit millisecond performance at any scale. Customers use the DynamoDB console to learn and prototype cloud solutions. Console-to-Code helps you record those actions and uses generative AI to suggest code in you preferred infrastructure-as-code (IAC) format for the actions you want. You can use this code as a starting point for infrastructure automation and further customize for your production workloads. For example, with Console-to-Code, you can record creating an Amazon DynamoDB table and choose to generate code for the AWS CDK (TypeScript, Python, or Java) or CloudFormation (YAML or JSON).
We’re excited to announce the general availability of two new Amazon CloudWatch metrics for AWS Outposts racks: VifConnectionStatus and VifBgpSessionState. These metrics provide you with greater visibility into the connectivity status of your Outposts racks’ Local Gateway (LGW) and Service Link Virtual Interfaces (VIFs) with your on-premises devices.
These metrics provide you with the ability to monitor Outposts VIF connectivity status directly within the CloudWatch console, without having to rely on external networking tools or coordination with other teams. You can use these metrics to set alarms, troubleshoot connectivity issues, and ensure your Outposts racks are properly integrated with your on-premises infrastructure. The VifConnectionStatus metric indicates whether an Outposts VIF is successfully connected, configured, and ready to forward traffic. A value of “1” means that the VIF is operational, while “0” means that it is not ready. The VifBgpSessionState metric shows the current state of the BGP session between the Outposts VIF and the on-premises device, with values ranging from 1 (IDLE) to 6 (ESTABLISHED).
The VifConnectionStatus and VifBgpSessionState metrics are available for all Outposts VIFs in all commercial AWS Regions where Outposts racks are available.
Amazon Relational Database Service (Amazon RDS) for SQL Server now supports Cumulative Update CU20 for SQL Server 2022 (RDS version 16.00.4205.1.v1), and General Distribution Releases (GDR) for SQL Server 2016 SP3 (RDS version 13.00.6460.7.v1), SQL Server 2017 (RDS version 14.00.3495.9.v1) and SQL Server 2019 (RDS version 15.00.4435.7.v1). The new CU20 and GDR releases address the vulnerabilities described in CVE-2025-49717, CVE-2025-49718 and CVE-2025-49719. Additionally, CU20 also includes important security fixes, performance improvements, and bug fixes. For additional information, see the Microsoft SQL Server 2022 CU20 documentation and GDR release notes KB5058717,KB5058714,KB5058722 and KB5058721.
We recommend that you upgrade your Amazon RDS for SQL Server instances to these latest versions using Amazon RDS Management Console or by using the AWS SDK or CLI. You can learn more about upgrading your database instances by using Amazon RDS User Guide.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M7i and M7i-flex instances powered by custom 4th Gen Intel Xeon Scalable processors (code-named Sapphire Rapids) are available in Asia Pacific (Osaka) region. These custom processors, available only on AWS, offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers.
M7i-flex instances are the easiest way for you to get price-performance benefits for a majority of general-purpose workloads. They deliver up to 19% better price-performance compared to M6i. M7i-flex instances offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources such as web and application servers, virtual-desktops, batch-processing, and microservices.
M7i deliver up to 15% better price-performance compared to M6i. M7i instances are a great choice for workloads that need the largest instance sizes or continuous high CPU usage, such as gaming servers, CPU-based machine learning (ML), and video-streaming. M7i offer larger instance sizes, up to 48xlarge, and two bare metal sizes (metal-24xl, metal-48xl). These bare-metal sizes support built-in Intel accelerators: Data Streaming Accelerator, In-Memory Analytics Accelerator, and QuickAssist Technology that are used to facilitate efficient offload and acceleration of data operations and optimize performance for workloads.
AWS customers can now view and manage their support cases from the AWS Console Mobile App. Customers can view and reply to case correspondence, and resolve, reopen, or create support cases while on the go and away from their workstations. Visit the Services tab and select “Support” to get started.
The AWS Console Mobile App enables AWS customers monitor and manage a select set of resources and receive push notifications to stay informed and connected with their AWS resources while on-the-go. The sign-in process supports biometrics authentication, making access to AWS resources simple, secure, and quick. For AWS services not available natively, customers can access the AWS Management Console via an in-app browser to access service pages without additional authentication, manual navigation, or need to switch from the app to a browser.
With this launch, Amazon VPC Reachability Analyzer and Amazon VPC Network Access Analyzer are now available in Asia Pacific (Jakarta), Asia Pacific (Malaysia), Asia Pacific (Thailand), Europe (Zurich), and Middle East (UAE).
VPC Reachability Analyzer allows you to diagnose network reachability between a source resource and a destination resource in your virtual private clouds (VPCs) by analyzing your network configurations. For example, Reachability Analyzer can help you identify a missing route table entry in your VPC route table that could be blocking network reachability between an EC2 instance in Account A that is not able to connect to another EC2 instance in Account B in your AWS Organization.
VPC Network Access Analyzer allows you to identify unintended network access to your AWS resources, helping you meet your security and compliance guidelines. For example, you can create a scope to verify that all paths from your web-applications to the internet, traverse the firewall, and detect any paths that bypass the firewall.
Today, Amazon QuickSight is announcing the general availability of a native Apache Impala connector.
Apache Impala is a massively parallel processing (MPP) SQL query engine that runs natively on Apache Hadoop. QuickSight customers can now connect using their username password credentials for Impala and import their data into SPICE.
Apache Impala connector for Amazon QuickSight is now available in the following regions: US East (N.Virgina and Ohio), US West (Oregon), Canada (Central), South America (Sao Paulo), Africa (South Africa), Europe (Frankfurt, Zurich, Stockholm, Milan, Spain, Ireland, London, Paris), Asia Pacific (Mumbai, Singapore, Tokyo, Seoul, Jakarta, Sydney). For more details, click here.
Google is committed to helping federal agencies meet their mission, more securely and more efficiently, with innovative cloud technologies. Today, we’re reinforcing our commitment to FedRAMP 20x, an innovative pilot program that marks a paradigm shift in federal cloud authorization. FedRAMP 20x is a new assessment process designed to move away from traditional narrative-based requirements towards continuous compliance and automated validation of machine-readable evidence. Our approach is built around Google Cloud Compliance Manager (now available for public preview) and is designed to transform the path to FedRAMP authorization for our partners and customers.
Compliance Manager accelerates the FedRAMP authorization process by automating end to end management of compliance for partners and customers building on Google Cloud. By providing automated, externally validated cloud controls to demonstrate compliance with FedRAMP 20x Key Security Indicators (KSIs), Compliance Manager allows partners to spend fewer resources manually collecting evidence and is designed to reduce the time required to achieve FedRAMP authorization. Compliance Manager will natively support FedRAMP 20x compliance with general availability later this year.
During a recent proof of concept demonstration to the FedRAMP Program Management Office (PMO), Google showcased how Compliance Manager enables strategic Google Cloud partners such as stackArmor to submit applications for 20x Phase One authorization and beyond.
Google Cloud’s latest capabilities are an exciting step forward in accelerating the FedRAMP 20x cloud-native approach to security assessment and validation. We need true innovation from industry to realize this vision of automated security and Google Cloud is leading the way by building it natively into their platform. As Google goes to market in support of FedRAMP 20x, we can’t help but wonder who’s next?
Pete Waterman
Director, FedRAMP
Compliance Manager’s ability to automate KSI compliance is also being assessed by Coalfire, a FedRAMP recognized Third Party Assessment Organization (3PAO). Coalfire is providing independent validation that agencies can benefit from a much faster, more automated path to deploying secure Google Cloud solutions, directly accelerating their access to critical cloud technologies.
Google is dedicated to accelerating federal compliance through both the existing FedRAMP Rev5 authorization path and the pilot FedRAMP 20x process. Recent Rev5 High authorizations for Google Cloud services including Agent Assist, Looker (Google Cloud core), and Vertex AI Vector Search.
If you are spending more effort than expected on compliance and audits, you can get started with Compliance Manager and streamline compliance and audits for your organization. Want to learn more? Register for the Google Public Sector Summit on October 29, 2025, in Washington, D.C., where you will gain crucial insights and skills to navigate this new era of innovation and harness the latest cloud technologies.
In a prior blog post, we introduced BigQuery’s advanced runtime, where we detailed enhanced vectorization and discussed techniques like dictionary and run-length encoded data, vectorized filter evaluation and compute pushdown, and parallelizable execution.
This blog post dives into short query optimizations, a key component of BigQuery’s advanced runtime. These optimizations can significantly speed up the “short” queries all while using fewer BigQuery “slots” (our term for computational capacity). They are commonly used by business intelligence (BI) tools such as LookerStudio or custom applications powered by BigQuery.
Similar to other BigQuery optimization techniques, the system uses a set of internal rules to determine if it should consolidate a distributed query plan into a single, more efficient step for short queries. These rules consider factors like:
The estimated amount of data to be read
How effectively the filters are reducing the data size
The type and physical arrangement of the data in storage
The overall query structure
The runtime statistics of past query executions
Along with enhanced vectorization, short query optimizations are an example of how we work to continuously improve performance and efficiency for BigQuery users.
Optimizations specific to short query execution
BigQuery’s short query optimizations dramatically speed up short, eligible queries to significantly reduce slot usage and improve query latency. Normally, BigQuery breaks down queries into multiple stages, each with smaller tasks processed in parallel across a distributed system. However, for suitable queries, short query optimizations skip this multi-stage distributed execution, leading to substantial gains in performance and efficiency. When possible, it also uses multithreaded execution, including for queries with joins and aggregations. For these queries, BigQuery automatically determines if a query is eligible and dispatched to a single stage. BigQuery also employs history-based optimization (HBO), which learns from past query executions. HBO helps BigQuery decide whether a query should run in a single stage or multiple stages based on its historical performance, ensuring the single stage approach remains beneficial even as workloads evolve.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebccb70bf10>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Full view of data
Short query optimizations process the entire query as a single stage, giving the runtime complete visibility into all tables involved. This allows the runtime to read both sides of the join and gather precise metadata, including column cardinalities. For instance, join columns in the query below have a low cardinality, so it is efficiently stored using dictionary and run-length encodings (RLE). Consequently, the runtime devises a much simpler query execution plan by leveraging encoding during execution.
The following query calculates top-ranked products and their performance metrics for an e-commerce scenario. It’s based on a type of query observed in Google’s internal data pipelines that benefited from short query optimizations. The following query uses this BigQuery public dataset, allowing you to replicate the results. Metrics throughout this blog were captured during internal Google testing.
code_block
<ListValue: [StructValue([(‘code’, ‘WITHrn AllProducts AS (rn SELECTrn id,rn name AS product_namern FROMrn `bigquery-public-data.thelook_ecommerce.products`rn ),rn AllUsers AS (rn SELECTrn idrn FROMrn `bigquery-public-data.thelook_ecommerce.users`rn ),rn AllSales AS (rn SELECTrn oi.user_id,rn oi.sale_price,rn ap.product_namern FROMrn `bigquery-public-data.thelook_ecommerce.order_items` AS oirn INNER JOIN AllProducts AS aprn ON oi.product_id = ap.idrn INNER JOIN AllUsers AS aurn ON oi.user_id = au.idrn ),rn ProductPerformanceMetrics AS (rn SELECTrn product_name,rn ROUND(SUM(sale_price), 2) AS total_revenue,rn COUNT(*) AS units_sold,rn COUNT(DISTINCT user_id) AS unique_customersrn FROMrn AllSalesrn GROUP BYrn product_namern ),rn RankedProducts AS (rn SELECTrn product_name,rn total_revenue,rn units_sold,rn unique_customers,rn RANK() OVER (ORDER BY total_revenue DESC) as revenue_rankrn FROMrn ProductPerformanceMetricsrn )rnSELECTrn revenue_rank,rn product_name,rn total_revenue,rn units_sold,rn unique_customersrnFROMrn RankedProductsrnORDER BYrn revenue_rankrnLIMIT 25;’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ebccb720670>)])]>
By skipping the shuffle layer, overall query execution requires less CPU, memory, and network bandwidth. In addition to that, short query optimizations take full advantage of enhanced vectorization described in the Understanding BigQuery enhanced vectorization blog post.
Figure 2: Internal Google testing: One stage plan for internal testing of query from Figure 1.
Queries with joins and aggregations
In data analytics, it’s common to join data from several tables and then calculate aggregate results. Typically, a query performing this distributed operation will go through many stages. Each stage can involve shuffling data around, which adds overhead and slows things down. BigQuery’s short query optimizations can dramatically improve this process. When enabled, BigQuery intelligently recognizes if the amount of data being queried is small enough to be handled by a much simpler plan. This optimization leads to substantial improvements: for the query described in Figure 3, during internal Google testing we observed 2x to 8x faster execution times and an average of 9x reduction in slot-seconds.
code_block
<ListValue: [StructValue([(‘code’, “SELECTrn p.category,rn dc.name AS distribution_center_name,rn u.country AS user_country,rn SUM(oi.sale_price) AS total_sales_amount,rn COUNT(DISTINCT o.order_id) AS total_unique_orders,rn COUNT(DISTINCT o.user_id) AS total_unique_customers_who_ordered,rn AVG(oi.sale_price) AS average_item_sale_price,rn SUM(CASE WHEN oi.status = ‘Complete’ THEN 1 ELSE 0 END) AS completed_order_items_count,rn COUNT(DISTINCT p.id) AS total_unique_products_sold,rn COUNT(DISTINCT ii.id) AS total_unique_inventory_items_soldrnFROMrn `bigquery-public-data.thelook_ecommerce.orders` AS o,rn `bigquery-public-data.thelook_ecommerce.order_items` AS oi,rn `bigquery-public-data.thelook_ecommerce.products` AS p,rn `bigquery-public-data.thelook_ecommerce.inventory_items` AS ii, `bigquery-public-data.thelook_ecommerce.distribution_centers` AS dc, `bigquery-public-data.thelook_ecommerce.users` AS urnWHERErno.order_id = oi.order_id AND oi.product_id = p.id AND ii.product_distribution_center_id = dc.id AND oi.inventory_item_id = ii.id AND o.user_id = u.idrnGROUP BYrn p.category,rn dc.name,rn u.countryrnORDER BYrn total_sales_amount DESCrnLIMIT 1000;”), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ebccb7208b0>)])]>
We can see how the execution graph changes when short query optimizations are applied to the query in Figure 3.
Figure 4: Internal Google testing: execution the join-aggregate query in 9 stages using BigQuery distributed execution.
Figure 5: Internal Google testing: with advanced runtime short query optimizations, join-aggregate query completes in 1 stage.
Optional job creation
Short query optimizations and optional job creation mode are two independent yet complementary features that enhance query performance. While optional job creation mode contributes significantly to the efficiency of short queries regardless of short query optimizations, they work even better together. When both are enabled, the advanced runtime streamlines internal operations and utilizes the query cache more efficiently, which leads to even faster delivery of the results.
Better throughput
By reducing the resources required for queries, short query optimizations not only deliver performance gains, but also significantly improves the overall throughput. This efficiency means that more queries can be executed concurrently within the same resource allocation.
The following graph, captured from Google internal data pipeline, shows an example query that benefits from short query optimizations. The blue line shows the maximum QPS (or throughput) that can be sustained. The red line shows QPS on the same reservation after advanced runtime was enabled. In addition to better latency, the same reservation can now handle over 3x higher throughput.
Figure 6: Internal Google testing: Throughput comparison, red line shows improvements from short query optimizations in the advanced runtime
Optimal impact
BigQuery’s short query optimizations feature is designed primarily for BI queries intended for human-readable output. Even though BigQuery utilizes a dynamic algorithm to determine eligible queries, it works with other performance-enhancing features like history-based optimizations, BI Engine, optional job creation, etc. Nevertheless, some workload patterns will benefit from short query optimizations more than others.
This optimization may not significantly improve queries that read or produce a lot of data. To optimize for short query performance, it is crucial to keep the query working set and result size small through pre-aggregation and filtering. Implementing partitioning and clustering strategies appropriate to your workload can also significantly reduce the amount of data processed, and utilizing the optional job creation mode is beneficial for short-lived queries that can be easily retried.
Short query optimizations in action
Let’s see how these optimizations actually impact our test query by looking closer at the query in Figure 1 and its query plan in Figure 2. The query shape is based on actual workloads observed in production and made to work against a BigQuery public dataset, so you can test it for yourself.
Despite the query scanning only 6.5 MB, running this query without advanced runtime takes over 1 second and consumes about 20 slot-seconds (execution time may vary depending on available resources in the project).
Figure 7: Internal Google testing: Sample query execution details without Advanced Runtime
With BigQuery’s enhanced vectorization in the advanced runtime, during internal Google testing this query finishes in 0.5 seconds while consuming 50x less resources.
Figure 8: Internal Google testing: Sample query execution details with Advanced Runtime Short Query Optimizations
The magnitude of improvement here is less common, showing an example of real workload improvement from Google internal pipelines. We have also seen classic BI queries with several aggregations, filters, group by and sort, or snowflake joins achieve faster performance and better slot utilization.
Try it for yourself
Short query optimizations boost query price/performance, allowing for higher throughput and lower latencies for common BI small queries. It achieves this by combining cutting-edge algorithms with Google’s latest innovations across storage, compute, and networking. This is just one of many performance improvements that we’re continually delivering to BigQuery customers like enhanced vectorization, history based optimizations, optional-job-creation mode, column metadata index (CMETA) and others.
Now that both key pillars of advanced runtime are in public preview, all you have to do to test it with your workloads is to enable it using the single ALTER PROJECT command as documented here. This enables both enhanced vectorization and short query optimizations. If you already did that earlier for enhanced vectorization, your project is automatically also enabled for short query optimizations.
Try it now with your own workload following steps in BigQuery advanced runtime documentation here, and share your feedback and experience with us at bqarfeedback@google.com.
In a world where data is your most valuable asset, protecting it isn’t just a nice-to-have — it’s a necessity. That’s why we are thrilled to announce a significant leap forward in protecting the data in your Cloud SQL instances, with Enhanced Backups for Cloud SQL.
This powerful new capability integrates Google Cloud Backup and DR Service directly into Cloud SQL, providing a robust, centralized, and secure solution to help ensure business continuity for your database workloads. The Backup and DR Service already protects Compute Engine VMs, Persistent Disks, and Hyperdisk, extending its ability to protect all of your workloads.
Modern defense for modern threats
Enhanced Backups for Cloud SQL provides advanced protection by storing database backups in logically air-gapped and immutable backup vaults. Managed by Google and completely separate from your source project, these vaults provide a critical defense against threats that could compromise your entire environment.
For customers like JFrog, Cloud SQL Enhanced Backup with Google Cloud Backup and DR is proving to be a superior and robust alternative:
“Using this integration will help us significantly bolster our security posture by offering logically air-gapped and immutable backup vaults, creating a vital defense layer against diverse data-loss scenarios.” – Shiran Melamed, DevOps Group Leader, JFrog
Control, compliance, and peace of mind
We designed Enhanced Backups to be both powerful and easy to use, giving you fine-grained control over your data protection strategy. These capabilities are now available in Preview for both Cloud SQL Enterprise and Enterprise Plus editions, and offer key features to help ensure your data is always secure and recoverable:
Immutable, air-gapped vaults: Protect your data with immutable backups stored in a secure, logically air-gapped vault. Setting minimum enforced retention and retention locks ensure backups cannot be deleted or changed for a predefined period, while a zero-trust access policy provides granular control.
Business continuity: Your data is safeguarded against both source-instance and source-project deletion, so you can recover your data even if the source project itself becomes unavailable.
Flexible policies that fit your needs: Your business isn’t one-size-fits-all, and your backup strategy shouldn’t be either. We offer highly customizable backup schedules, including hourly, daily, weekly, monthly, and yearly options. You can store backups for periods ranging from days to decades.
Centralized command and control: Manage everything from a single, unified dashboard in the Google Cloud console. Monitor job status, identify unprotected resources, and generate reports, all in one place.
But you don’t have to take our word for it. See how customers like SQUARE ENIX and Rotoplas are already benefiting from Enhanced Backups for Cloud SQL:
“At SQUARE ENIX, protecting our users’ data is paramount. Google Cloud SQL’s Enhanced Backup integrated with the Backup and DR service is essential to our resiliency strategy. Its robust protection against instance- and even project-level deletion, combined with a secure, isolated vault and long-term retention, provides a critical safeguard for our most valuable asset. This capability will give us confidence in our data’s integrity and recoverability, allowing our teams to focus on creating the unforgettable experiences our users expect.” – Kazutaka Iga, SRE,SQUARE ENIX
“Google Cloud SQL’s Enhanced Backup feature along with Google Professional Services support is a value add to our backup strategy at Rotoplas. The ability to centralize management, flexibly schedule backups, and store them independent of the source project gives us unprecedented control. This streamlined approach simplifies our operations and enhances security, ensuring our data is always protected and easily recoverable.” – Agustín Chávez Cabrera, Devops manager, Rotoplas
Get started with Enhanced Backups
Getting started with Enhanced Backups is simple. Here’s how you can enable this enhanced protection for your Cloud SQL instances:
1. Create or select a backup vault: In the Backup and DR service, either create a new backup vault or use an existing one.
2. Create a backup plan: Define a backup plan for Cloud SQL within your chosen backup vault, setting your desired backup frequency and retention rules.
3. Apply the backup plan to the Cloud SQL instances: Apply your new backup plan to existing or new Cloud SQL instances.
Once you apply a backup plan, your backups will automatically be scheduled and moved to the secure backup vault based on the rules you defined. The entire experience can be managed through the tools you already use — whether it’s the Google Cloud console, gcloud command-line tool, or APIs — so there’s no additional infrastructure for you to deploy or manage.
Protect your data now
With Enhanced Backups for Cloud SQL, you can build a superior data protection strategy that enhances security, simplifies operations, and strengthens your overall data resilience for Cloud SQL instances.
Get started and use it yourself. The new features are available now in supported regions.
Experience the new management solution in the console.
Amazon OpenSearch Serverless now offers automatic semantic enrichment, a breakthrough feature that simplifies semantic search implementation. You can now boost your search relevance with minimal effort, eliminating complex manual configurations through an automated setup process.
Semantic search goes beyond keyword matching by understanding the context and meaning of search queries. For example, when searching for “how to treat a headache,” semantic search intelligently returns relevant results about “migraine remedies” or “pain management techniques” even when these exact terms aren’t present in the query.
Previously, implementing semantic search required ML (Machine Learning) expertise, model hosting, and OpenSearch integration. Automatic semantic enrichment simplifies this process dramatically. With automatic semantic enrichment, you simply specify which fields need semantic search capabilities. OpenSearch Service handles all semantic enrichment automatically during data ingestion.
The feature launches with support for two language variants: English-only and Multi-lingual, covering 15 languages including Arabic, Chinese, Finnish, French, Hindi, Japanese, Korean, Spanish, and more. You pay only for actual usage during data ingestion, with no ongoing costs for storage or search queries.
This new feature is automatically enabled for all serverless collections and is now available in the following regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (Spain), Europe (Stockholm). To get started, visit our technical documentation, read our blog post, and see Amazon OpenSearch Service semantic search pricing. Check the AWS Regional Services List for availability in your region.
AWS Private Certificate Authority (AWS Private CA) now supports AWS PrivateLink with all AWS Private CA Federal Information Processing Standard (FIPS) endpoints that are available in commercial AWS Regions and the AWS GovCloud (US) Regions. With this launch, you can establish a private connection between your virtual private cloud (VPC) and AWS Private CA FIPS endpoints instead of connecting over the public internet, helping you meet your organization’s business, compliance, and regulatory requirements to limit public internet connectivity.
AWS Private CA offers FIPS endpoints in the following AWS Regions: US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Canada (Central), Canada West (Calgary), AWS GovCloud (US-East), and AWS GovCloud (US-West).