Amazon Relational Database Service (Amazon RDS) for SQL Server now supports encrypting native backups in Amazon S3 using server-side encryption with AWS KMS keys (SSE-KMS). When customers create database backup files (.bak files) in their Amazon S3 buckets, the backup files are automatically encrypted using server-side encryption with Amazon S3-managed keys (SSE-S3). Now, customers have the option to additionally encrypt their native backup files in Amazon S3 using their own AWS KMS key for additional protection.
To use SSE-KMS encryption for native backups, customers must update their KMS key policies to provide access to the RDS backup service, and specify the parameter @enable_bucket_default_encryption in their native backup stored procedure. For detailed instructions on how to use SSE-KMS with native backups, please refer to the Amazon RDS for SQL Server User Guide. This feature is available in all AWS Regions where Amazon RDS for SQL Server is available.
Amazon Relational Database Service (Amazon RDS) for SQL Server now allows maintaining Change Data Capture (CDC) settings and metadata when restoring native database backups. CDC is a Microsoft SQL Server feature that customers can use to record insert, update, and delete operations occurring in a database table, and make these changes accessible to applications. When a database is restored from a backup, CDC configurations and data are not preserved by default, which can result in gaps in data capture. With this new feature, customers can preserve their database CDC settings when restoring a database backup to a new instance, or a different database name.
To retain CDC configurations, customers can specify the KEEP_CDC option when restoring a database backup. This option ensures that the CDC metadata and any captured change data are kept intact. Refer to the Amazon RDS for SQL Server User Guide to learn more about KEEP_CDC. This feature is available in all AWS Regions where Amazon RDS for SQL Server is available.
AWS Parallel Computing Service (PCS) now supports rotation of cluster secret keys using AWS Secrets Manager, enabling you to update the secure credentials used for authentication between Slurm controller and compute nodes without creating a new cluster. Regularly rotating your Slurm cluster secret keys strengthens your security posture by reducing the risk of credential compromise and ensuring compliance with best practices. This helps keep your HPC workloads and accounting data safe from unauthorized access.
PCS is a managed service that makes it easier to run and scale high performance computing (HPC) workloads on AWS using Slurm. With the support of cluster secret rotation in PCS, you can strengthen your security controls and maintain operational efficiency. You can now implement secret rotation as part of your security best practices while maintaining cluster continuity.
This feature is available in all AWS Regions where PCS is available. You can rotate cluster secrets using either the AWS Secrets Manager console or API after preparing your cluster for the rotation process. Read more about PCS support for cluster secret rotation in the PCS User Guide.
AWS Graviton4-based R8g database instances are now generally available for Amazon DocumentDB (with MongoDB compatibility). R8g instances are powered by AWS Graviton4 processors and feature the latest DDR5 memory, making it ideal for memory-intensive workloads. These instances are built on the AWS Nitro System, which offloads CPU virtualization, storage, and networking functions to dedicated hardware and software to enhance the performance and security of your workloads.
Customers can get started with R8g instances through the AWS Management Console, CLI, and SDK by modifying their existing Amazon DocumentDB database cluster or creating a new one. R8g instances are available for Amazon DocumentDB 5.0 on both Standard and IO-Optimized cluster storage configurations. For more information including region availability visit our pricing page and documentation.
Today, AWS’ Customer Carbon Footprint Tool (CCFT) has been updated to include Scope 3 emissions data and Scope 1 natural gas and refrigerants, providing AWS customers more complete visibility into their cloud carbon footprint. This update expands the CCFT to cover all three industry-standard emission scopes as defined by the Greenhouse Gas Protocol.
The CCFT Scope 3 update gives AWS customers full visibility into the lifecycle carbon impact of their AWS usage, including emissions from manufacturing the servers that run their workloads, powering AWS facilities, and transporting equipment to data centers. Historical data is available back to January 2022, allowing organizations to track their progress over time and make informed decisions about their cloud strategy to meet their sustainability goals. This data is available through the CCFT dashboard and AWS Billing and Cost Management Data Exports, enabling customers to easily incorporate carbon insights into their operational workflows, sustainability planning, and reporting processes.
AWS Nitro Enclaves is an Amazon EC2 capability that enables customers to create isolated compute environments (enclaves) to further protect and securely process highly sensitive data within their EC2 instances. Nitro Enclaves helps customers reduce the attack surface area for their most sensitive data processing applications.
There is no additional cost other than the cost for the using Amazon EC2 instances and any other AWS services that are used with Nitro Enclaves.
Nitro Enclaves is now available across all AWS Regions, expanding to include new regions in Asia Pacific (New Zealand, Thailand, Jakarta, Hyderabad, Malaysia, Melbourne, and Taipei), Europe (Spain and Zurich), Middle East (UAE and Tel Aviv), and North America (Central Mexico and Calgary).
To learn more about AWS Nitro Enclaves and how to get started, visit the AWS Nitro Enclaves page.
Editor’s note: Today we hear about SmarterX, which helps retailers, manufacturers, and logistics companies minimize regulatory risk, maximize sales, and protect consumers and the environment by giving them AI-driven tools to safely and compliantly sell, ship, store, and dispose of their products. SmarterX uses BigQuery, Gemini, and Vertex AI to collect, process, and analyze vast amounts of unstructured regulatory and product data from across the web, using it to train custom, highly accurate large language models (LLMs) to help large consumer packaged goods brands and retailers sell, ship, store, and dispose of regulated products compliantly. Read on to learn about how Google Cloud’s integrated, easy-to-use toolset is helping them accelerate product development.
EVP for product and technology at SmarterX, Russell Foltz-Smith views the world of retail through search-colored glasses.
“If universal product codes were really universal, looking for a product and all the information directly related to it would be a one-step process,” he proposes. “But in the real world, the ideal of universality just doesn’t exist.”
It’s a reality we all deal with dozens of times a day when we type something into a browser’s search bar: There are very few queries guaranteed to return a single answer. Thus the need for what data scientists call “probabilistic search backed by algorithmic indexing and ranking strategies” ( what most of us call “googling”) was born.
“In many ways, all data science and LLM-building boils down to accurate information retrieval,” adds Foltz-Smith. And he’s well-positioned to understand why.
SmarterX customers — consumer packaged goods brands, third-party retailers, distributors, and logistics companies — rely on SmarterX to make sense of the overwhelming volume of regulatory product data online. The platform helps ensure the way products are sold, shipped, stored, and disposed of complies with all applicable laws and regulations.
“SmarterX collects and indexes data, triangulates for missing data points, and provides a queryable interface that helps our customers minimize regulatory risk while maximizing sales,” Foltz-Smith explains. To do so, SmarterX hunts down regulatory information, using crawlers enabled by machine learning and natural language processing to locate, scrape, and parse data from websites, research papers, safety data sheets, and other nooks and crannies of the web where it may be tucked away.
“Google Cloud technologies are a perfect fit for our needs,” Foltz-Smith states. “At their core is the ability to surface the right search results from an inconceivably vast expanse of data where the inputs and outputs are not predetermined and the data itself is unstructured.”
Real-time data processing and fast, accurate model-building
To collect and store all that data, SmarterX employs BigQuery and Cloud Storage. “Our data sources are disparate and the formats unpredictable,” he continues. “BigQuery accommodates unstructured and semi-structured data, then functions as a job engine, recursively cleansing, normalizing, schematizing, and classifying that data at runtime.”
Google Cloud’s scalable compute resources and storage also enable real-time data processing. “We never have to worry about whether we have enough servers in a data center or adequate bandwidth,” Foltz-Smith adds. “Google Cloud hides all that complexity, so it’s handled automatically and cost-effectively.”
Further accelerating data processing is BigQuery’s integration with Gemini, which manages data-processing job queues and also forms the basis of many of the large language models, or LLMs, SmarterX builds for its clients. “Gemini is in part a collection of everything Google has already crawled, so we don’t need to re-crawl it ourselves,” Foltz-Smith notes. That makes model-building faster.
Built-in grounding — the ability to connect model output to verifiable information sources — makes Gemini a safer, more conscientious way to assemble data for SmarterX customers. And retrieval-augmented generation, or RAG, allows SmarterX to connect Gemini with customers’ proprietary databases, enhancing the LLMs’ accuracy and relevance while helping ensure the security of his customers’ data.
We never have to worry about whether we have enough servers in a data center or adequate bandwidth. Google Cloud hides all that complexity, so it’s handled automatically and cost-effectively.
Russell Foltz-Smith
Executive Vice President for Product & Technology, SmarterX
Keeping up with ecommerce and regulatory compliance
For each of its clients, SmarterX builds several discrete LLMs on Vertex AI, many of which are updated as a customer’s business requirements change.
“Vertex AI not only enables us to access Gemini directly but also provides links to smaller, publicly available AI models specific to narrowly defined topics like chemical formulas” he says. SmarterX’s Gemini-based models can even perform complex computations such as chemistry calculations to determine flashpoints, boiling points, and pH levels. This data is then used to automatically triangulate missing data, augment existing data, or update out-of-date information.
Vertex AI also operates at scale, a necessity for a company whose clients include eight major retailers, each of which has thousands of suppliers of regulated consumer packaged goods. SmarterX’s customers include those same suppliers, each of which sells their products on third-party marketplaces like Amazon and TikTok.
“Gone are the days when a brand sold its merchandise exclusively in brick-and-mortar stores they owned,” Foltz-Smith explains. “The proliferation of retail websites and marketplace-specific product variations adds tremendous complexity to our work.” On any given day, SmarterX is processing millions of SKUs and must update each customer-specific LLM with any new compliance data, which can affect its customers’ entire supply chain — from product formulation to sales and marketing to product disposal.
It’s the integration of SQL into BigQuery, and the interoperability of the entire Google Cloud technology constellation that Foltz-Smith credits with allowing SmarterX to keep pace with that volume.
“We no longer have to maintain separate workflows, learn multiple tools, and constantly jump between them,” he notes. “We can crawl the web, land the data in BigQuery, process it, write code programmatically or in SQL statements, massage training data, build new LLMs, and evaluate, deploy, and update them all within one coherent, well-orchestrated system with the same familiar interfaces throughout. Google Cloud workflows were built for high-volume data science.”
Empowering subject matter experts
Google workflows were built for democratized data science as well, with features that enable non-technical subject matter experts who are not trained data scientists to work with data directly, and even to deploy models on their own.
Among those features, according to Foltz-Smith, are the ability to easily swap in and out new sets of training data, an assistive decision-making feature for parameterization, easy-to-understand out-of-the-box visualizations for model evaluation, and templates for formatting evaluation frameworks.
“In the past, you’d need to know how to use a modeling tool, a database tool, and an API deployment tool, as well as understand the math underlying a particular model and how to write code in order to build and deploy a model,” he says. “Having it all in a single environment with familiar user interfaces enables people without a data science background to be much more productive. It’s incredibly freeing and empowering for them.”
That freedom translates into accelerated product development.
SmarterX team members with industry-specific knowledge of regulatory requirements can now evaluate, correct, and deploy the models that provide that knowledge to SmarterX customers; previously, they had to wait for a data scientist to help translate that know-how into a model for them.
“Google’s mission to organize all the information in the world and make it universally available is apparent in the tools it offers today, and that mission dovetails precisely with the way SmarterX employs data science in service to our customers,” Foltz-Smith concludes. “I’ve been a data scientist for over two decades, and the tools in Google Cloud continually exceed my expectations.”
In today’s complex threat landscape, effectively managing network security is crucial — especially across diverse environments. Organizations are looking to advanced capabilities to strengthen security, enhance threat protection, and simplify network security operations for hybrid and multicloud deployments.
We’re excited to announce new capabilities in Cloud Armor, featuring more comprehensive security policies and granular network configuration controls and improvements, so you can more easily manage network security operations across hybrid and multicloud environments.
Improving your security posture with hierarchical security policies and organization-scoped address groups
Hierarchical Security policies, now generally available, can extend Google Cloud Armor’s web application firewall (WAF) and DDoS protection by allowing security policies to be configured at the organization, folder, and project level. This update can help manage security policies across projects in large organizations with centralized control to support a consistent security posture and streamlined deployment of updates and mitigations.
Organization-scoped address groups, now generally available, can help manage IP range lists across multiple Cloud Armor security policies. Organization-scope address groups can enhance scalability and manageability by enabling the definition and reuse of IP range lists for both hierarchical and project-level configurations.
You can reduce the complexity of cloud networking security configurations by using organization-scoped address groups to eliminate duplicate rules and policies across multiple backends as well as share it across products such as Cloud Next Generation Firewall for a unified and consolidated security posture.
Security Policies overview.
Enhancing threat protection with granular network policy controls
Threat actors frequently conceal malicious content in larger request bodies to circumvent detection. Our enhanced WAF inspection capability, now in preview, incorporates the expansion of request body inspection from 8 KB to a robust 64 KB for all preconfigured WAF rules. This leap in inspection depth dramatically improves the capacity to detect and mitigate sophisticated malicious content.
JA4 network fingerprinting support, now generally available, elevates SSL/TLS client fingerprinting with more detailed and precise client identification and profiling, while building on the foundational principles of JA3.
JA4 incorporates additional fields and metadata, and can yield deeper insights into client behavior. This advanced telemetry can provide security analysts with richer contextual information, facilitating more sophisticated security analysis, more thorough threat hunting, and the ability to differentiate legitimate traffic from malicious actors.
This new capability can strengthen security against known malicious IP addresses and traffic patterns by permitting and blocking traffic from specific ASNs directly at the network edge. Effectively, this can preempt the impact of known malicious entities on your services. It represents a potent instrument for safeguarding media assets and ensuring a secure user experience.
The global front end: Your unified defense strategy
Google Cloud’s global front end (GFE) provides comprehensive protection for your workloads no matter where they’ve been deployed — on Google Cloud, on other public cloud environments, in co-location facilities, or on-premises data centers. The GFE integrates Cloud Load Balancing, Cloud Armor, and Cloud CDN into a singular, end-to-end solution at the perimeter of the Google Cross-Cloud Network.
Our GFE offering can help ensure the secure, reliable, and high-performance delivery of defensive services to the internet. Functioning as the dedicated security component in the GFE, Cloud Armor is your primary line of defense, protecting applications and APIs from a broad spectrum of web and DDoS attacks. It also can manage your network security posture, safeguarding against the OWASP Top 10 vulnerabilities, and mitigating bot and fraud risks with reCAPTCHA Enterprise integration.
Google Cloud global front end.
Industry recognition and sustained customer confidence
Google Cloud Armor’s commitment to innovation and client success has garnered significant recognition. We are honored that Cloud Armor was acknowledged as a “Strong Performer” in The Forrester Wave™: Web Application Firewall Solutions, Q1 2025.
Forrester’s rigorous evaluation cited Google Cloud Armor’s vision and roadmap that emphasize protection and automation, with a strong focus on AI. The report also recognized Google’s streamlined operations facilitated by Gemini, and differentiated custom reporting.
The report cited Cloud Armor’s threat intelligence feeds and DevOps integrations, enabling robust security in your development pipelines. The report also noted Cloud Armor’s flexible pricing and the Cloud Armor Enterprise tier that includes threat intelligence and DDoS protection as a bundled solution.
The Forrester Wave™: Web Application Firewall Solutions, Q1 2025
Get started with Cloud Armor
With these advanced capabilities, Google Cloud Armor can empower organizations to significantly enhance their security posture and threat protection while embracing a proactive, intelligent, and unified approach to safeguarding their assets.
Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change. For more information, read about Forrester’s objectivity here.
Effective AI systems operate on a foundation of context and continuous trust. When you use Dataplex Universal Catalog, Google Cloud’s unified data governance platform, the metadata that describes your data is no longer static — it’s where your AI applications can go to know where to find data and what to trust.
But when you have complex data pipelines, it’s easy for your data’s journey to become obscured, making it difficult to trace information from its origin to its eventual impact. To solve this, we are extending Dataplex lineage capabilities from object-level to column-level, starting with support for BigQuery.
“To power our AI strategy, we need absolute trust in our data. Column-level lineage provides that. It’s the foundation for governing our data responsibly and confidently.” – Latheef Syed – AVP, Data & AI Governance Engineering at Verizon
While object-level lineage tracks the top-level connections between entire tables, column-level lineage charts the specific, granular path of a single data column as it moves and transforms. With that, we are now providing a dynamic and granular map to govern your data-to-AI ecosystem, so you can ground your agentic AI applications in context. Lineage is upgraded to Column-level at no extra cost.
Answering critical questions about your data
Data professionals often need precise answers about the complex relationships in their BigQuery datasets. Column-level lineage provides a graph of data flows that you can trace to find these answers quickly. Now you can:
Confirm that a column used in your AI models originates from an authoritative source
Understand how changes to one column affect other columns downstream before you make a modification
Trace the root cause of an issue with a column by examining its upstream transformations
Verify that sensitive data at the column level is used correctly throughout your organization
“Column-level lineage takes the trusted map of our data ecosystem to the next level. It’s the precision tool we need to fully understand the impact of a change, trace a problem to its source, and ensure compliance down to the most granular detail.” – Arvind Rajagopalan – AVP, Data / AI & Product Engineering at Verizon
Explore lineage visually
Dataplex now provides an interactive, visual representation of column-level lineage relationships. You can select a single column in a table to see a graph of all its upstream and downstream connections. As you navigate the graph at the asset level, you can drill down to the column level to verify which specific columns are affected by a process. You can also visualize the direct lineage paths between the columns of two different assets, giving you a focused view of their relationship.
Column-level tracing for AI models
Tables used for AI and ML model training often have data coming from different sources and taking different paths, and it’s important to have granular visibility into the data’s journey. For example, in complex AI/ML feature tables, a single table for model training may contain many columns. Column-level lineage can verify that the one column originates from a trusted, audited financial system, while another one comes from ephemeral web logs. Table-level lineage would obscure this critical distinction, treating all features with the same level of trust.
Powering context-aware AI agents
More companies are developing AI agents to automate tasks and answer complex questions about their data, and these agents require a deep understanding of the business and organizational context to be effective. The granular metadata provided by column-level lineage supplies this necessary context. For example, it can allow the agent to distinguish between similarly named metrics. Tracing each column’s path, including its frequency of usage, and freshness, it gives context to the agent on the importance of a column if affected by a change, or severity of impact when troubleshooting. By grounding AI agents in a rich, factual map of your data assets and their relationships, you can build more accurate and reliable agentic workflows.
Google Axion processors, our first custom Arm®-based CPUs, mark a major step in delivering both performance and energy efficiency for Google Cloud customers and our first-party services, providing up to 65% better price-performance and up to 60% more energy-efficient than comparable instances on Google Cloud.
We put Axion processors to the test: running Google production services. Now that our clusters contain both x86 and Axion Arm-based machines, Google’s production services are able to run tasks simultaneously on multiple instruction-set architectures (ISAs). Today, this means most binaries that compile for x86 now need to compile to both x86 and Arm at the same time — no small thing when you consider that the Google environment includes over 100,000 applications!
We recently published a preprint of a paper called “Instruction Set Migration at Warehouse Scale” about our migration process, in which we analyze 38,156 commits we made to Google’s giant monorepo, Google3. To make a long story short, the paper describes the combination of hard work, automation, and AI we used to get to where we are today. We currently serve Google services in production on Arm and x86 simultaneously including YouTube, Gmail, and BigQuery, and we have migrated more than 30,000 applications to Arm, with Arm hardware fully-subscribed and more servers deployed each month.
Let’s take a brief look at two steps on our journey to make Google multi-architecture, or ‘multiarch’: an analysis of migration patterns, and exploring the use of AI in porting the code. For more, be sure to read the entire paper.
Migrating all of Google’s services to multiarch
Going into a migration from x86-only to Arm and x86, both the multiarch team and the application owners assumed that we would be spending time on architectural differences such as floating point drift, concurrency, intrinsics such as platform-specific operators, and performance.
At first, we migrated some of our top jobs like F1, Spanner, and Bigtable using typical software practices, complete with weekly meetings and dedicated engineers. In this early period, we found evidence of the above issues, but not nearly as many as we expected. It turns out modern compilers and tools like sanitizers have shaken out most of the surprises. Instead, we spent the majority of our time working on issues like:
fixing tests that broke because they overfit to our existing x86 servers
updating intricate build and release systems, usually for our oldest and highest-traffic services
resolving rollout issues in production configurations
taking care to avoid destabilizing critical systems
Moving a dozen applications to Arm this way absolutely worked, and we were proud to get things running on Borg, our cluster management system. As one engineer remarked, “Everyone fixated on the totally different toolchain, and [assumed] surely everything would break. The majority of the difficulty was configs and boring stuff.”
And yet, it’s not sufficient to migrate a few big jobs and be done. Although ~60% of our running compute is in our top 50 applications, the curve of usage across the remaining applications in Google’s monorepo is relatively flat. The more jobs that can run on multiple architectures, the easier it is for Borg to fit them efficiently into cells. For good utilization of our Arm servers, then, we needed to address this long list of the remaining 100,000+ applications.
The multiarch team could not effectively reach out to so many application owners; just setting up the meetings would have been cost-prohibitive! Instead, we have relied on automation, helping to minimize involvement from the application teams themselves.
Automation tools We had many sources of automation to help us, some of which we already used widely at Google before we started the multiarch migration. These include:
Rosie, which lets us programmatically generate large numbers of commits and shepherd them through the code review process. For example, the commit could be one line to enable Arm in a job’s Blueprint: “arm_variant_mode = ::blueprint::VariantMode::VARIANT_MODE_RELEASE“
Sanitizers and fuzzers, which catch common differences in execution between x86 and Arm (e.g., data races that are hidden by x86’s TSO memory model). Catching these kinds of issues ahead of time avoids non-deterministic, hard-to-debug behavior when recompiling to a new ISA.
Continuous Health Monitoring Platform (CHAMP), which is a new automated framework for rolling out and monitoring multiarch jobs. It automatically evicts jobs that cause issues on Arm, such as crash-looping or exhibiting very slow throughput, for later offline tuning and debugging.
We also began using an AI-based migration tool called CogniPort — more on that below.
Analysis The 38,156 commits to our code monorepo constituted most of the commits across the entire ISA migration project, from huge jobs like Bigtable to myriad tiny ones. To analyze these commits, we passed the commit messages and code diffs into Gemini Flash LLM’s 1M token context window in groups of 100, generating 16 categories of commits in four overarching groups.
Figure 1: Commits fall into four overarching groups.
Once we had a final list, we ran commits again through the model and had it assign one of these 16 categories to each of them (as well as an additional “Uncategorized” category, which improved stability of the categorization by catching outliers).
Figure 2: Code examples in the first two categories. More examples are available in the paper.
Altogether, this analysis covered about 700K changed lines of code. We plotted the timeline of our ISA migration, normalized, as lines of code per day or month changed over time.
Figure 3: CLs by category by time, normalized.
As you can see, as we started our multiarch toolchain, the largest set of commits were in tooling and test adaptation. Over time, a larger fraction of commits were around code adaptation, aligned with the first few large applications that we migrated. During this phase, the focus was on updating code in shared dependencies and addressing common issues in code and tests as we prepared for scale. In the final phase of the process, almost all commits were configuration files and supporting processes. We also saw that, in this later phase, the number of merged commits rapidly increased, capturing the scale-up of the migration to the whole repository.
Figure 4: CLs by category by time, in raw counts.
It’s worth noting that, overall, most commits related to migration are small. The largest commits are often to very large lists or configurations, as opposed to signaling more inherent complexity or intricate changes to single files.
Automating ISA migrations with AI
Modern generative AI techniques represent an opportunity to automate the remainder of the ISA migration process. We built an agent called CogniPort which aims to close this gap. CogniPort operates on build and test errors. If at any point in the process, an Arm library, binary, or test does not build or a test fails with an error, the agent steps in and aims to fix the problem automatically. As a first step, we have already used CogniPort’s Blueprint editing mode to generate migration commits that do not lend themselves to simple changes.
The agent consists of three nested agentic loops, shown below. Each loop executes an LLM to produce one step of reasoning and a tool invocation. The tool is executed and the outputs are attached to the agent’s context.
Figure 5: CogniPort
The outermost agent loop is an orchestrator that repeatedly calls the two other agents, the build-fixer agent and the test-fixer agent. The build-fixer agent tries to build a particular target and makes modifications to files until the target builds successfully or the agent gives up. The test-fixer agent tries to run a particular test and makes modifications until the test succeeds or the agent gives up (and in the process, it may use the build-fixer agent to address build failures in the test).
Testing CogniPort
While we only recently scaled up CogniPort usage to high levels, we had the opportunity to more formally test its behavior by taking historic commits from the dataset above that were created without AI assistance. Focusing on Code & Test Adaptation (categories 1-8) commits that we could cleanly roll back (not all of the other categories were suitable for this approach), we generated a benchmark set of 245 commits. We then rolled the commits back and evaluated whether the agent was able to fix them.
Figure 6: CogniPort results
Despite no special prompts or other optimizations, early tests were very encouraging, successfully fixing failed tests 30% of the time. CogniPort was particularly effective for test fixes, platform-specific conditionals, and data representation fixes. We’re confident that as we invest in further optimizations of this approach, we will be even more successful.
A multiarch future
From here, we still have tens of thousands more applications to address with automation. To cover future code growth, all new applications are designed to be multiarch by default. We will continue to use CogniPort to fix tests and configurations, and we will also work with application owners on trickier changes. (One lesson of this project is how well owners tend to know their code!)
Yet, we’re increasingly confident in our goal of driving Google’s monorepo towards architecture neutrality for production services, for a variety of reasons:
All of the code used for production services is visible in a vast monorepo (still).
Most of the structural changes we need to build, run, and debug multiarch applications are done.
Existing automation like Rosie and the recently developed CHAMP allows us to keep expanding release and rollout targets without much intervention on our part.
Last but not least, LLM-based automation will allow us to address much of the remaining long tail of applications for a multi-ISA Google fleet.
To read even more about what we learned, don’t miss the paper itself. And to learn about our chip designs and how we’re operating a more sustainable cloud, you can read about Axion at g.co/cloud/axion.
This blog post and the associated paper represents the work of a very large team. The paper authors are Eric Christopher, Kevin Crossan, Wolff Dobson, Chris Kennelly, Drew Lewis, Kun Lin, Martin Maas, Parthasarathy Ranganathan, Emma Rapati, and Brian Yang, in collaboration with dozens of other Googlers working on our Arm porting efforts.
Google Threat Intelligence Group (GTIG) observed multiple instances of pro-Russia information operations (IO) actors promoting narratives related to the reported incursion of Russian drones into Polish airspace that occurred on Sept. 9-10, 2025. The identified IO activity, which mobilized in response to this event and the ensuing political and security developments, appeared consistent with previously observed instances of pro-Russia IO targeting Poland—and more broadly the NATO Alliance and the West. Information provided in this report was derived from GTIG’s tracking of IO beyond Google surfaces. Google is committed to information transparency, and we will continue tracking these threats and blocking their inauthentic content on Google’s platforms. We regularly disclose our latest enforcement actions in the TAG Bulletin.
Observed messaging surrounding the Russian drone incursion into Polish airspace advanced multiple, often intersecting, influence objectives aligned with historic pro-Russia IO threat activity:
Promoting a Positive Russian Image: Concerted efforts to amplify messaging denying Russia’s culpability for the incursion.
Blaming NATO and the West: The reframing of the events to serve Russian strategic interests, effectively accusing either Poland or NATO of manufacturing pretext to serve their own political agendas.
Undermining Domestic Confidence in Polish Government: Messaging designed to negatively influence Polish domestic support for its own government, by insinuating that its actions related to both the event itself and the broader conflict in Ukraine are detrimental to Poland’s domestic stability.
Undermining International Support to Ukraine: Messaging designed to undercut Polish domestic support for its government’s foreign policy position towards Ukraine.
Notably, Russia-aligned influence activities have long prioritized Poland, frequently leveraging a combination of Poland-focused operations targeting the country domestically, as well as operations that have promoted Poland-related narratives more broadly to global audiences. However, the mobilization of covert assets within Russia’s propaganda and disinformation ecosystem in response to this most recent event is demonstrative of how established pro-Russia influence infrastructure—including both long-standing influence campaigns and those which more recently emerged in response to Russia’s full-scale invasion of Ukraine in 2022—can be flexibly leveraged by operators to rapidly respond to high-profile, emerging geopolitical stressors.
Examples highlighted in this report are designed to provide a representative snapshot of pro-Russia influence activities surrounding the Russian drone incursion into Polish airspace; it is not intended to be a comprehensive account of all pro-Russia activity which may have leveraged these events.
Multiple IO actors that GTIG tracks rapidly promoted related narratives in the period immediately following the drone incursion. While this by itself is not evidence of coordination across these groups, it does highlight how influence actors throughout the pro-Russia ecosystem have honed their activity to be responsive to major geopolitical developments. This blog post contains examples that we initially observed as part of this activity.
Portal Kombat
The actor publicly referred to as Portal Kombat (aka the “Pravda Network”) has been publicly reported on since at least 2024 as operating a network of domains that act as amplifiers of content seeded within the broader pro-Russia ecosystem, primarily focused on Russia’s invasion of Ukraine. These domains share near identical characteristics while each targeting different geographic regions. As has likewise been documented in public reporting, over time Portal Kombat has developed new infrastructure to expand its targeting of the West and other countries around the world via subdomains stemming from a single actor-controlled domain. Some examples of Portal Kombat’s promoted narratives related to the incursion of Russian drones into Polish airspace include the following:
One article, ostensibly reporting on the crash of one of the drones, called into question whether the drones could have come from Russia, noting that the type of drones purportedly involved are not capable of reaching Poland.
Another article claimed that officials from Poland and the Baltic States politicized the issue, intentionally reframing it as a threat to NATO as a means to derail possible Russia-U.S. negotiations regarding the conflict in Ukraine out of a fear that the U.S. would deprioritize the region to focus on China. The article further claimed that videos of the drones shown in the Polish media are fake, and that the Russian military does not have a real intention of attacking Poland.
A separate article promoted a purported statement made by a Ukrainian military expert, claiming that the result of the drone incursion was that Europe will focus its spending on defense at home, rather than on support for Ukraine—the purported statement speculated as to whether this was the intention of the incursion itself.
Figure 1: Example of an English-language article published by the Portal Kombat domain network, which promoted a narrative alleging that Polish and Baltic State officials were using news of the Russian drone incursion to derail U.S.-Russia negotiations related to the war in Ukraine
Doppelganger
The “Doppelganger” pro-Russia IO actor has created a network of inauthentic custom media brands that it leverages to target Europe, the U.S., and elsewhere. These websites often have a specific topical and regional focus and publish content in the language of the target audience. GTIG identified at least two instances in which Polish-language and German-language inauthentic custom media brands that we track disseminated content that leveraged the drone incident (Figure 2).
A Polish-language article published to the domain of the Doppelganger custom media brand “Polski Kompas” promoted a narrative that leveraged the drone incursions as a means to claim that the Polish people do not support the government’s Ukraine policy. The article claimed that such support not only places a burden on Poland’s budget, but also risks the security and safety of the Polish people.
A German-language article published to the domain of the Doppelganger custom media brand “Deutsche Intelligenz” claimed that the European reaction to the drone incident was hyperinflated by officials as part of an effort to intimidate Europeans into entering conflict with Russia. The article claimed that Russia provided warning about the drones, underscoring that they were not threatening, and that NATO used this as pretext to increase its regional presence—steps that the article claimed pose a risk to Russia’s security and could lead to war.
Figure 2: Examples of articles published to the domains of two Doppelganger inauthentic media brands: Polski Kompas (left) and Deutsche Intelligenz (right)
Niezależny Dziennik Polityczny (NDP)
The online publication “Niezależny Dziennik Polityczny” is a self-proclaimed “independent political journal” focused on Polish domestic politics and foreign policy and is the primary dissemination vector leveraged by the eponymously named long-standing, pro-Russia influence campaign, which GTIG refers to as “NDP”. The publication has historically leveraged a number of suspected inauthentic personas as editors or contributing authors, most of whom have previously maintained accounts across multiple Western social media platforms and Polish-language blogging sites. NDP has been characterized by multiple sources as a prolific purveyor of primarily anti-NATO disinformation and has recently been a significant amplifier within the Polish information space of pro-Russia disinformation surrounding Russia’s ongoing invasion of Ukraine.
Examples of NDP promoted narratives related to the incursion of Russian drones into Polish airspace:
GTIG observed an article published under the name of a previously attributed NDP persona, which referenced the recent Polish response to the Russian drone incursion as a component of ongoing “war hysteria” artificially constructed to distract the Polish people from domestic issues. The article further framed other NATO activity in the region as disproportionate and potentially destabilizing (Figure 3).
Additionally, GTIG observed content promoted by NDP branded social media assets that referenced the drone incursion in the days following these events. This included posts that alleged that Poland had been pre-warned about the drones, that Polish leadership was cynically and disproportionately responding to the incident, and that a majority of Poles blame Ukraine, NATO, or the Polish Government for the incident.
Figure 3: Examples of narratives related to the Russian drone incursion into Polish airspace promoted by the NDP campaign’s “political journal” (left) and branded social media asset (right)
Outlook
Covert information operations and the spread of disinformation are increasingly key components of Russian state-aligned actors’ efforts to advance their interests in the context of conflict. Enabled by an established online ecosystem, these actors seek to manipulate audiences to achieve ends like the exaggeration of kinetic military action’s efficacy and the incitement of fear, uncertainty, and doubt within vulnerable populations. The use of covert influence tactics in these instances is manifold: at minimum, it undermines society’s ability to establish a fact-based understanding of potential threats in real-time by diluting the information environment with noise; in tandem, it is also used to both shape realities on the ground and project messaging strategically aligned with one’s interests—both domestically and to international audiences abroad.
While the aforementioned observations highlight tactics leveraged by specifically Russia-aligned threat actors within the context of recent Russian drone incursions into Polish airspace, these observations are largely consistent with historical expectations of various ideologically-aligned threat actors tracked by GTIG and their respective efforts to saturate target information environments during wartime. Understanding both how and why malicious threat actors exploit high-profile, and often emerging, geopolitical stressors to further their political objectives is critical in identifying both how the threats themselves manifest and how to mitigate their potential impact. Separately, we note that the recent mobilization of covert assets within Russia’s propaganda and disinformation ecosystem in response to Russia’s drone incursion into Polish airspace is yet another data point suggesting Poland—and NATO allied countries, more broadly—will remain a high priority target of Russia-aligned influence activities.
Today, Amazon Simple Email Service (SES) added visibility into the IP addresses used by Dedicated IP Addresses – Managed (DIP-M) pools. Customers can now find out the exact addresses in use when sending emails through DIP-M pools to mailbox providers. Customers can also see Microsoft Smart Network Data Services (SNDS) metrics for these IP addresses, giving them more insight into their sending reputation with Microsoft mailbox providers. This gives customers more transparency into the IP activities in DIP-M pools.
Previously, customers could configure DIP-M pools to perform automatic IP allocation and warm-up in response to changes in email sending volumes. This reduced the operational overhead of managing dedicated sending channels, but customers could not easily see which IP addresses were in use by DIP-M pools. This also made it difficult to find SNDS feedback, which customers use to improve their reputation. Now, customers can see the IPs in DIP-M pools through the console, CLI, or SES API. SES also automatically creates CloudWatch Metrics for SNDS information on each IP address, which customers can access through the CloudWatch console or APIs. This gives customers more tools to monitor their sending reputation.
SES supports DIP-M IP observability in all AWS Regions where SES is available.
For more information, see the documentation for information about DIP-M pools.
On October 21, 2025 Amazon announced quarterly security and critical updates for Amazon Corretto Long-Term Supported (LTS) versions of OpenJDK. Corretto 25.0.1, 21.0.9, 17.0.17, 11.0.29, 8u472 are now available for download. Amazon Corretto is a no-cost, multi-platform, production-ready distribution of OpenJDK.
This release of Corretto JDK binaries for Generic Linux, Alpine and macOS will include Async-Profiler, a low overhead sampling profiler for Java supported by the Amazon Corretto team. Async-Profiler is designed to provide profiling data for CPU time, allocations in Java Heap, native memory allocations and leaks, contended locks, hardware and software performance counters like cache misses, page faults, context switches, Java method profiling, and much more.
Click on the Corretto home page to download Corretto 25, Corretto 21, Corretto 17, Corretto 11, or Corretto 8. You can also get the updates on your Linux system by configuring a Corretto Apt, Yum, or Apk repo.
Starting today, Amazon EC2 High Memory U7i instances with 6TB of memory (u7i-6tb.112xlarge) are now available in the Europe (London) region. U7i-6tb instances are part of AWS 7th generation and are powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids). U7i-6tb instances offer 6TB of DDR5 memory, enabling customers to scale transaction processing throughput in a fast-growing data environment.
U7i-6tb instances offer 448 vCPUs, support up to 100Gbps Elastic Block Storage (EBS) for faster data loading and backups, deliver up to 100Gbps of network bandwidth, and support ENA Express. U7i instances are ideal for customers using mission-critical in-memory databases like SAP HANA, Oracle, and SQL Server.
Amazon CloudWatch Database Insights expands the availability of its on-demand analysis experience to the RDS for SQL Server database engine. CloudWatch Database Insights is a monitoring and diagnostics solution that helps database administrators and developers optimize database performance by providing comprehensive visibility into database metrics, query analysis, and resource utilization patterns. This feature leverages machine learning models to help identify performance bottlenecks during the selected time period, and gives advice on what to do next.
Previously, database administrators had to manually analyze performance data, correlate metrics, and investigate root cause. This process is time-consuming and requires deep database expertise. With this launch, you can now analyze database performance monitoring data for any time period with automated intelligence. The feature automatically compares your selected time period against normal baseline performance, identifies anomalies, and provides specific remediation advice. Through intuitive visualizations and clear explanations, you can quickly identify performance issues and receive step-by-step guidance for resolution. This automated analysis and recommendation system reduces mean-time-to-diagnosis from hours to minutes.
You can get started with this feature by enabling the Advanced mode of CloudWatch Database Insights on your RDS for SQL Server databases using the RDS service console, AWS APIs, the AWS SDK, or AWS CloudFormation. Please refer to RDS documentation and Aurora documentation for information regarding the availability of Database Insights across different regions, engines and instance classes.
Amazon Connect can now automatically initiate follow-up evaluations to analyze specific situations identified during initial evaluations. For example, when an initial customer service evaluation detects customer interest in a product, Amazon Connect can automatically trigger a follow-up evaluation focused on the agent’s sales performance. This enables managers to maintain consistent evaluation standards across agent cohorts and over time, while capturing deeper insights on specific scenarios such as sales opportunities, escalations, and other critical interaction moments.
This feature is available in all regions where Amazon Connect is offered. To learn more, please visit our documentation and our webpage.
Amazon Bedrock Data Automation (BDA) now supports AVI, MKV, and WEBM file formats along with the AV1 and MPEG-4 Visual (Part 2) codecs, enabling you to generate structured insights across a broader range of video content. Additionally, BDA delivers up to 50% faster image processing.
BDA automates the generation of insights from unstructured multimodal content such as documents, images, audio, and videos for your GenAI-powered applications. With support for AVI, MKV, and WEBM formats, you can now analyze content from archival footage, high-quality video archives with multiple audio tracks and subtitles, and web-based and open-source video content. This expanded video format and codec support enables you to process video content directly in the formats your organization uses, streamlining your workflows and accelerating time-to-insight. With faster image processing on BDA, you you can extract insights from visual content faster than ever before. You can now analyze larger volumes of images in less time, helping you scale your AI applications and deliver value to your customers more quickly.
Amazon Bedrock Data Automation is available in 8 AWS regions: Europe (Frankfurt), Europe (London), Europe (Ireland), Asia Pacific (Mumbai), Asia Pacific (Sydney), US West (Oregon) and US East (N. Virginia), and GovCloud (US-West) AWS Regions.
Amazon Nova models now support the customization of content moderation settings for approved business use cases that require processing or generating sensitive content.
Organizations with approved business use cases can adjust content moderation settings across four domains: safety, sensitive content, fairness, and security. These settings allow customers to adjust specific settings relevant to their business requirements. Amazon Nova enforces essential, non-configurable controls to ensure responsible use of AI, such as controls to prevent harm to children and preserve privacy.
Customization of content moderation settings is available for Amazon Nova Lite and Amazon Nova Pro in the US East (N. Virginia) region.
To learn more about Amazon Nova, visit the Amazon Nova product page and to learn about Amazon Nova responsible use of AI, visit the AWS AI Service Cards, or see the User Guide. To see if your business model is appropriate to customize content moderation settings, contact your AWS Account Manager.
Amazon Elastic Container Service (Amazon ECS) now supports AWS CloudTrail data events, providing detailed visibility into Amazon ECS Agent API activities. This new capability enables customers to monitor, audit, and troubleshoot container instance operations.
With CloudTrail data event support, security and operations teams can now maintain comprehensive audit trails of ECS Agent API activities, detect unusual access patterns, and troubleshoot agent communication issues more effectively. Customers can opt in to receive detailed logging through the new data event resource type AWS::ECS::ContainerInstance for ECS agent activities, including when the ECS agent polls for work (ecs:Poll), starts telemetry sessions (ecs:StartTelemetrySession), and submits ECS Managed Instances logs (ecs:PutSystemLogEvents). This enhanced visibility enables teams to better understand how container instance roles are utilized, meet compliance requirements for API activity monitoring, and quickly diagnose operational issues related to agent communications.
This new feature is available for Amazon ECS on EC2 in all AWS Regions and ECS Managed Instances in select regions. Standard CloudTrail data event charges apply. To learn more, visit the Developer Guide.
AI Agents are now a reality, moving beyond chatbots to understand intent, collaborate, and execute complex workflows. This leads to increased efficiency, lower costs, and improved customer and employee experiences. This is a key opportunity for System Integrator (SI) Partners to deliver Google Cloud’s advanced AI to more customers. This post details how to build, scale, and manage enterprise-grade agentic systems using Google Cloud AI products to enable SI Partners to offer these transformative solutions to enterprise clients.
Enterprise challenges
The limitations of traditional, rule-based automation are becoming increasingly apparent in the face of today’s complex business challenges. Its inherent rigidity often leads to protracted approval processes, outdated risk models, and a critical lack of agility, thereby impeding the ability to seize new opportunities and respond effectively to operational demands.
Modern enterprises are further compounded by fragmented IT landscapes, characterized by legacy systems and siloed data, which collectively hinder seamless integration and scalable growth. Furthermore, static systems are ill-equipped to adapt instantaneously to market volatility or unforeseen “black swan” events. They also fall short in delivering the personalization and operational optimization required to manage escalating complexity—such as in cybersecurity and resource allocation—at scale. In this dynamic environment, AI agents offer the necessary paradigm shift to overcome these persistent limitations.
How SI Partners are solving business challenges with AI agents
Let’s discuss how SIs are working with Google Cloud to solve some of the discussed business challenges;
Deloitte: A major retail client sought to enhance inventory accuracy and streamline reconciliation across its diverse store locations. The client needed various users—Merchants, Supply Chain, Marketing, and Inventory Controls—to interact with inventory data through natural language prompts. This interaction would enable them to check inventory levels, detect anomalies, research reconciliation data, and execute automated actions.
Deloitte leveraged Google Cloud AI Agents and Gemini Enterprise to create a solution that generates insights, identifies discrepancies, and offers actionable recommendations based on inventory data. This solution utilizes Agentic AI to integrate disparate data sources and deliver real-time recommendations, ultimately aiming to foster trust and confidence in the underlying inventory data.
Quantiphi: To improve customer experience and optimize sales operations, a furniture manufacturer partnered with Quantiphi to deploy Generative AI. to create a dynamic intelligent assistant on Google Cloud. The multi-agent system automates the process of quotation response creation thereby accelerating and speeding the process. At its core is an orchestrator, built with Agent Development Kit (ADK) and an Agent to Agent (A2A) framework that seamlessly coordinates between agents to summarize the right response – whether you’re researching market trends, asking about product details, or analyzing sales data. Leveraging the cutting-edge capabilities of Google Cloud’s Gemini models and BigQuery, the assistant delivers unparalleled insights, transforming how one can access data and make decisions.
These examples represent just a fraction of the numerous use cases spanning diverse industry verticals, including healthcare, manufacturing, and financial services, that are being deployed in the field by SIs working in close collaboration with Google Cloud.
Architecture and design patterns used by SIs
The strong partnership between Google Cloud and SIs is instrumental in delivering true business value to customers. Let’s examine the scalable architecture patterns employed by Google Cloud SIs in the field to tackle Agentic AI challenges.
To comprehend Agentic AI architectures, it’s crucial to first understand what an AI agent is. An AI agent is a software entity endowed with the capacity to plan, reason, and execute complex actions for users with minimal human intervention. AI agents leverage advanced AI models for reasoning and informed decision-making, while utilizing tools to fetch data from external sources for real-time and grounded information. Agents typically operate within a compute runtime. The visual diagram illustrates the basic components of an agent;
Base AI Agent Components
The snippet below also demonstrates how an Agent’s code appears in the Python programming language;
Code snippet of an AI Agent
This agent code snippet showcases the components depicted in the first diagram, where we observe the Agent with a Name, Large Language Model (LLM), Description, Instruction and Tools, all of which are utilized to enable the agent to perform its designated functions.
To build enterprise-grade agents at scale, several factors must be considered during their ground-up development. Google Cloud has collaborated closely with its Partner ecosystem to employ cutting-edge Google Cloud products to build scalable and enterprise-ready agents.
A key consideration in agent development is the framework. Without it, developers would be compelled to build everything from scratch, including state management, tool handling, and workflow orchestration. This often results in systems that are complex, difficult to debug, insecure, and ultimately unscalable. Google Cloud Agent Development Kit (ADK) provides essential scaffolding, tools, and patterns for efficient and secure enterprise agent development at scale. It offers developers the flexibility to customize agents to suit nearly every applicable use case.
Agent development with any framework, especially multi-agent architectures in enterprises, necessitates robust compute resources and scalable infrastructure. This includes strong security measures, comprehensive tracing, logging, and monitoring capabilities, as well as rigorous evaluation of the agent’s decisions and output.
Furthermore, agents typically lack inherent memory, meaning they cannot recall past interactions or maintain context for effective operation. While frameworks like ADK offer ephemeral memory storage for agents, enterprise-grade agents demand persistent memory. This persistent memory is vital for equipping agents with the necessary context to enhance their performance and the quality of their output.
Google Cloud’s Vertex AI Agent Engine provides a secure runtime for agents that manages their lifecycle, orchestrates tools, and drives reasoning. It features built-in security, observability, and critical building blocks such as a memory bank, session service, and sandbox. Agent Engine is accessible to SIs and customers on Google Cloud. Alternative options for running agents at scale includeCloud Run orGKE.
Customers often opt for these alternatives when they already have existing investments in Cloud Run or GKE infrastructure on Google Cloud, or when they require configuration flexibility concerning compute, storage, and networking, as well as flexible cost management. However, when choosing Cloud Run or GKE, functions like memory and session management must be built and managed from the ground up.
Model Context Protocol (MCP) is a crucial element for modern AI agent architectures. This open protocol standardizes how applications provide context to LLMs, thereby improving agent responses by connecting agents and underlying AI models to various data sources and tools. It’s important to note that Agents also communicate with enterprise systems using APIs, which are referred to as Tools when employed with agents. MCP enables agents to access fresh external data.
When developing enterprise agents at scale, it is recommended to deploy the MCP servers separately on a serverless platform like Cloud Run or GKE on Google Cloud, with agents running on Agent Engine configured as clients. The sample architecture illustrates the recommended deployment model for MCP integration with ADK agents;
AI agent tool integration with MCP
The reference architecture demonstrates howADK built agents can integrate with MCP to connect data sources and provide context to underlying LLM models. The MCP utilizes Get, Invoke, List, and Call functions to enable tools to connect agents to external data sources. In this scenario, the agent can interact with a Graph database through application APIs using MCP, allowing the agent and the underlying LLM to access up-to-date data for generating meaningful responses.
Furthermore, when building multi-agent architectures that demand interoperability and communication among agents from different systems, a key consideration is how to facilitate Agent-to-Agent communication. This addresses complex use cases that require workflow execution across various agents from different domains.
Google Cloud launched theAgent-to-Agent Protocol (A2A) with native support within Agent Engine to tackle the challenge of inter-agent communication at scale. Learn how to implement A2A from this blog.
Google Cloud has collaborated with SIs on agentic architecture and design considerations to build multiple agents, assisting clients in addressing various use cases across industry domains such as Retail, Manufacturing, Healthcare, Automotive, and Financial Services. The reference architecture below consolidates these considerations.
Reference architecture – Agentic AI system with ADK, MCP, A2A and Agent Engine
This reference architecture depicts an enterprise-grade Agent built on Google Cloud to address a supply chain use case. In this architecture, all agents are built with the ADK framework and deployed on Agent Engine. Agent Engine provides a secure compute runtime with authentication, context management using managedsessions,memory, and quality assurance throughExample Store andEvaluation Services, while also offering observability into the deployed agents. Agent Engine delivers all these features and many more as a managed service at scale on GCP.
This architecture outlines an Agentic supply chain featuring an orchestration agent (Root) and three dedicated sub-agents: Tracking, Distributor, and Order Agents. Each of these agents are powered by Gemini. For optimal performance and tailored responses, especially in specific use cases, we recommend tuning your model with domain-specific data before integration with an agent. Model tuning can also help optimize responses for conciseness, potentially leading to reduced token size and lower operational costs.
For instance, a user might send a request such as “show me the inventory levels for men’s backpack.” The Root agent receives this request and is capable of routing it to the Order agent, which is responsible for inventory and order operations. This routing is seamless because the A2A protocol utilizesagent cards to advertise the capabilities of each respective agent. A2A isconfigured with a few steps as a wrapper for your agents for Agent Engine deployment.
In this example, inventory and order details are stored inBigQuery. Therefore, the agent uses its tool configuration to leverage the MCP server to fetch the inventory details from the BigQuery data warehouse. The response is then returned to the underlying LLM, which generates a formatted natural language response and provides the inventory details for men’s backpacks to the Root agent and subsequently to the user. Based on this response, the user can, for example, place an order to replenish the inventory.
When such a request is made, the Root agent routes it to the Distributor agent. This agent possesses knowledge of all suppliers who provide stock to the business. Depending on the item being requested, the agent will use its tools to initiate an MCP server connection to the correct external API endpoints for the respective supplier to place the order. If the suppliers have agents configured, the A2A protocol can also be utilized to send the request to the supplier’s agent for processing. Any acknowledgment of the order is then sent back to the Distributor agent.
In this reference architecture, when the Distributor agent receives acknowledgment, A2A enables the agent to detect the presence of a Tracking agent that monitors new orders until delivery. The Distributor agent will pass the order details to the Tracking agent and also send updates back to the user. The Tracking agent will then send order updates to the user via messaging, utilizing the public API endpoint of the supplier. This is merely one example of a workflow that could be built with this reference architecture.
This modular architecture can be adapted to solve various use cases with Agentic AI built with ADK and deployed to Agent Engine.
The reference architecture allows this multi-agent system to be consumed via a chat interface through a website or a custom-built user interface. It is also possible to integrate this agentic AI architecture with Google Cloud Gemini Enterprise.
Learn how enterprises can start by using Gemini Enterprise as the front door to Google Cloud AI from this blog from Alphabet CEO Sundar Pichai. This approach helps enterprises to start small using low code out of the box agents. As they mature, they can now implement complex use cases with advanced high code AI agents using this reference archiecture .
Getting started
This blog post has explored the design patterns for building intelligent enterprise AI agents. For enterprise decision makers, use the 5 essential elements to start implementing agentic solutions to help guide your visionary strategy and decision making when it comes to running enterprise agents at scale.
We encourage you to embark on this journey today by collaborating with Google Cloud Partner Ecosystem to understand your enterprise landscape and identify complex use cases that can be effectively addressed with AI Agents. Utilize these design patterns as your guide and leverage the ADK to transform your enterprise use case into a powerful, scalable solution that delivers tangible business value on Agent Engine with Google Cloud.