Amazon S3 Express One Zone now supports Internet Protocol version 6 (IPv6) addresses for gateway Virtual Private Cloud (VPC) endpoints. S3 Express One Zone is a high-performance storage class designed for latency-sensitive applications.
Organizations are adopting IPv6 networks to mitigate IPv4 address exhaustion in their private networks or to comply with regulatory requirements. You can now access your data in S3 Express One Zone over IPv6 or DualStack VPC endpoints. You don’t need additional infrastructure to handle IPv6 to IPv4 address translation.
S3 Express One Zone support for IPv6 is available in all AWS Regions where the storage class is available at no additional cost. You can set up IPv6 for new and existing VPC endpoints using the AWS Management Console, AWS CLI, AWS SDK, or AWS CloudFormation. To get started using IPv6 on S3 Express One Zone, visit the S3 User Guide.
Written by: Stallone D’Souza, Praveeth DSouza, Bill Glynn, Kevin O’Flynn, Yash Gupta
Welcome to the Frontline Bulletin Series
Straight from Mandiant Threat Defense, the “Frontline Bulletin” series brings you the latest on the threats we are seeing in the wild right now, equipping our community to understand and respond.
Introduction
Mandiant Threat Defense has uncovered exploitation of an unauthenticated access vulnerability within Gladinet’s Triofox file-sharing and remote access platform. This now-patched n-day vulnerability, assigned CVE-2025-12480, allowed an attacker to bypass authentication and access the application configuration pages, enabling the upload and execution of arbitrary payloads.
As early as Aug. 24, 2025, a threat cluster tracked by Google Threat Intelligence Group (GTIG) as UNC6485 exploited the unauthenticated access vulnerability and chained it with the abuse of the built-in anti-virus feature to achieve code execution.
The activity discussed in this blog post leveraged a vulnerability in Triofox version 16.4.10317.56372, which was mitigated in release 16.7.10368.56560.
Gladinet engaged with Mandiant on our findings, and Mandiant has validated that this vulnerability is resolved in new versions of Triofox.
Initial Detection
Mandiant leverages Google Security Operations (SecOps) for detecting, investigating, and responding to security incidents across our customer base. As part of Google Cloud Security’s Shared Fate model, SecOps provides out-of-the-box detection content designed to help customers identify threats to their enterprise. Mandiant uses SecOps’ composite detection functionality to enhance our detection posture by correlating the outputs from multiple rules.
For this investigation, Mandiant received a composite detection alert identifying potential threat actor activity on a customer’s Triofox server. The alert identified the deployment and use of remote access utilities (using PLINK to tunnel RDP externally) and file activity in potential staging directories (file downloads to C:WINDOWSTemp).
Within 16 minutes of beginning the investigation, Mandiant confirmed the threat and initiated containment of the host. The investigation revealed an unauthenticated access vulnerability that allowed access to configuration pages. UNC6485 used these pages to run the initial Triofox setup process to create a new native admin account, Cluster Admin, and used this account to conduct subsequent activities.
Triofox Unauthenticated Access Control Vulnerability
Figure 1: CVE-2025-12480 exploitation chain
During the Mandiant investigation, we identified an anomalous entry in the HTTP log file – a suspicious HTTP GET request with an HTTP Referer URL containing localhost. The presence of the localhost host header in a request originating from an external source is highly irregular and typically not expected in legitimate traffic.
Within a test environment, Mandiant noted that standard HTTP requests issued to AdminAccount.aspx result in a redirect to the Access Denied page, indicative of access controls being in place on the page.
Figure 3: Redirection to AccessDenied.aspx when attempting to browse AdminAccount.aspx
Access to the AdminAccount.aspx page is granted as part of setup from the initial configuration page at AdminDatabase.aspx. The AdminDatabase.aspx page is automatically launched after first installing the Triofox software. This page allows the user to set up the Triofox instance, with options such as database selection (Postgres or MySQL), connecting LDAP accounts, or creating a new native cluster admin account, in addition to other details.
Attempts to browse to the AdminDatabase.aspx page resulted in a similar redirect to the Access Denied page.
Figure 4: Redirection to AccessDenied.aspx when attempting to browse AdminDatabase.aspx
Mandiant validated the vulnerability by testing the workflow of the setup process. The Host header field is provided by the web client and can be easily modified by an attacker. This technique is referred to as an HTTP host header attack. Changing the Host value to localhost grants access to the AdminDatabase.aspx page.
Figure 5: Access granted to AdminDatabase.aspx by changing Host header to localhost
By following the setup process and creating a new database via the AdminDatabase.aspx page, access is granted to the admin initialization page, AdminAccount.aspx, which then redirects to the InitAccount.aspx page to create a new admin account.
Figure 6: Successful access to the AdminCreation page InitAccount.aspx
Figure 7: Admin page
Analysis of the code base revealed that the main access control check to the AdminDatabase.aspx page is controlled by the function CanRunCrticalPage(), located within the GladPageUILib.GladBasePage class found in C:Program Files (x86)TriofoxportalbinGladPageUILib.dll.
public bool CanRunCriticalPage()
{
Uri url = base.Request.Url;
string host = url.Host;
bool flag = string.Compare(host, "localhost", true) == 0; //Access to the page is granted if Request.Url.Host equals 'localhost', immediately skipping all other checks if true
bool result;
if (flag)
{
result = true;
}
else
{
//Check for a pre-configured trusted IP in the web.config file. If configured, compare the client IP with the trusted IP to grant access
string text = ConfigurationManager.AppSettings["TrustedHostIp"];
bool flag2 = string.IsNullOrEmpty(text);
if (flag2)
{
result = false;
}
else
{
string ipaddress = this.GetIPAddress();
bool flag3 = string.IsNullOrEmpty(ipaddress);
if (flag3)
{
result = false;
}
else
...
Figure 8: Vulnerable code in the function CanRunCrticalPage()
As noted in the code snippet, the code presents several vulnerabilities:
Host Header attack – ASP.NET builds Request.Urlfrom the HTTP Host header, which can be modified by an attacker.
No Origin Validation – No check for whether the request came from an actual localhost connection versus a spoofed header.
Configuration Dependence – If TrustedHostIP isn’t configured, the only protection is the Host header check.
Triofox Anti-Virus Feature Abuse
To achieve code execution, the attacker logged in using the newly created Admin account. The attacker uploaded malicious files to execute them using the built-in anti-virus feature. To set up the anti-virus feature, the user is allowed to provide an arbitrary path for the selected anti-virus. The file configured as the anti-virus scanner location inherits the Triofox parent process account privileges, running under the context of the SYSTEM account.
The attacker was able to run their malicious batch script by configuring the path of the anti-virus engine to point to their script. The folder path on disk of any shared folder is displayed when publishing a new share within the Triofox application. Then, by uploading an arbitrary file to any published share within the Triofox instance, the configured script will be executed.
Figure 9: Anti-virus engine path set to a malicious batch script
SecOps telemetry recorded the following command-line execution of the attacker script:
Download a payload from http://84.200.80[.]252/SAgentInstaller_16.7.10368.56560.zip, which hosted a disguised executable despite the ZIP extension
Save the payload to: C:WindowsappcompatSAgentInstaller_16.7.10368.56560.exe
Execute the payload silently
The executed payload was a legitimate copy of the Zoho Unified Endpoint Management System (UEMS) software installer. The attacker used the UEMS agent to then deploy the Zoho Assist and Anydesk remote access utilities on the host.
Reconnaissance and Privilege Escalation
The attacker used Zoho Assist to run various commands to enumerate active SMB sessions and specific local and domain user information.
Additionally, they attempted to change passwords for existing accounts and add the accounts to the local administrators and the “Domain Admins” group.
Defense Evasion
The attacker downloaded sihosts.exe and silcon.exe (sourced from the legitimate domain the.earth[.]li) into the directory C:windowstemp.
Filename
Original Filename
Description
sihosts.exe
Plink (PuTTY Link)
A common command-line utility for creating SSH connections
silcon.exe
PuTTY
A SSH and telnet client
These tools were used to set up an encrypted tunnel, connecting the compromised host to their command-and-control (C2 or C&C) server over port 433 via SSH. The C2 server could then forward all traffic over the tunnel to the compromised host on port 3389, allowing inbound RDP traffic. The commands were run with the following parameters:
While this vulnerability is patched in the Triofox version 16.7.10368.56560, Mandiant recommends upgrading to the latest release. In addition, Mandiant recommends auditing admin accounts, and verifying that Triofox’s Anti-virus Engine is not configured to execute unauthorized scripts or binaries. Security teams should also hunt for attacker tools using our hunting queries listed at the bottom of this post, and monitor for anomalous outbound SSH traffic.
Acknowledgements
Special thanks to Elvis Miezitis, Chris Pickett, Moritz Raabe, Angelo Del Rosario, and Lampros Noutsos
Detection Through Google SecOps
Google SecOps customers have access to these broad category rules and more under the Mandiant Windows Threatsrule pack. The activity discussed in the blog post is detected in Google SecOps under the rule names:
Gladinet or Triofox IIS Worker Spawns CMD
Gladinet or Triofox Suspicious File or Directory Activity
Gladinet Cloudmonitor Launches Suspicious Child Process
Powershell Download and Execute
File Writes To AppCompat
Suspicious Renamed Anydesk Install
Suspicious Activity In Triofox Directory
Suspicious Execution From Appcompat
RDP Protocol Over SSH Reverse Tunnel Methodology
Plink EXE Tunneler
Net User Domain Enumeration
SecOps Hunting Queries
The following UDM queries can be used to identify potential compromises within your environment.
GladinetCloudMonitor.exe Spawns Windows Command Shell
Identify the legitimate GladinetCloudMonitor.exe process spawning a Windows Command Shell.
Identify the execution of a renamed Plink executable (sihosts.exe) or a renamed PuTTy executable (silcon.exe) attempting to establish a reverse SSH tunnel.
metadata.event_type = "PROCESS_LAUNCH"
target.process.command_line = /-Rb/
(
target.process.file.full_path = /(silcon.exe|sihosts.exe)/ nocase or
(target.process.file.sha256 = "50479953865b30775056441b10fdcb984126ba4f98af4f64756902a807b453e7" and target.process.file.full_path != /plink.exe/ nocase) or
(target.process.file.sha256 = "16cbe40fb24ce2d422afddb5a90a5801ced32ef52c22c2fc77b25a90837f28ad" and target.process.file.full_path != /putty.exe/ nocase)
)
Google Public Sector is committed to supporting the critical missions of the U.S. Department of Defense (DoD) by delivering cutting-edge cloud, AI, and data services securely. Today, we are marking an important milestone in that commitment: we have successfully achieved Cybersecurity Maturity Model Certification (CMMC) Level 2 certification under the DoD’s CMMC program.
This certification, validated by a certified third-party assessment organization (C3PAO), affirms that Google Public Sector’s internal systems used to handle Controlled Unclassified Information (CUI) meet the DoD’s rigorous cybersecurity standards for protecting CUI.
Enabling a secure partnership
This CMMC Level 2 certification is a key enabler for our partnership with the DoD. It ensures our teams can operate and collaborate within the defense ecosystem fully supporting the new DoD requirements, allowing us to serve as a trusted partner and support the mission without compromise.
Helping the Defense Industrial Base on their CMMC journey
While this certification does not extend to customer environments, we are also dedicated to helping our partners and customers across the Defense Industrial Base (DIB) on their own CMMC journeys.
Our FedRAMP-authorized cloud services, including Google Workspace, are designed to support DIB suppliers in building their CMMC-compliant solutions with secure, cutting-edge cloud, AI, and data capabilities. You can find all of our compliance resources, including guides for both Google Cloud and Google Workspace, on our central CMMC compliance page. As an example, our Google Workspace CMMC Implementation Guide provides specific configuration details and control mappings and our recent blog details how Google Workspace can help you achieve CMMC 2.0 compliance. These resources are designed to help DIB companies accelerate their own assessments and build their CMMC-compliant solutions on a secure, verified foundation.
Understanding CMMC and the DFARS connection
The CMMC program is a DoD initiative to enhance cybersecurity across the DIB. Its purpose is to verify that contractors have implemented the required security controls, based heavily on NIST Special Publication (SP) 800-171, to protect CUI and Federal Contract Information (FCI).
Many contractors are already familiar with DFARS 252.204-7012, which has long required the implementation of NIST SP 800-171. The new CMMC program is being implemented into contracts via the clause DFARS 252.204-7021. When this clause appears in a solicitation, it makes having achieved a specific CMMC level a mandatory condition for contract award.
A continued commitment to the mission
Our CMMC Level 2 certification is a direct reflection of our commitment to meeting the DoD’s stringent security requirements. It ensures we can continue to support the Department’s mission responsibly and compliantly. We remain committed to our partnership with the DoD, empowering the Defense Industrial Base with cutting-edge cloud, AI, and data services to build a more secure and resilient future.
Catch the highlights from our recent Google Public Sector Summit where we shared how Google Cloud’s AI and security technologies can help advance your mission.
Amazon CloudWatch agent now supports collection of shared memory utilization metrics from Linux hosts running on Amazon EC2 or on-premises environments. This new capability enables you to monitor total shared memory usage in CloudWatch, alongside existing memory metrics like free memory, used memory, and cached memory.
Enterprise applications such as SAP HANA and Oracle RDBMS make extensive use of shared memory segments that were previously not captured in standard memory metrics. By enabling shared memory metric collection in your CloudWatch agent configuration file, you can now accurately assess total memory utilization across your hosts, helping you optimize host and application configurations and make informed decisions about instance sizing.
Amazon CloudWatch agent is supported in all commercial AWS Regions and AWS GovCloud (US) Regions. For Amazon CloudWatch custom metrics pricing, see the CloudWatch Pricing page.
Amazon SageMaker Unified Studio now provides real-time notifications for data catalog activities, enabling data teams to stay informed of subscription requests, dataset updates, and access approvals. With this launch, customers receive real-time notifications for catalog events including new dataset publications, metadata changes, and access approvals directly within the SageMaker Unified Studio notification center. This launch streamlines collaboration by keeping teams updated as datasets are published or modified.
The new notification experience in SageMaker Unified Studio is accessible from a “bell” icon in the top right corner of the project home page. From here, you can access a short list of recent notifications including subscription requests, updates, comments, and system events. To see the full list of all notifications, you can click on “notification center” to see all notifications in a tabular view that can be filtered based on your preferences for data catalogs, projects and event types.
Amazon Web Services (AWS) now supports Internet Protocol version 6 (IPv6) addresses for AWS PrivateLink Gateway and Interface Virtual Private Cloud (VPC) endpoints for Amazon S3.
The continued growth of the internet is exhausting available Internet Protocol version 4 (IPv4) addresses. IPv6 increases the number of available addresses by several orders of magnitude, and customers no longer need to manage overlapping address spaces in their VPCs. To get started with IPv6 connectivity on a new or existing S3 gateway or interface endpoint, configure IP address type for the endpoint to IPv6 or Dualstack. When enabled, Amazon S3 automatically updates the routing tables with IPv6 addresses for gateway endpoints and sets up an Elastic network interface (ENI) with IPv6 addresses for interface endpoints.
IPv6 support for VPC endpoints for Amazon S3 is now available in all AWS Commercial Regions and the AWS GovCloud (US) Regions, at no additional cost. You can set up IPv6 for new and existing VPC endpoints using the AWS Management Console, AWS CLI, AWS SDK, or AWS CloudFormation. To learn more, please refer to the service documentation.
AWS Private Certificate Authority (AWS Private CA) now enables you to create certificate authorities (CAs) and issue certificates that use Module Lattice-based Digital Signature Algorithm (ML-DSA). This feature enables you to begin transitioning your public key infrastructure (PKI) towards post-quantum cryptography, allowing you to put protections in place now to protect the security of your data against future quantum computing threats. ML-DSA is a post-quantum digital signature algorithm standardized by National Institute of Standards and Technology (NIST) as Federal Information Processing Standards (FIPS) 204.
With this feature, you can now test ML-DSA in your environment for certificate issuance, identity verification, and code signing. You can create CAs, issue certificates, create certificate revocation lists (CRLs) and configure online certificate status protocol (OCSP) responders using ML-DSA. Cryptographically relevant quantum computer (CRQC) will be able to break current digital signature algorithms, like Rivest–Shamir–Adleman (RSA) or Elliptic Curve Digital Signature Algorithm (ECDSA), which are expected to be phased out over the next decade.
AWS Private CA support for ML-DSA is available in all commercial AWS Regions, the AWS GovCloud (US) Regions, and the China Regions.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7i-flex instances that deliver up to 19% better price performance compared to C6i instances, are available in the Middle East (UAE) Region. C7i-flex instances provide the easiest way for you to get price performance benefits for a majority of compute intensive workloads. The new instances are powered by the 4th generation Intel Xeon Scalable custom processors (Sapphire Rapids) that are available only on AWS, and offer 5% lower prices compared to C7i.
C7i-flex instances offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources. With C7i-flex instances, you can seamlessly run web and application servers, databases, caches, Apache Kafka, and Elasticsearch, and more. For compute-intensive workloads that need larger instance sizes (up to 192 vCPUs and 384 GiB memory) or continuous high CPU usage, you can leverage C7i instances.
Starting today, VPC Lattice allows you to specify a custom domain name for a resource configuration. Resource configurations enable layer-4 access to resources such as databases, clusters, domain names, etc. across VPCs and accounts. With this feature, you can use resource configurations for cluster-based and TLS-based resources.
Resource owners can use this feature by specifying a custom domain for a resource configuration and sharing the resource configuration with consumers. Consumers can then access the resource using the custom domain, with VPC Lattice managing a private hosted zone in the consumer’s VPC.
This feature also provides resource owners and consumers control and flexibility over the domains they want to use. Resource owners can use a custom domain owned by them, or AWS, or a third-party. Consumers can use granular controls to choose which domains they want VPC Lattice to manage private hosted zones for.
Data is the engine of modern telecommunications. For Ericsson’s Managed Services, which operates a global network of more than 710,000 sites, harnessing this data is not just an advantage, it’s essential for business growth and leadership. To power the future of its autonomous network operations and deliver on its strategic priorities, Ericsson has been on a transformative data journey with governance at the center of its strategy.
Ericsson moved from foundational practices to a sophisticated, business-enabling data governance framework using Google Cloud’s Dataplex Universal Catalog, turning data from a simple resource into a strategic asset.
From a new operating model to a new data mindset
Ericsson’s journey began in 2019 with the launch of the Ericsson Operations Engine (EOE), a groundbreaking, AI-powered operating model for managing complex, multi-vendor telecom networks. The EOE made one thing clear: to succeed,data had to be at the core of everything.
This realization led Ericsson to develop its first enterprise data strategy, which established the core principles for how data is collected, managed and governed. However, building a strategy is one thing — operationalizing it at scale is another.
To move beyond theory to address real-world challenges, Ericsson needed to:
Build trust: Provide discoverable, clean, reliable, and well-understood data to the teams deploying analytics, AI, and automation.
Balance defense and offense: Ensure compliance with contracts and regulations (defensive governance) while empowering teams to innovate and create value from data (offensive governance).
Ensure data integrity: Ericsson users see data integrity as the core principle for effective data management. Data quality, which is essential for reliable, trustworthy data throughout its lifecycle, is a key quality indicator (KQI) for measuring effectiveness. Any quality deviations must be managed like a high-priority incident with clear Service Level Agreements (SLA) for restoration and resolution.
To realize this vision, Ericsson sought a platform that could match its ambition for global-scale governance and innovation — and Dataplex Universal Catalog emerged as the ideal choice.
Ericsson made its selection based on four key criteria.
First, its capabilities aligned perfectly with Ericsson’s requirements for cloud-native transformation, business principles, and a long-term governance vision, underpinned by Ericsson’s strategic partnership with Google Cloud. Second, from a technical standpoint, Dataplex provided a tightly integrated, end-to-end ecosystem as a native Google Cloud solution, translating to faster time-to-market for use cases and reduced integration overhead.
Third, the platform offered a practical operating model that enabled quick learning, adaptation, and self-sufficiency, supporting an agile approach where Ericsson could fail fast and iterate. Finally, as an existing Google Cloud customer, Dataplex presented a clear and manageable Total Cost of Ownership (TCO), serving as a natural extension of Ericsson’s existing environment and providing a clear, manageable cost profile for both storage and compute extension with governance capabilities.
Putting governance into practice: Key capabilities in action
With Dataplex Universal Catalog as the governance foundation, Ericsson began implementing the core pillars of its governance program, moving from manual processes to an automated, intelligent data fabric.
More specifically, Ericsson established a unified business vocabulary within Dataplex. This transformative first step eliminated ambiguity and ensured their teams — from data scientists to data analysts — were speaking the same language. These glossaries also captured tribal knowledge and became the foundation for creating trusted data products.
In addition, Dataplex’s catalog is at the heart of the data governance solution, making data discovery simple and intuitive for authorized users. Ericsson uses its tagging capabilities to enrich the data assets with critical metadata, including data classification, ownership, retention policies, and sensitivity labels. Dataplex’s ability to automatically visualize data lineage, down to the column level, is another game-changer. Different data personas can instantly understand a dataset’s origin and its downstream impact, dramatically increasing trust and reducing investigation time. Furthermore, trustworthy AI models are built on high-quality data. For proactive data quality, Ericsson uses Dataplex to run automated quality checks and profiles on its data pipelines. When a quality rule is breached, an alert is automatically triggered, creating an incident in its service management platform to ensure data issues are treated with the urgency they deserve.
These capabilities are all underpinned by Ericsson’sData Operating Model (DOM), a framework that defines the policies, people, processes, and technology needed to translate its data strategy into tangible value, comprising several facets when working with data.
Enterprise data architecture: Managing data flow, enterprise data modeling and best practices for data collection till consumption
Technology and tools: Business glossary, master, reference and metadata management, data modeling, and data quality management
Roles and responsibilities: Roles to manage and govern data (i.e., end-to-end data lifecycle and stewardship)
Data and model assurance: Data pipelines monitoring, data observability, and data quality monitoring
Governance: Manage data compliance, risk and security management, managing operational level agreement, objective and key results, and audit management
Processes: Data governance, data quality, data management, and data consent related processes
Looking ahead: The future is integrated and intelligent
As a global technology leader, Ericsson is committed to shaping the future of AI-powered data governance. Technology, especially in the AI space, is evolving at a breathtaking pace and both the data and AI governance practices must keep up.
These developments are guiding Ericsson’s future priorities, which include bridging the gap between data and AI governance, especially with the rise of generative and agentic AI. These plans include evaluating using generative AI capabilities in BigQuery and Dataplex to simplify governance and pursuing solutions that ensure transparency, explainability, fairness and manage risk in the deployment of AI models.
In addition to harnessing the power of AI for at-scale governance, Ericsson will also include usage of governance workflows, glossary-driven data quality policies, at-scale assignment of terms to assets, bulk import and export of glossaries, AI-powered glossary recommendations, and data quality re-usability functionalities. Ericsson is also aligning its architecture with data fabric and data mesh principles, empowering teams with self-service access to high-quality, trusted data products.Finally, Ericsson will be assessing the use of more granular, policy-based access controls to complement existing role-based access, further strengthening its data security, protection and privacy.
For any organization embarking on a similar path, Ericsson’s experience offers several key lessons:
Governance is a value enabler, not a blocker: A modern data governance program is focused on business enablement first, driving value and innovation, to complement policies, rules and risk management.
It’s a journey, not a destination: Be prepared to fail fast, learn, and adapt. The landscape is constantly changing at breakneck speed.
Focus on business outcomes, not tools: Technology is a critical enabler, but the conversation is about the business value you’re creating. Simplify the story, speak the language of the business, and unpack the hype.
Culture is everything: For governance to be effective, it’s the responsibility of everyone. This requires strong leadership, sponsorship, and a “data-first” mindset embedded throughout the organization.
By partnering with Google Cloud and tapping into the power of Dataplex Universal Catalog, Ericsson is building a data foundation that is not only compliant and secure but agile and intelligent — ready to power the next generation of autonomous networks.
At its simplest, an agent is an application that reasons on how to best achieve a goal based on inputs and tools at its disposal.
As you build sophisticated multi-agent AI systems with the Agent Development Kit (ADK), a key architectural decision involves choosing between a sub-agent and an agent as a tool. This choice fundamentally impacts your system’s design, how well it scales, and its efficiency. Choosing the wrong pattern can lead to massive overhead — either by constantly passing full conversational history to a simple function or by under-utilizing the context-sharing capabilities of a more complex system.
While both sub-agents and tools help break down complex problems, they serve different purposes. The key difference is how they handle control and context.
Agents as tools: The specialist on call
An agent as a tool is a self-contained expert agent packaged for a specific, discrete task, like a specialized function call. The main agent calls the tool with a clear input and gets a direct output, operating like a transactional API. The main agent doesn’t need to worry about how the tool works; it only needs a reliable result. This pattern is ideal for independent and reusable tasks.
Key characteristics:
Encapsulated and reusable: The internal logic is hidden, making the tool easy to reuse across different agents.
Isolated context: The tool runs in its own session and cannot access the calling agent’s conversation history or state.
Stateless: The interaction is stateless. The tool receives all the information it needs in a single request.
Strict input/output: It operates based on a well-defined contract.
Sub-agents: The delegated team member
A sub-agent is a delegated team member that handles a complex, multi-step process. This is a hierarchical and collaborative relationship where the sub-agent works within the broader context of the parent agent’s mission. Use sub-agents for tasks that require a chain of reasoning or a series of interactions.
Key characteristics:
Tightly coupled and integrated: Sub-agents are part of a larger, defined workflow.
Shared context: They operate within the same session and can access the parent’s conversation history and state, allowing for more nuanced collaboration.
Stateful processes: They are ideal for managing processes where the task requires several steps to complete.
Hierarchical delegation: The parent agent explicitly delegates a high-level task and lets the sub-agent manage the process.
Here is a simple decision matrix that you can use to guide your architectural decision based on the task:
Criterion
Agent as a tool
Sub-agent
Decision
Task complexity
Low to Medium
High
Use a tool for atomic functions. Use a sub-agent for complex workflows.
Context & state
Isolated/None
Shared
If the task is stateless, use a tool. If it requires conversational context, use a sub-agent.
Reusability
High
Low to Medium
For generic, widely applicable capabilities, build a tool. For specialized roles in a specific process, use a sub-agent.
Autonomy & control
Low
High
Use a tool for a simple request-response. Use a sub-agent for delegating a whole sub-problem.
Use cases in action
Let’s apply this framework to some real-world scenarios.
Use case 1: The data agent (NL2SQL and visualization)
A business user asks for the top 5 product sales in Q2 by region and wants a bar chart.
Root Agent : Receives the business user’s request (NL), determines the necessary steps (SQL generation → Execution → Visualization), and delegates/sequences the tasks, before returning the response to the user.
NL2SQL Agent: Use a tool. The task is a single, reusable function: convert natural language to a SQL string, using metadata & schema for grounding.
Database Executor: Use a tool. This is a simple, deterministic function to execute the query and return data.
Data Visualization Agent: Use a sub-agent. The task is complex and multi-step. It involves analyzing the data returned by the database tool, and the original user query, selecting the right chart type, generating the visualization code, and executing it. Delegating this to a sub-agent allows the main orchestrator agent to maintain a high-level view while the sub-agent independently manages its complex internal workflow.
Use case 2: The sophisticated travel planner
A user asks to plan a 5-day anniversary trip to Paris, with specific preferences for flights, hotels, and activities. This is an ambiguous, high-level goal that requires continuous context and planning.
Travel planner: Use a root agent, to maintain the overall goal (“5-day anniversary trip to Paris”),manage the flow between sub-agents, and aggregate the final itinerary.
Note: You could implement a Context/Memory Manager Tool accessible to all agents, potentially using a simple key-value store (like Redis or a simple database) to delegate the storage of immutable decisions.
Flight search: Use a sub-agent. The task is not a simple search; involving multiple back-and-forth interactions with the user (e.g., “Is a layover in Dubai okay?”) while managing the overall trip context (dates, destination, class).
Hotel booking: Use a sub-agent. It needs to maintain state and context (dates, location preference, 5-star rating) as it searches for and presents options.
Itinerary generation: Use a sub-agent to generate a logical, day-by-day itinerary. The agent must combine confirmed flights/hotels with user interests (e.g., art museums, fine dining), potentially using its own booking tools.
Using tools is inefficient; each call requires the full trip context, leading to redundancy and state loss. Sub-agents are better for these stateful, collaborative processes as they share session context.
Get started
The decision between sub-agents and agents as tools is fundamental to designing an effective and scalable agentic system in ADK. As a guiding principle, remember:
Use tools for discrete, stateless, and reusable capabilities.
Use sub-agents to manage complex, stateful, and context-dependent processes.
By mastering this architectural pattern, you can design multi-agent systems that are modular and capable of solving complex, real-world problems.
Check out these examples on GitHub to start building using ADK.
Here is a fantastic blogpost that will help you build your first multi-agent workflow.
Modern applications store their most valuable data such as product catalogs or user profiles in operational databases. These data stores are excellent for applications that need to handle real-time transactions — and with their support for vector operations, they’ve also become an excellent foundation for modern search or gen AI application serving.
AlloyDB AI provides powerful, high-performance vector capabilities enabling you to generate embeddings inline and manually tune powerful vector indexes. While you can generate embeddings out of the box for in line search use cases, we also wanted AlloyDB to address the complexity of creating and maintaining huge numbers of vector embeddings.
To make this possible, we’re introducing two new features for AlloyDB AI, available in preview, that will empower you to transform your existing operational database into a powerful, AI-native database with just a few lines of SQL:
Auto vector embeddings
Auto vector index
Auto vector embeddings transform operational data into vector search ready data by vectorizing data stored inside of AlloyDB at scale. The auto vector index self-configures vector indexes optimized for customer’s workloads, ensuring high quality and performance.
Compare this to the traditional approach of creating the vectors and loading them into your database. The basic steps are familiar to any AI developer: generate vector embeddings using specialized AI models, import the vectors into the database alongside the underlying text, and tune vector indexes. In other words, build an ETL (Extract, Transform, Load) pipeline, extract the data from your database, apply transformations, run it through the AI model, reload and reformat it, then reinsert it into your database and then tune the vector indexes. This approach not only involves significant engineering complexity but also introduces latency, making it difficult to keep your application in sync with your live data despite it being stored alongside it.
An additional challenge is to keep the vector index up to date, which is hard to do manually. While manually tuned indexes are performant and provide excellent results, they can be sensitive to updates in the underlying data and require performance and quality testing before they’re ready to hit the road.
Let’s walk through an example journey of an operational workload and see how AlloyDB AI’s new features remove friction from building enterprise-grade AI, and enable users to modernize applications from their database.
AlloyDB as a vector database
Imagine you run a large e-commerce platform with a products table in AlloyDB, containing structured data like product_id, color, price, and inventory_count, alongside unstructured data such as product_description.
You want to build a gen AI search feature to improve the quality of search in your application and make it more dynamic and personalized for users. You want to evolve from solely supporting simple lexical searches such as “jacket”, which perform exact matches, to searches such as “warm coat for winter” that can find semantically similar items like jackets, coats or vests. To refine the quality, you also want to combine this semantic matching with structured filters such as color = 'maroon' or price < 100. Some of these filters may even live in a different table, such as an orders table which stores information about the user’s order history.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get started with a 30-day AlloyDB free trial instance’), (‘body’, <wagtail.rich_text.RichText object at 0x7f0420dea850>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
From operational to AI-native
Before you can get started on application logic, you need to generate embeddings on your data so you can perform a vector search. For this you would typically need to:
Build an ETL pipeline to extract products data from AlloyDB
Write custom code to batch the data and send it to an embedding model API on Vertex AI
Carefully manage rate limits, token limits, and failures
Write the resulting vectors back into your database
Build another process to watch for UPDATE commands so you can do it again and again, just to keep your data fresh
AlloyDB AI’s new feature, auto vector embeddings, eliminates this entire workflow.
It provides a fully managed, scalable solution to create and maintain embeddings directly from the database. The system batches API calls to Vertex AI, maximizing throughput, and can operate as a background process to ensure that your critical transactions aren’t blocked.
To generate vector embeddings from your product_description column, you just run one SQL command:
Now that you have embeddings, you face the second hurdle: performance and quality of search. Say a user searches for “warm winter coat.” Your query may look like this:
code_block
<ListValue: [StructValue([(‘code’, “SELECT * FROM productsrnWHERE color = ‘maroon’rnORDER BY product_embedding <-> google_ml.embedding(‘gemini-embedding-001’, ‘warm coat for winter’)rnLIMIT 10;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f0420dea940>)])]>
To make this vector search query performant, you need a vector index. But traditional vector indexes require deep expertise: you have to manually configure parameters, rebuild the index periodically as data changes, and hope your tuning is correct. This complexity slows development and adds operational complexity.
code_block
<ListValue: [StructValue([(‘code’, ‘– Optimal `num_leaves` and `max_num_levels` are based on number of vectors in thern– products table, which means the user will have to figure that out beforehand torn– properly tune the index.rnrnCREATE INDEX idx_products_embedding ON productsrnUSING scann (product_embedding)rnWITH (num_leaves=100000, max_num_levels=2);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f0420deaa00>)])]>
The new auto vector index feature abstracts all this away and delivers a fully automated and integrated vector search experience that is self-configuring, self-maintaining, and self-tuning. To create a fully optimized index, you just run:
code_block
<ListValue: [StructValue([(‘code’, “– AlloyDB will automatically figure out index configuration underneath the hood.rnCREATE INDEX idx_products_embedding ON productsrnUSING scann (product_embedding)rnWITH (mode = ‘AUTO’);”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f0420deaa90>)])]>
With mode=’AUTO’, AlloyDB handles everything:
Automatic configuration: It analyzes your data and automatically configures the index parameters at creation time to meet your performance and quality goals.
Automatic maintenance: The index updates incrementally and automatically as your data changes, ensuring it remains optimized without any manual intervention. It automatically splits as the index grows in size and automatically updates centroids when data distribution drifts.
Automatic query plan optimization: This is where the real magic happens. The ScaNN index leverages real-time workload statistics to self-tune and optimize te execution plan. For a deeper dive, read our previous blog, A deep dive into AlloyDB’s vector search enhancements.
Two new ways to become AI-native
With AlloyDB’s new capabilities, making your operational workload AI-native no longer requires complex ETL pipelines and infrastructure code.
Auto vector embeddings transforms your data by handling the entire embedding generation and management lifecycle inside the database.
Auto vector index simplifies retrieval by providing a self-tuning, self-maintaining index that automatically optimizes complex filtered vector searches.
By removing this complexity, AlloyDB empowers you to use your existing SQL skills to build and scale world-class AI experiences with speed and confidence, moving projects from proof-of-concept to production faster than ever before. Get started with auto vector embeddings and the auto vector index today.
To get started, try our 30-day AlloyDB free trial. New Google Cloud customers also get $300 in free credits.
n8n is a powerful yet easy-to-use workflow and automation tool for multi-step AI agents, and many teams want a simple, scalable, and cost-effective way to self-host it. With just a few commands, you can deploy n8n to Cloud Run and have it up and running, ready to supercharge your business with AI workflows that can manage spreadsheets, read and draft emails, and more. The n8n docs now tell you how to deploy the official n8n Docker image to our serverless platform, connect it to Cloud SQL for persistent data storage, call Gemini as the agents’ LLM, and (optionally) connect your workflows directly to Google Workspace.
Deploy n8n to Cloud Run in minutes
You can deploy the official n8n image directly to Cloud Run. This gives you a managed, serverless environment that automatically scales from zero to handle any workload, so you only pay for what you use. That means whenever you’re not actively using n8n, you’re not paying for any compute and your n8n data is persisted in Cloud SQL.
To first try out n8n quickly on Cloud Run, deploy it with this one command:
This gives you a running instance of n8n that you can use to try out n8n and all its awesome features for workflow automation with the power of AI. Connect your first n8n agent to Gemini (provide your Gemini API key for the “Google Gemini Chat Model” credentials) and see it in action.
Then when you’re ready to use n8n for actual workflows, you can follow the steps in the n8n docs for a more durable, secure setup (using Cloud SQL, Secrets Manager, etc.). You can either use a Terraform script or follow along step-by-step through each gcloud command in the instructions.
Connect Google Workspace tools
A key benefit of hosting on Google Cloud is the ability to easily connect n8n to your Google Workspace tools. The n8n docs walk you through the steps to configure OAuth for Google Cloud, allowing your n8n workflows to securely access and automate tasks using Google tools like Gmail, Google Calendar, and Google Drive.
Here’s a demo showing an n8n instance on Cloud Run that uses Gmail and Google Calendar to schedule appointments on your behalf whenever an email hits your inbox with a request to meet:
The two AI agents in this n8n workflow call Gemini to do the following:
The Text Classifier reads your incoming emails to see which ones are asking for time to meet
The Agent checks your calendar for your availability, and sends a response with a suggested time
Cloud Run is great for all AI apps
Cloud Run is a versatile, easy-to-use runtime for all your AI application needs. Whether your agentic app was made with n8n, LangChain, ADK, or no framework at all, you can deploy it to Cloud Run. This collaboration on Cloud Run and n8n is another example of how we aim to simplify the process for developers to build and deploy intelligent applications.
Across the world, organizations are partnering with Google Cloud to tackle their toughest challenges, drive digital transformation, and unlock new levels of growth. In Europe, organizations face unique and complex regulatory challenges. To ensure we’re delivering the best possible value and experience for our customers here, we have established a new European Advisory Board. This distinguished group of leaders from across various industries will act as a vital feedback channel, help customers navigate complex regulatory landscapes, and foster a strong, sustainable digital economy. Their counsel is key to ensuring Google Cloud products not only meet but exceed European requirements, driving our regional expertise and differentiation and ultimately supporting Europe’s digital transformation.
The board comprises renowned leaders with deep expertise spanning technology, finance, retail, and public service.
The new board members are:
Jim Snabe (Chair): A global business leader and current Chairman of Siemens AG. With a long career at the intersection of technology and innovation, including his time as Co-CEO of SAP AG, Jim brings deep expertise in guiding multinational organizations through digital transformation and growth. His leadership will be pivotal in steering the board’s strategic direction.
Stefan F Heidenreich: A business leader with extensive experience in the consumer goods industry, including as Chairman of the Management Board and CEO of Beiersdorf AG. His knowledge of brand management, market strategy, and organizational leadership will provide valuable commercial insights.
Nigel Hinshelwood: An expert in financial services with significant leadership roles at institutions like HSBC and Lloyds Banking Group. His understanding of Europe’s financial sector and regulatory environment will be crucial for guiding Google Cloud’s work with major banking and financial services clients.
Christophe Cuvillier: A prominent French businessman and former CEO of Unibail-Rodamco-Westfield. With a background in luxury, retail, and real estate, Christophe’s perspective on customer-centricity and business transformation in the consumer sector will be a key asset to the board.
Tim Radford (from Jan 2026): A former British military leader and operational commander with a background in defense and large-scale project delivery. His insights into leveraging technology to achieve strategic business objectives will be vital to the board’s discussions.
“It is a privilege to chair Google Cloud’s EMEA advisory board,” said Jim Snabe. “Europe is at a critical juncture in its digital evolution. This board’s mission is to provide counsel that helps Google Cloud not only accelerate innovation but also ensure it is done in a way that aligns with Europe’s values and priorities, fostering a secure and inclusive digital future.”
The formation of this board underscores Google Cloud’s ongoing commitment to a European-first strategy, collaborating closely with local leaders to build technology solutions that are tailored to the continent’s unique needs and opportunities. The board will meet periodically to advise Google Cloud leadership on a range of strategic issues, from product development and market entry to policy and sustainability initiatives.
Large Language Models (LLMs) are powerful, but their performance can be bottlenecked by the immense NVIDIA GPU memory footprint of the Key-Value (KV) Cache. This cache, crucial for speeding up LLM inference by storing Key (K) and Value (V) matrices, directly impacts context length, concurrency, and overall system throughput. Our primary goal is to maximize the KV Cache hit ratio by intelligently expanding NVIDIA GPU High Bandwidth Memory (HBM) with a tiered node-local storage solution.
Our collaboration with the LMCache team (Kuntai Du, Jiayi Yao, and Yihua Cheng from Tensormesh) has led to the development of an innovative solution on Google Kubernetes Engine (GKE).
Tiered Storage: Expanding the KV Cache Beyond HBM
LMCache extends the KV Cache from the NVIDIA GPU’s fast HBM (Tier 1) to larger, more cost-effective tiers like CPU RAM and local SSDs. This dramatically increases the total cache size, leading to a higher hit ratio and improved inference performance by keeping more data locally on the accelerator node. For GKE users, this means accommodating models with massive context windows while maintaining excellent performance.
Performance Benchmarking and Results
We designed tests to measure the performance of this tiered KV Cache by configuring workloads to fill each storage layer (HBM, CPU RAM, Local SSD). We benchmarked these configurations using various context lengths (1k, 5k, 10k, 50k, and 100k tokens), representing diverse use cases such as:
1k – 5k tokens: High-fidelity personas and complex instructions
10k tokens: Average user prompts (small RAG) or web page/article content
50k tokens: Prompt stuffing
100k tokens: Content equivalent to a long book
Our primary performance indicators were Time to First Token (TTFT), token input throughput, and end-to-end latency. The results highlight the best-performing storage setup for each KV Cache size and the performance improvements achieved.
Experiment Setup
We deployed a vLLM server on an A3 mega machine, leveraging local SSD for ephemeral storage via emptyDir.
Requests: Tests were conducted with system prompt lengths of 1k, 5k, 10k, 50k, and 100k tokens. Each system prompt provided a shared context for a batch of 20 inference requests, with individual requests consisting of a unique 256-token input and generating a 512-token output.
Our tests explored different total KV Cache sizes. The following results highlight the optimal storage setup for each size and the performance improvements achieved:
Test 1: Cache (1.1M – 1.3M tokens) fits entirely within HBM
Results: In this scenario, adding slower storage tiers provided no advantage, making an HBM-only configuration the optimal setup.
Test 2: Cache (4.0M – 4.3M tokens) exceeds HBM capacity but fits within HBM + CPU RAM
System Prompt Length
Best-performing Storage Setup
Mean TTFT (ms) Change (%) vs. HBM only
Input Throughput Change (%) vs. HBM only
Mean End-to-End Latency Change (%) vs. HBM only
1000
HBM
0%
0%
0%
5000
HBM + CPU RAM
-18%
+16%
-14%
10000
HBM + CPU RAM
-44%
+50%
-33%
50000
HBM + CPU RAM + Local SSD
-68%
+179%
-64%
100000
HBM + CPU RAM + Local SSD
-79%
+264%
-73%
Test 3: Large cache (12.6M – 13.7M tokens) saturates HBM and CPU RAM, spilling to Local SSD
System Prompt Length
Best-performing Storage Setup
Mean TTFT (ms) Change (%) vs. HBM only
Input Throughput Change (%) vs. HBM only
Mean End-to-End Latency Change (%) vs. HBM only
1000
HBM + CPU RAM
+5%
+1%
-1%
5000
HBM + CPU RAM
-6%
+27%
-21%
10000
HBM + CPU RAM
+121%
+23%
-19%
50000
HBM + CPU RAM + Local SSD
+48%
+69%
-41%
100000
HBM + CPU RAM + Local SSD
-3%
+130%
-57%
Summary
These results clearly demonstrate that a tiered storage solution significantly improves LLM inference performance by leveraging node-local storage, especially in scenarios with long system prompts that generate large KV Caches.
Optimizing LLM inference is a complex challenge requiring the coordinated effort of multiple infrastructure components (storage, compute, networking). Our work is part of a broader initiative to enhance the entire end-to-end inference stack, from intelligent load balancing at theInference Gatewayto advanced caching logic within the model server.
We are actively exploring further enhancements by integrating additional remote storage solutions with LMCache.
In our latest episode of The Agent Factory, we were thrilled to welcome Logan Kilpatrick from Google Deep Mind for a vibe coding session that showcased the tools shaping the future of AI development. Logan, who has had a front-row seat to the generative AI revolution at both OpenAI and now Google, gave us a hands-on tour of the vibe coding experience in Google AI Studio, showing just how fast you can go from an idea to a fully-functional AI application.
This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.
The Build Experience in Google AI Studio – What is it?
This episode focused on the Build feature in Google AI Studio and Logan used the term vibe coding to describe the experience of using it. This feature is designed to radically accelerate how developers create AI-powered apps. The core idea is to move from a natural language prompt of an idea for an app to a live, running application in under a minute. It handles the scaffolding, code generation, and even error correction, allowing you to focus on iterating and refining your idea.
The Factory Floor
The Factory Floor is our segment for getting hands-on. Here, we moved from high-level concepts to practical code with live demos.
To kick things off, Logan hit the “I’m Feeling Lucky” button to generate a random app idea: a virtual food photographer for restaurant owners. The goal was to build an app that could:
Accept a simple text-based menu.
Generate realistic, high-end photography for each dish.
Allow for style toggles like “rustic and dark” or “bright and modern.”
In about 90 seconds, we had a running web app. Logan fed it a quirky menu of pizza, blueberries, and popcorn, and the app generated images of each. We also saw how you can use AI-suggested features to iteratively adjust the prepared photos—like adding butter to the popcorn, and add functionality—like changing the entire design aesthetic of the site.
Next, Logan showcased one of the most exciting new features: grounding with Google Maps. This allows the Gemini models to connect directly to Google Maps to pull in rich, real-time place data without setting up a separate API. He demonstrated a starter template app that acted as a local guide, finding Italian restaurants in Chicago and describing the neighborhood.
For developers looking for inspiration, Logan walked us through the AI Studio Gallery. This is a collection of pre-built, interactive examples that show what the models are capable of. Two highlights were:
Prompt DJ: An app that uses the Lyria model to generate novel, real-time music based on a prompt.
Vibe Check: A fun tool for visually testing and comparing how different models respond to the same prompt, which is becoming a popular way for developers to quickly evaluate a model’s suitability for their use case.
For the final demo, Logan used a speech-to-text input to describe an app idea which he called “Yap to App”. His pitch: an AI pair programmer that could generate HTML code and then vocally coach him on how to improve it. After turning his spoken request into a written prompt, AI Studio built a voice-interactive app. The AI assistant generated a simple HTML card and then, when asked, provided verbal suggestions for improvement.
In this segment, we covered some of the biggest recent launches in the agent ecosystem:
Veo 3.1: Google’s new state-of-the-art video generation model that builds on Veo 3, adding richer native audio and the ability to define the first and last frames of a video to generate seamless transitions. Smitha showcased a quick applet, built entirely in AI Studio, where users can upload a selfie of themselves and generate a video of their future career in AI using Veo 3.1.
Anthropic’s Skills: A new feature that allows you to give Claude specific tools (like an Excel script) that it can decide to use on its own to complete a task. We compared this to Gemini Gems, noting the difference in approach between creating a persona (Gem) and providing a tool (Skill).
When asked which launch developers have been most excited about, Logan admitted he was surprised by the overwhelmingly positive reception for grounding with Google Maps. He noted that the Maps API is one of the most widely used developer APIs in the world, and making it incredibly simple to integrate with Gemini unlocked key use cases for countless developers and startups.
Looking ahead, Logan shared his excitement for the continued progress on code generation, which he sees as a fundamental accelerant for all other AI capabilities. He also pointed out a trend: models are evolving from simple tools into complex systems.
Historically, a model was something that took a token in and produced a token out. Now, models are starting to look more like agents out of the box. They can take actions: spinning up code sandboxes, pinging APIs, and navigating browsers. “Folks have thought about agents and models as these decoupled concepts,” Logan said, “and it feels like they’re coming closer and closer together as the model capabilities keep improving.”
Conclusion
This conversation was a powerful reminder of how quickly the barrier to entry for building sophisticated AI applications is falling. With tools like Google AI Studio, the ability to turn a creative spark into a working prototype is no longer a matter of weeks or days, but minutes. The focus is shifting from complex scaffolding to rapid, creative iteration.
Your turn to build
We hope this episode inspired you to get hands-on. Head over to Google AI Studio to try out vibe coding for yourself, and don’t forget to watch the full episode for all the details.
The world of Generative AI is evolving rapidly, and AI Agents are at the forefront of this change. An AI agent is a software system designed to act on your behalf. They show reasoning, planning, and memory and have a level of autonomy to make decisions, learn, and adapt.
At its core, an AI agent uses a large language model (LLM), like Gemini, as its “brain” to understand and reason. This allows it to process information from various sources, create a plan, and execute a series of tasks to reach a predefined objective. This is the key difference between a simple prompt-and-response and an agent: the ability to act on a multi-step plan.
The great news is that you can now easily build your own AI agents, even without deep expertise, thanks toAgent Development Kit (ADK). ADK is an open-source Python and Java framework by Google designed to simplify agent creation.
To guide you, this post introduces three hands-on labs that cover the core patterns of agent development:
Building your first autonomous agent
Empowering that agent with tools to interact with external services
Orchestrate a multi-agent system where specialized agents collaborate
Build your first agent
This lab introduces the foundational principles of ADK by guiding you through the construction of a personal assistant agent.
You will write the code for the agent itself and will interact directly with the agent’s core reasoning engine, powered by Gemini, to see how it responds to a simple request. This lab is focused on building the fundamental scaffolding of every agent you’ll create.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7f04342d1dc0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Empower your agent with tools
An agent without custom tools can only rely on its built-in knowledge. To make it more powerful for your specific use-case, you can give it access to specialized tools. In this lab, you will learn three different ways to add tools:
Build a Custom Tool: Write a currency exchange tool from scratch.
Leverage a Third-Party Tool: Import and use a Wikipedia toolfrom the LangChain library.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7f04342d1550>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Build a Team of Specialized Agents
When a task is too complex for a single agent, you can build out a multi-agent team. This lab goes deep into the power of multi-agent systems by having you build a “movie pitch development team” that can research, write, and analyze a film concept.
You will learn how to use ADK’s Workflow Agents to control the flow of work automatically, without needing user input at every step. You’ll also learn how to use the session state to pass information between the agents.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7f04342d1400>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Summary: Build Your First AI Teammate Today
Ready to build your first AI agents? Dive into the codelabs from this post:
The Amazon Web Services (AWS) Advanced .NET Data Provider Driver is now generally available for Amazon RDS and Amazon Aurora PostgreSQL and MySQL-compatible databases. This advanced database driver reduces RDS Blue/Green switchover and database failover times, improving application availability. Additionally, it supports multiple authentication mechanisms for your database, including Federated Authentication, AWS Secrets Manager authentication, and token-based authentication with AWS Identity and Access Management (IAM).
The driver builds on top of Npgsql PostgreSQL, native MySql.Data, and MySqlConnector drivers to further enhance functionality beyond standard database connectivity. The driver is natively integrated with Aurora and RDS databases, enabling it to monitor database cluster status and quickly connect to newly promoted writers during unexpected failures that trigger database failovers. Furthermore, the driver seamlessly works with popular frameworks like NHibernate and supports Entity Framework (EF) with MySQL databases.
The driver is available as an open-source project under the Apache 2.0 license. Refer the instructions on the on the GitHub repository to get started.
Amazon Cognito user pools now supports AWS PrivateLink for secure and private connectivity. With AWS PrivateLink, you can establish a private connection between your virtual private cloud (VPC) and Amazon Cognito user pools to configure, manage, and authenticate against your Cognito user pools without using the public internet. By enabling private network connectivity, this enhancement eliminates the need to use public IP addresses or relying solely on firewall rules to access Cognito. This feature supports user pool management operations (e.g., list user pools, describe user pools), administrative operations (e.g., admin-created users), and user authentication flows (sign in local users stored in Cognito). OAuth 2.0 authorization code flow (Cognito managed login, hosted UI, sign-in via social identity providers), client credentials flow (Cognito machine-to-machine authorization), and federated sign-ins via SAML and OIDC standards are not supported through VPC endpoints at this time.
You can use PrivateLink connections in all AWS Regions where Amazon Cognito user pools is available, except AWS GovCloud (US) Regions. Creating VPC endpoints on AWS PrivateLink will incur additional charges; refer to AWS PrivateLink pricing page for details. You can get started by creating an AWS PrivateLink interface endpoint for Amazon Cognito user pools using the AWS Management Console, AWS Command Line Interface (CLI), AWS Software Development Kits (SDKs), AWS Cloud Development Kit (CDK), or AWS CloudFormation. To learn more, refer to the documentation on creating an interface VPC endpointand Amazon Cognito’s developer guide.
AWS Key Management Service (KMS) announces support for the Edwards-curve Digital Signature Algorithm (EdDSA). With this new capability, you can create an elliptic curve asymmetric KMS key or data key pairs to sign and verify EdDSA signatures using the Edwards25519 curve (Ed25519). Ed25519 provides 128-bit security level equivalent to NIST P-256, faster signing performance, and small signature size (64 bytes) and public key sizes (32 bytes).
Ed25519 is ideal for situations that require small key and signature sizes, such as Internet of Things (IoT) devices and blockchain applications like cryptocurrency.
This new capability is available in all AWS Regions, including the AWS GovCloud (US) Regions and the China Regions. To learn more about this new capability, see Asymmetric key specs section in the AWS KMS Developer Guide.