Cloud

2025 03 24

GCP – Build gen AI agents using Google Cloud databases

As enterprises build generative AI agents to strengthen their security posture or improve their customer experience, they need access to real-time data. Because most business critical and real-time data is stored and processed in databases, you need ways to perform agentic orchestration in dynamic ways.

In this post, we’ll define a new tech stack composed of models, tools, data stores, and applications. We’ll show why each component is important to serving the needs of enterprise customers based on scale, performance, security, and manageability. Let’s dive in.

1. From LLMs to RAG Agents — From LLMs to RAG Agents

Components of an agentic application

In the world of services, human agents have been supporting customers for decades. They’ve helped with travel reservations, insurance recommendations, and contract negotiations. “AI” agents are similar in the way they support users, but they have additional advantages. We’re seeing organizations build increasingly sophisticated AI agents to:

Deliver hyper-personalized experiences by understanding their needs based on history
Help employees be more productive and work better together with automated tools
Assist the creative process by generating content and running campaigns based on objectives
Perform complex data analysis with multiple data sources to act on signals and patterns
Accelerate software development with AI-enabled code generation and assistance
Strengthen security posture by mitigating attacks and increasing the speed of investigation

Agentic apps have a few additional features when compared to previous gen AI apps—they have a more sophisticated orchestration module with additional instructions which enables them to reason and plan by utilizing various tools.

2. Agentic app architecture — Agentic app architecture

The orchestration system within the agent runtime works with foundation models and tools to call service APIs, connect to databases, and even collaborate with other agents. There are a few core modules that construct an agent runtime.

Orchestration: Maintains the memory and state of sessions, sends prompts to the model and parses the response. If the response includes tool calls then the orchestration performs the corresponding API call and includes the results in the next prompt.
Models: Used to reason over goals to determine the next steps and generate a response.
Data: Retrieves application data from other service APIs, operational data from databases, analytical data from lake houses, and unstructured data from blobs.

Developers can configure the agent with natural language instructions and examples to guide them. Then they can give access to various types of memory such as session histories, user profiles, and task profiles, and can augment it with task decomposition and planner services to break down complex requests into smaller work streams.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e849a716550>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

Connecting agents to Google Cloud Databases

Agents are only as powerful as the tools that they can use to perform their tasks. And most enterprise applications rarely use just a single data source. That is one reason why agentic orchestration has emerged as a new paradigm for LLM-powered applications to handle more complex tasks. Agents can select from a set of functions called “tools” which can access data or take actions which inform the next step. Using this dynamic, iterative process, agents can automate complex enterprise workflows.

However, there are several challenges developers face when creating and managing tools at scale. Agents often use multiple tools and frameworks as well as connect to various data sources which can be difficult to integrate. One of the most challenging tasks for agents is the discovery and connectivity to data sources. This process can be complex and introduce security challenges, while supporting multiple frameworks can be difficult to manage.

That’s where Gen AI Toolbox for Databases comes in, an open-source service that empowers application developers to connect production-grade, agent-based AI applications to databases. It streamlines the creation, deployment, and management of sophisticated gen AI tools capable of querying databases with secure access, robust observability, scalability, and comprehensive manageability. Gen AI Toolbox currently provides connectivity to popular open-source databases such as PostgreSQL, MySQL, Neo4j, Hypermode, as well as Google Cloud databases such as AlloyDB (+Omni), Spanner, and Cloud SQL.

3. Gen AI Toolbox for Databases — Gen AI Toolbox for Databases

Gen AI Toolbox for Databases improves how gen AI tools interact with data, addressing common challenges in gen AI tool management. By acting as an intermediary between the application’s orchestration layer and data sources, it enables faster development and more secure data access, improving the production-quality of tools.

Using natural language to query databases with agents

Once your agents are connected to databases you can use a wide variety of methods to query the data. A recent technique that is gaining popularity is using natural language to query databases. Through the power of LLMs, your natural language questions such as:

“What are the cheapest direct flights from Boston to Denver in July?”

Can be converted to a SQL query like:

SELECT flight.id, flight.price, carrier.name, […]

FROM […]

WHERE […]

ORDER BY flight.price ASC

AlloyDB provides the capability to convert natural language questions like these into SQL statements. This enables gen AI applications to more securely execute natural-language queries such as “Where is my package?” or “Who is the top earner in each department?” AlloyDB does this by translating the natural-language input into a SQL query specific to your database and filtering the results only to what the user of your application is allowed to view.

Since natural language is already used for LLM prompts, it would be efficient and easier for agents and gen AI apps to pass the queries to the database without having to convert them to SQL statements, if the database can understand natural language—opening up a new approach for data access and retrieval.

Handling complex data models for agents

For large organizations, it is common to have various data types and models. Furthermore, interconnected data is becoming increasingly important to customers for use cases such as knowledge graphs, recommendations, and fraud detection. This has been a challenging problem to solve for agents as it requires them to discover and traverse various data systems and combine the results afterwards.

4. Graph-based models for connected data — Graph-based models for connected data

With Graph built into Spanner this can be resolved with a simple call for an agent. Spanner is a multi-model database that supports differentiated graph, vector, and full-text search. Graph search will get the relevant results based on the pre-defined relationships, vector search can retrieve the categorical results for similarity searches, and full-text search can get the exact matches—together enabling hybrid search from a single database.

This provides a powerful way to provide enterprise context for agents across a wide variety of data models without having to combine the results separately.

Get started with agentic apps and databases

To get started building agentic apps with Google Databases, download and try Gen AI Toolbox for Databases on Github.

Read More for the details.

2025 03 24

GCP – Nuro drives autonomous innovation with AlloyDB for PostgreSQL

Tibor Kiss Cloud, Google Cloud gcp

Editor’s note: Nuro, a robotics company that develops technology for self-driving vehicles, needed a data platform that could handle complex data processes and support continuous AI model improvement. By migrating to AlloyDB for PostgreSQL, Nuro gained the scalability, high performance, and advanced query capabilities needed to power AI-driven insights across millions of data points while reducing operating costs. AlloyDB AI further enables Nuro to perform complex similarity searches on vector embeddings, supporting continuous improvement.

Nuro’s mission is to make daily life better through robotics.

One of the ways we achieve this is with Nuro Driver, an AI-powered technology that automakers and mobility providers use to develop autonomous vehicles for personal use, delivery services, and ride-sharing applications.

Naturally, creating self-driving technology that’s truly safe and reliable takes more than just innovation — it requires a platform capable of processing vast amounts of data and adapting to continuous learning cycles. That’s why we needed data infrastructure that could handle our growing volumes of complex data and support essential processes like data discovery, labeling, and rapid evaluation.

As we navigated options for a managed SQL database that could handle these challenges and build on our existing PostgreSQL setup, we explored several options. We ultimately arrived at AlloyDB, a high-performance, fully managed PostgreSQL-compatible database on Google Cloud, for its superior performance, ease of use, and seamless integration.

Gearing up for autonomous data growth

Transitioning to a new data infrastructure can often be disruptive, but with AlloyDB, the process was seamless. The migration from our existing PostgreSQL environment required zero downtime and one-click setup. This allowed for continuous fleet operations without interruptions to deliveries or model training. AlloyDB now powers our core transactional and analytical workloads, managing crucial metadata for logs, trips, simulations, and real-time autonomy issues.

Operating across multiple cities, we rely on Google Cloud’s global availability to collect and manage petabytes of data for AI model training, evaluation, and simulation — with quick turn-around. This infrastructure enables analysis for refining route optimization to find challenging scenarios so our AI models can learn based on real-world on-road performance. AlloyDB plays a critical role in this ecosystem, efficiently processing large query volumes while supporting the rapid, data-driven decisions essential to autonomous operations.

Beyond performance, AlloyDB’s fully managed service reduced the burden of scaling and maintenance, allowing our team to focus on improving AI models rather than database administration. Its advanced query capabilities and deep integration with Google Cloud streamlined workflows, helping us iterate on autonomy models faster. With improved efficiency and reliability, our fleet can continuously evolve.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e849a324c10>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

A data platform built for the long road ahead

Google Cloud is always innovating new ways to advance autonomous driving. We recently migrated all our vector embeddings to AlloyDB AI, enabling ML-based similarity searches across millions—and sometimes hundreds of millions—of vectors. With AlloyDB’s vector store and advanced indexing using ScaNN, our autonomy team can run complex similarity searches that quickly identify scenarios where Nuro Driver can learn and improve. AlloyDB’s high query performance for both transactional and analytical tasks ensures we can scale our dataset continuously, allowing us to train models on increasingly complex road conditions without performance bottlenecks.

To support these capabilities and improve performance, we’ve built a comprehensive ecosystem on Google Cloud. Cloud Storage serves as our primary storage for autonomy logs, on-road operation data, simulation records, and ML evaluation files. Using change data capture from Datastream, we replicate AlloyDB data to BigQuery in near real-time. This creates a unified flow that supports business dashboards and provides detailed, real-time analytics on autonomy performance. BigQuery serves as the main backend for analytical metrics, enabling precise evaluation and validation of the Nuro Driver.

Additionally, we use Spanner for storing log namespace metadata, while Firestore, Datastream, and Memorystore support various other applications, making our data management flexible and efficient. This diverse set of databases on a single cloud platform not only centralizes data management but also enables real-time insights and seamless data access. It’s the robust, scalable foundation we need to drive reliable autonomy at scale.

AlloyDB takes the driver’s seat in Nuro’s data transformation

Since migrating to AlloyDB AI, we’ve seen a substantial reduction in the operational costs of storing and searching embeddings. AlloyDB AI’s horizontal scalability has proven to be the most cost-effective solution for our needs, allowing us to add several new types of embeddings across applications without concerns over performance. With ScaNN indexing, our searches now yield over 20,000 high-precision results in seconds, outperforming alternative indexing methods like IVF and HNSW in both quality and scalability.

Our partnership with Google Cloud has also been invaluable. We have continuous access to innovations from the Google Cloud team, and we can easily meet any database requirement by leveraging their extensive suite of products. This support has accelerated our development, enabling us to focus on what matters most — advancing autonomous technology.

Looking forward, Google Cloud remains our primary cloud platform. Relying on its global presence and infrastructure, we can expand our services to new customers worldwide, all while maintaining the high standards of reliability and performance our team depends on. Google Cloud gives us the green light to tackle future challenges in autonomous driving, remove potential roadblocks, and keep innovation on the fast track.

Ready to get started with AlloyDB in your own environment? Check out the following resources:

Discover how AlloyDB combines the best of PostgreSQL with the power of Google Cloud in our latest e-book.
Try AlloyDB at no cost for 30 days with AlloyDB free trial clusters!
Learn more about AlloyDB for PostgreSQL.

Read More for the details.

2025 03 21

AWS – Amazon Connect Contact Lens now enables you to capture agent acknowledgements of performance evaluations

Tibor Kiss AWS, Cloud AWS

You can now capture and review agent acknowledgements of performance evaluations within Contact Lens, ensuring that agents have reviewed evaluation feedback and understand performance expectations. Today, agents receive automated email notifications or tasks to review their performance evaluations. With this launch, agents can acknowledge their review of performance evaluations within the Connect UI, and add optional notes (e.g., “reviewed and accepted feedback on being more empathetic towards angry customers”). Managers can then track agent acknowledgements, to confirm that agents are regularly reviewing the feedback on performance evaluations for improving their performance.

This feature is available in all regions where Contact Lens performance evaluations are already available. To learn more, please visit our documentation and our webpage. For information about Contact Lens pricing, please visit our pricing page.

Read More for the details.

2025 03 21

AWS – AWS adds currency selection to Payment Profiles

Tibor Kiss AWS, Cloud AWS

AWS has enhanced Payment Profiles to allow customers to select their preferred currency for each AWS service provider (seller of record). This new capability builds upon the existing ability to set different payment methods per service provider, giving you more control over your currency preferences.

With Payment Profiles you can now customize both payment methods and currencies for each AWS service provider. For example, you can select to pay one AWS service provider in USD and another in EUR, aligning your AWS payment preferences with your business needs. If you prefer not to create specific Payment Profiles, AWS will continue using your default payment preferences.

This feature is now generally available to all AWS customers worldwide.

To start managing your currency preferences with Payment Profiles, visit the Payment Preferences page in the AWS Billing and Cost Management Console. For more information about configuring Payment Profiles, visit Using payment profiles.

Read More for the details.

2025 03 21

AWS – AWS Directory Service for Microsoft AD and AD Connector available in Mexico and Thailand

Tibor Kiss AWS, Cloud AWS

AWS Directory Service for Microsoft Active Directory, also known as AWS Managed Microsoft AD, and AD Connector are now available in the AWS Mexico – Central and in Thailand region.

Built on actual Microsoft Active Directory (AD), AWS Managed Microsoft AD enables you to migrate AD-aware applications while reducing the work of managing AD infrastructure in the AWS Cloud. You can use your Microsoft AD credentials to domain join EC2 instances, and also manage containers and Kubernetes clusters. You can keep your identities in your existing Microsoft AD or create and manage identities in your AWS managed directory.

AD Connector is a proxy that enables AWS applications to use your existing on-premises AD identities without requiring AD infrastructure in the AWS Cloud. You can also use AD Connector to join Amazon EC2 instances to your on-premises AD domain and manage these instances using your existing group policies.

Please see all AWS Regions where AWS Managed Microsoft AD and AD Connector are available. To learn more, see AWS Directory Service.

Read More for the details.

2025 03 21

AWS – AWS Deadline Cloud now supports Internet Protocol Version 6 (IPv6)

Tibor Kiss AWS, Cloud AWS

AWS Deadline Cloud customers can now use Internet Protocol version 6 (IPv6) addresses, enabling you to run your render farms in IPv6-only or dual-stack environments. AWS Deadline Cloud is a fully managed service that helps customers render visual effects, animation, and other complex graphics workloads in the cloud. With the addition of IPv6 support, AWS Deadline Cloud enhances its networking capabilities, providing you with improved compatibility with modern network infrastructures.

The transition to IPv6 is essential due to the continued growth of the internet, which is exhausting available Internet Protocol version 4 (IPv4) addresses. By adopting IPv6 in AWS Deadline Cloud, you can ensure your rendering workflows are prepared for the future of internet connectivity, while still leveraging current IPv4-based setups.

IPv6 support is now available in all regions where the AWS Deadline Cloud is offered. To learn more about IPv6 support in AWS Deadline Cloud, see the AWS Deadline Documentation or visit the AWS Deadline product page.

Read More for the details.

2025 03 21

GCP – The AI lens: How Arpeely uses multimodality and BigQuery to revolutionize AdTech

Tibor Kiss Cloud, Google Cloud gcp

Traditional programmatic advertising often misses the mark. Flat pricing, limited targeting, and a focus on immediate conversions over long-term customer value leave advertisers wanting more. At Arpeely, we’re changing the game in three ways by:

Putting our own money on the line for performance with a different business model.
Taking care of creatives and the funnel end-to-end.
Optimizing on lifetime value.

Arpeely is on a mission to transform the way advertisers connect with their audiences online, fueled by one simple belief: To truly understand the internet, you need to see it. Not just the text and code — but the images, the emotions, the nuances that make each web page unique. That’s why we’re doubling-down on our ad-tech platform on Google Cloud.

In this post, we’ll take a closer look at how Google Cloud is helping us to leverage the power of multimodality and AI in products like BigQuery and Gemini to make media buying smarter, more efficient, and laser-focused on long-term value.

Transforming how Arpeely ‘sees’ the internet with multimodal AI

Our AI algorithms analyze massive datasets to identify the users most likely to become loyal, high-value customers for our clients, but we need to understand the web on a deeper level to do that effectively. That’s where multimodality — the ability to process multiple types of data — comes in. We don’t just look at text; we use Google Cloud’s powerful multimodal AI, including Gemini Pro Vision and Gemini Flash 1.5 to analyze web page screenshots, extracting visual information that enriches our understanding in real time. This allows us to cluster billions of websites with remarkable speed and precision, uncovering connections and insights that traditional methods would miss.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e89d94d9be0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>

Taming the data deluge with BigQuery and Pub/Sub

Building an AI-powered, multimodal ad platform requires handling a truly staggering amount of data. With BigQuery, we can constantly crunch numbers, analyze user behavior, and generate insights from over 25 petabytes of compressed data, fueling our real-time bidding engine.

But AI demands speed as well as scale. That’s why we rely on Pub/Sub, Google Cloud’s real-time messaging service, to keep the information flowing. Pub/Sub acts like our central nervous system, connecting our microservices and ensuring that our AI algorithms have the up-to-the-second data they need to make smart decisions.

Going beyond keywords: BigQuery vector search for unprecedented ad relevance

Traditional ad targeting relies on keywords, which are blunt instruments in the nuanced world of online behavior. Apeely takes a smarter approach by using the ML.GENERATE_EMBEDDING function within BigQuery ML to generate embeddings for the webpage in real time. By representing web pages and ad creatives as vectors in a multi-dimensional space and using BigQuery vector search, we can understand the semantic relationships between them in real time. This means we can deliver highly contextual ads that go beyond simple keyword matching, resulting in greater relevance, higher click-through rates, and better campaign performance for our clients, with a 15% uplift in revenue.

Making ads blend in and stand out with Visual Question Answering

Our commitment to understanding the visual web goes even further with Visual Question Answering (VQA). By training AI models to “see” and interpret images, we can extract detailed information about web pages, such as dominant colors, layouts, and even emotional tones. Our VQA models enable us to dynamically adjust the look and feel of our ads to match the context of each page, creating a more seamless and engaging experience for users, resulting in a 28% increase in user engagement.

The Google Cloud advantage: Building the Future of AdTech

Building Arpeely on Google Cloud has been instrumental in bringing our AI-powered vision to life. The platform’s scalability, serverless offerings, and unified ecosystem give us the agility and efficiency we need to innovate at a rapid pace.

We’re incredibly excited about the future of ad-tech and the role AI will continue to play. With Google Cloud as our trusted partner, we’re confident in our ability to lead the way toward a more intelligent, effective, and value-driven advertising landscape.

Get started with multimodality use cases in BigQuery today.

Read More for the details.

2025 03 21

GCP – Build GraphRAG applications using Spanner Graph and LangChain

Tibor Kiss Cloud, Google Cloud gcp

Spanner Graph redefines graph data management by integrating graph, relational, search, and AI capabilities with virtually unlimited scalability. GraphRAG has emerged as a frontrunner in building question-answering systems that enable organizations to extract relevant insights from their interconnected data. In this blog, we demonstrate how to leverage LanghChain and Spanner Graph to build powerful GraphRAG applications.

Application developers are increasingly experimenting with Retrieval Augmented Generation (RAG), which enhances the performance of generative AI (gen AI) foundation models by enabling dynamic knowledge retrieval. Rather than relying solely on pre-trained knowledge, RAG systems query external data sources during inference, commonly using techniques like vector search. The retrieved information is then integrated into the prompt, leading to more accurate and contextually relevant responses.

Vector-based RAG is effective at retrieving relevant content, however, it can sometimes overlook the interconnectedness of data, failing to capture relationships like citations or product dependencies. GraphRAG addresses this gap and is increasingly gaining popularity. It creates a knowledge graph from varied data sources, allowing for context retrieval through a blend of graph queries, and vector search, thereby producing more detailed and contextually relevant responses for gen AI applications.

LangChain is a leading orchestration framework for building RAG applications that simplifies the integration of diverse data sources and foundation models. Recently, we integrated Spanner Graph and LangChain, streamlining the development of GraphRAG solutions. By making Spanner Graph’s enterprise-grade, scalable, and reliable graph capabilities available directly in LangChain workflows, we’ve made it easier for developers to build advanced, relationship-aware RAG systems.

Let’s jump in.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e89d71d4ac0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

Building a retail application using GraphRAG

To illustrate the practical application of GraphRAG, let’s consider an electronics e-commerce scenario. Imagine an online retailer with a vast collection of data, including product specifications, bundle offerings, and promotional deals. This data contains implicit relationships between various entities:

Products: A laptop, for example, might have compatible accessories or be part of a bundle.
Categories: Products belong to categories, which can have hierarchical relationships (e.g., “Laptops” is a subcategory of “Computers”).
Customers: Customers have purchase histories and preferences, indicating relationships with products and categories.

While traditional vector-based RAG can retrieve basic product information in response to a customer query, GraphRAG provides a more comprehensive and contextualized understanding. By representing these entities and their relationships as a graph, the system can traverse connections and provide richer, more relevant information.

Let’s take a look at the steps involved to use GraphRAG using Spanner Graph and LangChain.

Step 1: Construct the knowledge graph
To leverage the power of GraphRAG, the first step is to transform your data corpus into a knowledge graph. This transformation can be achieved through various methods, including Spanner Graph schema, custom code or existing libraries. In this example, we’ll demonstrate using LangChain’s LLMGraphTransformer to convert a subset of our retail business’s unstructured document corpus into a graph.

The LLMGraphTransformer accepts the node and relationship types, along with their properties, as input which specifies the following:

Node types: the types of entities in the graph (e.g., “Product,” “Bundle,” “Deal”).
Relationship types: the types of connections between entities (e.g., “In_Bundle,” “Is_Accessory_Of,” “Is_Upgrade_Of,” “Has_Deal”).
Properties: the attributes associated with nodes and edges (e.g., “name,” “price,” “weight,” “deal_end_date,” “features”).

Given this input, the LLMGraphTransformer processes the documents and generates a graph represented as a list of GraphDocument objects.

Here’s a code snippet illustrating this process:

code_block: <ListValue: [StructValue([(‘code’, ‘# load documentsrnloader = DirectoryLoader(‘…’)rndocuments = loader.load()rnrn# convert documents to graphrnllm_transformer = LLMGraphTransformer(rn llm=ChatVertexAI(),rn allowed_nodes = [“Product”, “Bundle”, “Deal”, “Category”, “Segment”, ],rn allowed_relationships = [“In_Category”,”In_Bundle”, “Is_Accessory_Of”,rn “Is_Upgrade_Of”, “Has_Deal”],rn node_properties=[ “name”, “price”, “weight”, “deal_end_date”, “features”, ],rn)rngraph_documents = llm_transformer.convert_to_graph_documents(documents)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d71d4520>)])]>

To use semantic search in our GraphRAG application, we also need to generate vector embeddings for our graph nodes. This enables us to identify nodes based on the semantic meaning of their content. In our retail scenario, we can generate embeddings for the textual descriptions of features of products, categories, and other relevant entities.

Here’s a simplified example of how we can generate these embeddings:

code_block: <ListValue: [StructValue([(‘code’, ’embedding_service = VertexAIEmbeddings()rnfor graph_document in graph_documents:rn for node in graph_document.nodes:rn if “features” in node.properties:rn node.properties[“embedding”] =rn embedding_service.embed_query(node.properties[“features”])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d71d4070>)])]>

Alternatively, you can follow this Get Vertex AI text embeddings guide to use Spanner’s built-in text embedding generation capabilities.

Step 2: Store the knowledge graph in Spanner Graph
To persist and query the constructed knowledge graph, you can utilize the SpannerGraphStore library to load the generated graph into Spanner Graph. This library simplifies the process by handling the underlying Spanner Graph schema generation, including the necessary input tables and the graph itself, and then applying that schema to the database. Additionally, it performs lightweight reconciliation of duplicate nodes and edges within the graph before writing the data to the database, improving data integrity.

Here’s an example of how you can store a graph:

code_block: <ListValue: [StructValue([(‘code’, ‘from langchain_google_spanner import SpannerGraphStorernrn# Initialize SpannerGraphStorerngraph_store = SpannerGraphStore(rn instance_id=INSTANCE,rn database_id=DATABASE,rn graph_name=GRAPH_NAME,rn)rnrn# store documents into Spanner Graphrngraph_store.add_graph_documents(graph_documents)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d71d4df0>)])]>

Step 3: Inspect the knowledge graph
Once the knowledge graph is loaded into Spanner Graph, you can use Spanner Graph Notebook to inspect both its schema and the data itself to ensure it accurately represents the retail information. You can use the following magic command to connect to the Spanner Graph instance and explore the graph:

code_block: <ListValue: [StructValue([(‘code’, ‘%%spanner_graph –project PROJECT –instance INSTANCE –database DATABASErnrnGRAPH retail_graphrnMATCH p = ()->()rnRETURN TO_JSON(p) AS path_jsonrnLIMIT 200;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d71d4610>)])]>

You can then inspect the graph schema and interact with the graph data as illustrated below:

Step 4: Retrieve context using the SpannerGraphVectorContextRetriever
This section demonstrates how GraphRAG excels at context retrieval for generating answers compared to conventional RAG. To answer questions grounded in the generated graph, you can utilize the SpannerGraphVectorContextRetriever. This retriever takes a natural language question as input and leverages vector search to identify nodes in the graph that are the closest semantic matches. It then enhances the context by exploring paths from the matched nodes up to a defined number of hops. The retriever effectively combines the power of vector search with the capabilities of graph traversals within Spanner Graph.

Let’s analyze how the retriever handles the following natural language question: "I am looking for a beginner drone. Please give me some recommendations".

First, you construct a SpannerGraphVectorContextRetriever configured to answer product-related questions. Then, you invoke the retriever with the natural language question to obtain the relevant context:

code_block: <ListValue: [StructValue([(‘code’, ‘retriever = SpannerGraphVectorContextRetriever.from_params(rn graph_store,rn VertexTextEmbedding(),rn label_expr=”Product”,rn expand_by_hops=1, #expands to all nodes one hop awayrn)rnquestion = “I am looking for a beginner drone. Please give me some recommendations.”rncontext = retriever.invoke(question)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d71d47f0>)])]>

The retriever leverages Spanner vector search to identify the most relevant product — in this case, the SkyHawk Zephyr Drone (specified as a beginner drone in the input documents). It then navigates the graph to discover related information, including bundles, compatible accessories, potential upgrades, and available deals for the SkyHawk Zephyr Drone. The neighborhood subgraph retrieved by the SpannerGraphVectorContextRetriever, centered on the “SkyHawk Zephyr Drone” node, is illustrated below:

This enriched context is then passed to the LLM, which produces the final answer to the question.

How GraphRAG outperforms conventional RAG

To illustrate how GraphRAG provides a richer, more informative answer compared to conventional RAG using vector search alone, we compare their results below:

Conventional RAG

GraphRAG

Based on the information provided, I recommend the SkyHawk Zephyr Drone or the SkyHawk Zephyr Starter Package for a beginner drone. Both are priced at $129.99 and are designed with simplicity and stability in mind, offering user-friendly controls, automatic takeoff and landing, and a durable design. The Starter Package includes everything you need to begin flying.

The SkyHawk Zephyr Drone is a good option for a beginner drone, it has features like User-Friendly Controls, Stable Flight, Durable Design, HD Camera, Long Flight Time, and One-Key Return. It is priced at $129.99.

It also has a limited time deal price of $109.99. There are also some bundles and accessories available for this drone. The SkyHawk Zephyr Explorer Package is priced at $179.99 and the Skyhawk Zephyr Starter Package is priced at $129.99. The SkyHawk Zephyr Extended Battery is an accessory that increases flight time and is priced at $29.99. The Skyhawk Zephyr Propeller Guards are also available for $14.99. If you are looking for an upgrade, the Skyhawk Zephyr 2.0 is available for $199.99.

Notably, both RAG workflows identify the SkyHawk Zephyr Drone as the beginner drone and recommend the SkyHawk Zephyr Starter Package. However, GraphRAG also surfaces valuable additional information about the recommended products, including:

Accessories available for the SkyHawk Zephyr Drone
Related deals on the recommended product
The option to upgrade to the next tier of the recommended drone, specifically the SkyHawk Zephyr 2.0

Get started today

Google Spanner Graph and LangChain streamline GraphRAG development by combining Spanner Graph’s enterprise-grade reliability, scalability, and distributed graph processing with LangChain’s versatile tools. This enables rapid prototyping of intelligent applications and unlocks valuable data insights. We’re excited to see what you’ll build!

To get started, visit the GitHub repository. You can deep dive into the reference notebook tutorial for the use case discussed above. Learn more about Spanner Graph benefits and use cases here. Use this quick setup guide to get started with Spanner Graph capabilities.

Read More for the details.

2025 03 21

GCP – JetStream for GCE Disaster Recovery Orchestration: Protect and manage your critical workloads

Tibor Kiss Cloud, Google Cloud gcp

Enterprises need strong disaster recovery (DR) processes in place to ensure business continuity in the face of unforeseen disruptions. A robust disaster recovery plan safeguards essential data and systems, minimizing downtime and potential financial losses. This not only helps maintain customer trust by providing service reliability but also can help play a vital role in meeting regulatory compliance requirements.

An effective DR solution provides near-zero recovery-point and recovery-time objectives (RPO and RTO). JetStream Software is partnering with Google Cloud on its orchestrated JetStream for Google Compute Engine Disaster Recovery Orchestration, which you can deploy and use via Google Cloud Marketplace.

The solution uses Google Cloud’s Asynchronous Replication, which simplifies block-level storage replication with API-driven, agentless functionality, without performance overhead. Asynchronous replication protects data between two regions with low RPO and low RTO. In the unlikely event of a regional outage, Async Replication lets you failover your data to a secondary region and restart your workload in that region. It also supports consistency groups for multi-disk applications. Compute Engine supports block-storage asynchronous replication with RPO < 1 minute, cloning for DR testing, rapid replication disconnect and storage attach to support rapid recovery, and compatibility with high performance disks.

And for organizations that need near-zero RTO, minimal downtime automated recovery processes, non-disruptive failover testing, and robust protection orchestration, JetStream for GCE Disaster Recovery Orchestration provides:

Configuration management: Manages runtime configuration changes for your VMs
Tight integration: Available directly through GCP Marketplace
Ready-to-use failover and failback orchestration: Enables rapid recovery and business continuity and full orchestration of protection, failover, run DR drills for failover and failback operations, minimizing downtime and impact on the business
Continuous monitoring: Provides replication monitoring to help ensure data protection
Continuous replication: Helps ensure minimal data loss in a disaster scenario

Let’s take a deeper look at some of these capabilities.

Setup and configuration management
JetStream works with both Windows and Linux VMs without any OS agents or guest OS modifications. Deployment is streamlined with deployment scripts and documentation. JetStream actively monitors and responds to runtime configuration changes such as VM property adjustments, and can hot-add or remove disks from protected VMs.

Failover and failback with runbook support
JetStream uses disk-level asynchronous replication and Compute Engine APIs for VM replication, for a general-purpose solution. It supports both failover and failback, and includes runbook support that provides DR administrators with fine-grained control over the recovery order of protected VMs. This includes the ability to achieve DR for stateless workloads, including SAP application servers.

DR drills
JetStream supports safe and non-disruptive failover testing. These test failovers don’t impact ongoing protection, allowing disaster recovery DR administrators to test and validate their recovery plans in a controlled environment, without risk.

Get started

Want to experience JetStream for GCE Disaster Recovery Orchestration? Try it now through the Google Cloud Marketplace with a free trial! Installation is straightforward; learn more here.

Read More for the details.

2025 03 21

GCP – Strengthening Google Developer Experts community with Google Cloud Champion Innovators

Tibor Kiss Cloud, Google Cloud gcp

Today, we’re excited to announce a significant milestone in deepening our developer communities: We’re fully integrating Google Cloud Champion Innovators (Champions) into the Google Developer Experts (GDE) program.

The Champion Innovators program was launched in 2022 to recognize and support developers demonstrating exceptional expertise and passion for Google Cloud technologies. For over 12 years, the GDE program has been a respected community for recognized experts across a wide array of Google technologies and other developer-facing products at Google such as Android, Firebase, Flutter, Angular, and Chrome. The program currently consists of over 1,100 thought leaders and community builders.

With the addition of the Champions, we are growing the GDE to over 1,400 members. Together, it will become a single, powerful community, streamlining resources and amplifying the impact of our most passionate experts.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e89a9167c70>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Bringing the community together

This transition is about more than just merging programs. It’s about amplifying the incredible work the Google Cloud GDEs (formerly Champions) do.

A broader community: The addition of Champions further enriches the impressive range of expertise in the GDE program. This cross-pollination of knowledge and experience will benefit all GDEs and the wider developer community. A combined directory of all Experts is available here.
Enhanced collaboration and support: A unified program means better communication, streamlined resources, and increased opportunities for collaboration. GDEs will have access to a wider network of peers, fostering even greater innovation and knowledge sharing.
A stronger voice for developers: By consolidating our programs, we’re creating a more powerful and unified voice for developers within Google and the industry as a whole.

Stay up-to-date on our GDEs

This transition provides greater avenues for GDE impact across developer communities. Meet a GDE at an event near you during the global Build with AI workshops featuring Google’s AI tools, Next 25, and DevFest in the second half of 2025!

You can also stay connected to our GDEs on our LinkedIn, X, and Medium.

Read More for the details.

2025 03 21

GCP – Mastering secure AI on Google Cloud, a practical guide for enterprises

Tibor Kiss Cloud, Google Cloud gcp

Introduction

As we continue to see rapid AI adoption across the industry, organizations still often struggle to implement secure solutions because of the new challenges around data privacy and security.

We want customers to be successful as they develop and deploy AI, and that means carefully considering risk mitigation and proactive security measures.

The four cornerstones of a secure AI platform

When adopting AI, it is crucial to consider a platform-based approach, rather than focus solely on individual models.

A secure AI platform, like a secure storage facility, requires strong foundational cornerstones. These are infrastructure, data, security and responsible AI (RAI).

Infrastructure is your foundation. Like the physical security of a storage facility, secure Google Cloud infrastructure (compute, networking, storage) is the AI platform on which your AI models and applications operate.
Data is your protected fuel. Data security is vital when developing AI powered applications. Protecting data from unauthorized access, modification, and theft is essential. Protected data can help safeguard AI integrity, ensure privacy compliance, and build customer trust.
Security is your shield. This layer protects the entire AI ecosystem by detecting, preventing, and responding to threats similar to a storage facility’s security systems. A strong AI security strategy should minimize your attack surface, detect incidents, and maintain confidentiality, integrity, and availability.
Responsible AI is your ethical compass. Building trust in enterprise AI systems is just as important as securing them. Responsible AI ensures that AI systems are used ethically and in a way that benefits society. This is like ensuring that a storage facility is used for its intended purpose and not for any illegal or unethical activities. RAI is based on the following:
1. Fairness: Use bias mitigation to ensure AI models are free from bias, and treat all users fairly. This requires careful data selection, model evaluation, and ongoing monitoring.
2. Explainability: Model transparency can make AI models transparent and understandable, so you can identify and address potential issues. Explainable AI can help build trust in AI systems.
3. Privacy: Data protection and compliance can protect user data and comply with privacy regulations. This includes implementing appropriate data anonymization and de-identification techniques.
4. Accountability: Establish clear lines of responsibility for the development and deployment of enterprise AI systems to ensure that there is accountability for the ethical implications of enterprise AI systems.

Responsible AI is essential for building trust in AI systems and ensuring that they are used in a way that is ethical and beneficial. By prioritizing fairness, explainability, privacy, and accountability, organizations can build AI systems that are both secure and trustworthy. To support our customers on their AI journey, we’ve provided the following design considerations.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e89d7dc61f0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Key security design considerations on Vertex AI

Vertex AI provides a secured managed environment for building and deploying machine-learning models as well as accessing foundation models. However, building on Vertex requires thoughtful security design.

1_Secure AIML Reference Architecture — Secure AI/ML deployment reference architecture.

In the architecture shown above, you can see that we recommend designing AI application security with the four key cornerstones in mind.

You can see here that Virtual Private Cloud (VPC) is the foundation of the platform. It isolates AI resources from the public internet, creating a private network for sensitive data, applications and models on Google Cloud.

We recommend using Private Services Connect (PSC) as the private endpoint type for Vertex AI resources (such as Notebooks and Model Endpoints.) These allow Vertex AI resources to be deployed into private VPC subnets to prevent direct internet access. It also allows applications deployed into private subnets within the VPC to securely make inference to the AI models privately.

VPC Service Control perimeters and Firewall Rules are used in addition to Identity and Access Management (IAM) to authorize network communication and block unwanted connections. This enhances the security of cloud resources and data for AI processing. Access levels can be defined based on IP addresses, device policies, user identity, and geographical location to control access to protected services and projects.

If there are requirements for private resources like notebooks to access the internet for updates, we recommend building a Cloud NAT, which can enable instances in private subnets to access the internet (such as for software updates) without exposure to direct inbound connections.

The reference architecture uses a Cloud Load Balancer (LB) as the network entry point for AI applications. The LB distributes traffic securely across multiple instances, ensuring high availability and scalability. Integrated with Cloud Armor, it protects against denial of service (DDoS) and web attacks.

reCAPTCHA Enterprise can prevent fraud and abuse, whether perpetrated by bots or humans. LBs inherent scalability can effectively mitigate DDoS attacks, as we saw when Google Cloud stopped one of the largest DDoS attacks ever seen.

Model Armor is also used to enhance the security and safety of AI applications by screening foundation model prompts and responses for different security and safety risks. It can perform functions such as filters for dangerous harassment, and hate speech. Additionally, Model Armor can identify malicious URLs in prompts and responses as well as injection and jailbreak attacks.

In this design, Chrome Enterprise Premium is used to implement a Zero Trust model, removing implicit trust by authenticating and authorizing every user and device for remote access to AI applications on Google Cloud. Chrome Enterprise Premium enforces inspection and verification of all incoming traffic, while a Secure Web Proxy manages secure egress HTTP/S traffic.

Sensitive Data Protection can secure AI data on Google Cloud by discovering, classifying, and protecting sensitive information and maintaining data integrity. Cloud Key Management is used to provide centralized encryption key management for Vertex AI model artifacts and sensitive data.

In addition to this reference architecture, it is important to remember to implement appropriate IAM roles when using Vertex AI. This enforces Vertex AI resource control and access for different states of the machine learning (ML) workflow. For example roles must be defined for data scientists, model trainers, model deployers etc.

Finally it is important to conduct regular security assessments and penetration testing to identify and address potential vulnerabilities in your Vertex AI deployments. Tools including Security Command Center, Google Security Operations, Dataplex and Cloud Logging can be used to enforce a secure security posture for AI/ML deployments on Google Cloud.

Securing the machine learning workflow on Vertex AI

Building upon the general AI/ML security architecture we’ve discussed, Vertex-based ML workflows present specific security challenges at each state. Address these unique concerns when securing AI workloads following these recommendations:

Development and data ingestion: Begin with secure development by managing access with IAM roles, isolate environments in Vertex AI Notebooks, and secure data ingestion by authenticating pipelines and sanitizing inputs to prevent injection attacks.
Code and pipeline security: Use IAM to secure code repositories with Cloud Source Repositories for access control and implement branch protection policies. Secure CI/CD pipelines using Cloud Build to control build execution and artifact access with IAM. Use secure image sources, and conduct vulnerability scanning on container images.
Training and model protection: Protect model training environments by using private endpoints, controlling access to training data and monitoring for suspicious activity. Manage pipeline components with Container Registry, prioritize private registry access control and container image vulnerability scanning.
Deployment and serving: Secure model endpoints with strong authentication and authorization. Implement rate limiting to prevent abuse. Use Vertex Prediction for prediction serving, and implement IAM policies for model access control and input sanitization to prevent prompt injections.
Monitoring and governance: Continuous monitoring is key. Use Vertex Model Monitoring to set up alerts, detect anomalies, and implement data privacy safeguards.

2_Secure MLOps Reference Architecture — Secure MLOPs reference architecture.

By focusing on these key areas within the Vertex AI MLOps workflow — from secure development and code management to robust model protection and ongoing monitoring — clients can significantly enhance the security of AI applications.

Confidential AI on Vertex AI

For highly sensitive customer data on Vertex AI, we recommend using Confidential Computing. It encrypts VM memory, generates ephemeral hardware-based keys unique to each VM that are unextractable, even by Google, and encrypts data in transit between CPUs/GPUs. This Trusted Execution Environment restricts data access to authorized workloads only.

It ensures data confidentiality, enforces code integrity with attestation, and it removes the operator and the workload owner from the trust boundary.

Get started today

We encourage organizations to prioritize AI security on Google Cloud by applying the following key actions:

Ensure Google Cloud best practices have been adopted for data governance, security, infrastructure and RAI.
Implement security controls such as VPC Service Controls, encryption with CMEK and access control with IAM.
Use the RAI Toolkit to ensure a responsible approach to AI.
Use Google Safeguards to protect AI models.
Ensure the importance of data privacy and secure practices for data management.
Apply security best practices across the MLOps workflow.
Stay informed: Stay up to date with the latest resources, including the Secure AI Framework (SAIF).

By implementing these strategies, organizations can be empowered to use the benefits of AI while effectively mitigating risks, ensuring a secure and trustworthy AI ecosystem on Google Cloud. Reach out to Google’s accredited Partners to help you implement these practices for your business.

Read More for the details.

2025 03 21

GCP – Building AI agents with Gen AI Toolbox for Databases and Dgraph

Tibor Kiss Cloud, Google Cloud gcp

We recently announced the public beta of Gen AI Toolbox for Databases, and today we’re excited to expand its capabilities through a new partnership with Hypermode.

Gen AI Toolbox for Databases is an open source server that empowers application developers to connect production-grade, agent-based generative AI (gen AI) applications to databases. Toolbox streamlines the creation, deployment, and management of sophisticated gen AI tools capable of querying databases with secure access, robust observability, scalability, and comprehensive manageability.

Currently, Toolbox can be used to build tools for a large number of databases: AlloyDB for PostgreSQL (including AlloyDB Omni), Spanner, Cloud SQL for PostgreSQL, Cloud SQL for MySQL, Cloud SQL for SQL Server and self-managed MySQL and PostgreSQL. Additionally, Toolbox is open-source, including contributions and support from Google partners such as Neo4j.

Today we are excited to announce Gen AI Toolbox support for Dgraph in partnership with Hypermode.

Dgraph by Hypermode is the fully open source, built-for-scale graph database for AI apps.

What sets Dgraph apart:

Real-time performance: Designed for real-time workloads with distributed architecture that processes queries in parallel
Horizontal scalability: Easily scales to handle growing data volumes and user demands
AI-native primitives: Features vector indexing, search, and storage capabilities that allow development teams to store multiple embeddings on any node
Flexible data modeling: Supports property graph models that represent complex relationships crucial for recommendation systems and knowledge graphs

The integration between Dgraph and Toolbox delivers significant advantages for application developers:

1. Simplified configuration & development

Straightforward setup: Just configure kind: dgraph in the source and kind: dgraph-dql in Toolbox
Toolbox handles database connectivity while you focus on building AI features

2. Production-ready infrastructure

Automated operational management including connection pooling, authentication, and resource allocation
Zero-downtime deployments through config-driven approach
Built-in support for common auth providers

3. Enterprise-grade observability

Out-of-the-box insights via logging, metrics, and tracing
Simplified debugging and monitoring for graph database operations

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e89d5b8b8b0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Real-world use case: Building an e-commerce agent

To demonstrate the power of the Dgraph integration with Toolbox within the Gen AI ecosystem, let’s explore how to build a product search and recommendation agent for a large ecommerce platform.

Modern ecommerce platforms need to deliver personalized experiences that help customers discover relevant products quickly. This requires:

Efficient product search capabilities
Personalized product recommendations based on user behavior and preferences
Natural language interaction with product catalog and reviews

Our solution uses a polyglot database approach:

AlloyDB for PostgreSQL: Stores transactional data including product catalog, purchase history, and inventory
Dgraph: Powers the knowledge graph for personalized product recommendations and understanding user reviews
LangChain: Orchestrates the agent workflow and LLM interactions
Gen AI Toolbox for Databases: Connects our agent to both databases with production-ready infrastructure

Creating a product knowledge graph with Dgraph

A knowledge graph powers the foundation for our ecommerce agent. We’ll model products, users, and user reviews as a property graph in Dgraph. This approach captures not just the data, but the knowledge of how these data are connected.

The property graph model consists of:

Nodes with labels: Defining the type of the node (Product, User, Review)
Relationships: Connecting nodes (`purchased_by`, `reviewed_by`, `similar_to`)
Properties: Key-value pairs with attributes of each type of node.

Here we can see how we will model our product knowledge graph:

DQL (Dgraph Query Language) is Dgraph’s native query language. Inspired by GraphQL and optimized for graph operations, DQL offers intuitive syntax for expressing complex relationships with features like reverse edges, cascading filters, and efficient variable binding. Its key advantages include performance on relationship-heavy queries, simplified traversal of connected data, and the ability to execute complex graph operations in a single query – making it ideal for recommendation engines, social networks, and knowledge graphs where entity relationships are crucial.

We can traverse the product knowledge graph stored in Dgraph to generate personalized recommendations. This is a common use case for graph databases like Dgraph due to the performance optimizations of traversing the graph to find products purchased by similar users or products with similar features. This type of query can be expressed simply using DQL, for example:

code_block: <ListValue: [StructValue([(‘code’, ‘query ProductRecommendations($productId: string) {rn # Get the current product detailsrn product(func: uid($productId)) {rn uidrn namern categoryrn }rnrn # Find users who purchased this productrn var(func: uid($productId)) {rn ~purchased_by @cascade {rn uid as uidrn }rn }rnrn # Find other products these users purchasedrn recommendations(func: has(name), orderasc: name) @filter((has(purchased_by) AND uid_in(purchased_by, uid) AND NOT uid($productId))) {rn uidrn namern categoryrn pricern ratingrn purchase_count: count(purchased_by)rn rn # Sort by most frequently purchasedrn orderDesc: val(purchase_count)rn }rn}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d42918b0>)])]>

Defining our tools with Toolbox

Tools are defined using YAML with Gen AI Toolbox and represent parameterized database queries. Each tool includes a description of the input and output, which helps the LLM understand and determine which tool to call and use. See the documentation for more details on tool definition.

We are able to define tools that leverage both PostgreSQL and Dgraph, allowing our agent to seamlessly work with both databases. Here we see defining the Dgraph source and the Dgraph tool for finding product reviews. Other tools used include searching the product catalog and querying Dgraph for personalized recommendations.

code_block: <ListValue: [StructValue([(‘code’, ‘sources:rn my-dgraph-source:rn kind: dgraphrn dgraphUrl: http://localhost:8080rntools:rn get-product-reviews:rn kind: dgraph-dqlrn source: my-dgraph-sourcern statement: |rn query all($asin: string){rn productReviews(func: type(Product), first: 10) @filter(eq(Product.asin, $asin )) {rn uidrn Product.asinrn Product.reviews {rn Review.titlern Review.textrn Review.ratingrn }rn }rn }rn isQuery: truern timeout: 20srn description: |rn Use this tool to find product reviews for a specific product.rn parameters:rn – name: asinrn type: stringrn description: The product ASIN’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d69f3df0>)])]>

Building your agent with LangChain

Now we’re ready to create our agent. We’ll use the LangChain framework, which implements a client for Gen AI Toolbox. We’ll connect to the Toolbox server and load the toolset, a description of the tools available to our agent, provided by Toolbox. Since we’ll be implementing a chat interface for our agent, we’ll use LangChain’s agent memory component. While LangChain supports many models, we’ll make use of Gemini Pro via Google’s Vertex AI cloud service.

code_block: <ListValue: [StructValue([(‘code’, ‘# Connect to Toolbox server and load the toolsetrntoolbox = ToolboxClient(“http://127.0.0.1:5000″)rntools = toolbox.load_toolset()rnrn# Initialize a memory componentrnmemory = ConversationBufferMemory(rn memory_key=”chat_history”,rn return_messages=Truern)rnrn# Agent promptrnprompt = ChatPromptTemplate.from_messages([rn (“system”, “””rn You are a helpful product search assistant. Your job is to help users find products they’re looking for using your tools.rn”””), rn MessagesPlaceholder(variable_name=”chat_history”),rn (“human”, “{input}”), rn MessagesPlaceholder(variable_name=”agent_scratchpad”)rn])rnrn# Use Gemini Pro LLMrnllm = ChatVertexAI(rntmodel=”gemini-1.5-pro-002″, rnttemperature=0, rntconvert_system_message_to_human=Falsern)rnrn# Create agent and agent executorrnagent = create_tool_calling_agent(llm, tools, prompt)rnagent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, memory=memory,tool_choice=”any” )’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e89d69f3d60>)])]>

Now when we run our Python script we have a natural language chat interface to our agent. With each interaction the agent will determine if it has enough information and context to respond to the query and if not will select a tool from the toolset provided by Toolbox. Toolbox also handles the logic of invoking the tool – which involves managing database connection pools, authentication to the database, and the query request/result lifecycle.

The result: A powerful gen AI-powered shopping assistant

With this implementation, we’ve created a natural language interface that can:

Help customers search for products using natural language
Provide detailed product information from PostgreSQL
Generate personalized product recommendations based on the knowledge graph in Dgraph
Surface relevant product reviews to aid purchase decisions

All of this is made possible by Gen AI Toolbox for Databases, which handles the complexity of database connectivity, authentication, and query execution while the agent focuses on delivering a seamless user experience.

Get started

Ready to build your own database-connected agents? Try Gen AI Toolbox with Dgraph support today:

Join the Hypermode community or to share your experiences and get help building your own solutions!

Read More for the details.

2025 03 21

GCP – Introducing protection summary, a new Google Cloud Backup and DR feature

Tibor Kiss Cloud, Google Cloud gcp

In today’s cloud environments, data protection is paramount. Ensuring your backups are configured correctly and aligned with your business continuity requirements is critical for business continuity and resilience against threats like ransomware. However, understanding your backup coverage across a complex cloud environment can be challenging. That’s why we’re excited to announce the preview of protection summary, a new feature in Google Cloud Backup and DR that provides a centralized view of your backup configurations, helps you identify gaps in your data protection, and empowers you to take action to improve your resilience.

We’re enhancing the efficiency and visibility of your VM data protection with two new key features: protection summary and the data protection tab. This blog will focus on protection summary, a new capability designed to provide a consolidated, at-a-glance view of your VM’s backup state. Complementing this, our second launch of the data protection tab brings together backup and continuous data replication options for block storage in a single unified interface. Click here to learn more.

Introducing protection summary

Protection summary, a new capability in the Google Cloud Backup and DR management experience, allows you to view the backup configuration state of your resources. Protection summary helps you to easily identify resources that have not been configured for backup, discover those that are configured and identify areas where you can enhance your data protection.

Identify resources with no backup configuration

Quickly discover resources that lack any backup configuration and take immediate steps to protect them. Supported resource types include Compute Engine VMs and Cloud SQL instances.

While some workloads such as test environments may not need any data protection at all, most production workloads need data protection from temporary outages, disasters, user errors, and malware.

Protection summary evaluates various data protection options offered by Google Cloud to determine if backups are configured for a resource. If a VM has a backup plan, a template for backups in the Backup and DR Service management console, or if any of the disks have a snapshot schedule, then it is considered configured for backup. This definition does not include any third-party or custom protection tools that you may be using.

If there are no such data protection options being used, protection summary lists them under “backups not configured”, thus helping you identify gaps in protection.

Configure backups for resources

For resources that are listed as having backups not configured, protection summary allows you to assess the available backup options for a resource type and select the appropriate option based on your needs. For Compute Engine, available backup options include backup plans offered by the Backup and DR service and snapshot schedules offered by Compute Engine. For Cloud SQL instances, you can configure automated backups offered by Cloud SQL. Choose the option that works best for you and continue to configure backups.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7619b078e0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Assess backup configurations

Easily view how your current backup setup looks and identify areas for improvement. Understand your vulnerability to ransomware attacks and implement best practices like backup vaults to protect your backups from deletion or corruption.

When the configuration state is ‘vaulted’, it means backups are stored in a backup vault. This vault uses an enforced retention setting to provide immutability, preventing bad actors from deleting your backups.
When the configuration state is ‘not vaulted’, it means backups are not stored in the backup vault and are stored using other backup capabilities. While this serves many common needs, they do not offer the same level of enforced retention as a backup vault.

You can also review the details of configuration state, backup schedules, location and last successful backup time to ensure they meet your requirements.

How it works

To view the protection summary for your resources, simply navigate to the Backup and DR management experience in the Google Cloud console. Enable the Backup and DR API in the project where you want to view the protection summary.

Navigate to the protection summary page and select the region and resource type of your choice to proceed. You can now view all the resources split into two tabs based on their backup configuration state.

Finally, to ensure that you have data protection enabled from the get-go, you can add protection by selecting suitable options while creating your Compute Engine VMs. Read more about how to set up this protection on our other blog on Data Protection.

Get started today

Protection summary is now available in preview in the Google Cloud Backup and DR management experience. Try it out today and get visibility into your backup configuration state. Learn more here.

Read More for the details.

2025 03 21

GCP – The three pillars of data-driven government

Tibor Kiss Cloud, Google Cloud gcp

Over just the past few years, Artificial Intelligence and Machine Learning (AI/ML) have remarkably transformed IT and data science at the enterprise level. Yet, the public sector is still working through some significant “growing pains” adapting Data & AI strategies and infrastructure to improve and accelerate public services.

Google’s experiences across the Public Sector have led us to offer an adaptable framework for governments for defining and refining your government’s AI & data access strategy. We can break it down into three core pillars:

1. Defining a government Data Access Platform (DAP): A platform where individuals and businesses can easily find and access government data. This platform should be user-friendly with robust search capabilities, clear access routes, and strong governance protocols.

Key points to consider:

An authoritative single point-of-entry to access major regional and local government datasets, whether held in the platform itself or elsewhere.
Allowing users to both search and preview the data, based upon metadata in a uniform schema around their unique requirements.
Providing a clear route to access–either directly via the platform itself, or facilitating direct peer-to-peer access to the actual data owner.
Robust governance and fine-grained control of who accesses what–private versus public datasets.
Hosting across multiple data formats: tables, text, videos, and geospatial information and timeseries.
An API driven platform that allows for applications (public and private) to be built on top of the data services it provides.

2. Assembling a central DAP empowerment team: A dedicated team responsible for building, managing, and promoting the data access platform. This team should set clear goals, measure success, establish governance principles, incentivize participation, and provide training and support to the government departments:

Builds, owns, and operates the cross-government data access platform, as well as recommends additional integrated components.
Defines benefits to the public: better and faster public services, simpler user interactions.
Sets, captures, and reports metrics: success among users and overall impacts.
Sets core strategy and governance principles: primary data owners/datasets, metadata, formats (proprietary or open source).
Incentivizes adherence within or between departments–including funding, penalties, or recognition.
Coordinates necessary training and upskilling to modern technologies.
Manage relationships with a set of trusted partners who can assist with the full scope of data implementation.

3. The DAP-empowered ministry / government agency: As both contributors and beneficiaries of the (DAP) and the services of the central DAP empowerment team, the ministries or the government agencies should focus on a series of continual goals and strategies:

Establish a dedicated team of analysts capable of adeptly leveraging AI/ML tools and answering queries–proficient in data analysis, interpretation, and visualization, and able to respond to information requests from both internal and external stakeholders.
Manage internal data to high standards, ensuring data quality, accuracy, completeness, and consistency, as well as implementing robust data governance and security protocols.
Integrate with the DAP’s tiered access system–align existing data management systems with the DAP’s access controls to ensure appropriate data sharing and security across different levels of users.
Make selected data available to the DAP–identify and share relevant data with the DAP to contribute to broader government-wide data initiatives in support of informed decision-making.
Develop a set of in-house tools and capabilities to complement DAP-provided tools, where needed: specialized analytical tools, data visualization dashboards, or data management workflows tailored to the specific needs of the agency.

Meet us at Google Cloud Next 25 in Las Vegas

For deeper technical perspectives on redefining your department’s data strategy–including the emerging integral roles of leading-edge AI/ML– be sure to join us at Google Cloud Next ‘25, taking place April 9-11 in Las Vegas. We’ll showcase Google’s latest AI and cloud innovations, designed to empower agencies across the public sector and meet their missions.

Read More for the details.

2025 03 20

AWS – Research and Engineering Studio on AWS Version 2025.03 now available

Tibor Kiss AWS, Cloud AWS

Today we’re excited to announce Research and Engineering Studio (RES) on AWS Version 2025.03. This release introduces the RES cost dashboard, supports custom instance lists by software stack, extends hibernation support to Linux virtual desktops, and supports virtual desktops running Windows 10 and 11.

Administrators now have access to the RES cost dashboard, which provides an overview of the Virtual Desktop Infrastructure (VDI) costs at a project level. Use the cost dashboard to get an overview of each project’s budget progress and view data related to historical spend.

RES 2024.08 introduced the ability to modify the list of allowable VDI instance types at the environment level. This release refines that feature by allowing administrators to assign any subset of allowed instances to specific software stacks. Assigning these software stacks to projects makes it possible to limit the instances available for VDIs at a project level.

RES 2025.03 also extends hibernation support to all supported Linux distributions and now supports launching VDIs with Windows 10 and 11. Software stacks now have an additional setting to launch VDIs with either shared, dedicated instance, or dedicated host tenancy to meet licensing requirements. Finally, the ability to create a software stack from a running session has returned. Use this as an alternative to EC2 Image Builder to streamline creation of custom Software Stacks and software images.

See the regional availability page for the list of regions where RES is available.

Check out additional release notes on Github to get started and deploy RES 2025.03.

Read More for the details.

2025 03 20

AWS – Amazon Redshift Serverless is now available in the AWS Mexico (Central) and Asia Pacific (Thailand) Regions

Tibor Kiss AWS, Cloud AWS

Amazon Redshift Serverless, which allows you to run and scale analytics without having to provision and manage data warehouse clusters, is now generally available in the AWS Mexico (Central) and Asia Pacific (Thailand) regions. With Amazon Redshift Serverless, all users, including data analysts, developers, and data scientists, can use Amazon Redshift to get insights from data in seconds. Amazon Redshift Serverless automatically provisions and intelligently scales data warehouse capacity to deliver high performance for all your analytics. You only pay for the compute used for the duration of the workloads on a per-second basis. You can benefit from this simplicity without making any changes to your existing analytics and business intelligence applications.

With a few clicks in the AWS Management Console, you can get started with querying data using the Query Editor V2 or your tool of choice with Amazon Redshift Serverless. There is no need to choose node types, node count, workload management, scaling, and other manual configurations. You can create databases, schemas, and tables, and load your own data from Amazon S3, access data using Amazon Redshift data shares, or restore an existing Amazon Redshift provisioned cluster snapshot. With Amazon Redshift Serverless, you can directly query data in open formats, such as Apache Parquet, in Amazon S3 data lakes. Amazon Redshift Serverless provides unified billing for queries on any of these data sources, helping you efficiently monitor and manage costs.

To get started, see the Amazon Redshift Serverless feature page, user documentation, and API Reference.

Read More for the details.

2025 03 20

AWS – IonQ Forte Enterprise now available on Amazon Braket

Tibor Kiss AWS, Cloud AWS

Amazon Braket, the quantum computing service from AWS, now offers IonQ’s latest 36-qubit Forte Enterprise quantum processing unit (QPU) in the US East (N. Virginia) Region. This new device joins IonQ’s existing quantum hardware portfolio on Braket, which includes Forte-1, Aria-1, and Aria-2, providing customers with additional capacity to run their quantum workloads on ion-trapped devices.

With this launch, customers can use the familiar Braket SDK and APIs to access Forte Enterprise, which maintains the same capabilities that customers value in Forte-1. The device features IonQ’s debiasing and sharpening error mitigation algorithms to enable advanced customers workloads. Forte Enterprise continues to use the native ZZ gate architecture, making it easy for customers to seamlessly migrate workloads between the Forte devices.

IonQ Forte Enterprise is physically located in Switzerland, but all customer traffic routes through US East (N. Virginia) region. Customers can access this new device using the ARN: arn:aws:braket:us-east-1::device/qpu/ionq/Forte-Enterprise-1.

To get started with IonQ Forte Enterprise, visit the Amazon Braket devices page in the AWS Management Console to explore device specifications and capabilities. For additional guidance, review the Amazon Braket documentation and pricing information to make the most of this new device.

Read More for the details.

2025 03 20

GCP – Google Cloud Next 25 Partner Summit: Session guide for partners

Tibor Kiss Cloud, Google Cloud gcp

Partner Summit at Google Cloud Next ’25 is your opportunity to hear from Google Cloud leaders on what’s to come in 2025 for our partners. Breakout Sessions and Lightning Talks are your ticket to unlocking growth, mastering AI, and conquering the cloud marketplace. Sign-up today to secure your seat in one of the 40+ exclusive sessions at Partner Summit.

Need some help deciding where to prioritize your time? Our Google Cloud partner leaders have some ideas!

Jim Anderson, VP, NA Partner Ecosystem and Channels, Google Cloud

“AI is transforming how businesses discover, utilize, and manage data. Many customers have shared that adopting AI technology will require them to reimagine their business strategies. As we guide our customers through this transformation, it’s crucial to have a partner ecosystem that can offer the necessary industry and process expertise. This session shares valuable insights into how Google’s Go-To-Market teams are approaching this challenge, and how we can partner to deliver better customer outcomes.”

Jim recommends:

PAR101: North America’s business leaders on the AI revolution and verticalization: Discover the latest trends shaping the region, and gain actionable insights to embrace unprecedented opportunities and exceed customer expectations in this transformative age.

Troy Bertram, Managing Director, Public Sector Partner Ecosystem, Google Public Sector

“We are seeing incredible momentum within the public sector as organizations look to leverage Google AI to drive meaningful impact. Our partners are at the forefront of this evolution, building solutions that empower public servants and improve citizen outcomes. At Google Public Sector, we’re empowering partners to deliver secure, scalable, and innovative solutions that address the unique needs of government agencies, educational institutions, and nonprofits. Partners benefit from our go-to-market differentiators, including access to Google Public Sector’s Rapid Innovation Team (RIT), Delivery Expertise Badges, specialized SecOps resources, and subscription agreements. Public sector organizations are tackling some of society’s most pressing challenges, and our ecosystem’s expertise is vital in helping them achieve their missions.”

Troy recommends:

PAR106: Public Sector AI innovation and partner go-to-market: Learn new GTM strategies, understand co-sell, and how to utilize US Public Sector Deal Registration Discount, Partner Development Sprints, and the Rapid Innovation Team, all designed to accelerate partner success.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1b5206b5b0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Javier Carrique, Director, Partner Ecosystem & Channels, LATAM, Google Cloud

“We are building the foundation for future market leadership by strategically evolving our go-to-market engine to diversify our customer base and dramatically enhance market reach. Our approach includes a strong emphasis on portfolio and value differentiation, harnessing the innovation of generative AI to create distinct market advantages, empowering our Partners for mutual growth and expanded reach, and establishing a clear and influential voice in the industry to attract a wider range of clients.”

Javier recommends:

PAR103: Leveraging AI for partner success in LATAM: Capitalize on opportunities in LATAM & cutting edge AI technology to boost your organization’s revenue.
PARLT103: Generative AI training for partners: As AI becomes increasingly integrated into various industries, customers will demand partners who understand and can leverage this technology. Exclusive training ensures the partner ecosystem is prepared to meet these evolving needs.

Aimee Catalano, Senior Director, Global Partner Marketing, Google Cloud

“These sessions should be on the agenda of every marketer interested in capitalizing on Google Cloud’s benefits and resources. We offer critical overviews into our Partner Marketing Studio & various incentive offerings, sharing best practices and how to find the right programs for your business. We also showcase how we’ve integrated AI tools seamlessly into your existing workflows in order to optimize your campaigns and increase ROI. ”

Aimee recommends:

PARLT107: Excelling with Marketing Funds: program overview and tips for success: Walk through eligibility, funding options, and best practices to help you maximize impact.
PARLT108: Maximize your marketing with Google Cloud: Leverage key tools and techniques, like Partner Marketing Studio, to enhance campaigns and optimize resources.

Bruno Heese, Managing Director, Partners and Channels, EMEA, Google Cloud

“The EMEA region presents a dynamic and significant opportunity for growth, and our partners are central to realizing this potential. At Partner Summit, we will be sharing Google Cloud’s strategic vision and investment priorities for EMEA, emphasizing the key industry verticals and emerging technology trends that are ripe for innovation. In a panel with partners Devoteam and Deloitte, we will also delve into the crucial aspect of driving AI adoption across the region, exploring how partners can position AI as a transformative force for their customers and develop AI-powered solutions leveraging Google Cloud’s comprehensive suite of AI tools. For EMEA partners looking to maximize their potential and achieve significant growth with Google Cloud, this is a key session to attend.”

Bruno recommends:

PAR102: Driving AI innovation through joint opportunities in EMEA: Discover Google Cloud’s vision & investment priorities in EMEA for 2025. We’ll discuss how partners are capitalizing on opportunities to build presence in multiple markets and address generative AI implementation challenges.

Colleen Kapase, VP, Channels & Partner Programs, Google Cloud

“To thrive in today’s rapidly shifting landscape of customer expectations and emerging technologies, our partner ecosystem and programs are adapting to meet customers where they’re at. These three sessions will highlight how partners can drive revenue through joint co-selling, make the most of our incentives, and accelerate customer AI projects. Together, your organization and Google Cloud will show up for customers as one team. ”

Colleen recommends:

PAR107: Partner programs: Growing together in the age of AI: Learn how to stand out as a valued partner as we share our plans for co-Sell and services opportunities, new initiatives & key metrics designed to recognize and amplify partner contributions, and exciting investment opportunities.
PAR108: Maximizing profits with incentives & benefits: Unlock proven practices to optimize partner benefits and incentives, including programs that reward your success in driving customer cloud adoption
PAR109: Unlocking new revenue streams with Google AI: Open up new revenue streams by building profitable AI offerings that engage customers, accelerate deals, and showcase real-world ROI.

Anthony McMahon, Managing Director, Partners and Corporate Business, APAC

“Google Cloud is experiencing rapid growth in the APAC market, powered by our partner ecosystem. This session will outline our 2025 strategy for continued growth in the region and highlight how partners can contribute and differentiate themselves. We will also discuss how, together, we can capitalize on the AI opportunity in APAC.”

Anthony recommends:

PAR104: Winning together: The AI opportunity in APAC: Unlock the secrets to dominating the APAC market. Get the inside scoop on our 2025 strategy and learn how to position yourself for explosive growth. If APAC is your play, this session is your win.

Victor Morales, VP, Global System Integrators Partnerships, Google Cloud

“Excelling with Google Cloud industry solutions means you can apply or build upon our horizontals to address niche or industry-specific business challenges. Learn how our GSIs tailor different offerings to ensure partner and customer success.”

Victor recommends:

PARLT104: Leading the way in industry: The Google Cloud playbook: Discover Google Cloud’s industry approach & strategy. Hear from a fellow partner on how we jointly worked to bring a new solution to market and the Google Cloud resources and tools they leveraged for success.

Stephen Orban, VP, Migrations, ISVs, and Marketplace, Google Cloud

“Enterprise software is undergoing a seismic transformation. Cloud marketplaces are changing how enterprises find, use, and manage software, with billions transacted through the Google Cloud Marketplace in 2024. Artificial Intelligence, including agentic AI, is changing how workers in every industry interact and benefit from ISV solutions. These two sessions will help our ISV and Technology Partners understand how building AI into their software solutions and transacting through the Google Cloud Marketplace will accelerate their growth and deliver better customer outcomes.”

Stephen recommends:

PARLT119: Accelerate co-sell readiness with Google Cloud Marketplace: Designed for ISV partners, spur your path to co-sell success with Google Cloud. Gain a clear roadmap to co-sell readiness with resources designed specifically for you.
PAR111: Transforming technology partnerships: An AI-first GTM playbook for ISVs: Discover how to integrate AI into your go-to-market strategy and stay ahead of the curve. It’s not just about AI; it’s about AI that works for you.

Yumi Ueno, MD, Japan Partner Ecosystem and Channels, Google Cloud

“Japan, a cornerstone of the global economy, is poised for a significant transformation. While facing challenges such as demographic shifts and the need for optimized data infrastructures, these very conditions present compelling opportunities for strategic AI deployment, particularly in the realm of generative AI. Google Cloud, in close collaboration with our valued partners, is dedicated to delivering impactful, scalable solutions that drive tangible business outcomes in the Japanese market.”

Yumi recommends:

PAR105: Driving business with gen AI in Japan: Discover how generative AI is reshaping the Japanese business landscape, the current market dynamics, and unveil Google Cloud’s strategic AI initiatives.

Dai Vu, Managing Director, Marketplace & ISV GTM Programs, Google Cloud

“We’re seeing tremendous growth in solutions being bought and sold on Google Cloud Marketplace, with partners across the entire ecosystem from AI and data providers to channel partners and systems integrators all taking advantage of cloud marketplace as a route-to-market. I’m excited for my panel discussion with MongoDB, SADA, and Workday, strategic partners who will provide important insights for companies across different stages of the cloud marketplace journey. And I’m looking forward to the lightning talk on monetizing AI agents, which is an exciting next step in our AI evolution.”

Dai recommends:

PAR110: Scaling go-to-market success with Google Cloud Marketplace: Learn from the best. Gain insights from MongoDB and Workday on navigating and thriving in the Google Cloud Marketplace.
PARLT118: Monetizing AI agents on Google Cloud Marketplace: Step into the future of AI monetization. This lightning talk will illuminate the path to capitalizing on the next wave of AI innovation.

Don’t Miss Out. Register Now!

Google Cloud Next 25 is your chance to connect with industry leaders, gain actionable insights, and propel your business forward. Secure your spot today and get ready to transform your partnership with Google Cloud.

Read More for the details.

2025 03 20

GCP – Vertex AI Search and Generative AI (with Gemini) achieve FedRAMP High

Tibor Kiss Cloud, Google Cloud gcp

In the rapidly evolving AI landscape, security remains paramount. Today, we reinforce that commitment with another significant achievement: FedRAMP High authorization for Google Vertex AI Search and Generative AI on Vertex AI.

This follows our announcement earlier this week where we shared that Gemini in Workspace apps and the Gemini app are the first generative AI assistants for productivity and collaboration suites to have achieved FedRAMP High authorization. All of this builds on our prior announcement that Google Cloud achieved FedRAMP High Authorization on more than 100 additional services, which further underscores our dedication to delivering leading AI and robust security for mission-critical applications.

Generative AI on Vertex AI, our secure enterprise platform for hosting the Gemini family of models, is now FedRAMP High authorized, thereby empowering federal agencies to use AI capabilities for their needs. Vertex AI Search is a turnkey solution that enables federal agencies to achieve multimodal Google-quality search across external, internal and proprietary data. It makes discoverability of information much easier, and allows for greater transparency behind LLM operations.

For government customers, Vertex AI Search unlocks powerful capabilities for secure services delivery, including enhanced search and discovery and real-time operational advantages. For constituents, this translates to easier and more secure experience when engaging with government websites and applications.

Imagine asking a question on a government website without having to scroll through the site menu and being able to get an accurate answer quickly, based on the latest information and being able to know the source immediately.

Here’s a look at how Vertex AI Search is already making a difference for federal agencies:

The U.S. Department of State Bureau of Consular Affairs partnered with Google Public Sector and our partner TTEC to improve constituent experience on their largest public-facing website. After launching their online passport renewal in December 2024, they released their inaugural Agent Assist chatbot. Powered by Gemini and Google’s Contact Center AI (CCAI) solution, the chatbot enhances website FAQs at the travel.state.gov website.

The department rolled out Vertex AI Search to enable their constituents to quickly, easily, and accurately find travel advisories, passport information, visa information, overseas citizens services, emergency assistance, and more. These initiatives are just the beginning of a broader modernization effort, enabled by Google AI, to improve usability and access to critical information.

The National Archives and Records Administration (NARA) incorporated Vertex AI Search and Gemini into its searchable database from a subset of production data. NARA uses Gemini, complimented by sensitive data redaction tools, to enable advanced semantic search, while maintaining user friendliness and the highest standards of data privacy.

Vertex AI Search also can help research-focused agencies who manage scientific data. Multimodal capability enables researchers to search diverse datasets, including images, videos, and research papers more quickly.

Trust in sources is top of mind for researchers, specifically recitation and citations. Vertex AI’s Explainable AI capabilities ensure that responses are grounded in evidence. This reduces the risk of hallucinations and ensures fact checking. By fostering greater trust in the AI powered search, Vertex AI Search can accelerate research, discovery and breakthroughs.

Google is able to deliver powerful Gemini foundational models at a lower latency because of our superior AI infrastructure and computing capabilities. Our multimodal models capitalize on our AI infrastructure, achieving significantly lower latency as compared to other commercially available LLMs.

As public sector organizations transition from AI experimentation to essential, mission-critical applications, the importance of a comprehensive and integrated AI solution cannot be overstated. Google’s full-stack approach to AI, encompassing infrastructure, research, models, products, and platforms ensures efficiency and innovation across every facet of AI development and deployment.

This unique approach is further exemplified by our FedRAMP High authorization for Vertex AI Search and Generative AI on Vertex AI, which empowers federal agencies to confidently harness the potential of AI while maintaining the highest security and compliance standards.

Learn more about how Google’s AI solutions can empower your agency and accelerate mission impact by joining us at Google Cloud Next 2025 in Las Vegas.

Read More for the details.

2025 03 20

AWS – Amazon EC2 M7g instances are now available in AWS Israel (Tel Aviv) Region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M7g instances are available in the AWS Israel (Tel Aviv) region. These instances are powered by AWS Graviton3 processors that provide up to 25% better compute performance compared to AWS Graviton2 processors, and built on top of the the AWS Nitro System, a collection of AWS designed innovations that deliver efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage.

Amazon EC2 Graviton3 instances also use up to 60% less energy to reduce your cloud carbon footprint for the same performance than comparable EC2 instances. For increased scalability, these instances are available in 9 different instance sizes, including bare metal, and offer up to 30 Gbps networking bandwidth and up to 20 Gbps of bandwidth to the Amazon Elastic Block Store (EBS).

To learn more, see Amazon EC2 M7g. To explore how to migrate your workloads to Graviton-based instances, see AWS Graviton Fast Start program and Porting Advisor for Graviton. To get started, see the AWS Management Console.

Read More for the details.