Cloud

2025 03 21

GCP – The three pillars of data-driven government

Over just the past few years, Artificial Intelligence and Machine Learning (AI/ML) have remarkably transformed IT and data science at the enterprise level. Yet, the public sector is still working through some significant “growing pains” adapting Data & AI strategies and infrastructure to improve and accelerate public services.

Google’s experiences across the Public Sector have led us to offer an adaptable framework for governments for defining and refining your government’s AI & data access strategy. We can break it down into three core pillars:

1. Defining a government Data Access Platform (DAP): A platform where individuals and businesses can easily find and access government data. This platform should be user-friendly with robust search capabilities, clear access routes, and strong governance protocols.

Key points to consider:

An authoritative single point-of-entry to access major regional and local government datasets, whether held in the platform itself or elsewhere.
Allowing users to both search and preview the data, based upon metadata in a uniform schema around their unique requirements.
Providing a clear route to access–either directly via the platform itself, or facilitating direct peer-to-peer access to the actual data owner.
Robust governance and fine-grained control of who accesses what–private versus public datasets.
Hosting across multiple data formats: tables, text, videos, and geospatial information and timeseries.
An API driven platform that allows for applications (public and private) to be built on top of the data services it provides.

2. Assembling a central DAP empowerment team: A dedicated team responsible for building, managing, and promoting the data access platform. This team should set clear goals, measure success, establish governance principles, incentivize participation, and provide training and support to the government departments:

Builds, owns, and operates the cross-government data access platform, as well as recommends additional integrated components.
Defines benefits to the public: better and faster public services, simpler user interactions.
Sets, captures, and reports metrics: success among users and overall impacts.
Sets core strategy and governance principles: primary data owners/datasets, metadata, formats (proprietary or open source).
Incentivizes adherence within or between departments–including funding, penalties, or recognition.
Coordinates necessary training and upskilling to modern technologies.
Manage relationships with a set of trusted partners who can assist with the full scope of data implementation.

3. The DAP-empowered ministry / government agency: As both contributors and beneficiaries of the (DAP) and the services of the central DAP empowerment team, the ministries or the government agencies should focus on a series of continual goals and strategies:

Establish a dedicated team of analysts capable of adeptly leveraging AI/ML tools and answering queries–proficient in data analysis, interpretation, and visualization, and able to respond to information requests from both internal and external stakeholders.
Manage internal data to high standards, ensuring data quality, accuracy, completeness, and consistency, as well as implementing robust data governance and security protocols.
Integrate with the DAP’s tiered access system–align existing data management systems with the DAP’s access controls to ensure appropriate data sharing and security across different levels of users.
Make selected data available to the DAP–identify and share relevant data with the DAP to contribute to broader government-wide data initiatives in support of informed decision-making.
Develop a set of in-house tools and capabilities to complement DAP-provided tools, where needed: specialized analytical tools, data visualization dashboards, or data management workflows tailored to the specific needs of the agency.

Meet us at Google Cloud Next 25 in Las Vegas

For deeper technical perspectives on redefining your department’s data strategy–including the emerging integral roles of leading-edge AI/ML– be sure to join us at Google Cloud Next ‘25, taking place April 9-11 in Las Vegas. We’ll showcase Google’s latest AI and cloud innovations, designed to empower agencies across the public sector and meet their missions.

Read More for the details.

2025 03 20

AWS – Research and Engineering Studio on AWS Version 2025.03 now available

Tibor Kiss AWS, Cloud AWS

Today we’re excited to announce Research and Engineering Studio (RES) on AWS Version 2025.03. This release introduces the RES cost dashboard, supports custom instance lists by software stack, extends hibernation support to Linux virtual desktops, and supports virtual desktops running Windows 10 and 11.

Administrators now have access to the RES cost dashboard, which provides an overview of the Virtual Desktop Infrastructure (VDI) costs at a project level. Use the cost dashboard to get an overview of each project’s budget progress and view data related to historical spend.

RES 2024.08 introduced the ability to modify the list of allowable VDI instance types at the environment level. This release refines that feature by allowing administrators to assign any subset of allowed instances to specific software stacks. Assigning these software stacks to projects makes it possible to limit the instances available for VDIs at a project level.

RES 2025.03 also extends hibernation support to all supported Linux distributions and now supports launching VDIs with Windows 10 and 11. Software stacks now have an additional setting to launch VDIs with either shared, dedicated instance, or dedicated host tenancy to meet licensing requirements. Finally, the ability to create a software stack from a running session has returned. Use this as an alternative to EC2 Image Builder to streamline creation of custom Software Stacks and software images.

See the regional availability page for the list of regions where RES is available.

Check out additional release notes on Github to get started and deploy RES 2025.03.

Read More for the details.

2025 03 20

AWS – Amazon Redshift Serverless is now available in the AWS Mexico (Central) and Asia Pacific (Thailand) Regions

Tibor Kiss AWS, Cloud AWS

Amazon Redshift Serverless, which allows you to run and scale analytics without having to provision and manage data warehouse clusters, is now generally available in the AWS Mexico (Central) and Asia Pacific (Thailand) regions. With Amazon Redshift Serverless, all users, including data analysts, developers, and data scientists, can use Amazon Redshift to get insights from data in seconds. Amazon Redshift Serverless automatically provisions and intelligently scales data warehouse capacity to deliver high performance for all your analytics. You only pay for the compute used for the duration of the workloads on a per-second basis. You can benefit from this simplicity without making any changes to your existing analytics and business intelligence applications.

With a few clicks in the AWS Management Console, you can get started with querying data using the Query Editor V2 or your tool of choice with Amazon Redshift Serverless. There is no need to choose node types, node count, workload management, scaling, and other manual configurations. You can create databases, schemas, and tables, and load your own data from Amazon S3, access data using Amazon Redshift data shares, or restore an existing Amazon Redshift provisioned cluster snapshot. With Amazon Redshift Serverless, you can directly query data in open formats, such as Apache Parquet, in Amazon S3 data lakes. Amazon Redshift Serverless provides unified billing for queries on any of these data sources, helping you efficiently monitor and manage costs.

To get started, see the Amazon Redshift Serverless feature page, user documentation, and API Reference.

Read More for the details.

2025 03 20

AWS – IonQ Forte Enterprise now available on Amazon Braket

Tibor Kiss AWS, Cloud AWS

Amazon Braket, the quantum computing service from AWS, now offers IonQ’s latest 36-qubit Forte Enterprise quantum processing unit (QPU) in the US East (N. Virginia) Region. This new device joins IonQ’s existing quantum hardware portfolio on Braket, which includes Forte-1, Aria-1, and Aria-2, providing customers with additional capacity to run their quantum workloads on ion-trapped devices.

With this launch, customers can use the familiar Braket SDK and APIs to access Forte Enterprise, which maintains the same capabilities that customers value in Forte-1. The device features IonQ’s debiasing and sharpening error mitigation algorithms to enable advanced customers workloads. Forte Enterprise continues to use the native ZZ gate architecture, making it easy for customers to seamlessly migrate workloads between the Forte devices.

IonQ Forte Enterprise is physically located in Switzerland, but all customer traffic routes through US East (N. Virginia) region. Customers can access this new device using the ARN: arn:aws:braket:us-east-1::device/qpu/ionq/Forte-Enterprise-1.

To get started with IonQ Forte Enterprise, visit the Amazon Braket devices page in the AWS Management Console to explore device specifications and capabilities. For additional guidance, review the Amazon Braket documentation and pricing information to make the most of this new device.

Read More for the details.

2025 03 20

GCP – Google Cloud Next 25 Partner Summit: Session guide for partners

Tibor Kiss Cloud, Google Cloud gcp

Partner Summit at Google Cloud Next ’25 is your opportunity to hear from Google Cloud leaders on what’s to come in 2025 for our partners. Breakout Sessions and Lightning Talks are your ticket to unlocking growth, mastering AI, and conquering the cloud marketplace. Sign-up today to secure your seat in one of the 40+ exclusive sessions at Partner Summit.

Need some help deciding where to prioritize your time? Our Google Cloud partner leaders have some ideas!

Jim Anderson, VP, NA Partner Ecosystem and Channels, Google Cloud

“AI is transforming how businesses discover, utilize, and manage data. Many customers have shared that adopting AI technology will require them to reimagine their business strategies. As we guide our customers through this transformation, it’s crucial to have a partner ecosystem that can offer the necessary industry and process expertise. This session shares valuable insights into how Google’s Go-To-Market teams are approaching this challenge, and how we can partner to deliver better customer outcomes.”

Jim recommends:

PAR101: North America’s business leaders on the AI revolution and verticalization: Discover the latest trends shaping the region, and gain actionable insights to embrace unprecedented opportunities and exceed customer expectations in this transformative age.

Troy Bertram, Managing Director, Public Sector Partner Ecosystem, Google Public Sector

“We are seeing incredible momentum within the public sector as organizations look to leverage Google AI to drive meaningful impact. Our partners are at the forefront of this evolution, building solutions that empower public servants and improve citizen outcomes. At Google Public Sector, we’re empowering partners to deliver secure, scalable, and innovative solutions that address the unique needs of government agencies, educational institutions, and nonprofits. Partners benefit from our go-to-market differentiators, including access to Google Public Sector’s Rapid Innovation Team (RIT), Delivery Expertise Badges, specialized SecOps resources, and subscription agreements. Public sector organizations are tackling some of society’s most pressing challenges, and our ecosystem’s expertise is vital in helping them achieve their missions.”

Troy recommends:

PAR106: Public Sector AI innovation and partner go-to-market: Learn new GTM strategies, understand co-sell, and how to utilize US Public Sector Deal Registration Discount, Partner Development Sprints, and the Rapid Innovation Team, all designed to accelerate partner success.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1b5206b5b0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Javier Carrique, Director, Partner Ecosystem & Channels, LATAM, Google Cloud

“We are building the foundation for future market leadership by strategically evolving our go-to-market engine to diversify our customer base and dramatically enhance market reach. Our approach includes a strong emphasis on portfolio and value differentiation, harnessing the innovation of generative AI to create distinct market advantages, empowering our Partners for mutual growth and expanded reach, and establishing a clear and influential voice in the industry to attract a wider range of clients.”

Javier recommends:

PAR103: Leveraging AI for partner success in LATAM: Capitalize on opportunities in LATAM & cutting edge AI technology to boost your organization’s revenue.
PARLT103: Generative AI training for partners: As AI becomes increasingly integrated into various industries, customers will demand partners who understand and can leverage this technology. Exclusive training ensures the partner ecosystem is prepared to meet these evolving needs.

Aimee Catalano, Senior Director, Global Partner Marketing, Google Cloud

“These sessions should be on the agenda of every marketer interested in capitalizing on Google Cloud’s benefits and resources. We offer critical overviews into our Partner Marketing Studio & various incentive offerings, sharing best practices and how to find the right programs for your business. We also showcase how we’ve integrated AI tools seamlessly into your existing workflows in order to optimize your campaigns and increase ROI. ”

Aimee recommends:

PARLT107: Excelling with Marketing Funds: program overview and tips for success: Walk through eligibility, funding options, and best practices to help you maximize impact.
PARLT108: Maximize your marketing with Google Cloud: Leverage key tools and techniques, like Partner Marketing Studio, to enhance campaigns and optimize resources.

Bruno Heese, Managing Director, Partners and Channels, EMEA, Google Cloud

“The EMEA region presents a dynamic and significant opportunity for growth, and our partners are central to realizing this potential. At Partner Summit, we will be sharing Google Cloud’s strategic vision and investment priorities for EMEA, emphasizing the key industry verticals and emerging technology trends that are ripe for innovation. In a panel with partners Devoteam and Deloitte, we will also delve into the crucial aspect of driving AI adoption across the region, exploring how partners can position AI as a transformative force for their customers and develop AI-powered solutions leveraging Google Cloud’s comprehensive suite of AI tools. For EMEA partners looking to maximize their potential and achieve significant growth with Google Cloud, this is a key session to attend.”

Bruno recommends:

PAR102: Driving AI innovation through joint opportunities in EMEA: Discover Google Cloud’s vision & investment priorities in EMEA for 2025. We’ll discuss how partners are capitalizing on opportunities to build presence in multiple markets and address generative AI implementation challenges.

Colleen Kapase, VP, Channels & Partner Programs, Google Cloud

“To thrive in today’s rapidly shifting landscape of customer expectations and emerging technologies, our partner ecosystem and programs are adapting to meet customers where they’re at. These three sessions will highlight how partners can drive revenue through joint co-selling, make the most of our incentives, and accelerate customer AI projects. Together, your organization and Google Cloud will show up for customers as one team. ”

Colleen recommends:

PAR107: Partner programs: Growing together in the age of AI: Learn how to stand out as a valued partner as we share our plans for co-Sell and services opportunities, new initiatives & key metrics designed to recognize and amplify partner contributions, and exciting investment opportunities.
PAR108: Maximizing profits with incentives & benefits: Unlock proven practices to optimize partner benefits and incentives, including programs that reward your success in driving customer cloud adoption
PAR109: Unlocking new revenue streams with Google AI: Open up new revenue streams by building profitable AI offerings that engage customers, accelerate deals, and showcase real-world ROI.

Anthony McMahon, Managing Director, Partners and Corporate Business, APAC

“Google Cloud is experiencing rapid growth in the APAC market, powered by our partner ecosystem. This session will outline our 2025 strategy for continued growth in the region and highlight how partners can contribute and differentiate themselves. We will also discuss how, together, we can capitalize on the AI opportunity in APAC.”

Anthony recommends:

PAR104: Winning together: The AI opportunity in APAC: Unlock the secrets to dominating the APAC market. Get the inside scoop on our 2025 strategy and learn how to position yourself for explosive growth. If APAC is your play, this session is your win.

Victor Morales, VP, Global System Integrators Partnerships, Google Cloud

“Excelling with Google Cloud industry solutions means you can apply or build upon our horizontals to address niche or industry-specific business challenges. Learn how our GSIs tailor different offerings to ensure partner and customer success.”

Victor recommends:

PARLT104: Leading the way in industry: The Google Cloud playbook: Discover Google Cloud’s industry approach & strategy. Hear from a fellow partner on how we jointly worked to bring a new solution to market and the Google Cloud resources and tools they leveraged for success.

Stephen Orban, VP, Migrations, ISVs, and Marketplace, Google Cloud

“Enterprise software is undergoing a seismic transformation. Cloud marketplaces are changing how enterprises find, use, and manage software, with billions transacted through the Google Cloud Marketplace in 2024. Artificial Intelligence, including agentic AI, is changing how workers in every industry interact and benefit from ISV solutions. These two sessions will help our ISV and Technology Partners understand how building AI into their software solutions and transacting through the Google Cloud Marketplace will accelerate their growth and deliver better customer outcomes.”

Stephen recommends:

PARLT119: Accelerate co-sell readiness with Google Cloud Marketplace: Designed for ISV partners, spur your path to co-sell success with Google Cloud. Gain a clear roadmap to co-sell readiness with resources designed specifically for you.
PAR111: Transforming technology partnerships: An AI-first GTM playbook for ISVs: Discover how to integrate AI into your go-to-market strategy and stay ahead of the curve. It’s not just about AI; it’s about AI that works for you.

Yumi Ueno, MD, Japan Partner Ecosystem and Channels, Google Cloud

“Japan, a cornerstone of the global economy, is poised for a significant transformation. While facing challenges such as demographic shifts and the need for optimized data infrastructures, these very conditions present compelling opportunities for strategic AI deployment, particularly in the realm of generative AI. Google Cloud, in close collaboration with our valued partners, is dedicated to delivering impactful, scalable solutions that drive tangible business outcomes in the Japanese market.”

Yumi recommends:

PAR105: Driving business with gen AI in Japan: Discover how generative AI is reshaping the Japanese business landscape, the current market dynamics, and unveil Google Cloud’s strategic AI initiatives.

Dai Vu, Managing Director, Marketplace & ISV GTM Programs, Google Cloud

“We’re seeing tremendous growth in solutions being bought and sold on Google Cloud Marketplace, with partners across the entire ecosystem from AI and data providers to channel partners and systems integrators all taking advantage of cloud marketplace as a route-to-market. I’m excited for my panel discussion with MongoDB, SADA, and Workday, strategic partners who will provide important insights for companies across different stages of the cloud marketplace journey. And I’m looking forward to the lightning talk on monetizing AI agents, which is an exciting next step in our AI evolution.”

Dai recommends:

PAR110: Scaling go-to-market success with Google Cloud Marketplace: Learn from the best. Gain insights from MongoDB and Workday on navigating and thriving in the Google Cloud Marketplace.
PARLT118: Monetizing AI agents on Google Cloud Marketplace: Step into the future of AI monetization. This lightning talk will illuminate the path to capitalizing on the next wave of AI innovation.

Don’t Miss Out. Register Now!

Google Cloud Next 25 is your chance to connect with industry leaders, gain actionable insights, and propel your business forward. Secure your spot today and get ready to transform your partnership with Google Cloud.

Read More for the details.

2025 03 20

GCP – Vertex AI Search and Generative AI (with Gemini) achieve FedRAMP High

Tibor Kiss Cloud, Google Cloud gcp

In the rapidly evolving AI landscape, security remains paramount. Today, we reinforce that commitment with another significant achievement: FedRAMP High authorization for Google Vertex AI Search and Generative AI on Vertex AI.

This follows our announcement earlier this week where we shared that Gemini in Workspace apps and the Gemini app are the first generative AI assistants for productivity and collaboration suites to have achieved FedRAMP High authorization. All of this builds on our prior announcement that Google Cloud achieved FedRAMP High Authorization on more than 100 additional services, which further underscores our dedication to delivering leading AI and robust security for mission-critical applications.

Generative AI on Vertex AI, our secure enterprise platform for hosting the Gemini family of models, is now FedRAMP High authorized, thereby empowering federal agencies to use AI capabilities for their needs. Vertex AI Search is a turnkey solution that enables federal agencies to achieve multimodal Google-quality search across external, internal and proprietary data. It makes discoverability of information much easier, and allows for greater transparency behind LLM operations.

For government customers, Vertex AI Search unlocks powerful capabilities for secure services delivery, including enhanced search and discovery and real-time operational advantages. For constituents, this translates to easier and more secure experience when engaging with government websites and applications.

Imagine asking a question on a government website without having to scroll through the site menu and being able to get an accurate answer quickly, based on the latest information and being able to know the source immediately.

Here’s a look at how Vertex AI Search is already making a difference for federal agencies:

The U.S. Department of State Bureau of Consular Affairs partnered with Google Public Sector and our partner TTEC to improve constituent experience on their largest public-facing website. After launching their online passport renewal in December 2024, they released their inaugural Agent Assist chatbot. Powered by Gemini and Google’s Contact Center AI (CCAI) solution, the chatbot enhances website FAQs at the travel.state.gov website.

The department rolled out Vertex AI Search to enable their constituents to quickly, easily, and accurately find travel advisories, passport information, visa information, overseas citizens services, emergency assistance, and more. These initiatives are just the beginning of a broader modernization effort, enabled by Google AI, to improve usability and access to critical information.

The National Archives and Records Administration (NARA) incorporated Vertex AI Search and Gemini into its searchable database from a subset of production data. NARA uses Gemini, complimented by sensitive data redaction tools, to enable advanced semantic search, while maintaining user friendliness and the highest standards of data privacy.

Vertex AI Search also can help research-focused agencies who manage scientific data. Multimodal capability enables researchers to search diverse datasets, including images, videos, and research papers more quickly.

Trust in sources is top of mind for researchers, specifically recitation and citations. Vertex AI’s Explainable AI capabilities ensure that responses are grounded in evidence. This reduces the risk of hallucinations and ensures fact checking. By fostering greater trust in the AI powered search, Vertex AI Search can accelerate research, discovery and breakthroughs.

Google is able to deliver powerful Gemini foundational models at a lower latency because of our superior AI infrastructure and computing capabilities. Our multimodal models capitalize on our AI infrastructure, achieving significantly lower latency as compared to other commercially available LLMs.

As public sector organizations transition from AI experimentation to essential, mission-critical applications, the importance of a comprehensive and integrated AI solution cannot be overstated. Google’s full-stack approach to AI, encompassing infrastructure, research, models, products, and platforms ensures efficiency and innovation across every facet of AI development and deployment.

This unique approach is further exemplified by our FedRAMP High authorization for Vertex AI Search and Generative AI on Vertex AI, which empowers federal agencies to confidently harness the potential of AI while maintaining the highest security and compliance standards.

Learn more about how Google’s AI solutions can empower your agency and accelerate mission impact by joining us at Google Cloud Next 2025 in Las Vegas.

Read More for the details.

2025 03 20

AWS – Amazon EC2 M7g instances are now available in AWS Israel (Tel Aviv) Region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M7g instances are available in the AWS Israel (Tel Aviv) region. These instances are powered by AWS Graviton3 processors that provide up to 25% better compute performance compared to AWS Graviton2 processors, and built on top of the the AWS Nitro System, a collection of AWS designed innovations that deliver efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage.

Amazon EC2 Graviton3 instances also use up to 60% less energy to reduce your cloud carbon footprint for the same performance than comparable EC2 instances. For increased scalability, these instances are available in 9 different instance sizes, including bare metal, and offer up to 30 Gbps networking bandwidth and up to 20 Gbps of bandwidth to the Amazon Elastic Block Store (EBS).

To learn more, see Amazon EC2 M7g. To explore how to migrate your workloads to Graviton-based instances, see AWS Graviton Fast Start program and Porting Advisor for Graviton. To get started, see the AWS Management Console.

Read More for the details.

2025 03 20

AWS – Amazon Q Business now available in AWS Europe (Ireland) region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Q Business is available in AWS Europe region (Ireland). Amazon Q Business revolutionizes the way that employees interact with organizational knowledge and enterprise systems. Q Business customers in this region can get answers from enterprise RAG knowledge bases and uploaded files (e.g. pdf’s, images) and run tabular search on small tables. Customers can also get answers from LLM knowledge and generate content using their Q Business assistant. Amazon Q Business connects seamlessly to over 40 popular enterprise systems, including Amazon Simple Storage Service (Amazon S3), Microsoft 365, and Salesforce. It ensures that users access content securely with their existing credentials using single sign-on, according to their permissions, and enterprise-level access controls.

With this regional expansion, Amazon Q is now available in the following regions: US East (N. Virginia), US West (Oregon) and Europe West (Ireland) AWS Regions.

To learn more about the Amazon Q Business features available in this region, go to Q Business service regions.

For more information, see Amazon Q Business.

Read More for the details.

2025 03 20

AWS – AWS Network Firewall introduces new flow management feature

Tibor Kiss AWS, Cloud AWS

Today, AWS announces a new flow management feature for AWS Network Firewall that enables customers to identify and control active network flows. This feature introduces two key functions: Flow Capture, which allows point-in-time snapshots of active flows, and Flow Flush, which enables selective termination of specific connections. With these new capabilities, customers can now view and manage active flows based on criteria such as source/destination IP addresses, ports, and protocols, providing enhanced control over their network traffic.

This new feature helps customers maintain consistent security policies when updating firewall rules and enables rapid response during security incidents. Network administrators can now easily validate security configurations and ensure that all traffic is evaluated against current policies. The flow management feature is particularly valuable for troubleshooting network issues and isolating suspicious traffic during security events. By providing granular control over active network flows, AWS Network Firewall enhances customers’ ability to maintain a secure and efficient network environment.

The new flow management feature is available in all regions where AWS Network Firewall is supported, allowing customers to benefit from these enhanced capabilities across their global infrastructure.

Customers can start using Flow Capture and Flow Flush at no additional cost. To get started, visit the AWS Network Firewall documentation, explore the new APIs in the Network Firewall API Reference guide, or learn more about AWS Network Firewall on the product page.

Read More for the details.

2025 03 20

AWS – Amazon Bedrock now supports RAG Evaluation (generally available)

Tibor Kiss AWS, Cloud AWS

Amazon Bedrock RAG evaluation is now generally available. You can evaluate your retrieval-augmented generation (RAG) applications, either those built on Amazon Bedrock Knowledge Bases or a custom RAG system. You can evaluate either retrieval or end-to-end generation. Evaluations are powered by an LLM-as-a-judge, with a choice of several judge models. For retrieval, you can select from metrics such as context relevance and coverage. For end-to-end retrieve and generation, you can select from quality metrics such as correctness, completeness, and faithfulness (hallucination detection), and responsible AI metrics such as harmfulness, answer refusal, and stereotyping. You can also compare across evaluation jobs to iterate on your Knowledge Bases or custom RAG applications with different settings like chunking strategy or vector length, rerankers, or different content generating models.

*Brand new – more flexibility!* As of today, in addition to Bedrock Knowledge Bases, Amazon Bedrock’s RAG evaluations supports custom RAG pipeline evaluations. Customers evaluating custom RAG pipelines can now bring their input-output pairs and retrieved contexts into the evaluation job directly in their input dataset, enabling them to bypass the call to a Bedrock Knowledge Base (“bring your own inference responses”). We also added citation precision and citation coverage metrics for Bedrock Knowledge Bases evaluation. If you use a Bedrock Knowledge Base as part of your evaluation, you can incorporate Amazon Bedrock Guardrails directly.

To learn more, visit the Amazon Bedrock Evaluations page and documentation. To get started, log into the Amazon Bedrock Console or use the Amazon Bedrock APIs.

Read More for the details.

2025 03 20

AWS – Amazon Bedrock Model Evaluation LLM-as-a-judge is now generally available

Tibor Kiss AWS, Cloud AWS

Amazon Bedrock Model Evaluation’s LLM-as-a-judge capability is now generally available. Amazon Bedrock Model Evaluation allows you to evaluate, compare, and select the right models for your use case. You can choose an LLM as your judge from several available on Bedrock to ensure you have the right combination of evaluator models and models being evaluated. You can select quality metrics such as correctness, completeness, and professional style and tone, as well as responsible AI metrics such as harmfulness and answer refusal. You can evaluate all available models on Amazon Bedrock, including serverless models, Bedrock Marketplace models compatible with Converse API, customized and distilled models, imported models, and model routers. You can also compare results across evaluation jobs.

*Brand new – more flexibility!* Today, you can evaluate any model or system hosted anywhere by bringing your own inference responses you already fetched into your input prompt dataset for the evaluation job (“bring your own inference responses“). These responses can be from an Amazon Bedrock model or from any model or application hosted outside of Amazon Bedrock, enabling you to bypass calling an Amazon Bedrock model in the evaluation job, and allowing you to incorporate all the intermediate steps of your application into your final responses.

With LLM-as-a-judge, you can get human-like evaluation quality at lower cost, while saving weeks of time.

To learn more, visit the Amazon Bedrock Evaluations page and documentation. To get started, sign in to the AWS Console or use Amazon Bedrock APIs.

Read More for the details.

2025 03 20

GCP – Harvesting hardware: Our approach to carbon-aware fleet deployment

Tibor Kiss Cloud, Google Cloud gcp

When it comes to managing the infrastructure and AI that powers Google’s products and platforms – from Search to YouTube to Google Cloud – every decision we make has an impact. Traditionally, meeting growing demands for machine capacity means deploying new machines and that has an associated embodied carbon impact. That’s why we’re working to reduce the embodied carbon impact at our data centers by optimizing machine placement and promoting the reuse of technical infrastructure hardware.

In this post, we shine a spotlight on our hardware harvesting program, an approach to fleet deployment that prioritizes the reuse of existing hardware.

The hardware harvesting program

The concept is simple: As we deploy new machines or components in our fleet, we repurpose older equipment for alternative and/or additional use cases. The harvesting program prioritizes the reuse of existing hardware, which reduces our carbon emissions compared to exclusively buying brand new machines from the market. This program also helps conserve valuable resources and minimize waste, which contributes to a more circular economy. By scrutinizing the carbon impact of deployment decisions, we’re not just reducing emissions — we’re embedding carbon considerations into the very core of our data center machine operations and business decisions.

Hardware harvesting is not without its challenges. For the program to be successful, we need to ensure the harvested machines meet the specific demands of our workloads and our customers’ requirements, which vary depending on the type of machine and its configuration. However, our heterogeneous fleet, with a wide variety of computational, storage, and accelerator machines, gives us the flexibility to find creative solutions that support both our services and our sustainability goals.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2d7998e790>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Hardware harvesting in action

Google’s harvesting program has already yielded strong benefits. By prioritizing the reuse of existing hardware, we’ve been able to optimize the use of new equipment, reduce our carbon footprint, minimize waste and lower costs.

For example, in 2024, we needed more specific models and configurations of certain components (PCBs, CPUs, motherboards, and HDDs). We harvested them from existing machines by migrating configuration-agnostic jobs from existing machines to more efficient ones, then reclaimed the components from these specific machines. In 2024, the harvesting program helped us reuse over 293,000 components to fulfill new demand, save carbon emissions, and reduce costs. Scaling this hardware harvesting approach across Google’s data center infrastructure presents an opportunity for cost, resource, and carbon reduction.

Looking ahead: Leading by example

Harvesting is just one example of how we’re embedding carbon considerations into our data center practices. We believe that these initiatives will play a role in helping us achieve our company-wide net-zero goal and build a more sustainable future for cloud computing and AI. Read our 2024 Environmental Report to learn more about our sustainability practices.

As we continue to refine our strategies, we aim to lead by example and encourage other companies, especially those in the cloud computing industry, to consider similar approaches.

Read More for the details.

2025 03 20

GCP – Build richer gen AI experiences using model endpoint management

Tibor Kiss Cloud, Google Cloud gcp

Model endpoint management is available on AlloyDB, AlloyDB Omni and Cloud SQL for PostgreSQL.

Model endpoint management helps developers to build new experiences using SQL and provides a flexible interface to call gen AI models running anywhere — right from the database. You can generate embeddings inside the database, perform quality control on your vector search and analyze sentiment in the database, making it easier to monitor results. This feature is available through the google_ml_integration extension, which enables an integration with Vertex AI for both AlloyDB and Cloud SQL for PostgreSQL.

Previously, the google_ml_integration extension only allowed users to call models hosted on the Vertex AI platform. With model endpoint management, you can leverage models running on any platform — including your own local environment. We also added ease-of-use support for models running on Open AI, Hugging Face, and Anthropic, as well as Vertex AI’s latest embedding models so you can easily access these models. We have preconfigured the connectivity details and input/output transformation functions for these providers, so that you can easily register the model and simply set up the authentication details.

For Vertex AI models, we have pre-registered embedding and Gemini models so that you can easily start calling them. Plus, newer embedding models have built-in support meaning you are able to access the latest versions of pre-registered models allowing you to start making prediction calls out-of-the-box.

In this blog, we’ll walk you through three example workflows that leverage model endpoint management to build richer generative AI experiences.

Generating embeddings with Open AI embeddings models
Leveraging Gemini to evaluate vector search results
Running sentiment analysis to analyze user sentiment

aside_block: <ListValue: [StructValue([(‘title’, ‘Try AlloyDB for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2d7892c190>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

First, register your model.

To use your own model, register your model using the create model function, where you specify model endpoint connectivity details. You can then configure a set of optional parameters that allow you to transform the input and output of the model arguments to a format suitable for your database. Here’s an example of registering Anthropic’s Claude model.

Once you register your model, you can call it with the predict row function for any AI model — or you can use the embedding convenience function to call an embedding model.

#1: Generate embeddings with Open AI embeddings models

Model endpoint management allows you to leverage the embedding convenience function with any embeddings model, even ones that don’t run on Google Cloud. Say you want to generate embeddings with OpenAI’s ada embeddings model. With our ease-of-use support you need only register your authentication credentials, register the model, and start generating embeddings. You first need to configure the authentication for the endpoint you would like to reach — you can do so either by creating a PostgreSQL function to specify your API key in the header of the API call or by creating a secret with secret manager and then registering the secret with model endpoint management.

To register your secret, you simply need to specify the secret path and create an ID for the secret. You can find the secret path in the resource manager by clicking on the secret, and then clicking “copy resource name” on the specific version of the secret you want to use.

code_block: <ListValue: [StructValue([(‘code’, “CALL google_ml.create_sm_secret(rnsecret_id => ‘open-ai-secret’,rnsecret_path => ‘projects/{project_id}/secrets/{secret_id}/versions/{secret_version}’);”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d7892cf40>)])]>

Once your secret has been registered, you can register your model and point to the secret, open_ai_secret, when you register the openai-ada model. Our ease-of-use support handles the input and output formatting so that you can generate embeddings from data in your database and directly use the output embedding for vector search.

code_block: <ListValue: [StructValue([(‘code’, “call google_ml.create_model(rn model_id => ‘openai-ada-002’,rn model_provider => ‘open_ai’,rn model_type => ‘text_embedding’,rn model_qualified_name => ‘text-embedding-ada-002’,rn model_auth_type => ‘secret_manager’,rn model_auth_id => ‘open-ai-secret’);”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d7892c8e0>)])]>

You then need only specify the name of the model you have registered in the first argument and the text in the second argument. For instance, if you want to generate an embedding on the word “I love Google Databases”, you would invoke the embedding function like so:

code_block: <ListValue: [StructValue([(‘code’, ‘select google_ml.embedding(‘openai-ada-002’, ‘I love Google Databases’);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d7892c5e0>)])]>

If you want to generate an embedding in-line while performing a vector search, combine the embedding function with vector search in SQL using the following syntax:

code_block: <ListValue: [StructValue([(‘code’, “select id, name from itemsrn ORDER BY embeddingrn <-> google_ml.embedding(‘openai-ada-002’, ‘I love Google Databases’) LIMIT 10;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d7892cc70>)])]>

Model endpoint management also has built in integrations with Vertex AI’s latest embedding models, allowing you to access any of Vertex AI’s supported text embedding models. We recommend the embedding() function for in line SQL queries or to generate stored embeddings on datasets smaller than 100k rows.

#2: Leverage Gemini to evaluate vector search results

In addition to a deep integration with embedding models, model endpoint management provides developers out-of-the-box support for the latest Gemini models. Gemini Pro and Gemini Flash Light are both available as pre-registered models in AlloyDB and Cloud SQL for PostgreSQL. Leveraging Gemini, you can generate content, perform sentiment analysis or analyze the quality of vector search results. Let’s see how you might analyze the quality of your vector search results with Gemini using the predict row function.

Suppose you have a table apparels with an ID, product_description and embedding column. We can use model endpoint management to call Gemini to validate the vector search results by comparing a user’s search query against the product descriptions. This allows us to see discrepancies between the user’s query and the products returned by the vector search.

code_block: <ListValue: [StructValue([(‘code’, ‘SELECTrnLLM_RESPONSErnFROM (rnSELECTrnjson_array_elements( google_ml.predict_row( model_id =>’gemini-1.5-pro:streamGenerateContent’,rn request_body => CONCAT(‘{rn “contents”: [rn { “role”: “user”,rn “parts”:rn [ { “text”: “Read this user search text: ‘, user_text, ‘ Compare it against the product inventory data set: ‘, content, ‘ Return a response with 3 values: 1) MATCH: if the 2 contexts are at least 85% matching or not: YES or NO 2) PERCENTAGE: percentage of match, make sure that this percentage is accurate 3) DIFFERENCE: A clear short easy description of the difference between the 2 products. Remember if the user search text says that some attribute should not be there, and the record has it, it should be a NO match.”rn } ]rn }rn] }’rn)::json))-> ‘candidates’ -> 0 -> ‘content’ -> ‘parts’ -> 0 -> ‘text’rnAS LLM_RESPONSErn FROM (rn SELECTrn id || ‘ – ‘ || product_description AS literature,rn product_description AS content,rn ‘I want womens tops, pink casual only pure cotton.’ user_textrn FROMrn apparelsrn ORDER BYrn embedding <=> embedding(‘text-embedding-005’,rn ‘I want womens tops, pink casual only pure cotton.’)::vectorrn LIMITrn 5 ) AS xyz ) AS X;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d7892c4f0>)])]>

We are able to pass in the vector search results to Gemini to evaluate how well the user’s query matches the descriptions qualitatively, and note differences in natural language. This allows you to build quality control to your vector search use case so that your vector search application improves over time. For the full end to end use case follow this code lab.

#3: Run sentiment analysis to analyze user sentiment

One of the benefits of calling Gemini in the database is its versatility. Above, we showed how you can use it to check the quality of your vector search. Now, let’s take a look at how you might use it to analyze the sentiment of users.

Say you are an e-commerce company and you want to perform sentiment analysis on user review information stored in the database. You have a table products which stores the name of the product and their descriptions. You have another table of product reviews, product_reviews, storing user reviews of those products joined on the id of the product. You just added headphones to your online offering and want to see how well they are doing in terms of customer sentiment. You can use Gemini through model endpoint management to analyze the sentiment as positive or negative in the database and view the results as a separate column.

First, create a wrapper function in SQL to send a prompt and the text you want to analyze the sentiment on to Gemini with the predict row function.

code_block: <ListValue: [StructValue([(‘code’, ‘– Pass in the prompt for Gemini and text you want to analyze the sentiment ofrnCREATE OR REPLACE FUNCTION get_sentiment(prompt text)rnRETURNS VARCHAR(100)rnLANGUAGE plpgsqlrnAS $$rnDECLARErn prompt_output VARCHAR(100);rn predict_row_input text;rnBEGINrn SELECT ‘{rn “contents”: [{“role”: “user”,”parts”: [{“text”: “Only return just the output value for the input. input: ‘ || prompt || ‘. output:”}]}]}’ INTO predict_row_input;rn — Execute the prediction query with the input country namern SELECT trim(replace(google_ml.predict_row(‘gemini-1.5-pro:generateContent’, predict_row_input::json)-> ‘candidates’ -> 0 -> ‘content’ -> ‘parts’ -> 0 -> ‘text’#>> ‘{}’, E’\n’,”))rn INTO prompt_output;rn — Return the continent namern RETURN prompt_output;rnEND; $$;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d79dd37c0>)])]>

Now let’s say you want to analyze the sentiment on a single review — you could do it like so:

code_block: <ListValue: [StructValue([(‘code’, “SELECTrn get_sentiment(rn ‘Please output a sentiment for a given review input. The sentiment value return should be a single word positive/negative/neutral. Input review: These headphones are amazing! Great sound quality and comfortable to wear.’);”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d77899ca0>)])]>

You can then generate predictions on only reviews containing the word “headphones” by using a LIKE clause and calling your get sentiment function:

code_block: <ListValue: [StructValue([(‘code’, “SELECTrn review_id,rn product_review,rn gemini_prompt_get_scalar(rn ‘Please output a sentiment for a given review input. The sentiment value return should be a single word positive/negative/neutral. Input review:’rn || product_review)rnFROM product_reviewsrnWHERE product_id IN (SELECT product_id FROM products WHERE name LIKE ‘%Headphones%’);”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e2d77899eb0>)])]>

This should output whether the review was “positive, negative or neutral” for user reviews regarding headphones. Allowing you to see what the user sentiment is around this new product. Later, you can use aggregators to see whether the majority of the sentiment is positive or negative.

Get started

Model endpoint management is now available in AlloyDB, AlloyDB Omni and Cloud SQL for PostgreSQL. To get started with it, follow our documentation on AlloyDB and Cloud SQL for PostgreSQL.

Read More for the details.

2025 03 20

GCP – Vector similarity search for Cloud SQL for MySQL is now GA

Tibor Kiss Cloud, Google Cloud gcp

If you used the internet today, you’ve probably already benefited from generative AI. Whether it helped you get your work done faster, research home repairs, or find the perfect gift, gen AI is transforming how we get things done. These generative AI experiences use searches against vector embeddings — multi-dimensional representations of data’s meaning — to match your intent with the best answer.

But integrating vector technology into existing applications can be challenging. Many databases have historically not supported vector search, so developers have had to integrate specialized vector databases side-by-side with their existing databases.

Enter MySQL similarity search

Cloud SQL for MySQL now supports vector storage and similarity search, which means you can transform your MySQL databases in place to integrate gen AI capabilities without a specialized vector database. Now generally available, it’s as simple as adding a new column to your existing table and loading in your vector embeddings, which you can generate using your favorite models; for example, you can use Vertex AI’s pre-trained text embeddings models. Once you’ve imported your dataset, you can perform both k-nearest neighbors (kNN) and approximate nearest neighbors (ANN) searches by adding the right index for your use case; these search indexes were developed using Google’s open-source ScaNN libraries. Our GA offering includes the same ACID support and crash recovery for vectors that you expect from a relational database.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2d789b3b80>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

To think about this in action, imagine you’re the developer for a hardware store’s online shopping experience. By integrating ANN similarity search into your catalog, when a shopper asks “what do I need to fix a crack in my dining table?” you can convert this question into a vector embedding and match against all products in your catalog to find items that can be used to fix dining table cracks.

We’ve collaborated closely with companies that rely on MySQL to help them integrate generative AI into their existing applications. For instance, supply chain solution provider Manhattan Associates is exploring similarity search in MySQL to improve search results for customers using its applications.

“Similarity search in MySQL enables us to easily integrate gen AI capabilities into the fleet of applications we’ve built on Cloud SQL for MySQL. For example, we’re exploring how we can use similarity search against product information to render better search results. This can be expanded to various searches across the application solutions we provide.” – Sanjeev Siotia, Executive Vice President & Chief Technology Officer, Manhattan Associates

Get started building

Ready to build generative AI apps on top of your MySQL databases? We have a few solutions to help you get started:

Sample app: Lets you customize the datastore for a bot-based app, with Cloud SQL for MySQL as an option. This app uses kNN search as the search type.
Code lab: Walks you through the basics of deploying a gen AI app with Cloud SQL and LangChain, a popular gen AI app development framework.

We can’t wait to see what you create!

Read More for the details.

2025 03 20

GCP – Announcing BigQuery repositories: Git-based collaboration in BigQuery Studio

Tibor Kiss Cloud, Google Cloud gcp

Modern data teams want to use Git to collaborate effectively and adopt software engineering best practices for managing their data pipelines and analytics code. But most tools used by data teams don’t offer integration with Git version control systems, making a Git workflow feel out of reach. This forces users to copy and paste code between UIs, which is not only time-consuming but also error-prone.

To help, we’re introducing repositories in BigQuery in Preview, a new experience in BigQuery Studio that helps data teams collaborate on code stored in Git repositories.

Develop with Git in BigQuery Studio

BigQuery repositories provide a comprehensive set of features to integrate Git workflows directly into your BigQuery environment:

Set up new repositories in BigQuery Studio where you can develop SQL queries, Notebooks, data preparation, data canvases, or text files with any file extension.
Connect your repositories to remote git hosts like GitHub, GitLab, and other popular Git platforms.
Edit the code in your repositories within a dedicated workspace, on your own copy of the code, before publishing changes to branches.
Perform most Git operations with a user-friendly interface that lets you inspect differences, commit changes, push updates, and create pull requests — all within BigQuery Studio.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2d7a064e20>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>

Software engineering best practices for all data practitioners

BigQuery repositories help organizations standardize the way code is developed, version, and deployed. Data teams with members of different levels of technical expertise can all collaborate on the same code base, following the same software engineering best practices.

Data analysts can contribute to code repositories via a simple GUI interface that lets them create workspaces, commit changes, and push code to branches.

Data engineers can develop in BigQuery Studio or with their favourite local IDE on the same codebase.

Data scientists can develop Colab Enterprise notebooks from BigQuery Studio, within their organization’s VPC, but back the code in a remote repository where they can manage versions and ask peers for code reviews.

Getting started

To begin using BigQuery repositories, navigate to BigQuery Studio in the Google Cloud console or visit the documentation for detailed instructions.

Read More for the details.

2025 03 19

AWS – Amazon EC2 C7g instances are now available in AWS Canada West (Calgary) Region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7g instances are available in the AWS Canada West (Calgary) region. These instances are powered by AWS Graviton3 processors that provide up to 25% better compute performance compared to AWS Graviton2 processors, and built on top of the the AWS Nitro System, a collection of AWS designed innovations that deliver efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage.

To learn more, see Amazon EC2 C7g. To explore how to migrate your workloads to Graviton-based instances, see AWS Graviton Fast Start program and Porting Advisor for Graviton. To get started, see the AWS Management Console.

Read More for the details.

2025 03 19

AWS – Amazon EC2 R7g instances are now available in the AWS GovCloud (US-West) Region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) R7g instances are available in the AWS GovCloud (US-West) region. These instances are powered by AWS Graviton3 processors that provide up to 25% better compute performance compared to AWS Graviton2 processors, and built on top of the the AWS Nitro System, a collection of AWS designed innovations that deliver efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage.

To learn more, see Amazon EC2 R7g. To explore how to migrate your workloads to Graviton-based instances, see AWS Graviton Fast Start program and Porting Advisor for Graviton. To get started, see the AWS Management Console.

Read More for the details.

2025 03 19

AWS – Amazon EC2 M7gd instances are now available in Middle East (UAE) Region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M7gd instances with up to 3.8 TB of local NVMe-based SSD block-level storage are available in Middle East (UAE) region.

These Graviton3-based instances with DDR5 memory are built on the AWS Nitro System and are a great fit for applications that need access to high-speed, low latency local storage, including those that need temporary storage of data for scratch space, temporary files, and caches. They have up to 45% improved real-time NVMe storage performance than comparable Graviton2-based instances. Graviton3-based instances also use up to 60% less energy for the same performance than comparable EC2 instances, enabling you to reduce your carbon footprint in the cloud.

M7gd instances are now available in the following AWS regions: US East (N. Virginia, Ohio), US West (Oregon, N. California), Europe (Spain, Stockholm, Ireland, Frankfurt, Paris), Asia Pacific (Tokyo, Mumbai, Singapore, Sydney), South America (São Paulo), and Middle East (UAE).

To learn more, see Amazon EC2 M7gd instances. To get started, see the AWS Management Console.

Read More for the details.

2025 03 19

AWS – Amazon Nova expands Tool Choice options for Converse API

Tibor Kiss AWS, Cloud AWS

Amazon Nova now supports expanded Tool Choice parameter options in the Converse API, enhancing developers’ control over model interactions with tools. Today, developers already use the Converse API to create sophisticated conversational applications, such as customized chat bots to maintain conversations over multiple turns. With this update, Nova adds support for ‘Any’ and ‘Tool’ modes in addition to the existing ‘Auto’ mode support, enabling developers to use all three different modes.

Auto leaves tool selection entirely to Nova’s discretion, whether to call a tool or generate text instead. Auto is useful in use cases like chatbots and assistants where you may need to ask the user for more information, and is the current default.
Any prompts Nova to return at least one tool call, from the list of tools specified, while allowing it to choose which tool to use. Any is particularly useful in machine to machine interactions where your downstream components may not understand natural language but might be able to parse a schema representation.
Tool enable developers to request a specific tool to be returned by Nova. Tool is particularly useful in forcing a structured output by having a tool that has the return type as your desired output schema.

To learn about expanded Tool Choice parameter support in Amazon Nova’s Converse API, see the Amazon Nova user guide. Learn more about Amazon Nova foundation models at the Amazon Nova product page. You can get started with Amazon Nova foundation models in Amazon Bedrock from the Amazon Bedrock console.

Read More for the details.

2025 03 19

GCP – Gen AI Toolbox for Databases announces LlamaIndex integration

Tibor Kiss Cloud, Google Cloud gcp

We are excited to announce LlamaIndex integration for Gen AI Toolbox for Databases (Toolbox). We launched Toolbox in beta last month and are thrilled to continue building on that momentum.

Gen AI Toolbox for Databases is an open-source server that streamlines the development and management of sophisticated generative AI tools that can connect to databases. Currently, Toolbox can be used to build tools for a large number of databases: AlloyDB for PostgreSQL (including AlloyDB Omni), Spanner, Cloud SQL for PostgreSQL, Cloud SQL for MySQL, Cloud SQL for SQL Server, and self-managed MySQL and PostgreSQL. Because it’s fully open-source, it includes contributions from third-party databases such as Neo4j and Dgraph. This enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more.

LlamaIndex has emerged as a leading framework for building knowledge-driven and agentic systems. It offers a comprehensive suite of tools and functionality that facilitate the development of sophisticated AI agents. Notably, LlamaIndex provides both pre-built agent architectures that can be readily deployed for common use cases, as well as customizable workflows, which enable developers to tailor the behavior of AI agents to their specific requirements.

In this post, we’ll share how LlamaIndex support for Toolbox works, Toolbox and LlamaIndex use cases, and samples to get started.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Gen AI Toolbox for Databases for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6b07be4430>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

Challenges in gen AI tool management

Building AI agents that use different tools, frameworks, and data sources creates challenges, particularly when querying databases. These include:

Complex database connections that require configuration, connection pooling, and caching for optimal performance.
Security vulnerabilities when ensuring secure access from gen AI models to sensitive data.
Scaling tool management due to repetitive code and modifications across multiple locations for each tool.
Inflexible tool updates that require a complete rebuild and redeployment of the application.
Limited workflow observability due to lack of built-in support for comprehensive monitoring and troubleshooting.

Gen AI Toolbox for Databases

Toolbox comprises two components: a server specifying the tools for application use, and a client interacting with this server to load these tools onto orchestration frameworks. This centralizes tool deployment and updates, incorporating built-in production best practices to enhance performance, security, and simplify deployments.

How LlamaIndex support works

LlamaIndex is particularly useful for developers building knowledge assistants over enterprise data. LlamaIndex’s event-based Workflows provide a clean, easy abstraction for building production agents capable of finding information, synthesizing insights, generating reports, and taking action, even with the most complex enterprise data.

By connecting Large Language Models (LLMs) to virtually any data source to structure data, create indices, and build powerful query engines, LlamaIndex empowers developers to rapidly extract knowledge and build AI agents, accelerating the development and adoption of LLM applications across various industries.

For enterprises, LlamaCloud provides a turn-key solution for data ingestion, parsing, indexing and storage that integrates seamlessly with the rest of the framework to get from prototype to production quickly.

For building agents, the controlled and specified calling of tools, reliable execution, and seamless passing of context back to the LLM are essential. Toolbox handles the execution itself, seamlessly running the tool and returning results. Together, Toolbox and LlamaIndex create a powerful solution for tool calling in agent workflows.

Use cases

LlamaIndex supports a broad spectrum of different industry use cases, including agentic RAG, report generation, customer support, SQL agents, and productivity assistants. LlamaIndex’s multi-modal functionality extends to applications like retrieval-augmented image captioning, showcasing its versatility in integrating diverse data types. LlamaIndex’s hundreds of data integrations and industry-leading parsing solutions in LlamaParse make it a stand-out choice for building agents that interact with enterprise data sources.

“We’re delighted to work with Google on Gen AI Toolbox, which neatly addresses a number of real pain-points in getting production agentic applications off the ground. We think the simplified security story in particular is going to be really attractive to devs building with these popular databases,” said Laurie Voss, VP of Developer Relations at LlamaIndex.

Get started

Through our partnership with LlamaIndex, we’re thrilled to offer enhanced value to developers building production-grade agents across diverse knowledge retrieval use cases. Here are some resources to get you started:

Read More for the details.