In July, we announced the availability of Mistral AI’s models on Vertex AI: Codestral for code generation tasks, Mistral Large 2 for high-complexity tasks, and the lightweight Mistral Nemo for reasoning tasks like creative writing. Today, we’re announcing the availability of Mistral AI’s newest model on Vertex AI Model Garden: Mistral-Large-Instruct-2411 is now generally available
Large-Instruct-2411 is an advanced dense large language model (LLM) of 123B parameters with strong reasoning, knowledge and coding capabilities extending its predecessor with better long context, function calling and system prompt. The model is ideal for use cases that include complex agentic workflows with precise instruction following and JSON outputs, or large context applications requiring strong adherence for retrieval-augmented generation (RAG), and code generation.
You can access and deploy the new Mistral AI Large-Instruct-2411 model on Vertex AI through our Model-as-a-Service (MaaS) or self-service offering today.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ea040334070>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
What can you do with the new Mistral AI models on Vertex AI?
By building with Mistral’s models on Vertex AI, you can:
Select the best model for your use case: Choose from a range of Mistral AI models, including efficient options for low-latency needs and powerful models for complex tasks like agentic workflows. Vertex AI makes it easy to evaluate and select the optimal model.
Experiment with confidence: Mistral AI models are available as fully managed Model-as-a-Service on Vertex AI. You can explore Mistral AI models through simple API calls and comprehensive side-by-side evaluations within our intuitive environment.
Manage models without overhead: Simplify how you deploy the new Mistral AI models at scale with fully managed infrastructure designed for AI workloads and the flexibility of pay-as-you-go pricing.
Tune the models to your needs: In the coming weeks, you will be able to fine-tune Mistral AI’s models to create bespoke solutions, with your unique data and domain knowledge.
Craft intelligent agents: Create and orchestrate agents powered by Mistral AI models, using Vertex AI’s comprehensive set of tools, including LangChain on Vertex AI. Integrate Mistral AI models into your production-ready AI experiences with Genkit’s Vertex AI plugin.
Build with enterprise-grade security and compliance: Leverage Google Cloud’s built-in security, privacy, and compliance measures. Enterprise controls, such as Vertex AI Model Garden’s new organization policy, provide the right access controls to make sure only approved models can be accessed.
Get started with Mistral AI models on Google Cloud
These additions continue Google Cloud’s commitment to open and flexible AI ecosystems that help you build solutions best-suited to your needs. Our collaboration with Mistral AI is a testament to our open approach, within a unified and an enterprise ready environment. Vertex AI provides a curated collection of first-party, open-source, and third-party models, many of which — including the new Mistral AI models — can be delivered as a fully-managed Model-as-a-service (MaaS) offering — providing you with the simplicity of a single bill and enterprise-grade security on our fully-managed infrastructure.
To start building with Mistral’s newest models, visit Model Garden and select the Mistral Large model tile. The models are also available on Google Cloud Marketplace here: Mistral Large.
We’ve seen a sharp rise in demand from enterprises that want to use AI agents to automate complex tasks, personalize customer experiences, and increase operational efficiency. Today, we’re announcing a Google Cloud AI agent ecosystem program to help partners build and co-innovate AI agents with technical and go-to-market resources from Google Cloud. We’re also launching AI Agent Space, a new category in our Google Cloud Marketplace for customers to easily find and deploy partner-built AI agents.
Through this program, we’ll provide product support, marketing amplification, and co-selling opportunities to help our services and ISV partners bring these solutions to market faster, reach more customers, and grow their AI agent businesses. Our goal is to provide customers with a rich ecosystem of solutions that sit on top of our world-class infrastructure and offer the choice and optionality needed to tailor AI for their businesses and maximize value from AI investments.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3c646d2df0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
New resources for partners building AI Agents
To increase the development and adoption of AI agents, we’re focusing on supporting partners in three key areas:
Accelerated agent development: We’ll provide partners with direct access to Google Cloud’s product and engineering teams for guidance and optimization of their AI agents. Partners will also receive early access to our latest AI technologies, technical enablement and best practices, and dedicated support for bringing their solutions to market quickly via Google Cloud Marketplace.
Go-to-market success: New go-to-market programs and co-selling opportunities specifically designed for AI agent solutions will help partners more effectively promote their offerings and drive adoption across a wider range of customers.
Increased customer visibility: We will highlight the innovative work of our partners through targeted marketing resources, blogs, and dedicated events, which will increase visibility of partner-built AI agents and help them stand out in our growing AI ecosystem.
Offerings from services partners
We’ve seen significant momentum from services partners who have used Google Cloud’s technology to help customers successfully build and deploy AI agents. Through this program, our services partners will make their AI agents available to even more customers, including on AI Agent Space in the future. Here are some of their innovative agent solutions:
Accenture is transforming customer support at a major retailer by offering convenient self-service options through virtual assistants, enhancing the overall customer experience.
Bain supports SEB’s wealth management division with an AI agent that enhances end-customer conversations with suggested responses and generates call summaries that help increase efficiency by 15%.
BCG provides a CRM-optimization tool to improve the effectiveness and impact of insurance advisors.
Capgemini optimizes the ecommerce experience by helping retailers accept customer orders through new revenue channels and to accelerate the order-to-cash process for digital stores.
Cognizant helps legal teams draft contracts, assigning risk scores and recommendations for how to optimize operational impact.
Deloitte offers a “Care Finder” agent as part of its Agent Fleet, helping care seekers find in-network providers often in less than a minute — significantly faster than the average call time of 5-8 minutes.
HCLTechhelps predict and eliminate different types of defects on manufacturing products with its manufacturing quality agent, Insight.
Infosys optimizes digital marketplaces for a leading consumer brand manufacturer, providing actionable insights on inventory planning, promotions, and product descriptions.
PwC uses AI agent technology to help oncology clinics streamline administrative work so that doctors can optimize their time with patients.
TCS helps build persona-based AI agents contextualized with enterprise knowledge to accelerate software development.
Wiprosupports a national healthcare provider in using agent technology to develop and adjust contracts, streamlining a complex and time-consuming task while improving accuracy.
Partners have already given us positive feedback about the support we’ve provided to more effectively scale their agent solutions, including Datatonic, Kyndryl, Quantiphi, and Slalom who plan to bring new agents to market soon. Here’s what partners had to say:
“Leaders who prioritize and invest in agentic architecture will be at the forefront of their industries, driving future growth with generative AI. For example, Accenture’s marketing team is using autonomous agents to streamline campaign creation and execution, reducing manual steps by 25-35%, saving 6% in costs, and speeding up time-to-market by 25-55%.” – Scott Alfieri, Global Lead, Google Business Group, Accenture
“BCG continues to see strong business value partnering with Google Cloud to deliver gen AI transformations for our joint clients across industries. Google Cloud’s support for a robust ecosystem of AI agents demonstrates its commitment to innovation and democratization of AI.” – Val Elbert, Managing Director and Senior Partner, BCG
“By partnering with Google Cloud, we are building AI agents that transform customer experiences and bring efficiency to business processes. Google Cloud’s Agent Marketplace empowers Capgemini to continue developing and deploying innovative AI agents, leveraging our deep understanding of our customers.” – Fernando Alvarez, Chief Strategy and Development Officer and Group Executive Board Member, Capgemini
“Deloitte has helped some of its largest clients improve how they operate with AI agents built with Google Cloud’s technology. As agentic AI takes off, this initiative can enhance our agent-building and distribution capabilities, thus enabling us to accelerate our clients’ time to business value with AI solutions.” – Gopal Srinivasan, Alphabet Google Alliance Generative AI Leader, Deloitte Consulting LLP
Offerings from ISV partners
Our ISV partners are leveraging the power of Google Cloud’s AI technology, including Vertex AI and Gemini models, to develop cutting-edge AI agent solutions. Many have already made their offerings available on Google Cloud Marketplace, and we’re thrilled that they will be expanding their reach through AI Agent Space to make it even easier for customers to deploy and benefit from these innovative AI agents.
Here are some examples of their agent capabilities:
Bud Financial uses its “Financial LLM” to provide personalized answers to customer queries and supports automation of banking tasks such as moving money between accounts to avoid overdrafts.
Dun & Bradstreet uses its Hoovers SmartSearch AI to help customers quickly build targeted lists of companies and contacts matching specific criteria such as location, industry, and company size, making it easier to identify and action targeted opportunities.
Elastic helps SREs and SecOps interpret log messages and errors, optimize code, write reports, and even identify and execute a runbook.
Exabeam enhances cybersecurity with natural language search, visualization, and investigation acceleration, automating threat explanations and next steps for multi-terabyte datasets.
FullStory integrates its real-time data capture with Google Cloud’s AI to create context-aware conversational agents, enabling faster data discovery and analysis of web and mobile interactions and more intelligent AI responses.
GrowthLoop gives marketers tools that automate audience building, suggest optimal targeting, and create custom attributes, optimizing the power of BigQuery data.
OpenText enables users to quickly find fast, accurate answers to inquiries that span a broad set of business domains, such as DevOps, customer service, and content management.
Quantum Metric uses its Felix AI agent to help customer service associates quickly summarize and identify important takeaways from consumer engagements, with reporting metrics that help businesses enhance inquiry resolutions.
Sprinklr offers multiple AI agents that can help businesses improve decision-making, resolve service queries, and handle complex tasks end-to-end.
Teradata helps analyze, categorize, and summarize customer inquiries or complaints by using multimodal capabilities that process text and voice data, identifying key trends and actionable insights to enhance customer loyalty.
ThoughtSpot uses its Spotter agent to empower customers with autonomous analytics capabilities and a natural-language chat interface that brings deep data analysis and contextual reasoning to any user.
Typeface enables users to automate marketing workloads and across teams with its Arc Agent, which supports marketers with campaign performance, creative content creation updates, and audience optimization.
UKG enhances the workplace experience with Bryte AI, a conversational agent that enables HR administrators and people managers to request information about company policies, business insights, and more.
ISV partners are successfully using our AI to enhance their agent solutions, which they expect to grow through our ecosystem. Here’s what they had to say:
“Dun & Bradstreet built Hoovers SmartSearch AI with Google’s AI to revolutionize sales prospecting by instantly generating targeted lists of companies and contacts. Through this innovative initiative, customer adoption of our AI agent will be accelerated to help users effortlessly identify ideal customers and accelerate revenue growth.” – Michael Manos, Chief Technology Officer, Dun & Bradstreet
“Elastic AI Assistant uses Vertex AI and Gemini models to empower SREs and SecOps teams to build intelligent agents that interpret log messages, optimize code, automate reports, and even generate runbooks. This is the future of agentic architecture, and it’s available now in partnership with Google Cloud.” – Ken Exner, CPO, Elastic
“By leveraging Google’s advanced AI capabilities, ThoughtSpot Spotter delivers an autonomous analytics agent that empowers users to extract valuable insights from their data through natural language interactions. We’re excited to scale our AI agent to even more customers in partnership with Google Cloud.” – Sumeet Arora, Chief Development Officer, ThoughtSpot
“UKG leverages Vertex AI to power UKG Bryte AI, a gen AI sidekick for UKG’s Pro and Ready HCM solutions. Bryte AI is built on UKG’s proprietary people, culture, and work data to enhance insights and decision-making, and to enable more conversational AI experiences” – Venkat Ramamurthy, Head of Product, AI, and Data, UKG
We’re pleased by how quickly partners have built AI agents to help customers improve their businesses. Additional partners with powerful AI agent capabilities available through Google Cloud include AUI.io, Automation Anywhere, Big SUR AI, BigCommerce, DataStax, Decagon.ai, Dialpad, Elastic, ema.co, Livex.ai, Lyzr.ai, Mojix, Moveo.ai, Regnology, Tamr, Tektonic AI, Vijil, VMware, Wisdom AI, and Zeotap.
Joining AI Agent Space
AI Agent Space is available today with solutions from select partners, and we plan to add hundreds of additional AI agents over the coming months. Partners interested in learning more can visit Google Cloud Marketplace to start listing AI agent solutions, and they can apply to the program here or reach out to their partner representative to explore additional collaboration opportunities.
We’re dedicated to empowering our partners with the tools, resources, and support they need to build and deploy successful AI agents. We’re excited to see the transformative solutions they create and the positive impact they’ll have on customers in the coming year.
Your next big customer doesn’t speak your language. In fact, 40% of global consumers won’t even consider buying from websites not in their native tongue. With 51.6% of internet users speaking languages other than English, you’re potentially missing half your market.
Until now, enterprises faced an impossible choice in addressing translation use cases. They had to choose one of the following:
Human translators: High quality but slow and expensive
Basic machine translation: Fast but lacks nuance
DIY solutions: Inconsistent and risky
But the challenge with translation is, you need all three – and traditional translation methods can’t keep up. This isn’t just about converting words – it’s about connecting with people using the right context and tone.
That’s why at Google Cloud, we built Translation AI in Vertex AI. We’re excited to share the latest updates, and how you can apply it to your business.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3c64742c40>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Translation AI: Unmatched translation quality, but your way
Google Cloud’s Translation AI includes two offerings for you to choose from:
Translation API Basic: An essential toolkit for translation capabilities. You can instantly detect languages and translate text using our advanced Neural Machine Translation (NMT) model. Translation AI Basic is perfect for chat conversations, short-form content, and scenarios where speed and consistency are crucial.
Translation API Advanced: Process entire documents, run batch translations, and maintain terminology consistency with custom glossaries. You can leverage our Gemini-powered Translation model for long-form content, or use Adaptive Translation to capture your brand’s unique voice and tone. You can even customize translations by applying glossary, fine tuning our industry leading translation models, or adapting translation predictions in real time.
What’s new in Translation AI
Expanded reach and accuracy: You can now reach global audiences with our expanded language support, now covering 189 languages — including Cantonese, Fijian, and Balinese — while maintaining lightning-fast performance, perfect for user content and contact centers. “Per our evaluations, Google NMT is among the best-performing real-time NMT models for 97% of the language-domain combinations we’ve tested (87 out of 90) — which is 15% more than the closest competitor.” — Konstantin Savenkov, CEO & Co-Founder, Intento, Inc.
Smarter adaptive translation: You can customize your translations’ tone and style with as few as five examples, or use up to 30,000 for ultimate precision.
Model selection based on your use case: Using Cloud Translation Advanced, you have the option to choose from multiple approaches based on the complexity of your translation use case. For example, you can use our NMT model for translating general text or choose Adaptive Translation for customization in real-time.
Quality without compromise:While leaderboards and reports offer insights into overall model performance, they don’t reveal how a model handles your specific needs. The gen AI evaluation service helps you pick your own evaluation criteria, giving you a clear understanding of how well AI models and applications align with your use case. For example, Google’s MetricX and the widely used COMET correlate strongly with human evaluation, are widely used for evaluating translation quality, and available now on the Vertex gen AI evaluation service. Compare models, prototype solutions, and select the best translation approach for your needs.
We built Translation AI with a dual focus – transform how you translate and transform how you work with translation. While most vendors offer either powerful translation or easy implementation, we deliver on both in four critical ways.
Production-ready APIs for your existing workflows: Plug our Translation API (NMT) directly into your applications for real-time, high-volume translations. Switch your model selection to our Adaptive Translation Gemini-powered model via the same Translation API when tone and context matter most. Both models integrate into your existing workflows and automatically scale with your needs.
Customization without coding: Train custom translation models on your specific industry terms and phrases. Simply upload your domain-specific data and let Translation AI build a custom model that speaks your language. It’s perfect for specialized content in legal, medical, or technical fields—no ML expertise required.
Full control with Vertex AI: Own your complete translation pipeline using Translation AI through our comprehensive platform – Vertex AI. With Vertex AI, you can select your preferred models, customize their behavior, and monitor real-world performance. Integrate seamlessly with your existing CI/CD processes for true enterprise-grade translation at scale.
Real impact: The Uber story
Uber is leveraging Google Cloud Translation AI product suite to achieve their mission to help people go anywhere and get anything and earn their way.
“Operating in tens of thousands of cities worldwide, Uber prioritizes seamless communication and support for riders, drivers, couriers, and eaters across diverse languages. Misinterpretations can result in delays, frustration, and even safety concerns. For years, Google has been our trusted translation platform. With recent advancements in Translation models, automatic quality metrics, and language expansion, we’re excited to partner with Google to deliver innovative multilingual experiences to our users.” — Megha Yethadka, Senior Director, Uber.
Get started with Translation AI
Here are a few resources to help you get started with the latest features in Translation AI. Gemini-powered Translation model and Adaptive Translation are publicly available to use today. You can try them out in Vertex AI Studio.
Red Hat and Google Cloud have a long history of collaborating on and contributing to Kubernetes as well as other Cloud Native Compute Foundation (CNCF) projects including Istio, Knative and Tekton. Together, these projects make up the basis for OpenShift, Red Hat’s platform that helps developers build, deploy, and manage applications. In fact, Google and Red Hat have been collaborating since before Kubernetes was even conceived, including co-developing Cgroups, a precursor to Linux containers. When Google open-sourced Kubernetes, Red Hat was one of the first to jump on board, betting the Red Hat OpenShift platform on the new open-source standard. Today, Google and Red Hat hold prominent leadership roles in Kubernetes governance, and are #1 and #2 largest contributors to Kubernetes, respectively.
Google Cloud infrastructure is highly optimized for OpenShift. Custom machine shapes let you optimize OpenShift Pods:Nodes bin-packing, reducing how much compute capacity you need to provision for a typical OpenShift workload. Hyperdisk Storage Pools enables thin-provisioning for OpenShift PersistentVolumes, reducing the amount of storage that needs to be provisioned. Additionally, support for live migration in a wide array of Compute Engine families lets you provide higher uptime guarantees for stateful OpenShift workloads, which are common in enterprise application portfolios.
And when you deploy OpenShift workloads on Google Cloud, you can count on access to a deep bench of L3/ L4 engineers who are experts in the OpenShift runtime core components (in Kubernetes) given Google’s staff strong participation as core maintainers and technical leads in Kubernetes, providing you with enterprise-grade support and coverage for mission-critical workloads.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1aad975880>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
NetApp Volumes storage comes to OpenShift on Google Cloud
When you deploy OpenShift workloads on Google Cloud, there’s a wide array of options for modernizing your operations, with OpenShift-native integrations into managed infrastructure services across compute, networking, storage, monitoring/logging, secrets/encryption, serverless, CI/CD, etc.
These managed infrastructure services give you the ability to “carry much fewer pagers” than you typically would with an on-prem OpenShift deployment. However, sometimes you are migrating applications that have requirements or dependencies on specific solutions for infrastructure pillars (such as storage). The typical approach is to rely on self-managing the architecture — and going back to carrying pagers…
With support for Google Cloud NetApp Volumes in OpenShift, you benefit from the best of both worlds for your file storage needs: the modernization, toil-reduction, and efficiency benefits of a managed service, with enterprise-ready features, compatibility, and familiarity of NetApp on-premises storage.
You can maximize data performance and reliability for your Red Hat OpenShift workloads on Google Cloud by leveraging high-performance file storage on Google Cloud infrastructure while using NetApp Volumes features like automated snapshots, and Red Hat OpenShift-native persistent storage integration helps ensure high availability and fault tolerance across your workloads.
Streamlined deployment for a variety of workloads
Collaboration between Google Cloud, NetApp and Red Hat makes it easier to quickly configure and deploy Red Hat OpenShift clusters and workloads in Google Cloud with NetApp Volumes for file storage, with streamlined access to Google Cloud IAM, service account management, and the Certificate Authority Service, among others. NetApp Volumes provides as small as 1 GiB volumes, read-write many (RWX) PVs, low-latency performance and up to 12.5 GiB/sec throughput with large volumes, all while protecting your applications and data with customer managed encryption keys (CMEK).
“Google Cloud is heavily invested in our partner community with the common goal of providing a world-class experience for our customers. Building on our long-standing technical collaborations with industry-leading partners like Red Hat and NetApp, we are deeply aligned on the core principles of both openness and reliability to help enterprise customers get what they need done. Our customers are increasingly turning to us to help them transform their business and together, and through a joint partnership with NetApp and Red Hat, we can help customers with a new way to cloud, while leveraging familiarity and consistency that brings together innovations across their business.” – Stephen Orban, Google Cloud VP of Migrations, ISVs, and Marketplace
De-risk cloud adoption while accelerating time to value
We are partnering with one of our trusted resellers,Converge Technology Solutions, to bridge on-premises, multi-cloud, and Google Cloud, and provide unified management and operational experience for your Red Hat OpenShift and NetApp workloads. Converge’s expertise in Red Hat OpenShift and NetApp technologies helps ensure solutions are architected for peak performance, scalability, and reliability. You can take advantage of their deep understanding of hybrid cloud and Kubernetes/Red Hat OpenShift to provide a smooth transition to Google Cloud, minimizing disruption and maximizing uptime. At the same time, Converge’s best-practice-aligned methodologies help streamline Google Cloud deployments while integrating Red Hat OpenShift and NetApp Volumes, so you can run persistent containerized workloads on an enterprise-class hybrid cloud environment.
“Converge is thrilled to partner with Google Cloud, Red Hat, and NetApp to deliver this powerful new solution for OpenShift on Google Cloud. Our deep expertise in hybrid cloud and Kubernetes, coupled with our proven methodologies, ensures a seamless transition and rapid time-to-value for clients adopting this innovative offering. This collaboration empowers enterprises to modernize their operations, optimize their infrastructure, and unlock the full potential of containerized workloads in a secure and reliable hybrid cloud environment.” – David Luftig, Executive Vice President Strategy and Solutions, Converge Technology Solutions
About NetApp NetApp is the intelligent data infrastructure company, combining unified data storage, integrated data services, and CloudOps solutions to turn a world of disruption into opportunity for every customer. NetApp creates silo-free infrastructure, harnessing observability and AI to enable the industry’s best data management. As the only enterprise-grade storage service natively embedded in the world’s biggest clouds, our data storage delivers seamless flexibility. In addition, our data services create a data advantage through superior cyber resilience, governance, and application agility. Our CloudOps solutions provide continuous optimization of performance and efficiency through observability and AI. No matter the data type, workload, or environment, with NetApp you can transform your data infrastructure to realize your business possibilities.
About Red Hat, Inc. Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver reliable and high-performing Linux, hybrid cloud, container, and Kubernetes technologies. Red Hat helps customers integrate new and existing IT applications, develop cloud-native applications, standardize on our industry-leading operating system, and automate, secure, and manage complex environments. Award-winningsupport, training, and consulting services make Red Hat a trusted adviser to the Fortune 500. As a strategic partner to cloud providers, system integrators, application vendors, customers, and open source communities, Red Hat can help organizations prepare for the digital future.
About Converge Converge Technology Solutions Corp. is a services-led, software-enabled, IT & Cloud Solutions provider focused on delivering industry-leading solutions. Converge’s global approach delivers advanced analytics, artificial intelligence (AI), application modernization, cloud platforms, cybersecurity, digital infrastructure, and digital workplace offerings to clients across various industries. The Company supports these solutions with advisory, implementation, and managed services expertise across all major IT vendors in the marketplace. This multi-faceted approach enables Converge to address the unique business and technology requirements for all clients in the public and private sectors. For more information, visit convergetp.com.
AI is rapidly reshaping the public sector, ushering in a new era of intelligent and AI-powered service delivery and mission impact. Chief AI Officers (CAIOs) and other agency leaders play a critical role as AI becomes more pervasive. At Google, we’ve long believed that AI is a foundational and transformational technology, with the potential to benefit people and society. Realizing its full potential to improve government services, enhance decision-making, and ultimately create a more efficient and effective public sector requires leadership and a clear commitment.
Google recently commissioned IDC to conduct a study that surveyed 161 federal CAIOs, government AI leaders and other decision makers to understand how agency leaders are leading in this new AI era – and the value they are already bringing when it comes to AI governance, collaboration, and building public trust and citizen engagement¹. I recently sat down with Ruthbea Yesner, Vice-President of IDC Government Insights to explore the key findings of this research and what it means for the public sector – see excerpts of our discussion and key insights below.
Key Finding #1: 62% of those surveyed say strengthening cybersecurity is a top motivator for AI investments
Agencies are embracing AI to enhance cybersecurity and protect critical infrastructure – with 60% of respondents indicating that internal cybersecurity protection is their top AI/ML use case. Over 40% of federal agencies surveyed state that protecting critical infrastructure is a key driver for their AI investments going forward. Additionally, respondents believe that applying AI to strengthen cybersecurity and protect critical infrastructure will deliver positive outcomes in just 9 months; the second fastest time to value of any expected outcome of AI.
CAIOs and other agency leaders play a crucial role in driving AI adoption and ensuring that agencies are able to leverage this powerful technology. While 50% of federal agencies have already appointed a CAIO, the rest are expected to follow soon. As adoption accelerates and AI maturity grows, CAIOs need to prioritize robust cybersecurity measures and risk mitigation strategies in all AI initiatives, ensuring the protection of sensitive data and systems.
Key Finding #2: Higher AI maturity increases likelihood to explore other Gen AI use cases by 4x
IDC created a 5-phase approach to assessing AI maturity and the findings are remarkable – 50% of agencies surveyed reported high levels of AI maturity which corresponds to mature behaviors like piloting and implementing generative AI use cases to drive innovation and mission impact. Mature AI agencies are embracing an innovation culture and are focused on AI use cases and projects with high potential for impact.
We’re seeing some agencies solving for one specific problem or use case and creating quick wins and the appetite to do more, and in other cases, they are tackling big, complex challenges head-on. By adopting an AI-first mindset, incorporating AI into their workflows and scaling their use of AI, they are creating the groundswell to do more. This has a compounding effect as AI becomes more pervasive across the agency, and individuals increasingly feel part of its positive cultural change and impact.
This has a catalyst effect, it just takes one person doing something amazing with AI to motivate others to learn and apply AI
Ruthbea Yesner
Vice-President of IDC Government Insights
Generative AI is the future – attracting 42% of AI investments. Agencies are eager to explore its potential – and innovation will be a key motivator for continued AI investment going forward. As organizations prioritize AI, the CAIO role becomes even more multifaceted, demanding not just technical expertise but also visionary leadership to drive organizational culture change and develop a truly AI-enabled workforce.
Key Finding #3: An AI-ready workforce is the key to unlocking AI’s potential
The rapid pace of AI adoption has highlighted a significant challenge: a shortage of AI expertise. 39% percent of survey respondents report that their biggest challenge is a lack of in-house AI skills and expertise, and 68% are focused on training and retaining their workforce.
Google is tackling this skills challenge head-on. We recently announced our Google Cloud Launchpad for Veterans – a no-cost training and certification journey to equip veterans in all roles and at all levels with the cloud knowledge and skills needed to drive innovation, and contribute to their employer’s digital transformation strategy. And we also announced a new AI training initiative through Google.org’s AI Opportunity Fund – with $15 million for AI skills training for US government workers for the Partnership for Public Service and InnovateUS. This also includes a grant to the Partnership for Public Service to establish the new Center for Federal AI to provide AI skills and literacy to federal leaders and workers, including 2,000 senior government officials.
One thing is clear – AI requires leadership, and the CAIO is an important new C-suite role signaling the government’s commitment to harness AI and reach its full potential. CAIOs and other agency leaders are critical to charting this new AI era and providing the expertise and leadership necessary to leverage AI for the public good.
To learn more about how CAIOs are leading in this new AI era, download The Chief Artificial Intelligence Officer (CAIO) Playbook: A Practical Guide for Advancing AI Innovation in Government. By embracing its recommendations, agencies can create their own roadmap to drive AI adoption to accelerate mission outcomes and impact. To hear the full interview with Ruthbea Yesner, Vice-President of IDC Government Insights, please register to join the Google Public Sector Summit On-Demand on December 3rd.
¹ IDC Signature White Paper, The Chief Artificial Intelligence Officer (CAIO) Playbook: A Practical Guide for Advancing AI Innovation in Government, sponsored by Google Public Sector, Doc# US52616824, October 2024.
The content in this blog post was originally published last week as a members-only email to the Google Cloud Innovators community. To get this content directly in your inbox (not to mentionlots of other benefits),sign up to be an Innovator today.
New and shiny
Three new things to know this week
Ground Gemini’s answers with Google Search in Vertex AI and Google AI Studio. There’s brand new support in Google AI Studio for connecting the Gemini model’s output to verifiable sources of data through Google Search. This functionality is already part of Vertex AI, but both platforms now support dynamic retrieval. This means that grounding only happens if we predict that the query needs it. See how it works in Vertex AI, and learn how to get started in Google AI Studio.
Use Google’s own Arm-based CPUs. The Axion CPU is now ready for you! Get some excellent price performance and better energy efficiency by deploying C4A VMs powered by our first Arm-based processors.
Business process automation service gets AI upgrade, increased sophistication. It’s likely flying below your radar, but take a look at Google Cloud Application Integration. You can model out workflows for connecting systems in all sorts of ways. There’s now Gemini Code Assist functionality to help you build integrations, model out data transformations, create test cases, and even apply complex retry strategies.
Build and Deploy Gen AI Applications on Google Cloud with Genkit and Go.Join us on November 19th for a hands-on workshop to build and deploy a generative AI app on Google Cloud! Use Genkit, Vertex AI, and Go to create and automate a reusable app deployment pipeline—perfect for beginners and pros alike.
Watch this
Learn advanced RAG techniques. Watch this excellent video series to get up to speed on LLM fundamentals. This edition looks at retrieval augmented generation and enhancing quality of responses.
Community cuts
Every week I round up some of my favorite links from builders around the Google Cloud-iverse. Want to see your blog or video in the next issue?Drop Richard a line!
“Perfect” is a strong word, but yeah, we’re pretty good. Simon at SADA makes the case that you should be looking at Google Cloud for your AI work because of our models, security posture, tools, and expertise. Who am I to disagree?
Build an ETL pipeline locally and then transition to the cloud.Thomas looks at the exercise of taking a working data pipeline and using our data and compute services to get it running successfully in Google Cloud.
What exactly is Firebase? It’s been part of Google for a while and is used by mobile devs around the world. Now it’s appealing to new audiences, and this post from Hermant explains what it offers to modern developers.
Learn and grow
Three ways to build your cloud muscles this week
In-person Workshop! AI In Action – AlloyDB and Vertex AI Agent Builder. Level up your AI skills with a hands-on journey in building knowledge-driven chat applications! Dive into AlloyDB and Vertex AI Agent Builder to create intelligent, interactive customer solutions. Sign up now! New York – 11/20, Toronto – 11/22, Bay Area – 12/3, Seattle – 12/5.
Let’s modernize our old apps. Here’s a great code lab that walks you through, step-by-step, the process to modernize an old PHP app. Learn what it takes to containerize the app, automate the path to production, add generative AI features, and introduce modern operations.
Running Apache Airflow? You have choices. If you’re orchestrating data, there’s a good chance you’ve come across Apache Airflow. This post points out that you can run it yourself on VMs, use a more managed GKE environment, or embrace a fully managed service with Cloud Composer.
Cloud Workstations is ready for devs in government. You deserve nice things, wherever you may work. Those in restricted environments sometimes have to settle for less. But now, Cloud Workstations is FedRAMP High Authorized. Curious about Cloud Workstations? Stanal has a good new overview post.
Standard storage format, and all the BigQuery goodness. We just shipped a preview of BigQuery tables for Apache Iceberg. Use this open format to store data, but get all the lakehouse goodness that BigQuery offers.
One more thing
Gemini is used across Google to create helpful AI experiences. Jeff highlights Sundar’s message about the billions of Google users with access to Gemini.
Become an Innovator to stay up-to-date on the latest news, product updates, events, and learning opportunities with Google Cloud.
Protecting sensitive company data is no longer just a best practice—it’s business critical. In today’s world, data breaches can have serious consequences, from financial losses and reputational damage to legal repercussions and operational disruptions. That’s why Chrome Enterprise Premium, our advanced secure enterprise browser offering, includes a growing suite of Data Loss Prevention (DLP) capabilities to help organizations safeguard their sensitive information and maintain compliance.
We recently launched a number of enhancements to our DLP capabilities, giving you even more granular control over your company’s data. This blog post will explore how these new capabilities support your organization’s comprehensive DLP journey—from discovering potential risks and user behavior, to controlling data flow with robust security measures, to investigating potential incidents with detailed reporting and analysis, and finally, to expanding protection beyond desktops.
Discover and understand user behavior
Understanding how your users interact with data is the first step in preventing data leaks. Chrome Enterprise provides powerful tools to gain visibility into user activity and to identify potential risks:
1. Chrome Security Insights
Chrome Security Insights empowers Chrome Enterprise customers to proactively identify potential threats with simplified security monitoring. This feature monitors key security configurations, security event logging, and 50 common DLP detectors with just a few clicks. Administrators gain valuable insights into high-risk activities through detailed reports on users, domains, and sensitive data transfers, enabling swift identification and remediation of security concerns. Start your 30-day Chrome Enterprise Premium trial and enable Chrome Security Insights here.
2. URL Filtering Audit Mode [Currently in Public Preview (beta), general availability coming soon]
Chrome Enterprise Premium’s URL Filtering Audit Mode offers a valuable tool for organizations seeking to refine their web access policies. It allows administrators to selectively activate monitoring of employee browsing activity without enforcing restrictions, providing insights into users behavior and potential security risks. By analyzing this data, IT and security teams can make informed decisions regarding URL filtering rules, striking an effective balance between security and user productivity. See here to learn how to configure URL Filtering Audit Mode.
Enforce DLP controls
Once you understand your users’ behavior, it’s time to put the right controls in place to prevent data leaks. Chrome Enterprise offers a robust set of in-browser protections.
1. Copy and paste protections
Controlling how users interact with sensitive data is crucial. Chrome Enterprise Premium’s copy and paste protections allow you to restrict or block users from copying sensitive information from web pages or pasting it into unauthorized applications or websites. This granular control helps prevent data exfiltration and ensures that sensitive information stays within designated boundaries, reducing the risk of data breaches and helping with compliance with data protection regulations. The copy and paste protections include:
Preventing data leakage to Incognito mode: Concerned about sensitive data being copied into incognito mode, where it can potentially bypass security measures? Chrome Enterprise Premium now allows you to block or warn users when they attempt to copy data between regular browsing sessions and incognito windows.
Controlling data sharing between applications: For organizations looking to prevent data leakage to external applications, Chrome Enterprise Premium now allows you to block or warn users when they attempt to copy data from your web applications into external programs like Notepad, Microsoft Word, or other apps.
Isolating data between Chrome profiles: Shared devices or users with multiple Chrome profiles can introduce risks of data cross-contamination. Chrome Enterprise Premium’s new copy-paste controls now allow you to block or warn users when they attempt to move data between different profiles.
Securing sensitive emails: Emails often contain highly confidential information requiring stringent protection. With Chrome Enterprise Premium, you can implement specific rules, such as blocking any copying from Gmail unless it’s being pasted back into Gmail.
See more details about setting up copy and paste protections here.
2. Watermarking
Watermarking acts as a deterrent to unauthorized data sharing. Chrome Enterprise Premium allows you to apply visible watermarks to sensitive company documents viewed in the browser, displaying information like the user’s email address, date, or a custom message. This helps discourage data exfiltration and makes it easier to trace the source of any leaked information. See here on how to set up watermarking with Chrome Enterprise Premium
3. Screenshot protections
Screenshots can be a convenient way to capture information, but they also pose a data leak risk. Chrome Enterprise Premium’s screenshot protection allows you to prevent users from taking screenshots of sensitive content within the browser. This adds another layer of protection to your DLP strategy, limiting the potential for unauthorized data capture. Learn how to set up screenshot protection rules here.
These controls work together to create a comprehensive security strategy, limiting the ways in which data can be exfiltrated from your organization.
Investigate potential data leaks
Even with the best preventative measures in place, it’s crucial to be prepared to investigate potential security incidents. Chrome Enterprise provides tools to help you quickly identify and address threats:
1. Evidence Locker [Currently in Private Preview, general availability coming soon]
The evidence locker provides a secure repository for storing files and data that require further investigation by security teams. For instance, if an employee attempts to download a non-public financial report, Chrome Enterprise Premium can block the action and retain a copy of the file in the evidence locker. This triggers a detailed report for IT and security teams, enabling them to take appropriate investigation and remediation steps. Stay tuned for more information on the upcoming release of Evidence Locker.
2. Chrome Extension Telemetry in Google Security Operations
Chrome Enterprise Core integrates with Google Security Operations, our cloud-native security analytics platform, to provide deeper visibility into browser activity. Previously, detection and response teams were limited to analyzing static extension attributes. Now, you can set dynamic rules that continuously monitor extension behavior in your production environment, enabling proactive identification and remediation of risks before they escalate into threats. For example, you can monitor if extensions are unexpectedly contacting remote hosts or accessing cookies. This enhanced visibility empowers your security team to detect and mitigate data theft and infrastructure attacks in near real-time, significantly reducing your organization’s vulnerability to malicious extensions. See how to set this up here.
Expand protection to other platforms
Chrome Enterprise is committed to extending its threat protection capabilities beyond the desktop.
1. Mobile threat protections
With the growing use of mobile devices for work, securing the browser on these devices is essential. Chrome Enterprise Core is extending its threat protection capabilities to Android devices with download blocking. This feature will allow organizations to set policies to prevent users from downloading malicious files flagged by Google Safe Browsing from the web onto their mobile devices, bringing threat protections beyond desktops. Organizations can also choose to block all downloads on Android on managed Chrome. Get started with Chrome Enterprise Core today at no additional costs.
Chrome Enterprise Premium: Your partner in DLP
These features are just a glimpse into the comprehensive DLP capabilities offered by Chrome Enterprise. We are consistently enhancing our security capabilities to help organizations like yours take a proactive approach to data loss prevention, safeguarding sensitive information at the critical browser layer and ensuring compliance in today’s increasingly complex digital landscape.
Start using Chrome Enterprise Core today at no additional cost to gain foundational security capabilities. Or, experience Chrome Enterprise Premium’s advanced security and DLP features with a free 60-day trial and enable Chrome Security Insights here.
One of Google Cloud’s major missions is to arm security professionals with modern tools to help them defend against the latest threats. Part of that mission involves moving closer to a more autonomous, adaptive approach in threat intelligence automation.
In our latest advancements in malware analysis, we’re equipping Gemini with new capabilities to address obfuscation techniques and obtain real-time insights on indicators of compromise (IOCs). By integrating the Code Interpreter extension, Gemini can now dynamically create and execute code to help deobfuscate specific strings or code sections, while Google Threat Intelligence (GTI) function calling enables it to query GTI for additional context on URLs, IPs, and domains found within malware samples. These tools are a step toward transforming Gemini into a more adaptive agent for malware analysis, enhancing its ability to interpret obfuscated elements and gather contextual information based on the unique characteristics of each sample.
Building on this foundation, we previously explored critical preparatory steps with Gemini 1.5 Pro, leveraging its expansive 2-million-token input window to process substantial sections of decompiled code in a single pass. To further enhance scalability, we introduced Gemini 1.5 Flash, incorporating automated binary unpacking through Mandiant Backscatter before the decompilation phase to tackle certain obfuscation techniques. Yet, as any seasoned malware analyst knows, the true challenge often begins once the code is exposed. Malware developers frequently employ obfuscation tactics to conceal critical IOCs and underlying logic. Malware may also download additional malicious code, making it challenging to fully understand the behavior of a given sample.
For large language models (LLMs), obfuscation techniques and additional payloads create unique challenges. When dealing with obfuscated strings such as URLs, IPs, domains, or file names, LLMs often “hallucinate” without explicit decoding methods. Additionally, LLMs cannot access, for example, URLs that host additional payloads, often resulting in speculative interpretations about the sample’s behavior.
To help with these challenges, Code Interpreter and GTI function calling tools provide targeted solutions. Code Interpreter enables Gemini to autonomously create and execute custom scripts, as needed, using its own judgment to decode obfuscated elements within a sample, such as strings encoded with XOR-based algorithms. This capability minimizes interpretation errors and enhances Gemini’s ability to reveal hidden logic without requiring manual intervention.
Meanwhile, GTI function calling expands Gemini’s reach by retrieving contextualized information from Google Threat Intelligence on suspicious external resources such as URLs, IPs, or domains, providing verified insights without speculative guesses. Together, these tools equip Gemini to better handle obfuscated or externally hosted data, bringing it closer to the goal of functioning as an autonomous agent for malware analysis.
To illustrate how these enhancements boost Gemini’s capabilities, let’s look at a practical example. In this case, we’re analyzing a PowerShell script that contains an obfuscated URL that hosts a second-stage payload. This particular sample was previously analyzed with some of the most advanced publicly available LLM models, which incorporate code generation and execution as part of their reasoning process. Despite these capabilities, each model “hallucinated,” generating completely fabricated URLs instead of accurately revealing the correct one.
Utilizing Code Interpreter and GTI function calling as part of its reasoning process, Gemini autonomously generated the following report without any human intervention. When deemed necessary, it applies these tools to process and extract additional information from the sample.
Final Report: PowerShell Malware with URL Deobfuscation
This report analyzes a PowerShell script designed to download and execute a file from an obfuscated URL. The script employs a custom obfuscation routine resembling the RC4 stream cipher to conceal the malicious URL.
Functionality:
Obfuscation Routine: The script defines a function F that performs byte-level XOR encryption using a key derived from the string “tox2”. This function is used to deobfuscate the malicious URL.
URL Decryption: The obfuscated URL, stored as a byte array, is decrypted using the F function with “tox2” as the key.
File Download: The script downloads a file from the decrypted URL (hxxps://filedn[.]eu/lODWTgN8sswHA6Pn8HXWe1J/tox2/Scan_docs%2398097960[.]msi) and saves it to the system’s temporary directory under a randomly generated filename.
Execution: Finally, the script executes the downloaded file.
def deobfuscate(x, y):
s = list(range(256))
j = 0
k = [ord(c) for c in y]
for i in range(256):
j = (j + s[i] + k[i % len(k)]) % 256
s[i], s[j] = s[j], s[i]
i = 0
j = 0
output = []
for m in range(len(x)):
i = (i + 1) % 256
j = (j + s[i]) % 256
s[i], s[j] = s[j], s[i]
output.append(x[m] ^ s[(s[i] + s[j]) % 256])
return output
a = [0x25, 0x7B, 0x5A, 0xD8, 0x9C, 0xFF, 0xC3, 0xA4, 0x95, 0xB7,
0x9D, 0x6E, 0x5F, 0x4D, 0x49, 0xFB, 0x78, 0x32, 0xB8, 0x17, 0x72,
0x01, 0x31, 0xFB, 0xC0, 0x58, 0x45, 0xCB, 0x5C, 0x3C, 0xDD, 0x00,
0x60, 0x87, 0x25, 0xB2, 0x44, 0xA6, 0xC5, 0xEA, 0x71, 0xAA, 0x9D,
0xA1, 0x05, 0x5F, 0xBD, 0x41, 0x28, 0x70, 0x30, 0x6B, 0x96, 0x24,
0x92, 0xA0, 0x9D, 0x92, 0x23, 0x79, 0x69, 0x2C, 0x1B, 0x1E, 0xEB,
0x69, 0x8F, 0x04, 0x10, 0x74, 0xBC]
b = "tox2"
c = deobfuscate(a, b)
decoded_url = "".join(chr(i) for i in c)
print(decoded_url)
Google Threat Intelligence (GTI) Lookup:
The GTI lookup provides valuable context regarding the identified IOC:
Threat Actor: UNC5687, known for using MESHAGENT in phishing campaigns impersonating the Security Service of Ukraine.
Campaign: This specific URL is linked to a campaign spreading MESHAGENT via phishing emails with malicious links. The campaign involved a captcha page leading to the download of either a malicious MSI or EXE file.
Malware: The campaign delivers MESHAGENT, an open-source remote access framework, configured to communicate with C2 domains linked to a service called “AnonVNC.”
Conclusion:
The analyzed PowerShell script is confirmed to be malicious, associated with a phishing campaign by UNC5687 distributing the MESHAGENT remote access tool.
Gemini identified that the script employs an XOR-based obfuscation algorithm that resembles RC4 to conceal the download URL. Recognizing this pattern, Gemini autonomously generates and executes a Python deobfuscation script within the Code Interpreter sandbox, successfully revealing the external resource.
With the URL in hand, Gemini then utilizes GTI function calling to query Google Threat Intelligence for further context. This analysis links the URL to UNC5687, a threat cluster known for using a remote access tool in phishing campaigns impersonating the Security Service of Ukraine.
As we’ve seen, the integration of these tools has strengthened Gemini’s ability to function as a malware analyst capable of adapting its approach to address obfuscation and gathering vital context on IOCs. By incorporating the Code Interpreter and GTI function calling, Gemini is better equipped to navigate complex samples by autonomously interpreting hidden elements and contextualizing external references.
While these are significant advancements, many challenges remain, especially given the vast diversity of malware and scenarios that exist in the threat landscape. We’re committed to making steady progress, and future updates will continue to enhance Gemini’s capabilities, moving us closer to a more autonomous, adaptive approach in threat intelligence automation.
Cassandra, a key-value NoSQL database, is prized for its speed and scalability, and used broadly for applications that require rapid data retrieval and storage such as caching, session management, and real-time analytics. Its simple key-value pair structure helps ensure high performance and easy management, especially for large datasets.
But this simplicity also leads to limitations like poor support for complex queries, potential data redundancy, and difficulty in modeling intricate relationships. Spanner, Google Cloud’s always-on, globally consistent, and virtually unlimited-scale database, combines the scalability and availability of NoSQL with the strong consistency and relational model of traditional databases, positioning it for traditional Cassandra workloads. And today, it’s easier than ever to switch from Cassandra to Spanner, with the introduction of the Cassandra to Spanner Proxy Adapter, an open-source tool for plug-and-play migrations of Cassandra workloads to Spanner, without any changes to the application logic.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ec95529c8e0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Spanner for NoSQL workloads
Spanner provides strong consistency, high availability, virtually unlimited scalability, and a familiar relational data model with support for SQL and ACID transactions for data integrity. As a fully managed service, it helps simplify operations, allowing teams to focus on application development rather than database administration. Furthermore, Spanner’s high availability, even at a massive global scale, supports business continuity by minimizing database downtime.
We’re constantly evolving Spanner to meet the needs of modern businesses. Some of the latest Spanner capabilities include enhanced multi-model capabilities such as graph, full-text search, vector search, improved performance for analytical queries with Spanner Data Boost, and unique enterprise features such as geo-partitioning and dual-region configurations. For Cassandra users, these powerful features, along with Spanner’s compelling price-performance, unlock a world of new, exciting possibilities.
The Cassandra to Spanner adapter — battle-tested by Yahoo!
If you’re wondering, “Spanner sounds like a leap forward from Cassandra. How do I get started?” the proxy adapter provides a plug-n-play way to forward your client applications’ Cassandra Query Language (CQL) traffic to Spanner. Under the hood, the adapter functions as a Cassandra client for the application but operates internally by interacting with Spanner for all data manipulation tasks. With the Cassandra to Spanner proxy adapter there is no migration for your application code needed — it just works!
Yahoo successfully migrated from Cassandra to Spanner, reaping the benefits of improved performance, scalability, consistency, and operational efficiency. And the proxy adapter made it easy to migrate.
“The Cassandra Adapter has provided a foundation for migrating the Yahoo Contacts workload from Cassandra to Spanner without changing any of our CQL queries. Our migration strategy has more flexibility, and we can focus on other engineering activities while utilizing the scale, redundancy, and support of Spanner without updating the codebase. Spanner is cost-effective for our specific needs, delivering the performance required for a business of our scale. This transition enables us to maintain operational continuity while optimizing cost and performance.” – Patrick JD Newnan, Principal Product Manager, Core Mail and Analytics, Yahoo
Another Google Cloud customer that successfully migrated from Cassandra to Spanner recently is Reltio. Reltio benefited from an effortless migration process to minimize downtime and disruption to their services while reaping the benefits of a fully managed, globally distributed, and strongly consistent database.
These success stories demonstrate that migrating from Cassandra to Spanner can be a transformative step for businesses seeking to modernize their data infrastructure, unlock new capabilities, and accelerate innovation.
How does the new proxy adapter simplify your migration? A typical database migration involves the following steps:
Some of these steps — migrate your application (step 4) and migrate the data (step 6) — are more complex than others. The proxy adapter vastly simplifies migrating a Cassandra-backed application to point to Spanner. Here’s a high-level overview of the steps involved when using the new proxy adapter:
1. Assessment: Evaluate your Cassandra schema, data model, and query patterns which ones you can simplify after moving to Spanner.
2. Schema design: Spanner’s table declaration syntax and data types are similar to Cassandra’s; the documentation covers these similarities and differences in depth. With Spanner, you can also take advantage of relational capabilities and features like interleaved tables for optimal performance.
3. Data migration: There are several steps to migrate your data:
Replicate incoming data: Replicate incoming updates to your Cassandra cluster to Spanner in real-time using Cassandra’s Change Data Capture (CDC).
Another possibility is to update your application logic to perform dual-writes to Cassandra and Spanner. We don’t recommend this approach if you’re trying to minimize changes to your application code.
4. Set up the proxy adapter and update your Cassandra configuration: Download and run the Cassandra to Spanner Proxy Adapter, which runs as a sidecar next to your application. By default, the proxy adapter runs on port 9042. In case you decide to use a different port, don’t forget to update your application code to point to the proxy adapter.
5. Testing: Thoroughly test your migrated application and data in a non-production environment to ensure everything works as expected.
6. Cutover: Once you’re confident in the migration, switch your application traffic to Spanner. Monitor closely for any issues and fine-tune performance as needed.
What’s under the hood of the new proxy adapter?
The new proxy adapter presents itself as a Cassandra client to the application. From the application’s perspective, the only noticeable change is the IP address or hostname of the Cassandra endpoint, which now points to the proxy adapter. This streamlines the Spanner migration, without requiring extensive modifications to application code.
We designed the proxy adapter to establish a one-to-one mapping between each Cassandra cluster and a corresponding Spanner database. The proxy instance employs a multi-listener architecture, with each listener bound to a distinct port. This facilitates concurrent handling of multiple client connections, where each listener manages a distinct connection with the specified Spanner database.
The proxy’s translation layer handles the intricacies of the Cassandra protocol. This layer performs message decoding and encoding, manages buffers and caches, and crucially, parses incoming CQL queries and translates them into Spanner-compatible equivalents.
For more details about different ways of setting up the adapter, limitations, mapping of CQL data types to Spanner, and more, refer to the proxy adapter documentation.
Addressing common concerns and challenges
Let’s address a few concerns you may have with your migrations:
Cost: Have a look at Accenture’s benchmark result that demonstrates that Spanner ensures not only consistent latency and throughput but also cost efficiency. Furthermore, Spanner now offers a new tiered pricing model (Spanner editions) that delivers better cost transparency and cost savings opportunities to help you take advantage of all of Spanner’s capabilities.
Latency increases: To minimize an increase in query latencies, we recommend running the proxy adapter on the same host as the client application (as a side-car proxy) or running on the same Docker network when running the proxy adapter in a Docker container. We also recommend keeping the CPU utilization of the proxy adapter host to under 80%.
Schema flexibility: While Cassandra offers schema flexibility, Spanner’s stricter relational schema provides advantages in terms of data integrity, query power, and consistency.
Learning curve: Spanner’s data types have some differences with Cassandra’s. Have a look at this comprehensive documentation that can ease the transition.
Get started today
The benefits of strong consistency, simplified operations, enhanced data integrity, and global scalability make Spanner a compelling option for businesses looking to leverage the cloud’s full potential for NoSQL workloads. With the new Cassandra to Spanner proxy adapter, we are making it easier to plan and execute on your migration strategy, so you can unlock a new era of data-driven innovation for your organization.
AlloyDB Omni is back with a new release, version 15.7.0, and it’s bringing serious enhancements to your PostgreSQL workflows, including:
Faster performance
A new ultra-fast disk cache
An enhanced columnar engine
The general availability of ScANN vector indexing
A new release of the AlloyDB Omni Kubernetes operator
From transactional and analytical workloads to cutting-edge vector search, this update delivers across the board – in your data center, at the edge, on your laptop, and in any cloud and with 100% PostgreSQL compatibility.
Let’s jump in.
Better performance
Many workloads already get a boost compared to standard PostgreSQL. In our performance tests, AlloyDB Omni is more than 2x faster than standard PostgreSQL for transactional workloads, with most of the tuning being done for you automatically, without special configurations. One of the key advantages is the memory agent that optimizes shared buffers while at the same time avoiding out-of-memory errors. In general, the more memory you configure for AlloyDB Omni, the better it performs, serving more queries from the shared buffers and reducing the need to make calls to disk, which can be magnitudes slower than memory, particularly when using durable network storage
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e11acc7ab80>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
An ultra-fast disk cache
This trade-off between memory and disk storage also just got more flexible, with the introduction of an ultra-fast disk cache. It allows you to configure a fast, local, and not necessarily durable storage device as an extension of Postgres’ buffer cache. Instead of aging data out of memory to make space for new data, AlloyDB Omni can keep a copy of not-quite-hot data in the disk cache, where it can be accessed faster than from persistent disk.
Enhanced columnar engine
AlloyDB Omni’s analytics accelerator is changing the game for mixed workloads. Developers are finding it invaluable for gaining real-time analytical insights from their transactional data, all without the overhead of managing extra data pipelines or separate databases. You can instead enable the columnar engine, assign a portion of your memory to it, and let AlloyDB Omni decide which columns or tables to populate in the columnar engine to speed up queries. In our benchmarks, the columnar engine speeds up analytical queries up to 100x compared to standard PostgreSQL.
The practical size limit to the analytics accelerator was determined by the amount of memory you are able to assign to the columnar engine. What’s new is a feature that allows you to configure a fast local storage device for the columnar engine to spill to. This increases the volume of data that you can run analytical queries on.
SCaNN goes GA
Lastly, for vector database use-cases, AlloyDB Omni already offers great performance with pgvector using either the ivf or hnsw indexes. But while vector indexes are a great way to accelerate queries, they can be slow to build and rebuild. At Google Cloud Next 2024 we introduced ScaNN index as another available index type. AlloyDB AI’s ScaNN index surpasses standard PostgreSQL’s HNSW index by offering up to 4x faster vector queries. Beyond pure speed, ScaNN delivers significant advantages for real-world applications:
Rapid indexing: Accelerate development and eliminate bottlenecks in large-scale deployments with significantly faster index build times.
Optimized memory utilization: Reduce memory consumption by 3-4x compared to PostgreSQL’s HNSW index. This allows larger workloads to run on smaller hardware and boosts performance for diverse, hybrid workloads.
As of AlloyDB Omni version 15.7.0, AlloyDB AI ScANN indexing is generally available.
A new Kubernetes operator
In addition to the new version of AlloyDB Omni, we have also released version 1.2.0 of the AlloyDB Omni Kubernetes operator. This release adds support for more configuration options for health checks when high availability is enabled, support for configuring high availability to be enabled when a disaster recovery secondary cluster is promoted to primary, and support for log rotation to help manage storage space used by PostgreSQL log files.
At Google Cloud, we’re rapidly advancing our high-performance computing (HPC) capabilities, providing researchers and engineers with powerful tools and infrastructure to tackle the most demanding computational challenges. Here’s a look at some of the key developments driving HPC innovation on Google Cloud, as well as our presence at Supercomputing 2024.
We began our H-series with H3 VMs, specifically designed to meet the needs of demanding HPC workloads. Now, we’re excited to share some key features of the next generation of the H family, bringing even more innovation and performance to the table. The upcoming VMs will feature:
Improved workload scalability via RDMA-enabled 200 Gbps networking
Native support to directly provision full, tightly-coupled HPC clusters on demand
Titaniumtechnology that delivers superior performance, reliability, and security
We provide system blueprints for setting up turnkey, pre-configured HPC clusters on our H series VMs.
The next generation of H series is coming in early 2025.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e11a43375e0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Parallelstore: World’s first fully-managed DAOS offering
Parallelstore is a fully managed, scalable, high-performance storage solution based on next-generation DAOS technology, designed for demanding HPC and AI workloads. It is now generally available and provides:
Up to 6x greater read throughput performance compared to competitive Lustre scratch offerings
Low latency (<0.5ms at p50) and high throughput (>1GiB/s per TiB) to access data with minimal delays, even at massive scale
High IOPS (30K IOPS per TiB) for metadata operations
Simplified management that reduces operational overhead with a fully managed service
Parallelstore is great for applications requiring fast access to large datasets, such as:
Analyzing massive genomic datasets for personalized medicine
Training large language models (LLMs) and other AI applications efficiently
Running complex HPC simulations with rapid data access
A3 Ultra VMs with NVIDIA H200 Tensor Core GPUs
For GPU-based HPC workloads, we recently announced A3 Ultra VMs, which feature NVIDIA H200 Tensor Core GPUs. A3 Ultra VMs offer a significant leap in performance over previous generations. They are built on servers with our new Titanium ML network adapter, optimized to deliver a secure, high-performance cloud experience for AI workloads, and powered by NVIDIA ConnectX-7 networking. Combined with our datacenter-wide 4-way rail-aligned network, A3 Ultra VMs deliver non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE).
Compared with A3 Mega, A3 Ultra offers:
2x the GPU-to-GPU networking bandwidth, powered by Google Cloud’s Titanium ML network adapter and backed by our Jupiter data center network
Up to 2x higher LLM inferencing performance with nearly double the memory capacity and 1.4x more memory bandwidth
Ability to scale to tens of thousands of GPUs in a dense, performance-optimized cluster for large AI and HPC workloads
With system blueprints, available through Cluster Toolkit, customers can quickly and easily create turnkey, pre-configured HPC clusters with Slurm support on A3 VMs.
A3 Ultra VMs will also be available through Google Kubernetes Engine (GKE), which provides an open, portable, extensible, and highly-scalable platform for large-scale training and serving of AI workloads.
Trillium: Ushering in a new era of TPU performance for AI
Tensor Processing Units, or TPUs, power our most advanced AI models such as Gemini, popular Google services like Search, Photos, and Maps, as well as scientific breakthroughs like AlphaFold 2 — which led to a Nobel Prize this year!
4.7x increase in peak compute performance per chip
Double the high bandwidth memory capacity
Double the interchip interconnect bandwidth
Cluster Toolkit: Streamlining HPC deployments
We continue to improve Cluster Toolkit, providing open-source tools for deploying and managing HPC environments on Google Cloud. Recent updates include:
Slurm-gcp V6 is now generally available, providing faster deployments and robust reconfiguration among other benefits.
Google Cloud Customer Care is now available for Toolkit. You can find more information here on how to get support via the Cloud Customer Care console.
GKE: Container orchestration with scale and performance
GKE continues to lead the way for containerized workloads with the support of the largest Kubernetes clusters in the industry. With support for up to 65,000 nodes, we believe GKE offers more than 10X larger scale than the other two largest public cloud providers.
At the same time, we continue to invest in automating and simplifying the building of HPC and AI platforms, with:
Secondary boot disk, which provides faster workload startups through container image caching
Custom compute classes, offering greater control over compute resource allocation and scaling
Extensive innovations in Kueue.sh, which is becoming the de facto standard for job queueing on Kubernetes with topology-aware scheduling, priority and fairness in queueing, multi-cluster support (see demo by Google and CERN engineers), and more
Customer success stories: Atommap and beyond
Atommap, a company specializing in atomic-scale materials design, is using Google Cloud HPC to accelerate its research and development efforts. With H3 VMs and Parallelstore, Atommap has achieved:
Significant speedup in simulations: Reduced time-to-results by more than half, enabling faster innovation
Improved scalability: Easily scaled resources for 1,000s to 10,000s of molecular simulations, to meet growing computational demands
Better cost-effectiveness: Optimized infrastructure costs, with savings of up to 80%, while achieving high performance
Atommap’s success story highlights the transformative potential of Google Cloud HPC for organizations pushing the boundaries of scientific discovery and technological advancement.
Looking ahead
Google Cloud is committed to continuous innovation for HPC. Expect further enhancements to HPC VMs, Parallelstore, Cluster Toolkit, Slurm-gcp, and other HPC products and solutions. With a focus on performance, scalability, compatibility, and ease of use, we’re empowering researchers and engineers to tackle the world’s most complex computational challenges.
Google Cloud Advanced Computing Community
We’re excited to announce the launch of the Google Cloud Advanced Computing Community, a new kind of community of practice for sharing and growing HPC, AI, and quantum computing expertise, innovation, and impact.
This community of practice will bring together thought leaders and experts from Google, its partners, and HPC, AI, and quantum computing organizations around the world for engaging presentations and panels on innovative technologies and their applications. The Community will also leverage Google’s powerful, comprehensive, and cloud-native tools to create an interactive, dynamic, and engaging forum for discussion and collaboration.
The Community launches now, with meetings starting in December 2024 and a full rollout of learning and collaboration resources in early 2025. To learn more, register here.
Google Cloud at Supercomputing 2024
The annual Supercomputing Conference series brings together the global HPC community to showcase the latest advancements in HPC, networking, storage and data analysis. Google Cloud is excited to return to Supercomputing 2024 in Atlanta with our largest presence ever.
Visit Google Cloud at booth #1730 to jump in and learn about our HPC, AI infrastructure, and quantum solutions. The booth will feature a Trillium TPU board, NVIDIA H200 GPU and ConnectX-7 NIC, hands-on labs, a full schedule of talks, a comfortable lounge space, and plenty of great swag!
The booth theater will include talks from ARM, Altair, Ansys, Intel, NAG, SchedMD, Siemens, Sycomp, Weka, and more. Booth labs will get you deploying Slurm clusters to fine-tune the Llama2 model or run GROMACS using Cloud Batch to run microbenchmarks or quantum simulations, and more.
We’re also involved in several parts of SC24’s technical program, including BoFs, User Groups, and Workshops. Googlers will participate in the following technical sessions:
Finally, we’ll be holding private meetings and roadmap briefings with our HPC leadership throughout the conference. To schedule a meeting, please contact hpc-sales@google.com.
Cloud compliance can present significant regulatory and technical challenges for organizations. These complexities often include delineating compliance responsibilities and accountabilities between the customer and cloud provider.
At Google Cloud, we understand these challenges faced by our customers’ cloud engineering, compliance, and audit teams, and want to help make them easier to manage. That’s why we’re pleased to announce that our Audit Manager service, which can digitize and help streamline the compliance auditing process, is now generally available.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e11ab561700>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Traditional compliance methodologies, reliant on manual processes for evidence collection, are inefficient, prone to errors, and resource-intensive. According to the Gartner® Audit Survey, “When surveyed on their key priorities for 2024, 75% of chief audit executives (CAEs) cited audit’s ability to keep up with the fast-evolving cybersecurity landscape as their top priority — making it the most commonly cited priority.”
Introducing Audit Manager
Audit Manager can help organizations accelerate compliance efforts by providing:
Clear shared responsibility outlines: A matrix of shared responsibilities that delineates compliance duties between the cloud provider and customers, offering actionable recommendations tailored to your workloads.
Automated compliance assessments: Evaluation of your workloads against industry-standard technical control requirements in a simple and automated manner. Audit manager already supports popular industry and regulatory frameworks including NIST 800-53, ISO, SOC, and CSA-CCM. You can see the full list of supported frameworkshere.
Audit-ready evidence:Automated generation of comprehensive verifiable evidence reports to support your compliance claims and overarching governance activity. Audit Manager provides you with a quick execution summary of compliance at a framework level and the ability to deep-dive using control level reports.
Actionable remediation guidance: Insights to swiftly address each compliance gap that is identified.
The compliance audit journey with Audit Manager
The cloud compliance audit process involves defining responsibilities, identifying and mitigating risks, collecting supporting data, and generating a final report. This process requires collaboration between Governance, Risk, and Compliance analysts, compliance managers, developers, and auditors, each with their own specific tasks. Audit Manager streamlines this process for all involved roles, which can help simplify their work and improve efficiency.
Customer case study: Deutsche Börse Group
Deutsche Börse Group, an international stock exchange organization and innovative market infrastructure provider, began their strategic partnership with Google Cloud in 2022. Their cloud transformation journey is well under way, which brings with it the challenge of achieving and documenting compliance in their environment.
Florian Rodeit, head of cloud governance for Google Cloud, Deutsche Börse Group, first heard about Audit Manager during a Las Vegas Google Cloud Next 2024 session.
“The Audit Manager product promises a level of automation and audit control that has a lot of potential. At Deutsche Börse Group, we were excited to access the preview, explore the functionality further and build out a joint solution,” he said.
Following the European preview launch of Audit Manager, Deutsche Börse Group and Google Cloud set up a collaborative project to explore automating cloud controls via Audit Manager. Deutsche Börse Group had already created a comprehensive control catalogto manage their cloud control requirements across the organization. They analyzed the Cloud Security Alliance’s Cloud Controls Matrix against their written rules framework to create inputs for Audit Manager, and set out ownership and implementation guidelines for cloud-specific controls.
Now, Deutsche Börse Groupcan use Audit Manager to check if there are resources configured that deviate from the control framework, such as any resources that have been set up outside of approved regions. This provides automated, auditable evidence to support their specific requirements for compliant usage of Google Cloud resources.
Benjamin Möller, expert cloud governance, vice-president, Deutsche Börse Group, has been leading the collaborative project. “Moving forward, we hope that Audit Manager will allow us to automate many of our technical controls — giving us robust assurance that we are compliant, enabling us to quickly identify and rectify non-compliance, and minimizing the manual over-head of audit evidence. We are excited to continue making progress on our joint venture,” he said.
Take the next step
To use Audit Manager, access the tool directly from your Google Cloud console. Navigate to the Compliance tab in your Google Cloud console, and select Audit Manager. For a comprehensive guide on using Audit Manager, please refer to our detailed product documentation. We encourage you to share your feedback on this service to help us improve Audit Manager’s user experience.
We are thrilled to announce new capabilities that make running Dataproc Serverless even faster, easier, and more intelligent.
Elevate your Spark experience with:
Native query execution: Experience significant performance gains with the new Native query execution in the Premium tier.
Seamless monitoring with Spark UI: Track job progress in real time with a built-in Spark UI available by default for all Spark batches and sessions.
Streamlined investigation: Troubleshoot batch jobs from a central “Investigate” tab displaying all the essential metrics highlights and logs filtered by errors automatically.
Proactive autotuning and assisted troubleshooting with Gemini: Let Gemini minimize failures and autotune performance based on historical patterns. Quickly resolve issues using Gemini-powered insights and recommendations.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e461910aaf0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Accelerate your Spark jobs with native query execution
You can unlock considerable speed improvements for your Spark batch jobs in the Premium tier on Dataproc Serverless Runtimes 2.2.26+ or 1.2.26+ by enabling native query execution — no application changes required.
This new feature in Dataproc Serverless Premium tier improved the query performance by ~47%in our tests on queries derived from TPC-DS and TPC-H benchmarks.
Start now by running the native query execution qualification tool that can help you easily identify eligible jobs and estimate potential performance gains. Once you have the list of batch jobs identified for native query execution, you can enable it and have the jobs run faster and potentially save costs.
Seamless monitoring with Spark UI
Tired of wrestling with setting up the persistent history server (PHS) clusters and maintaining them just to debug your Spark batches? Wouldn’t it be easier if you could avoid the ongoing costs of the history server and yet see the Spark UI in real-time?
Until now, monitoring and troubleshooting Spark jobs in Dataproc Serverless required setting up and managing a separate Spark persistent history server. Crucially, each batch job had to be configured to use the history server. Otherwise, the open-source UI would be unavailable for analysis for the batch job. Additionally, the open-source UI suffered from slow navigation between applications.
We’ve heard you, loud and clear. We’re excited to announce a fully managed Spark UI in Dataproc Serverless that makes monitoring and troubleshooting a breeze.
The new Spark UI is built-in and automatically available for every batch job and session in both Standard and Premium tiers of Dataproc Serverless at no additional cost. Simply submit your job and start analyzing performance in real time with the Spark UI right away.
Here’s why you’ll love the Serverless Spark UI:
Traditional Approach
The new Dataproc Serverless Spark UI
Effort
Create and manage a Spark history server cluster. Configure each batch job to use the cluster.
No cluster setup or management required. Spark UI is available by default for all your batches without any extra configuration.The UI can be accessed directly from the Batch / Session details page in the Google Cloud console.
Latency
UI performance can degrade with increased load. Requires active resource management.
Enjoy a responsive UI that automatically scales to handle even the most demanding workloads.
Availability
The UI is only available as long as the history server cluster is running.
Access your Spark UI for 90 days after your batch job is submitted.
Data freshness
Wait for a stage to complete to see that its events are in the UI.
View regularly updated data without waiting for the stage to complete.
Functionality
Basic UI based on open-source Spark.
Enhanced UI with ongoing improvements based on user feedback.
Cost
Ongoing cost for the PHS cluster.
No additional charge.
Accessing the Spark UI
To gain deeper insights into your Spark batches and sessions — whether they’re still running or completed — simply navigate to the Batch Details or Session Details page in the Google Cloud console. You’ll find a “VIEW SPARK UI” link in the top right corner.
The new Spark UI provides the same powerful features as the open-source Spark History Server, giving you deep insights into your Spark job performance. Easily browse both running and completed applications, explore jobs, stages, and tasks, and analyze SQL queries for a comprehensive understanding of the execution of your application. Quickly identify bottlenecks and troubleshoot issues with detailed execution information. For even deeper analysis, the ‘Executors’ tab provides direct links to the relevant logs in Cloud Logging, allowing you to quickly investigate issues related to specific executors.
You can still use the “VIEW SPARK HISTORY SERVER” link to view the Persistent Spark History Server if you had already configured one.
A new “Investigate” tab in the Batch details screen gives you instant diagnostic highlights collected at a single place.
In the “Metrics highlights” section, the essential metrics are automatically displayed, giving you a clear picture of your batch job’s health. You can further create a custom dashboard if you need more metrics.
Below the metrics highlights, a widget “Job Logs” shows the logs filtered by errors, so you can instantly spot and address problems. If you would like to dig further into the logs, you can go to the Logs Explorer.
Proactive autotuning and assisted troubleshooting with Gemini (Preview)
Last but not least, Gemini in BigQuery can help reduce the complexity of optimizing hundreds of Spark properties in your batch job configurations while submitting the job. If the job fails or runs slow, Gemini can save the effort of wading through several GBs of logs to troubleshoot the job.
Optimize performance: Gemini can automatically fine-tune the Spark configurations of your Dataproc Serverless batch jobs for optimal performance and reliability.
Simplify troubleshooting: You can quickly diagnose and resolve issues with slow or failed jobs by clicking “Ask Gemini” for AI-powered analysis and guidance.
Sign up here for a free preview of the Gemini features and “Investigate” tab for Dataproc Serverless.
Climate change is the biggest challenge our society faces. As scientists, governments, and industry leaders gather in Baku, Azerbaijan for the 2024 United Nations Climate Change Conference, a.k.a. COP29, it’s incumbent upon all of us to find innovative solutions that can drive impact at a global scale.
The gravity of climate change requires solutions that go beyond incremental change. To find those solutions, we need the ability to make better decisions about how to approach climate mitigation and adaptation across every human activity — from transport, industry, and agriculture to communications, finance, and housing. This requires processing vast volumes of data generated by these industries. The combination of AI and cloud technologies offer the potential to unlock climate change solutions that can be both transformational and global in scale.
We already have a lot of examples that we can draw from.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eebc08468e0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Today, for example, Google Earth Engine is being used by the Forest Data Partnership, a collaboration for global monitoring of commodity-driven deforestation, to monitor every oil palm plantation around the globe, providing participating companies live early-warning signals for deforestation risks, and dramatically reducing the costs involved in forest monitoring. Similarly, NGIS is using Google Earth Engine to power TraceMark, helping businesses deliver traceability and transparency across global supply chains.
Another example is Global Fishing Watch, an international nonprofit co-founded by Google that is using geospatial analytics and AI to understand how human activity impacts the seas, global industries, climate, biodiversity and more. The datasets map global ocean infrastructure and vessels that don’t publicly broadcast their positions. This helps to advance policy conversations about offshore renewables development, provides insight into carbon dioxide emissions from maritime vessels, and enables marine protection.
It’s clear that AI can process large volumes of data, optimize complex systems, and drive the development of new business models. We see businesses harnessing the technology in the fight against climate change in four ways:
1. Measuring business performance
Businesses are using AI-powered insights to help monitor their advance towards sustainability targets, which ultimately contributes to building business resilience.
In today’s business landscape, this is of paramount importance as companies face growing demands for transparency and accountability regarding their environmental and social impact.
We are seeing cloud and AI being used to de-risk investments, improve transparency, and increase profitability through the use of large-scale datasets, machine learning, and generative AI. These technologies allow companies to analyze their ESG performance, gain insights into climate risks, and monitor supplier behaviors.
For example, Palo Alto Networks partnered with Watershed, a Google Cloud Ready – Sustainability Partner, to measure and track their carbon emissions across their entire business using Google Cloud. This partnership enabled them to gain a comprehensive understanding of their environmental impact and set actionable targets for reducing emissions.
Another example is HSBC, which developed a new credit ranking tool on Google Cloud that allows them to run multiple climate risk scenarios simultaneously. This tool empowers HSBC to make more informed investment decisions while considering the potential impact of climate change on their portfolio.
Secondly, businesses are using AI to optimize their operations and supply chains for energy and resource efficiency, as well as to cut costs.
This is crucial for companies seeking to enhance their sustainability performance while simultaneously improving their bottom line. Through the use of AI and machine learning, cloud technologies empower organizations to optimize their existing operations, improve cost efficiency, and minimize waste.
For example, Geotab, another Google Cloud Ready – Sustainability partner, is managing 75 billion data records in BigQuery for 4 million commercial fleet vehicles every day to optimize vehicle routes, increase driver safety behaviors and accelerate the path to fleet electrification.
3. Identifying cleaner business models
As the world shifts towards more sustainable practices, businesses must adapt and identify new avenues for growth. Cloud and AI is helping businesses do just that. Cloud and AI allow organizations to reimagine their business models, explore new markets, and create innovative products and services that align with their sustainability goals.
Recykal, for instance, has partnered with Google Cloud to build Asia’s largest circular economy marketplace. By leveraging Google Cloud’s AI and machine learning capabilities, Recykal is revolutionizing waste management and promoting sustainable practices in Asia.
Another example is Einride, a company that is reimagining freight transport by using electric, self-driving vehicles and an AI-powered platform. Their innovative approach to logistics is disrupting the transportation industry and contributing to a more sustainable future.
More recently, Climate Engine and Robeco have started using AI and geospatial technologies with their scientific expertise and investment knowledge to inform how publicly traded companies’ actions impact biodiversity. You can read their joint thought leadership paper here.
4. Building more sustainably
Finally, and very importantly, businesses want to ensure that the actual use of cloud and AI technologies doesn’t lead to increased climate impacts. From the get-go, developers need to take concrete steps towards reducing the carbon footprint and cost of their applications in the cloud.
This is why, through our Carbon Sense suite, we provide developers with the tools and resources they need to build and deploy applications in a way that minimizes their environmental impact, all while maintaining cost efficiency.
L’Oréal, for example, leverages Google Cloud’s Carbon Footprint tool to track the gross carbon emissions associated with their cloud usage. This allows L’Oréal to understand the environmental impact of their technology decisions and implement strategies to reduce their footprint.
Finally, Google takes its own carbon footprint very seriously, and is pursuing an ambitious goal to achieve net-zero emissions across all of its operations and value chain, supported by a goal to run on 24/7 carbon-free energy on every grid where it operates by 2030.
Google Cloud is committed to helping organizations of all sizes achieve their sustainability goals. With cloud, data analytics, and AI, we’re delivering new ways to build resilience, reduce costs, and unlock sustainable growth, while also accelerating the impact of organizations’ sustainability initiatives through the smarter use of data. This is an opportunity to drive tangible business results and create a more sustainable future for all.
Crafting the perfect prompt for generative AI models can be an art in itself. The difference between a useful and a generic AI response can sometimes be a well-crafted prompt. But, getting there often requires time-consuming tweaking, iteration, and a learning curve. That’s why we’re thrilled to announce new updates to the AI-powered prompt writing tools in Vertex AI, designed to make prompting easier and more accessible for all developers.
We’re introducing two powerful features designed to streamline your prompt engineering workflow:Generate prompt andRefineprompt.
Imagine you need a prompt to summarize customer reviews about your latest product. Instead of crafting the prompt yourself, you can simply tell the Generate prompt feature your goal. It will then create a comprehensive prompt, including placeholders for the reviews, which you can easily populate with your own data later. Generate prompt takes the guesswork out of prompt engineering by:
Turning simple objectives into tailor-made, effective prompts. This way, you don’t need to agonize over phrasing and keywords.
Generating placeholders for context, like customer reviews, news articles, or code snippets. This allows you to quickly add your specific data and get immediate results.
Speeding up the prompt writing process. Focus on your core tasks, not on perfecting prompt syntax.
Refine prompt: Iterate and improve with AI-powered suggestions
Once you have a prompt, either crafted by Generate prompt or one you’ve written yourself, Refine prompt helps you modify it for optimal performance. Here’s how it works:
Provide feedback: After running your prompt, simply provide feedback on the response, the same way you would critique a writer.
Instant suggestions: Vertex AI generates a new, suggested prompt in one step, taking your feedback into account.
Iterate and improve: You can accept or reject the suggestion and continue iterating by running the refined prompt and providing further feedback.
Prompt refinement boosts the quality of the prompt, while also saving significant times during prompt design. The quality is typically improved by augmenting the prompt instructions in a way that Gemini will better understand.
Below are some sample prompts that were revised with Refine prompt:
Original prompts
After using Prompt Refinement
Suggest engaging lesson plan ideas for art class
Suggest 3 engaging lesson plan ideas for a high school art class, each focusing on a different art form. Be concise and only include the most relevant information, such as the art form, target age group, and key activity.
Plan a schedule for a week with focus time and meeting time. Take in account that there are 2 teams with 6 hour delay
Create a detailed weekly schedule for a team with a 6-hour time difference. The schedule should include:
Specific time blocks for focus time and meetings.
Consideration of overlapping work hours to ensure effective communication and collaboration.
A balance of individual work and team interactions.
Suggestions for time zone conversion tools or strategies to facilitate scheduling.
A powerful duo: Generate prompt meets Refine prompt
These two features work in tandem to help you craft the most effective prompt for your objective – irrespective of your skill level. Generate prompt gets you started quickly, while Refine prompt allows for iterative improvement in five steps:
Define your objective: Tell Generate prompt what you want to achieve.
Generate a prompt: Generate prompt creates a ready-to-use prompt, often with helpful placeholders for context.
Run the prompt and review the output: Execute the prompt with your chosen LLM in Vertex AI.
Refine with feedback: Use Refine prompt to provide feedback on the output and receive AI-powered suggestions for prompt improvement.
Iterate until ideal performance: Continue refining and rerunning your prompt until you achieve your desired results.
How to get started
Go ahead and try out an AI-assisted prompt-writing through our interactive critiquing workflow. Vertex AI’s easy-to-use UI for refining prompts can be tested without setting up a Google Cloud account through this link (to demo without a Google Cloud account, be sure you are logged out of your Google account in your web browser or use incognito mode). For those with an account, you’ll have the ability to save, manage, and fine-tune your prompts.
Generative AI presents both immense opportunities and challenges for the Department of Defense (DoD). The potential to enhance situational awareness, streamline tasks, and improve decision-making is significant. However, the DoD’s unique requirements, especially their stringent security standards for cloud services (IL5), necessitate carefully crafted AI solutions that balance innovation with security.
The DoD’s 2023 Data, Analytics, and Artificial Intelligence Adoption Strategy report emphasizes the need to “strengthen the organizational environment” for AI deployment. This underscores the importance of solutions that seamlessly integrate into existing infrastructure, prioritize data security, and enable responsible and intelligent use of AI.
Google Public Sector’s 4 AI pillars: A framework for DoD AI adoption
To meet the DoD’s unique challenges, Google AI for Public Sector has focused on 4 areas when designing solutions to help empower the DoD:
Adaptive: AI solutions must seamlessly integrate into the DoD’s existing complex and evolving technology ecosystem. Google prioritizes adaptable solutions that minimize disruption and enable rapid adoption, aligning with the DoD’s focus on agile innovation.
Secure: Protecting sensitive DoD data is paramount. Google’s AI solutions are engineered with robust security measures, including Zero Trust architecture and adherence to IL5 requirements, ensuring the confidentiality and integrity of critical information.
Intelligent: Google’s AI capabilities are designed to deliver actionable insights from vast and diverse datasets. By harnessing the power of machine learning and natural language processing, our solutions enable the DoD to make data-driven decisions with greater speed and accuracy.
Responsible: Google is committed to developing and deploying AI in a responsible and ethical manner. Our AI Principles guide our research, product development, and deployment decisions, ensuring that AI is used for good and avoids harmful applications.
Breaking down data silos and delivering insights with enterprise search
Google Cloud’s solution for enterprise search is a powerful tool designed to help organizations overcome the challenges of data fragmentation. It acts as a central hub, seamlessly connecting to diverse data sources across the department, including structured and unstructured data.
Intelligent Information Retrieval: Leveraging advanced AI and natural language processing, enterprise search delivers precise and contextually relevant answers to queries, even when dealing with unstructured data like documents, images, and reports.
Seamless Integration: Federated search combined with Retrieval Augmented Generation (RAG) provides relevant query responses without the need to move data or train a custom Large Language Model (LLM).
Enhanced Transparency and Trust: The solution provides links to source documents alongside AI-generated responses, allowing users to verify information and build confidence in the system.
Robust Security: With all services used in the solution submitted for IL5 accreditation, enterprise search incorporates industry-leading security measures, including Role-Based Access Control (RBAC) and Common Access Card (CaC) compatibility, to safeguard sensitive DoD data.
Future-Proof Flexibility: The solution supports a wide range of Large Language Models (LLMs), including Google’s Gemini family of models and Gemma, our family of lightweight, state-of-the-art open models. Google offers choice, adaptability and avoids vendor lock-in, allowing the DoD to leverage the latest AI advancements without extensive redevelopment.
Google Cloud’s generative AI infused solution directly supports the DoD’s mission by consolidating data access, enhancing discoverability, and providing rapid, accurate insights, leading to improved decision-making and a strategic advantage.
Google Cloud is committed to supporting the DoD’s AI journey by providing solutions that are not only powerful and innovative, but also secure, responsible, and adaptable. By empowering the DoD to harness the full potential of its data, we are helping to enable more agile, informed, and effective service members. Learn more about how Google Public Sector’s AI solutions can empower your agency and visit Google AI for Public Sector for examples of how we are helping accelerate mission impact with AI.
Welcome to the first Cloud CISO Perspectives for November 2024. Today I’m joined by Andy Wen, Google Cloud’s senior director of product management for Google Workspace, to discuss a new Google survey into the high security costs of legacy tech.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
–Phil Venables, VP, TI Security & CISO, Google Cloud
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0df2ddf0>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Confronting the high security cost of legacy tech
By Phil Venables, VP, TI Security & CISO, Google Cloud, and Andy Wen, senior director, product management, Google Workspace
From a business perspective, it’s easy to understand why many organizations continue to rely on outdated technology. Replacing older systems can be expensive, but relying on them comes with hidden costs that can far outstrip the benefits.
Legacy technology can greatly increase the business and security risks that an organization will face, a serious concern given that the global average total cost of a security breach in 2024 was $4.88 million. Despite the availability of a plethora of more modern solutions, we’re still seeing too many organizations rely on defenses that were designed for the desktop era, according to a new Google Workspace global cyber security survey of more than 2,000 security and IT decision-makers.
The numbers paint a dire picture of the security impact of operating legacy systems:
71% said that legacy technology has left organizations less prepared for the future.
63% believe that their organization’s technology landscape is less secure than it was in the past.
More than 66% told us that their organizations are investing more time and money than ever in securing their environments — but still experience costly security incidents.
81% of organizations experience at least one security incident per year.
Organizations experience eight security incidents on average per year.
We know many security leaders have convinced the business to invest in more security tools, because the survey also found that 61% of organizations are using more security tools than they did two years ago. Yet while more than two-thirds of organizations are investing more time and money in securing their environments, many are still experiencing expensive security incidents.
Environments with more security tools often attempt to compensate for legacy platforms that continue to be vulnerable to security incidents. Meanwhile, 81% of security leaders believe cloud-first platforms are safer than legacy platforms.
Organizations with 10 or more security tools reported an average of 14 security incidents per year, with 34% of them spending more than $250,000 on incidents per year.
Organizations with fewer than 10 tools reported an average of six incidents per year, with 19% of them spending more than $250,000 on incidents per year.
“The solution is not more security tools, but more secure tools,” said CISA Director, Jen Easterly, at her mWISE Conference keynote in September.
We have also made this point often. To be truly resilient in today’s security landscape, organizations must consider an IT overhaul and rethink their strategy toward solutions with modern, secure-by-design architectures that nullify classes of vulnerabilities and attack vectors.
It may be daunting to take on an overhaul, especially for large organizations, but security leaders need to look at investing in a cloud-first solution to be resilient. The change can be made in small steps to minimize disruption and evaluate return on investment, such as using Chrome Enterprise for secure browsing and providing Google Workspace to specific teams.
The bottom line is that adopting modern technology can help eliminate entire classes of threats, as well as improve business outcomes.
We’d like to highlight three customer interactions that underscore organizational value gained by modernizing. Organizations need a centralized solution that can evolve, especially as attacks continue to increase in quantity and sophistication. We recently did some work with the cybersecurity company Trellix, which did a complete overhaul of its security infrastructure.
Trellix was running into issues where its old software stack felt stagnant and didn’t connect into new things they were doing or building. These older solutions made it hard to control where data was sitting and who was accessing it. They’ve since fully migrated to Google Workspace, adopted the Zero Trust capabilities we’ve built in, and augmented them with their own security solutions, including a security operations console, email security, and endpoint protection.
Employees can now chat, email, view files, edit documents, and join meetings from their device of choice without worrying about security and access permissions. All these capabilities live within the same platform, making it easier and simpler for security admins to oversee data safety with features like endpoint management and Zero Trust access controls in Workspace — without slowing down employee collaboration.
Similarly, the city of Dearborn, Mich., replaced its legacy email solution. After making the switch to Gmail, users noticed a meaningful decrease in spam, phishing, and malware, which helped reduce their cybersecurity risks.
Humana’s dilemma was driven by a legacy suite of desktop-based office applications that its IT team needed to spend 70% of its time maintaining. Humana’s IT team rolled out Google Workspace to 13,000 Humana employees in the field and in the office in four months, migrating 22 terabytes of data. Workspace’s built-in security features and browser-based apps saved the team time and reduced costs, and also led to a steady reduction in help desk tickets during and after rollout.
For more leadership guidance from Google Cloud experts, please see our CISO Insights hub.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0df2d370>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Join our upcoming Security Talks to unlock the Defender’s Advantage: Our next Security Talks is coming on Nov. 19, and will focus on the Defender’s Advantage. This free, day-long virtual event is packed with insights and strategies to help you proactively secure your cloud environment. Register today.
Cyber risk top 5: What every board should know: Boards should learn about security and digital transformation to better manage their organizations. Here’s five top risks they need to know — and prepare for. Read more.
Mandatory MFA is coming to Google Cloud. Here’s what you need to know: To help keep our customers secure, starting in 2025 we will require them to use MFA when accessing Google Cloud. Read more.
Google Cloud expands CVE program: As part of our commitment to security and transparency on vulnerabilities found in our products and services, we now will issue CVEs for critical Google Cloud vulnerabilities. Read more.
Our 2025 Forecast report: Get ready for the next year in cybersecurity with our 2025 Forecast report, now available. Read more.
From AI to Zero Trust, Google Cloud Security delivers comprehensive public sector solutions: Google Cloud Security is committed to helping government agencies and organizations strengthen their defenses, and we recently made several announcements at the Google Public Sector Summit. Read more.
FedRAMP High development in the cloud: Code with Cloud Workstations: A Forrester Total Economic Impact™ (TEI) study found that Google Cloud Workstations enhance consistency, agility, and security while reducing costs and risks. Read more.
Please visit the Google Cloud blog for more security stories published this month.
(In)tuned to take-overs: Abusing Intune permissions for lateral movement and privilege escalation: Learn how the Mandiant Red Team was able to move laterally from a customer’s on-premises environment to their Microsoft Entra ID tenant, and obtained privileges to compromise existing Entra ID service principals installed in the tenant. Also learn how to defend against it. Read more.
Flare-On 11 Challenge solutions: The latest Flare-On challenge is over, and it proved a doozy: Only 275 players out of 5,300 completed all 10 stages. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Google Cloud Security and Mandiant podcasts
Gen AI security: Unseen attack surfaces and pentesting lessons: What’s the current state of gen AI security? From common mistakes to novel attack surfaces to unique challenges, podcast hosts Anton Chuvakin and Tim Peacock discuss with Ante Gojsalic, co-founder and CTO, SplxAI, today’s gen AI security concerns and their potential impact on tomorrow’s tech. Listen here.
Get the Google Security Operations perspective on SIEM and security data lakes: What’s a disassembled SIEM, and why you should care: Travis Lanham, uber tech lead for Security Operations Engineering, Google Cloud, goes SIEM-deep with Anton and Tim. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in two weeks with more security-related updates from Google Cloud.
It’s an exciting time in the world of data and analytics, with more organizations harnessing the power of data and AI to help transform and grow their businesses. But in a threat landscape with increasingly sophisticated attacks around every corner, ensuring the security and integrity of that data is critical.
Google Cloud offers a comprehensive suite of tools to help protect your data while unlocking its potential. In our new ebook, Building a Secure Data Platform with Google Cloud, we dig into the many data security capabilities within Google Cloud and share how they can help support data-based innovation strategies.
Take a peek inside the ebook, then download the full version here.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0e6fc0d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Unlock data platform-level security with BigQuery BigQuery, Google Cloud’s unified data platform, offers a robust set of integrated security features to help you safeguard your data. The platform automatically encrypts all data at rest, which provides a foundational layer of defense against unauthorized access. For data sharing, BigQuery Analytics Hub and data clean rooms allow you to efficiently, securely, and easily share data across organizational boundaries. The platform also includes Dataplex, which enables you to implement comprehensive policies to govern how data is accessed, used, and shared within your organization.
Shield assets with granular access controls and guardrails With Cloud Identity and Access Management (IAM), you can manage access to critical data across BigQuery, Cloud Run, Cloud Run functions, and Google Kubernetes Engine (GKE) resources. Organization restrictions place further limits on which users can access resources in your organization. Combined with Cloud IAM, this feature supports your organization policies and helps you maintain a secure perimeter around your Google Cloud environment.
Create security boundaries with perimeter and data protection Google Cloud offers several ways to reinforce your perimeter. VPC Service Controls help prevent data exfiltration from cloud resources, providing precise control over access and movement by external entities or by insiders.
Meanwhile, with Sensitive Data Protection, you can identify and classify your sensitive data within BigQuery, which can help you implement targeted protection measures such as masking, tokenization, and redaction. You can also gain even more granular control over your encryption keys with Customer-managed encryption keys for BigQuery.
Strengthen data security posture with automated monitoring and compliance Establishing robust security controls for your data is essential for improving your security posture, but it’s just as important to monitor your environment for threats and maintain compliance with industry standards. Security Command Center gives you a comprehensive view of your security posture with direct visibility into your BigQuery datasets. With Cloud Logging, you can collect, store, and analyze logs to gain insights into system activities, detect anomalies, and respond to security incidents. Assured Workloads further simplifies compliance, providing peace of mind that you’ve established strong baseline controls and compliant configurations.
All-in-one data security with integrated solutions from Google Cloud
Building a secure data ecosystem requires a multi-layered approach. With comprehensive security features from Google Cloud, you can safeguard your sensitive data, comply with industry regulations, and discover the full potential of your data. Dive deeper into these tools, solutions, and strategies in the full ebook — Building a Secure Data Platform with Google Cloud — to ensure the safety and integrity of your organization’s most valuable asset. Download the full version here.
As open-source large language models (LLMs) become increasingly popular, developers are looking for better ways to access new models and deploy them on Cloud Run GPU. That’s why Cloud Run now offers fully managed NVIDIA GPUs, which removes the complexity of driver installations and library configurations. This means you’ll benefit from the same on-demand availability and effortless scalability that you love with Cloud Run’s CPU and memory, with the added power of NVIDIA GPUs. When your application is idle, your GPU-equipped instances automatically scale down to zero, optimizing your costs.
In this blog post, we’ll guide you through deploying the Meta Llama 3.2 1B Instruction model on Cloud Run. We’ll also share best practices to streamline your development process using local model testing with Text Generation Inference (TGI) Docker image, making troubleshooting easy and boosting your productivity.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0cf8f040>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Why Cloud Run with GPU?
There are four critical reasons developers benefit from deploying open models on Cloud Run with GPU:
Fully managed: No need to worry about drivers, libraries, or infrastructure.
On-demand scaling: Scale up or down automatically based on demand.
Cost effective: Only pay for what you use, with automatic scaling down to zero when idle.
Performance: NVIDIA GPU-optimized for Meta Llama 3.2.
Initial Setup
First, create a Hugging Face token.
Second, check that your Hugging Face token has permission to access and download Llama 3.2 model weight here. Keep your token handy for the next step.
Third, use Google Cloud’s Secret Manager to store your Hugging Face token securely. In this example, we will be using Google user credentials. You may need to authenticate for using gcloud CLI, setting default project ID, and enable necessary APIs, and grant access to Secret Manager and Cloud Storage.
code_block
<ListValue: [StructValue([(‘code’, ‘# Authenticate CLIrngcloud auth loginrnrn# Set default projectrngcloud config set project <your_project_id>rnrn# Create new secret key, remember to update <your_huggingface_token>rngcloud secrets create HF_TOKEN –replication-policy=”automatic”rnecho -n <your_huggingface_token> | gcloud secrets versions add HF_TOKEN –data-file=-rnrn# Retrieve the keyrnHF_TOKEN=$(gcloud secrets versions access latest –secret=”HF_TOKEN”)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6f0cf8f490>)])]>
Local debugging
Install huggingface_cli python package in your virtual environment.
Run huggingface-cli login to set up a Hugging Face credential.
Use the TGI Docker image to test your model locally. This allows you to iterate and debug your model locally before deploying it to Cloud Run.
Now, we will create a new Cloud Run service using the deployment script as follows. (Remember to update BUCKET_NAME). You may also need to update the network and subnet name as well.
New solutions, old problems. Artificial intelligence (AI) and large language models (LLMs) are here to signal a new day in the cybersecurity world, but what does that mean for us—the attackers and defenders—and our battle to improve security through all the noise?
Data is everywhere. For most organizations, the access to security data is no longer the primary issue. Rather, it is the vast quantities of it, the noise in it, and the disjointed and spread-out nature of it. Understanding and making sense of it—THAT is the real challenge.
When we conduct adversarial emulation (red team) engagements, making sense of all the network, user, and domain data available to us is how we find the path forward. From a defensive perspective, efficiently finding the sharpest and most dangerous needles in the haystack—for example, easily accessible credentials on fileshares—is how we prioritize, improve, and defend.
How do you make sense of this vast amount of structured and unstructured data, and give yourself the advantage?
Data permeates the modern organization. This data can be challenging to parse, process, and understand from a security implication perspective, but AI might just change all that.
This blog post will focus on a number of case studies where we obtained data during our complex adversarial emulation engagements with our global clients, and how we innovated using AI and LLM systems to process this into structured data that could be used to better defend organizations. We will showcase the lessons learned and key takeaways for all organizations and highlight other problems that can be solved with this approach for both red and blue teams.
Approach
Data parsing and understanding is one of the biggest early benefits of AI. We have seen many situations where AI can help process data at a fast rate. Throughout this post, we use an LLM to process unstructured data, meaning that the data did not have a structure or format that we knew about before parsing the data.
If you want to try these examples out yourself, please make sure you use either a local model, or you have permission to send the data to an external service.
Getting Structured Data Out of an LLM
Step one is to get the data into a format we can use. If you ever used an LLM, you will have noticed it will output as a story or prose text, especially if you use chat-based versions. For a lot of use cases, this is fine; however, we want to analyze the data and get structured data. Thus, the first problem we have to solve is to get the LLM to output the data in a format we can specify. The simple method is to ask the LLM to output the data in a machine readable format like JSON, XML, or CSV. However, you will quickly notice that you have to be quite specific with the data format, and the LLM can easily output data in another format, ignoring your instructions.
Luckily for us, other people have encountered this problem and have solved it with something called Guardrails. One of the projects we have found is called guardrails-ai. It is a Python library that allows you to create guardrails—specific requirements—for a model based on Pydantic.
To illustrate, take a simple Python class from the documentation to validate a pet from the output of the LLM:
from pydantic import BaseModel, Field
class Pet(BaseModel):
pet_type: str = Field(description="Species of pet")
name: str = Field(description="a unique pet name")
You can use the next code from the Guardrails documentation to process the output of the LLM into a structured object:
from guardrails import Guard
import openai
prompt = """
What kind of pet should I get and what should I name it?
${gr.complete_json_suffix_v2}
"""
guard = Guard.from_pydantic(output_class=Pet, prompt=prompt)
raw_output, validated_output, *rest = guard(
llm_api=openai.completions.create,
engine="gpt-3.5-turbo-instruct"
)
print(validated_output)
If we look at what this library generates underwater for this prompt, we see that it adds a structured object part with the instructions for the LLM to output data in a specific way. This streamlines the way you can get structured data from an LLM.
For the next use case, we will show the Pydantic models we’ve created to process the output.
Red Team Use Cases
The next sections contain some use cases where we can use an LLM to get structured data out of data obtained. The use cases are divided into three categories of the attack lifecycle:
Initial Reconnaissance
Escalate Privileges
Internal Reconnaissance
Initial Reconnaissance
Open Source Intelligence (OSINT) is an important part of red teaming. It includes gathering data about the target organization from news articles, social media, and corporate reports.
This information can then be used in other red team phases such as during phishing. For defenders, it helps them understand which parts of their organization are exposed to the internet, anticipating a possible future attack. In the next use case, we talk about processing social media information to process roles and extract useful information.
Use Case 1: Social Media Job Functions Information
During OSINT, we often try to get information from employees about their function in their company. This helps with performing phishing attacks, as we do not want to target IT professionals, especially those that work in cybersecurity.
Social media sites allow their users to write about their job titles in a free format. This means that the information is unstructured and can be written in any language and any format.
We can try to extract the information from the title with simple matches; however, because the users can fill in anything and in any language, this problem can be better solved with an LLM.
Data Model
First, we create a Pydantic model for the Guardrail:
class RoleOutput(BaseModel):
role: str = Field(description="Role being analyzed")
it: bool = Field(description="The role is related to IT")
cybersecurity: bool = Field(description="The role is related to
CyberSecurity")
experience_level: str = Field(
description="Experience level of the role.",
)
This model has two Boolean options if the role is IT or cybersecurity related. Additionally, we would like to know the experience level of the role.
Prompt
Next, let’s create a prompt to instruct the LLM to extract the requested information from the role. This prompt is quite simple and just asks the LLM to fill in the data.
Given the following role, answer the following questions.
If the answer doesn't exist in the role, enter ``.
${role}
${gr.complete_xml_suffix_v2}
The two last lines are placeholders used by guardrails-ai.
Results
To test the models, we have scraped the titles that employees use on social media. This dataset contained the titles that the employees used and contained 235 entries. For testing, we used the gemini-1.0-pro model.
Gemini managed to parse 232 entries. The results are shown in Table 1.
Not IT
IT
Cybersecurity
Gemini
183
49
5
Manual evaluation by a red team operator
185
47
5
False positive
1
3
0
Table 1: Results of Gemini parsing 232 job title entries
In the end, Gemini processed the roles quite on par with a human. Most of the false positives were questionable because it is not very clear if the role was actually IT related. The experience level did not perform well, as the model deemed the experience level as “unknown” or “none” for most of the entries. To resolve this issue, the field was changed so that the experience level should be a number from 1 to 10. After running the analysis again, this yielded better results for the experience level. The lowest experience levels (1–4) contained function titles like “intern,” “specialist,” or “assistant.” This usually indicated that the person had been employed at that role for a shorter period of time. The updated data model is shown as follows:
class RoleOutput(BaseModel):
role: str = Field(description="Role being analyzed")
it: bool = Field(description="The role is related to IT")
cybersecurity: bool = Field(description="The role is related to
CyberSecurity")
experience_level: int = Field(
description="Estimate of the experience level of the role on
a scale of 1-10. Where 1 is low experience and 10 is high.",
)
This approach helped us to sort through a large dataset of phishing targets by identifying employees that did not have IT and cybersecurity roles, and sorting them by experience level. This can speed up target selection for large organizations and may allow us to better emulate attackers by changing the prompts or selection criteria. To defend against this, data analysis is more difficult. In theory, you can instruct all your employees to include “Cybersecurity” in their role, but that does not scale well or solve the underlying phishing problem. The best approach with regards to phishing is, in our experience, to invest into phishing resistant multifactor authentication (MFA) and application allowlisting. If applied well, these solutions can mitigate phishing attacks as an initial access vector.
Escalate Privileges
Once attackers establish a foothold into an organization, one of their first acts is often to improve their level of access or control through privilege escalation. There are quite a few methods that can be used for this. It comes in a local system-based variety as well as wider domain-wide types, with some based on exploits or misconfigurations, and others based on finding sensitive information when searching through files.
Our focus will be on the final aspect, which aligns with our challenge of identifying the desired information within the vast amount of data, like finding a needle in a haystack.
Use Case 2: Credentials in Files
After gaining initial access to the target network, one of the more common enumeration methods employed by attackers is to perform share enumeration and try to locate interesting files. There are quite a few tools that can do this, such as Snaffler.
After you identify files that potentially contain credentials, you can go through them manually to find useful ones. However, if you do this in a large organization, there is a chance that you will have hundreds to thousands of hits. In that case, there are some tools that can help with finding and classifying credentials like TruffleHog and Nosey Parker. Additionally, the Python library detect-secrets can help with this task.
Most of these tools look for common patterns or file types that they understand. To cover unknown file types or credentials in emails or other formats, it might instead be valuable to use an LLM to analyze the files to find any unknown or unrecognized formats.
Technically, we can just run all tools and use a linear regression model to combine the results into one. An anonymized example of a file with a password that we encountered during our tests is shown as follows:
@Echo Off
Net Use /Del * /Yes
Set /p Path=<"path.txt"
Net Use %Path% Welcome01@ /User:CHAOS.LOCALWorkstationAdmin
If Not Exist "C:Data" MKDIR "C:Data"
Copy %Path%. C:Data
Timeout 02
Data Model
We used the following Python classes to instruct Gemini to retrieve credentials with an optional domain. One file can contain multiple credentials, so we use a list of credentials to instruct Gemini to optionally retrieve multiple credentials from one file.
class Credential(BaseModel):
password: str = Field(description="Potential password of an account")
username: str = Field(description="Potential username of an account")
domain: Optional[str] = Field(
description="Optional domain of an account", default=""
)
class ListOfCredentials(BaseModel):
credentials: list[Credential] = []
Prompt
In the prompt, we give some examples of what kind of systems we are looking for, and output into JSON once again:
Given the following file, check if there are credentials in the file.
Only include results if there is at least one username and password.
If the domain doesn't exist in the file, enter `` as a default value.
${file}
${gr.complete_xml_suffix_v2}
Results
We tested on 600 files, where 304 contain credentials and 296 do not. Testing occurred with the gemini-1.5 model. Each file took about five seconds to process.
To compare results with other tools, we also tested Nosey Parker and TruffleHog. Both NoseyParker and Truffle Hog are made to find credentials in a structured way in files, including repositories. Their use case is usually for known file formats and randomly structured files.
The results are summarized in Table 2.
Tool
True Negative
False Positive
False Negative
True Positive
Nosey Parker
284 (47%)
12 (2%)
136 (23%)
168 (28%)
TruffleHog
294 (49%)
2 (<1%)
180 (30%)
124 (21%)
Gemini
278 (46%)
18 (3%)
23 (4%)
281 (47%)
Table 2: Results of testing for credentials in files, where 304 contain them and 296 do not
In this context, the definitions of true negative, false positive, false negative, and true positive are as follows:
True Negative: A file does not contain any credentials, and the tool correctly indicates that there are no credentials.
False Positive: The tool incorrectly indicates that a file contains credentials when it does not.
False Negative: The tool incorrectly indicates that a file does not contain any credentials when it does.
True Positive: The tool correctly indicates that a file contains credentials.
In conclusion, Gemini finds the most files with credentials, at a cost of a slightly higher false positive rate. TruffleHog has the lowest false positive rate, but also finds the least amount of true positives. This is to be expected, as a higher true positive rate usually is accompanied by a higher false positive rate. The current dataset has almost an equal number of files with and without credentials—in real-world scenarios this ratio can differ wildly, which means that the false positive rate is still important even though the percentages are quite close.
To optimize this approach, you can use all three tools, combine the output signals to a single signal, and then sort the potential files based on this combined signal.
Defenders can, and should, use the same techniques previously described to enumerate the internal file shares and remove or limit access to files that contain credentials. Make sure to check what file shares each server and workstation exposes to the network, because in some cases file shares are exposed accidentally or were forgotten about.
Internal Reconnaissance
When attackers have gained a better position in the network, the next step in their playbooks is understanding the domain in which they have landed so they can construct a path to their ultimate goal. This could be full domain control or access to specific systems or users, depending on the threat actor’s mission. From a red team perspective, we need to be able to emulate this. From a defender’s perspective, we need to find these paths before the attackers exploit them.
The main tool that red teamers use to analyze Active Directory is BloodHound, which uses a graph database to find paths in the Active Directory. BloodHound is executed in two steps. First, an ingester retrieves the data from the target Active Directory. Second, this data is ingested and analyzed by BloodHound to find attack paths.
Some tools that can gather data to be used in BloodHound are:
Sharphound
Bloodhound.py
Rusthound
Adexplorer
Bofhound
Soaphound
These tools gather data from the Active Directory and other systems and output it in a format that BloodHound can read. In theory, if we have all the information about the network in the graph, then we can just query the graph to figure out how to achieve our objective.
To improve the data in BloodHound, we have thought of additional use cases. Use Case 3 is about finding high-value systems. Discovering more hidden edges in BloodHound is part of Use Case 4 and Use Case 5.
Use Case 3: High-Value Target Detection in Active Directory
By default, BloodHound deems some groups and computers as high value. One of the main activities in internal reconnaissance is figuring out which systems in the client’s network are high-value targets. Some examples of systems that we are interested in, and that can lead to domain compromise, are:
Backup systems
SCCM
Certificate services
Exchange
WSUS systems
There are many ways to indicate which servers are used for a certain function, and it depends on how the IT administrators have configured it in their domain. There are some fields that may contain data in various forms to indicate what the system is used for. This is a prime example of unstructured data that might be analyzable with an LLM.
The following fields in the Active Directory might contain the relevant information:
Name
Samaccountname
Description
Distinguishedname
SPNs
Data Model
In the end, we would like to have a list of names of the systems the LLM has deemed high value. During development, we noticed that LLM results improved dramatically if you asked it to specify a reason. Thus, our Pydantic model looks like this:
class HighValueSystem(BaseModel):
name: str = Field(description="Name of this system")
reason: str = Field(description="Reason why this system is
high value", default="")
class HighValueResults(BaseModel):
systems: list[HighValueSystem] = Field(description="high value
systems", default=[])
Prompt
In the prompt, we give some examples of what kind of systems we are looking for:
Given the data, identify which systems are high value targets,
look for: sccm servers, jump systems, certificate systems, backup
systems and other valuable systems. Use the first (name) field to
identify the systems.
Results
We tested this prompt on a dataset of 400 systems and executed it five times. All systems were sent in one query to the model. To accommodate this, we used the gemini-1.5 model because it has a huge context window. Here are some examples of reasons Gemini provided, and what we think the reason was based off:
Domain controller: Looks like this was based on the “OU=Domain Controllers” distinguishedname field of BloodHound
Jumpbox: Based on the “OU=Jumpboxes,OU=Bastion Servers” distinguishedname
Lansweeper: Based on the description field of the computer
Backup Server: Based on “OU=Backup Servers” distinguishedname
Some of the high-value targets are valid yet already known, like domain controllers. Others are good finds, like the jumpbox and backup servers. This method can process system names in other languages and more verbose descriptions of systems to determine systems that may be high value. Additionally, this method can be adapted to allow for a more specific query—for example, that might suit a different client environment:
Given the data, identify which systems are related to
SWIFT. Use the first (name) field to identify the systems.
In this case, the LLM will look for SWIFT servers and may save you some time searching for it manually. This approach can potentially be even better when you combine this data with internal documentation to give you results, even if the Active Directory information is lacking any information about the usage of the system.
For defenders, there are some ways to deal with this situation:
Limit the amount of information in the Active Directory and put the system descriptions in your documentation instead of within the Active Directory
Limit the amount of information a regular user can retrieve from the Active Directory
Monitor LDAP queries to see if a large amount of data is being retrieved from LDAP
Use Case 4: User Clustering
After gaining an initial strong position, and understanding the systems in the network, attackers will often need to find the right users to compromise to gain further privileges in the domain. For defenders, legacy user accounts or administrators with too many rights is a common security issue.
Administrators often have multiple user accounts: one for normal operations like reading email and using it on their workstations, and one or multiple administrator accounts. This separation is done to make it harder for attackers to compromise the administrator account.
There are some common flaws in the implementations that sometimes make it possible to bypass these separations. Most of the methods require the attacker to cluster the users together to see which accounts belong to the same employee. In many cases, this can be done by inspecting the Active Directory objects and searching for patterns in the display name, description, or other fields. To automate this, we tried to find these patterns with Gemini.
Data Model
For this use case, we would like to have the account’s names that Gemini clusters together. During initial testing, the results were quite random. However, after adding a “reason” field, the results improved dramatically. So we used the next Pydantic model:
class User(BaseModel):
accounts: list[Account] = Field(
description="accounts that probably belongs
to this user", default=[]
)
reason: str = Field(
description="Reason why these accounts belong
to this user", default=""
)
class UserAccountResults(BaseModel):
users: list[User] = Field(description="users with multiple
accounts", default=[])
Prompt
In the prompt, we give some examples of what kind of systems we are looking for:
Given the data, cluster the accounts that belong to a single person
by checking for similarities in the name, displayname and sam.
Only include results that are likely to be the same user. Only include
results when there is a user with multiple accounts. It is possible
that a user has more than two accounts. Please specify a reason
why those accounts belong to the same user. Use the first (name)
field to identify the accounts.
Results
The test dataset had about 900 users. We manually determined that some users have two to four accounts with various permissions. Some of these accounts had the same pattern like “user@test.local” and “adm-user@test.local.” However, other accounts had patterns where the admin account was based on the first couple of letters. For example, their main account had the pattern matthijs.gielen@test.local, and the admin account was named: adm-magi@test.local. To keep track of those accounts, the description of the admin account contained some text similar to “admin account of Matthijs Gielen.”
With this prompt, Gemini managed to cluster 50 groups of accounts in our dataset. After manual verification, some of the results were discarded because they only contained one account in the cluster. This resulted in 43 correct clusters of accounts. Manually, we found the same correlation; however, where Gemini managed to output this information in a couple of minutes, manually this took quite a bit longer to analyze and correlate all accounts. This information was used in preparation for further attacks, as shown in the next use case.
Use Case 5: Correlation Between Users and Their Machines
Knowing which users to target or defend is often not enough. We also need to find them within the network in order to compromise them. Domain administrators are (usually) physical people; they need somewhere to type in their commands and perform administrative actions. This means that we need to correlate which domain administrator is working from which workstation. This is called session information, and BloodHound uses this information in an edge called “HasSession.”
In the past, it was possible to get all session information with a regular user during red teaming.
Using the technique in Use Case 4, we can correlate the different user accounts that one employee may have. The next step is to figure out which workstation belongs to that employee. Then we can target that workstation, and from there, hopefully recover the passwords of their administrator accounts.
In this case, employees have corporate laptops, and the company needs to keep track of which laptop belongs to which employee. Often this information is stored in one of the fields of the computer object in the Active Directory. However, there are many ways to do this, and using Gemini to parse the unstructured data is one such example.
Data Model
This model is quite simple, we just want to correlate machines to their users and have Gemini give us a reason why—to improve the output of the model. Because we will send all users and all computers at once, we will need a list of results.
class UserComputerCorrelation(BaseModel):
user: str = Field(description="name of the user")
computer: str = Field(description="name of the computer")
reason: str = Field(
description="Reason why these accounts belong to this user",
default=""
)
class CorrelationResults(BaseModel):
results: list[UserComputerCorrelation] = Field(
description="users and computers that correlate", default=[]
)
Prompt
In the prompt, we give some examples of what kind of systems we are looking for:
Given the two data sets, find the computer that correlates
to a user by checking for similarities in the name, displayname
and sam. Only include results that are likely to correspond.
Please specify a reason why that user and computer correlates.
Use the first (name) field to identify the users and computers.
Results
The dataset used contains around 900 users and 400 computers. During the assignment, we determined that the administrators correlated users and their machines via the description field of the computer, which was sort of equal to the display name of the user. Gemini correctly picked up this connection, correctly correlating around 120 users to their respective laptops (Figure 3).
Gemini helped us to select an appropriate workstation, which enabled us to perform lateral movement to a workstation and obtain the password of an administrator, getting us closer to our goal.
To defend against these threats, it can be valuable to run tools like BloodHound in the network. As discussed, BloodHound might not find all the “hidden” edges in your network, but you can add these yourself to the graph. This will allow you to find more Active Directory-based attack paths that are possible in your network and mitigate these before an attacker has an opportunity to exploit those attack paths.
Conclusion
In this blog post, we looked at processing red team data using LLMs to aid in adversarial emulation or improving defenses. These use cases were related to processing human-generated, unstructured data. Table 3 summarizes the results.
Use Case
Accuracy of the Results
Usefulness
Roles
High: There were a few false positives that were in the gray area.
High: Especially when going through a large list of roles of users, this approach will provide fairly fast results.
Credentials in files
High: Found more credentials than comparable tools. More testing should look into the false-positive rate in real scenarios.
Medium: This approach finds a lot more results; however, processing it with Gemini is a lot slower (five seconds per file) than many other alternatives.
High-value targets
Medium: Not all results were new, nor were all high-value targets.
Medium: Some of the results were useful; however, all of them still require manual verification.
Account clustering
High: After taking into account the clusters with one account, the other ones were well clustered.
High: Clustering users is most of the time a tedious process to do manually. It gives fairly reliable results if you filter out the results with only one account.
Computer correlation
High: All results were correctly correlated users to their computers.
High: This approach produces accurate results potentially providing insights into extra possible attack paths.
Table 3: The results of our experiments of data processing with Gemini
As the results show, using an LLM like Gemini can help in converting this type of data into structured data to aid attackers and defenders. However, keep in mind that LLMs are not a silver bullet and have limitations. For example, they can sometimes produce false positives or be slow to process large amounts of data.
There are quite a few use cases we have not covered in this blog post. Some other examples where you can use this approach are:
Correlating user groups to administrator privileges on workstations and servers
Summarizing internal website content or documentation to search for target systems
Ingesting documentation to generate password candidates for cracking passwords
The Future
This was just an initial step that we on the Advanced Capabilities team on the Mandiant Red Team have explored so far when using LLMs for adversarial emulation and defense. For next steps, we know that the models and prompts can be improved by testing variations in the prompts, and other data sources can be investigated to see if Gemini can help analyze them. We are also looking at using linear regression models as well as clustering and pathfinding algorithms to enable cybersecurity practitioners to quickly evaluate attack paths that may exist in a network.