Editor’s note: In the heart of the fintech revolution, Current is on a mission to transform the financial landscape for millions of Americans living paycheck to paycheck. Founded on the belief that everyone deserves access to modern financial tools, Current is redefining what it means to be a financial institution in the digital age. Central to their success is a cloud-native infrastructure built on Google Cloud, with Spanner, Google’s globally distributed database with virtually unlimited scale, serving as the bedrock of their core platform.
More than 100 million Americans struggle to make ends meet, including the 23% of low-income Americans the Federal Reserve estimates do not have a bank account. Current was created to address their needs with a unique business model focused on payments, rather than the deposits and withdrawals of traditional financial institutions. We offer an easily accessible experience designed to make financial services available to all Americans, regardless of age or income.
Our innovative approach — built on proprietary banking core technology with minimal reliance on third-party providers — enables us to rapidly deploy financial solutions tailored to our members’ immediate needs. More importantly, these solutions are flexible enough to evolve alongside them in the future.
In our mission to deliver an exceptional experience, one of the biggest challenges we faced was creating a scalable and robust technological foundation for our financial services. To address this, we developed a modern core banking system to power our platform. Central to this core is our user graph service, which manages all member entities — such as users, products, wallets, and gateways.
Many unbanked and disadvantaged Americans lack bank accounts due to a lack of trust in institutions as much as because of any lack of funds. If we were going to win their trust and business, we knew we had to have a secure, seamless, and reliable service.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5581c88730>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A cloud-native core with Spanner
Our previous self-hosted graph database solution lacked cloud-native capabilities and horizontal scalability. To address these limitations, we strategically transitioned to managed persistence layers, which significantly improves our risk posture. Features like point-in-time restore and multi-regional redundancy enhanced our resilience, reduced recovery time objectives (RTO) and improved recovery point objectives (RPO). Additionally, push-button scaling optimized our cloud budget and operational efficiency.
This cloud-native platform necessitated a database solution with consistent writes, horizontal scalability, low read latency under load, and multi-region failover. Given our extensive use of Google Cloud, we prioritized its database offerings. Spanner emerged as the ideal solution, fulfilling all our requirements. It offers consistent writes, horizontal scalability, and the ability to maintain low read latency even under heavy load. Its seamless scalability — particularly the decoupling of compute and storage resources — proved invaluable in adapting to our dynamic consumer environment.
This robust and scalable infrastructure empowers Current to deliver reliable and efficient financial services, critical for building and maintaining member trust. We are the primary financial relationship for millions of Americans who are trusting us with their money week after week.Our experience migrating from a third-party database to Spanner proved that transitioning to a globally scalable, highly available database can be easy and seamless. Spanner’s unique ability to scale compute and storage independently proved invaluable in managing our dynamic user base.
Our strategic migration to Spanner employed a write-ahead commit log to ensure a seamless transition. By prioritizing the migration of reads and verifying their accuracy before shifting writes, we minimized risk and maximized efficiency. This process resulted in a zero-downtime, zero-loss cutover, where we could first transition reads to Spanner on a service-by-service basis, confirm accuracy, and finally migrate writes.
Ultimately, our Spanner-powered user graph service delivered the consistency, reliability, and scalability essential for our financial platform. We had renewed confidence in our ability to serve our millions of customers with reliable service and new abilities to scale our existing services and future offerings.
Unwavering Reliability and Enhanced Operational Efficiency
Spanner has dramatically improved our resilience, reducing RTO and RPO by more than 10x, cutting times to just one hour. With Spanner’s streamlined data restoration process, we can now recover data with a few simple clicks. Offloading operational management has also significantly decreased our team’s maintenance burden. With nearly 5,000 transactions per second, we continue to be impressed by Spanner’s performance and scalability.
Additionally, since migrating to Spanner, we have reduced our availability-related incidents to zero. Such incidents could disrupt essential banking functions like accessing funds or making payments, leading to customer dissatisfaction and potential churn, as well as increased operational costs for issue resolution. Elimination of these occurrences is critical for building and maintaining member trust, enhancing retention, and improving the developer experience.
Building Financial Resilience with Google Cloud
Looking ahead, we envision a future where our platform continues to evolve, delivering innovative financial solutions that meet the ever-changing needs of our members. With Spanner as the foundation of our core platform — you could call it the core of cores — we are confident in building a resilient and reliable platform that enables millions of more Americans to improve their financial outcomes.
In today’s congested digital landscape, businesses of all sizes face the challenge of optimizing their marketing budgets. They must find ways to stand out amid the bombardment of messages vying for potential customers’ attention. Moreover, they grapple with rising customer acquisition costs and dwindling retention rates, impeding their profitability.
Adding to this complexity is the abundance of consumer data, which businesses often struggle to harness effectively to target the right audience. To address these challenges, companies are seeking data-driven approaches to enhance their advertising effectiveness, to help ensure their continued relevance and profitability.
Moloco offers AI-powered advertising solutions that drive user acquisition, retention, and monetization efforts. Moloco Ads, its demand-side platform (DSP), utilizes its customers’ unique first-party data, helping them to target and acquire high-value users based on real-time consumer behavior — ultimately, delivering higher conversion rates and return on investment.
To meet this demand, Moloco leverages predictions from a dozen deep neural networks, while continuously designing and evaluating new models. The platform ingests 10 petabytes of data and processes bid requests per day at a peak rate of 10.5 million queries per second (QPS).
Moloco has seen tremendous growth over the last three years, with its business growing over 8X and multiple customers spending more than $50 million annually. Moloco’s rapid growth required an infrastructure that could handle massive data processing and real-time ML predictions while remaining cost effective. As Moloco’s models grew in complexity, training times increased, hindering productivity and innovation. Separately, the Moloco team realized that they also needed to optimize serving efficiency to scale low-latency ad experiences for users across the globe.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e55818530a0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Training complex ML models with GKE
After evaluating multiple cloud providers and their solutions, Moloco opted for Google Cloud for its scalability, flexibility, and robust partner ecosystem.The infrastructure provided by Google Cloud aligned with Moloco’s requirements for handling its rapidly growing data and machine learning workloads that are instrumental to optimizing customers’ advertising performance.
Google Kubernetes Engine (GKE) was a primary reason for Moloco selecting Google Cloud over other cloud providers. As Moloco discovered, GKE is more than a container orchestration tool; it’s a gateway to harnessing the full potential of AI and ML. GKE provides scalability and performance optimization tools to meet diverse ML workloads, and supports a wide range of frameworks, allowing Moloco to customize the platform according to their specific needs.
GKE serves as a foundation for a unified AI/ML platform, integrating with other Google Cloud services, facilitating a robust environment for the data processing and distributed computing that underpin Moloco’s complex AI and ML tasks. GKE’s ML data layer offers the high-throughput storage solutions that are crucial for read-heavy workloads. Features like cluster autoscaler, node-auto provisioner, and pod autoscalers ensure efficient resource allocation.
“Scaling our infrastructure as Moloco’s Ads business grew exponentially was a huge challenge. GKE’s autoscaling capabilities enabled the engineering team to focus on development without spending a ton of effort on operations.” – Sechan Oh, Director of Machine Learning, Moloco
Shortly after migrating to Google Cloud, Moloco began using GKE for model training. However, Moloco quickly found that using traditional CPUs was not competitive at its scale, in terms of both cost and velocity. GKE’s ability to autoscale on multi-host Tensor Processing Units (TPUs), Google’s specialized processing units for machine learning workloads, was critical to Moloco’s success, allowing Moloco to harness TPUs at scale, resulting in significant enhancements in training speed and efficiency.
Moloco further leveraged GKE’s AI and ML capabilities to optimize the management of its compute resources, minimizing idle time and generating cost savings while improving performance. Notably, GKE empowered Moloco to scale its ML infrastructure to accommodate exponential business growth without straining its engineering team. This enabled Moloco’s engineers to concentrate on developing AI and ML software instead of managing infrastructure.
“The GKE team collaborated closely with us to enable auto scaling for multi host TPUs, which is a recently added feature. Their help has really enabled amazing performance on TPUs, reducing our cost per training job by 2-4 times.” – Kunal Kukreja, Senior Machine Learning Engineer, Moloco
In addition to training models on TPUs, Moloco also uses GPUs on GKE to deploy ML models into production. This lets the Moloco platform handle real-time inference requests effectively and benefit from GKE’s scalability and operational stability, enhancing performance and supporting more complex models.
Moloco collaborated closely with the Google Cloud team throughout the implementation process, leveraging their expertise and guidance. The Google Cloud team supported Moloco in implementing solutions that ensured a smooth transition and minimal disruption to operations. Specifically, Moloco worked with the Google Cloud team to migrate its ML workloads to GKE using the platform’s autoscaling and pod prioritization capabilities to optimize resource utilization and cost efficiency. Additionally, Moloco integrated Cloud TPUs into its training pipeline, resulting in significantly reduced training times for complex ML models. Furthermore, Moloco optimized its serving infrastructure with GPUs, ensuring low-latency ad experiences for its customers.
A powerful foundation for ML training and inference
Moloco’s collaboration with Google Cloud profoundly transformed its capacity for innovation.
“By harnessing Google Cloud’s solutions, such as GKE and Cloud TPU, Moloco dramatically reduced ML training times by up to tenfold.”–Sechan Oh, Director of Machine Learning, Moloco
This in turn facilitated swift model iteration and experimentation, empowering Moloco’s engineers to innovate with unprecedented speed and efficiency. Moreover, the scalability and performance of Google Cloud’s infrastructure enabled Moloco to manage increasingly intricate models and expansive datasets, to create and implement cutting-edge machine learning solutions. Notably, Moloco’s low-latency advertising experiences, bolstered by GPUs, fostered enhanced customer satisfaction and retention.
Moloco’s success demonstrates the power of Google Cloud’s solutions to enable businesses achieve their full potential. By leveraging GKE, Cloud TPU, and GPUs, Moloco was able to scale its infrastructure, accelerate its ML training, and deliver exceptional ad experiences to its customers. As Moloco continues to grow and innovate, Google Cloud will remain a critical partner in its success.
Meanwhile, GKE is transforming the AI and ML landscape by offering a blend of scalability, flexibility, cost-efficiency, and performance. And Google Cloud continues to invest in GKE so it can handle even the most demanding AI training workloads. For example, GKE now supports 65,000-node clusters, offering unmatched scale for training or inference. For more, watch this demo of 65,000 nodes on a single GKE cluster.
Based on your feedback, Partner Summit 2025 will begin on Tuesday, April 8 – one day before Google Cloud Next kicks off – to offer a dedicated day of partner breakout sessions and learning opportunities before the main event begins. The Partner Summit Lounge, partner keynote, lightning talks, and more will all be available April 9–11, 2025.
Partner Summit is your exclusive opportunity to:
Accelerate your business by aligning on joint business goals, learning about new programmatic and incentive opportunities, and diving deep into cutting-edge insights in our Partner Summit breakout sessions and lightning talks.
Build new connections as you network with other partners and Googlers while you explore the activities and perks located in our exclusive Partner Summit Lounge.
Get a look at what’s next from Google Cloud leadership at the dedicated partner keynote to learn about where cloud is headed – and how our partners are central to our mission.
Make the most of our partnership with personalized advice from Google Cloud team members on incentives, certifications, co-marketing, and more at our Meet the Experts booths.
Get ready to learn, connect, and build the future of business with us. Early bird registration is now open for $999. This special rate is only available through February 14, 2025, or until tickets are sold out.
Google Cloud Next returns to Las Vegas, April 9–11, 2025* and I’m thrilled to share that registration is now live! We welcomed 30,000 attendees to our largest flagship conference in Google Cloud history this past April, and 2025 will be even bigger and better than ever.
Join us for an unforgettable week of hands-on experiences, inspiring content, problem-solving with our top partners and seize the opportunity to learn from top experts and peers tackling the same challenges you are day in and day out. Walk away with new ideas, breakthrough skills and actionable knowledge only available at Google Cloud Next 2025.
Early bird registration is now available for just $999 for a limited time**.
Here’s why you need to be at Next:
Experience AI in Action: Immerse yourself in the latest technology; build your next agent; explore our demos, hackathons, and workshops; and learn how others are harnessing the power of AI to propel their businesses to new heights.
Forge Powerful Connections: Network with peers, industry experts, and the brightest minds in tech to exchange ideas, spark collaborations, and shape the future of your industry.
Build and Learn Live: With a wealth of demos and workshops, hackathons, keynotes, and deep dives, Next is the place to be for the builders, dreamers, and doers shaping the future of technology.
* Select programming to take place in the afternoon of April 8. ** Space is limited, and this offer is only valid through 11:59 PM PT on February 14, 2025, or until tickets are sold out.
Through our collaboration, the Air Force Research Laboratory (AFRL) is leveraging Google Cloud’s cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities to tackle complex challenges across various domains, from materials science and bioinformatics to human performance optimization. AFRL, the center for scientific research and development for the U.S. Air Force and Space Force, is embracing the transformative power of AI and cloud computing to accelerate its mission of developing and transitioning advanced technologies to the air, space, and cyberspace forces.
This collaboration not only enhances AFRL’s research capabilities, but also aligns with broader Department of Defense (DoD) initiatives to integrate AI into critical operations, bolster national security, and maintain technological advantage by demonstrating game-changing technologies that enable technical superiority and help the Air Force adopt to cutting edge technologies as soon as they are released. By harnessing Google Cloud’s scalable infrastructure, comprehensive generative AI offerings and collaborative environment, the AFRL is driving innovation and ensuring the U.S. Air Force and Space Force remain at the forefront of technological advancement.
Let’s delve into examples of how the AFRL and Google Cloud are collaborating to realize the benefits of AI and cloud services:
Bioinformatics breakthroughs: The AFRL’s bioinformatics research was once hindered by time-consuming manual processes and data bottlenecks, causing delays in moving and sharing data, getting access to US-based tools, using standard storage and hardware, and having the right system communications and integrations across third party infrastructure. Because of this, cross-team collaboration and experiment expansion was severely limited and inefficiently tracked. With very little cloud experience, the team was able to create a siloed environment where they used Google Cloud’s infrastructure, such as Google Compute Engine, Cloud Workstations, and Cloud Run to build analytic pipelines that helped them test, store, and analyze data in an automated and streamlined way. That data pipeline automation paved the way for further exploration and expansion on a use case that had never been done before.
Web app efficiency for lab management: The AFRL’s complex lab equipment scheduling process resulted in challenges in providing scalable, secure access to important content and information for users in different labs. To mitigate these challenges and ease maintenance for non-programmer researchers and lab staff, the team built a custom web application based on Google App Engine, integrated with Google Workspace and Apps Scripts, so that they could capture usage metrics for future hardware investment decisions and automate admin tasks that were taking time away from research. The result was significantly faster ability to make changes without administrator intervention, a variety of self-service options for users to schedule time on equipment and request training, and an enhanced, scalable design architecture with built-in SSO that helped streamline internal content for multiple labs.
Modeling insights into human performance: Understanding and optimizing human performance is critical for the AFRL’s mission. The FOCUS Mission Readiness App, built on Google Cloud utilizes various infrastructure services, such as Cloud Run, Cloud SQL, and GKE and integrates with the Garmin Connect APIs to collect and analyze real-time data from wearables.
By leveraging Google Cloud’s BigQuery and other analytics tools, this app provides personalized insights and recommendations for fatigue interventions and predictions that help capture valuable improvement mechanisms in cognitive effectiveness and overall well-being for Airmen.
Streamlined AI model development with Vertex AI:
The AFRL wanted to replicate the functionality of university HPC clusters, especially since there was a diversity of users that needed extra compute and not everyone was trained on how to use these tools. They wanted an easy GUI and to maintain active connections where they could develop AI models and test their research with confidence. They leveraged Google Cloud’s Vertex AI and Jupyter Notebooks through Workbench, Compute Engine, Cloud Shell, Cloud Build and much more to get a head start in creating a pipeline that could be used for sharing, ingesting, and cleaning their code. Having access to these resources helped create a flexible environment for researchers to do model development and testing in an accelerated manner.
Cloud capabilities and AI/ML tools provide a flexible and adaptable environment that empowers our researchers to rapidly prototype and deploy innovative solutions. It’s like having a toolbox filled with powerful AI building blocks that can be combined to tackle our unique research challenges.
Dr. Dan Berrigan
Air Force Research Laboratory
The AFRL’s collaboration with Google Cloud exemplifies how AI and cloud services can be a driving force behind innovation, efficiency, and problem-solving across agencies. As the government continues to invest in AI research and development, collaborations like this will be crucial for unlocking the full potential of AI and cloud computing, ensuring that agencies across the federal landscape can leverage these transformative technologies to create a more efficient, effective, and secure future for all.
Learn more about how we’ve helped government agencies accelerate their mission and impact with AI.
Watch the Google Public Sector Summit On Demand to gain crucial insights on the critical intersection of AI and Security in the public sector.
Written by: Ilyass El Hadi, Louis Dion-Marcil, Charles Prevost
Executive Summary
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security (AppSec) expertise enables organizations to better simulate the tactics and techniques of modern adversaries. This includes:
Leveraging minimal access for maximum impact: There is no need for high privilege escalation. Red Team objectives can often be achieved with limited access, highlighting the importance of securing all internet-facing assets.
Recognizing the potential of low-impact vulnerabilities through vulnerability chaining: Low- and medium-impact vulnerabilities can be exploited in combination to achieve significant impact.
Developing your own exploits: Skilled adversaries or consultants will invest the time and resources to reverse-engineer and/or find zero-day vulnerabilities in the absence of public proof-of-concept exploits.
Employing diverse skill sets: Red Team members should include individuals with a wide range of expertise, including AppSec.
Fostering collaboration: Combining diverse skill sets can spark creativity and lead to more effective attack simulations.
Integrating AppSec throughout the engagement: Offensive application security contributions can benefit Red Teams at every stage of the project.
By embracing this approach, organizations can proactively defend against a constantly evolving threat landscape, ensuring a more robust and resilient security posture.
Introduction
In today’s rapidly evolving threat landscape, organizations find themselves engaged in an ongoing arms race against increasingly sophisticated cyber criminals and nation-state actors. To stay ahead of these adversaries, many organizations turn to Red Team assessments, simulating real-world attacks to expose vulnerabilities before they are exploited. However, many traditional Red Team assessments typically prioritize attacking network and infrastructure components, often overlooking a critical aspect of modern attack surfaces: web applications.
This gap hasn’t gone unnoticed by cyber criminals. In recent years, industry reports consistently highlight the evolving trend of attackers exploiting public-facing application vulnerabilities as a primary entry point into organizations. This aligns with Mandiant’s observations of common tactics used by threat actors, as observed in our 2024 M-Trends Report: “In intrusions where the initial intrusion vector was identified, 38% of intrusions started with an exploit. This is a six percentage point increase from 2022.”
The 2024 M-Trends Report also documents that 28.7% of Initial Compromise access is obtained through exploiting public-facing web applications (MITRE T1190).
At Mandiant, we recognize this gap and are committed to closing it by integrating AppSec expertise into our Red Team assessments. This optional approach is offered to customers who wish to increase the coverage of their external perimeters to gain a deeper understanding of their security posture. While most of the infrastructure typically receive a considerable amount of security scrutiny, web applications and edge devices often lack the same level of consideration, making them prime targets for attackers.
This integrated approach is not limited to full-scope Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
The Role of Application Security in Red Team Assessments
The integration of AppSec specialists into Red Team assessments manifests in a unique staffing approach. The role of this specialist is to augment the Red Team’s capabilities with the ever-evolving exploitation techniques used by adversaries to breach organizations from the external perimeter.
The AppSec specialist will often get involved as early as possible on an engagement, even during the scoping and early planning stages. They perform a meticulous review of the target perimeter, mapping out the various application inventory and identifying vulnerabilities within the various components of web applications and application programming interfaces (APIs) exposed to the internet.
While examination is underway, Red Team operators concurrently focus on other crucial aspects of the assessment, including infrastructure preparation, crafting convincing phishing campaigns, developing and refining tools, and creating effective payloads that will evade the target environment’s controls and defense mechanisms.
Once an AppSec vulnerability of critical impact is discovered, the team will generally proceed to its exploitation, notifying our primary point of contact of our preliminary findings and validating the potential impacts of our discovery. It is important to note that a successful finding doesn’t always result in a direct foothold in the target environment. The intelligence gathered through the extensive reconnaissance and perimeter review phase can be repurposed for various aspects of the Red Team mission. This could include:
Identifying valuable reconnaissance targets or technologies to fine-tune a social engineering campaign
Further tailoring an attack payload
Establishing a temporary foothold that might lead to further exploitation
Hosting malicious payloads for later stages of the attack simulation
Once the external perimeter examination phase is complete, our Red Team operators will begin carrying out the remaining mission objectives, empowered with the AppSec team’s insights and intelligence, including identified vulnerabilities and associated exploits. Even though the Red Team operators will perform most of the remaining activities at this point, the AppSec consultants will stay close to the engagement and often engage to further support internal exploitations efforts. For example, applications that are only accessible internally generally get a lot less scrutiny and are consequently assessed much less frequently than externally accessible assets.
By incorporating AppSec expertise, we’ve achieved a significant increase of engagements where our Red Team successfully gained a significant advantage during a customer’s external perimeter review, such as obtaining a foothold or gaining access to confidential information. This overall approach translates to a more realistic and valuable assessment for our customers, ensuring comprehensive coverage of both network and application security risks. By uncovering and addressing vulnerabilities across the entire attack surface, Mandiant empowers organizations to proactively defend against a wide array of threats, strengthening their overall security posture.
Case Studies: Demonstrating the Impact of Application Security Support
In this section, we focus on four of the multiple real-world scenarios where the support of Mandiant’s AppSec Team has significantly enhanced the effectiveness of Red Team assessments. Each case study highlights the attack vectors, the narrative behind the attack, key takeaways from the experience, and the associated assumptions and misconceptions.
These case studies highlight the value of incorporating application security support in Red Team engagements, while also offering valuable learning opportunities that promote collaboration and knowledge sharing.
Unlocking the Vault: Exposed API Key to Sensitive Internal Document Access
Context
A company in the energy sector engaged Mandiant to assess the efficiency of its cybersecurity team’s abilities in detection, prevention, and response. Because the organization had grown significantly in the past years following multiple acquisitions, Mandiant suggested an increased focus on their external perimeter. This would allow the organization to measure the subsidiaries’ external security posture, compared to the parent organization’s.
Target of Interest
Following a thorough reconnaissance phase, the AppSec Team began examination of a mobile application developed by the customer for its business partners. Once the mobile application was decompiled, a hardcoded API key granting unauthorized access to an external API service was discovered. Leveraging the API key, authenticated reconnaissance on the API service was conducted, which led to the discovery of a significant vulnerability within the application’s PDF generation feature: a full-read Server-Side Request Forgery (SSRF), enabled through HTML injection.
Vulnerability Identification
During the initial reconnaissance phase, the team observed that numerous internal systems’ hostnames were publicly accessible through certificate transparency logs. With that in mind, the objective was to exploit the SSRF vulnerability to determine if any of these internal systems were reachable via the external API service. Eventually, one such host was identified: a commercial ASP.NET document management solution. Once the solution’s name and version were identified, the AppSec Team searched for known vulnerabilities online. Among the findings was a recent CVE entry regarding insecure ViewState deserialization, which included details about the affected dynamic-link library (DLL) name.
Exploitation
With no public exploit proof-of-concepts available, the team searched for the DLL without success until the file was found in VirusTotal’s public corpus. The DLL was then decompiled into C# code, revealing the vulnerable function, which provided all the necessary components for a successful exploitation. Next, the application security consultants leveraged the post-authentication SSRF vector to exploit the ViewState deserialization vulnerability, affecting the internal application. This attack chain led to a reliable foothold into the parent organization’s internal network.
Takeaways
The organization’s demilitarized zone (DMZ) was now breached, and the remote access could be passed off to the Red Team operators. This enabled the operators to perform lateral movement into the network and achieve various predetermined objectives. However, the customer expressed high satisfaction with the demonstrated impact prior to lateral movement, especially since the application server housed numerous sensitive documents. This underscores a common misconception that exploiting the external perimeter must necessarily result in facilitating lateral movement within the internal network. Yet, the impact was evident even before lateral movement, simply by gaining access to the customer’s sensitive data.
Breaking Barriers: Blind XSS as a Gateway to Internal Networks
Context
A company operating in the technology industry engaged Mandiant for a Red Team assessment. This company, with a very mature security program, requested that no phishing be performed because they were already conducting numerous internal phishing and vishing exercises. They highlighted that all previous Red Team engagements had relied heavily on various social engineering methods, and the success rate was consistently low.
Target of Interest
During the external reconnaissance efforts, the AppSec Team identified multiple targets of interest, such as a custom-built customer relationship management (CRM) solution. Leveraging the Wayback Machine on the CRM hostname, a legacy endpoint was discovered, which appeared obsolete but still accessible without authentication.
Vulnerability Identification
Despite not being accessible through the CRM’s user interface, the endpoint contained a functional form to request support. The AppSec Team injected a blind cross-site scripting (XSS) payload into the form, which loaded an external JavaScript file containing post-exploitation code. When successful, this method allows an adversary to temporarily hijack the targeted user’s browser tab, allowing attackers to perform actions on behalf of the user. Moments later, the team received a notification that the payload successfully executed within the context of a user browsing an internal customer support administration panel.
The AppSec Team analyzed the exfiltrated Document Object Model (DOM) to further understand the payload’s execution context and assess the data accessible within this internal application.The analysis revealed references to Apache Tapestry framework version 3, a framework initially released in 2004. Shortly after identifying the internal application’s framework, Mandiant deployed a local Tapestry v3 instance to identify potential security pitfalls. Through code review, Mandiant discovered a zero-day deserialization vulnerability in the core framework, which led to remote code execution (RCE). Apache Software Foundation assigned CVE-2022-46366 for this RCE.
Exploitation
The zero-day, which affected the internal customer support application, was exploited by submitting an additional blind XSS payload. Crafted to trigger upon form submission, the payload autonomously executed in an employee’s browser, exploiting the internal application’s deserialization flaw. This led to a crucial foothold within the client’s infrastructure, enabling the Red Team to progress with their lateral movement until all objectives were successfully accomplished.
Takeaways
This real-world scenario highlights a common misconception that cross-site scripting holds minimal relevance in Red Team assessments. The significance and impact of this particular attack vector in this case study were evident: it acted as a gateway, breaching the external network and leveraging an employee’s internal network position as a proxy to exploit the internal application. Mandiant had not previously identified XSS vulnerabilities on the external perimeter, which further highlights how the security posture of the external perimeter can be much more robust than that of the internal network.
Logger Danger: From Log Files to Unauthorized Cloud Access
Context
An organization in the transportation sector engaged Mandiant to perform a Red Team assessment, with the goal of emulating an initial access broker (IAB) threat group, focused on breaching externally exposed systems and services. Those groups, who typically resell illegitimate access to compromised victims’ environments, were previously identified as a significant threat to the organization by the Google Threat Intelligence (GTI) team while building a threat profile to help support assessment activities.
Target of Interest
Among hundreds of external applications identified during the reconnaissance phase, one stood out: a commercial Java-based supply chain management solution hosted in the cloud. This application brought additional attention upon discovery of an online forum post describing its installation procedures. Within the post, a link to an unlisted YouTube video was shared, offering detailed installation and administration guidance. Upon reviewing the video, the AppSec Team noted the URL for the application’s trial installer, still accessible online despite not being referenced or indexed anywhere else.
Following installation and local deployment, an administration manual was available within the installation folder. This manual contained a section for a web-based performance monitor plugin that was deployed by default with the application, along with its default credentials. The plugin’s functionality included logging performance metrics and stack traces locally in files upon encountering unhandled errors. Furthermore, the plugin’s endpoint name was uniquely distinct, making it highly unlikely to be discovered with conventional directory brute-forcing methods.
Vulnerability Identification
The AppSec Team successfully logged into the organization’s performance monitor plugin by using the default credentials sourced from the administration manual and resumed local testing to identify post-authentication vulnerabilities. Conducting code review in parallel with manual testing, a log management feature was identified, which allowed authenticated users to manipulate log filenames and directories. The team also observed they could induce errors through targeted, malformed HTTP requests. In conjunction with the log filename manipulation, it was possible to force arbitrary data to be stored at an arbitrary file location on the underlying server’s file system.
Exploitation
The strategy involved intentionally triggering exceptions, which the performance monitor would then log in an attacker-defined Jakarta Server Pages (JSP) file within the web application’s root directory. The AppSec Team crafted an exploit that injected arbitrary JSP code into an HTTP request’s parameter, forcing the performance monitor to log errors into the attacker-controlled JSP file. Upon accessing the JSP log file, the injected code executed, enabling Mandiant to breach the customer’s cloud environment and access thousands of sensitive logistics documents.
Takeaways
A common assumption that breaches should lead to internal on-premises network access or to Active Directory compromise was challenged in this case study. While lateral movement was constrained by time, the primary objective was achieved: emulating an initial access broker. This involved breaching the cloud environment, where the client lacked visibility compared to its internal Active Directory network, and gaining access to business-critical crown jewels.
Collaborative Intrusion: Webhooks to CI/CD Pipeline Access
Context
A company in the automotive sector engaged Mandiant to perform a Red Team assessment, with the goal of obtaining access to their continuous integration and continuous delivery/deployment (CI/CD) pipeline. Due to the sheer number of externally exposed systems, the AppSec Team was staffed to support the Red Team’s reconnaissance and breaching efforts.
Target of Interest
Most of the interesting applications were redirecting to the customer’s single-sign on (SSO) provider. However, one application had a different behavior. By querying the Wayback Machine, the team uncovered an endpoint that did not redirect to the SSO. Instead, it presented a blank page with a unique favicon. With the goal of identifying the application’s underlying technology, the favicon’s hash was calculated and queried using Shodan. The results returned many other live applications sharing the same favicon. Interestingly, some of these applications operated independently of SSO, aiding the team in identifying the application’s name and vendor.
Vulnerability Identification
Once the application’s name was identified, the team visited the vendor’s website and accessed their public API documentation. Among the API endpoints, one stood out—it could be directly accessed on the customer’s application without redirection to the SSO. This API endpoint did not require authentication and only took an incremental numerical ID as its parameter’s value. Upon querying, the response contained sensitive employee information, including email addresses and phone numbers. The team systematically iterated through the API endpoint, incrementing the ID parameter to compile a comprehensive list of employee email addresses and phone numbers. However, the Red Team refrained from leveraging this data, as another intriguing application was discovered. This application exposed a feature that could be manipulated into sending fully user-controlled emails from the company’s no-reply@ email address.
Capitalizing on these vulnerabilities, the Red Team initiated a phishing campaign, successfully gaining a foothold in the customer’s network before the AppSec Team could identify an external breach vector. As efforts continued on the internal post-exploitation, the application security consultants shifted their focus to support the Red Team’s efforts within the internal network.
Exploitation
Digging into network shares, the Red Team found credentials of a developer for an enterprise source control application account. The AppSec Team sifted through reconnaissance data and flagged that the same source control application server was exposed externally. The credentials were successfully used to log in, as multi factor authentication was absent for this user. Within the GitHub interface, the team uncovered a pre-defined webhook linked to the company’s internal Jenkins—an integration commonly employed for facilitating communication between source control systems and CI/CD pipelines. Leveraging this discovery, the team created a new webhook. When manually triggered by the team, this webhook would perform an SSRF to internal URLs. This eventually led to the exploitation of an unauthenticated Jenkins sandbox bypass vulnerability (CVE-2019-1003030), and ultimately in remote code execution, effectively compromising the organization’s CI/CD pipeline.
Takeaways
In this case study, the efficacy of collaboration between the Red Team and the AppSec Team was demonstrated. Leveraging insights gathered collectively, the teams devised a strategic plan to achieve the main objective set by the customer: accessing its CI/CD pipelines. Moreover, we challenged the misconception that singular critical vulnerabilities are indispensable for reaching objectives. Instead, we revealed the reality where achieving goals often requires innovative detours. In fact, a combination of vulnerabilities or misconfigurations, whether they are discovered by the AppSec Team or the Red Team, can be strategically chained together to accomplish the mission.
Conclusion
As this blog post demonstrated, the integration of application security expertise into Red Team assessments yields significant benefits for organizations seeking to understand and strengthen their security posture. By proactively identifying and addressing vulnerabilities across the entire attack surface, including those commonly overlooked by traditional approaches, businesses can minimize the risk of breaches, protect critical assets, and hopefully avoid the financial and reputational damage associated with successful attacks.
This integrated approach is not limited to Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security expertise enables organizations to better simulate the tactics and techniques of modern adversaries.
Google Cloud is delighted to announce the opening of our 41st cloud region in Querétaro, Mexico. This marks our third cloud region in Latin America, joining Santiago, Chile, and São Paulo, Brazil. From Querétaro, we’ll provide fast, reliable cloud services to businesses and public sector organizations throughout Mexico and beyond. This new region offers low latency, high performance, and local data residency, empowering organizations to innovate and accelerate digital transformation initiatives.
Helping organizations in Mexico thrive in the cloud
Google Cloud regions are major investments to bring best-in-class infrastructure, cloud and AI technologies closer to customers. Enterprises, startups, and public sector organizations can leverage Google Cloud’s infrastructure economy of scale and global network to deliver applications and digital services to their end users.
With this new region in Querétaro, Mexico, Google Cloud customers enjoy:
Speed: Serve your end users with fast, low-latency experiences, and transfer large amounts of data between networks easily across Google’s global network.
Security: Keep your organizations’ and customers’ data secure and compliant, including meeting the requirements of CNBV contractual frameworks, and maintain local data residency.
Capacity: Scale to meet growing user and business needs.
Sustainability: Reduce the carbon footprint of your IT environment and help meet sustainability targets.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edc867b96d0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Google Cloud customers are eager to benefit from the new possibilities that this cloud region offers:
“At Prosa, we have been undergoing a transformation process for the past three years that involves adopting technology and developing digital skills within our teams. The partnership with Google has been key to carrying out projects, evolving towards digital business models, enabling the ecosystem, promoting the API-ification of services, and improving data analysis. This alliance is only deepened with the launch of the new Google Cloud region, which will facilitate the integration of participants into the payment ecosystem in a secure and highly available manner, improving the customer experience and delivering value more quickly and agilely,” said Salvador Espinosa, CEO of Prosa, a payment technology company that processed more than 10 million transactions in 2023.
Building a new Google Cloud region in Querétaro, Mexico is welcomed by the Mexican public sector.
“The new Google cloud region in Mexico will be key to build a digital government accountable to citizens, deepening our path to digital transformation. Since 2018, the Auditoria Superior de la Federación (ASF) has pioneered digital transformation in Mexico, promoting innovation and the responsible use of technology, while using advanced technologies like Google Cloud’s Vertex AI, among other proprietary tools, to enhance data analysis, automate processes, and improve collaboration. This enables more accurate decision-making, optimized oversight of public spending, increased inspection coverage, and transparent use of resources. Thanks to the cloud, we see a future where technology is a strategic ally to execute efficient, agile and exhaustive digital audits, detect irregularities early, and strengthen accountability. ASF’s focus on transparency and efficiency aligns with President Claudia Sheinbaum’s public innovation policy.” – Emilio Barriga Delgado, Special Auditor of Federalized Expenditure, Auditoria Superior de la Federación
The new cloud region also opens new opportunities for our global ecosystem of over 100,000 incredibly diverse partners.
“For Amarello and our customers, the availability of a new region in Mexico demonstrates the great growth of Google Cloud and its commitment to Mexico. It’s also a great milestone for the country, putting us on par with other economies. This will create jobs that will speed up our clients’ adoption of strategic projects and latency-sensitive technological services such as financial services or mission-critical operations. At the same time, the new region will enable projects that require information to be maintained within the national territory, now on the most innovative and secure public cloud.” – Mauricio Sánchez Valderrama, managing partner, Amarello Tecnologías de Información
And for global companies looking to tap into the Mexican market:
As networks shift to a cloud-first approach, and hybrid work enables work from anywhere, businesses in the Mexico region can now securely accelerate innovation, boost efficiency, and enhance customer experiences with Palo Alto Networks AI-powered solutions, like Prisma SASE, built in the cloud to secure the cloud at scale. The powerful collaboration between Google Cloud and Palo Alto Networks reinforces our commitment to security and innovation so organizations can confidently embrace the AI-driven future, knowing their users, data, and applications are protected from evolving threats.” Anupam Upadhyaya, Vice President, Product Management, Palo Alto Networks
Delivering on our commitment to Latin America
In 2022, we announced a five-year, $1.2 billion commitment to Latin America, focusing on four key areas: digital infrastructure, digital skills, entrepreneurship, and inclusive, sustainable communities.
We’re equally committed to creating new career opportunities for people in Mexico and Latin America: We’re working with over 550 universities across Latin America to offer a robust and continuously updated portfolio of learning resources so students can seize the opportunities created by new digital technologies like AI and the cloud. As a result, we’ve already granted more than 14,000 digital skill badges to students and individual developers in Mexico over the last 24 months.
Another example of our commitment is the “Súbete a la nube” program that we created in partnership with the Inter-American Development Bank (IDB), with a focus on women and the southern region of the country. To date, 12,500 people have registered for essential digital skills training in cloud computing through the program.
Today, we’re also announcing a commitment to train 1 million Mexicans in AI and cloud technologies over the coming years. Google Cloud will continue to skill Mexico’s local talent with a variety of no-cost training programs for students, developers and customers. Some of the ongoing training programs will include no-cost, localized courses available through YouTube, credentials through the Google Cloud Skills Boost platform, community support by Google Developer Groups, and scholarships for the Google Career Certificates that help prepare learners for high-growth, in-demand jobs in fields like cybersecurity and data analytics, so the cloud can truly democratize innovation and technology.
This new Google Cloud region is also a step towards providing generative AI products and services to Latin American customers. Cloud computing will increasingly be a key gateway towards the development and usage of AI, helping organizations compete and innovate at global scale.
Google Cloud is dedicated to being the partner of choice for customers undergoing digital transformation. We’re focused on providing sustainable, low-carbon options for running applications and infrastructure. Since 2017, we’ve matched 100% of our global annual electricity use with renewable energy. We’re aiming even higher with our 2030 goal: operating on 24/7 carbon-free energy across every electricity grid where we operate, including Mexico.
We’re incredibly excited to open the Querétaro, Mexico region, bringing low-latency, reliable cloud services to Mexico and Latin America, so organizations can take advantage of all that the cloud has to offer. Stay tuned for even more Google Cloud regions coming in 2025 (and beyond), and click here to learn more about Google Cloud’s global infrastructure.
AI agents are revolutionizing the landscape of gen AI application development. Retrieval augmented generation (RAG) has significantly enhanced the capabilities of large language models (LLMs), enabling them to access and leverage external data sources such as databases. This empowers LLMs to generate more informed and contextually relevant responses. Agentic RAG represents a significant leap forward, combining the power of information retrieval with advanced action planning capabilities. AI agents can execute complex tasks that involve multiple steps that reason, plan and make decisions, and then take actions to execute goals over multiple iterations. This opens up new possibilities for automating intricate workflows and processes, leading to increased efficiency and productivity.
LlamaIndex has emerged as a leading framework for building knowledge-driven and agentic systems. It offers a comprehensive suite of tools and functionality that facilitate the development of sophisticated AI agents. Notably, LlamaIndex provides both pre-built agent architectures that can be readily deployed for common use cases, as well as customizable workflows, which enable developers to tailor the behavior of AI agents to their specific requirements.
Today, we’re excited to announce a collaboration with LlamaIndex on open-source integrations for Google Cloud databases including AlloyDB for PostgreSQL and Cloud SQL for PostgreSQL.
These LlamaIndex integrations, available to download via PyPi llama-index-alloydb-pg and llama-index-cloud-sq-pg, empower developers to build agentic applications that can connect with Google databases. The integrations include:
In addition, developers can also access previously published LlamaIndex integrations for Firestore, including for Vector Store and Index Store.
Integration benefits
LlamaIndex supports a broad spectrum of different industry use cases, including agentic RAG, report generation, customer support, SQL agents, and productivity assistants. LlamaIndex’s multi-modal functionality extends to applications like retrieval-augmented image captioning, showcasing its versatility in integrating diverse data types. Through these use cases, joint customers of LlamaIndex and Google Cloud databases can expect to see an enhanced developer experience, complete with:
Streamlined knowledge retrieval: Using these packages makes it easier for developers to build knowledge-retrieval applications with Google databases. Developers can leverage AlloyDB and Cloud SQL vector stores to store and semantically search unstructured data to provide models with richer context. The LlamaIndex vector store integrations let you filter metadata effectively, select from vector similarity strategies, and help improve performance with custom vector indexes.
Complex document parsing: LlamaIndex’s first-class document parser, LlamaParse, converts complex document formats with images, charts and rich tables into a form more easily understood by LLMs; this produces demonstrably better results for LLMs attempting to understand the content of these documents.
Secure authentication and authorization: LlamaIndex integrations to Google databases utilize the principle of least privilege, a best practice, when creating database connection pools, authenticating, and authorizing access to database instances.
Fast prototyping: Developers can quickly build and set up agentic systems with readily available pre-built agent and tool architectures on LlamaHub.
Flow control: For production use cases, LlamaIndex Workflows provide the flexibility to build and deploy complex agentic systems with granular control of conditional execution, as well as powerful state management.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e61ee34f490>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A report generation use case
Agentic RAG workflows are moving beyond simple question and answer chatbots. Agents can synthesize information from across sources and knowledge bases to generate in-depth reports. Report generation spans across many industries — from legal, where agents can do prework such as research, to financial services, where agents can analyze earning call reports. Agents mimic experts that sift through information to generate insights. And even if agent reasoning and retrieval takes several minutes, automating these reports can save teams several hours.
LlamaIndex provides all the key components to generate reports:
Structured output definitions with the ability to organize outputs into Report templates
Intelligent document parsing to easily extract and chunk text and other media
Knowledge base storage and integration across the customer’s ecosystem
Agentic workflows to define tasks and guide agent reasoning
Now let’s see how these concepts work, and consider how to build a report generation agent that provides daily updates on new research papers about LLMs and RAG.
1. Prepare data: Load and parse documents
The key to any RAG workflow is ensuring a well-created knowledge base. Before you store the data, you need to ensure it is clean and useful. Data for the knowledge bases can come from your enterprise data or other sources. To generate reports for top research articles, developers can use the Arxiv SDK to pull free, open-access publications.
But rather than use the ArxivReader to load and convert articles to plain text, LlamaParse supports varying paper formats, tables, and multimodal media leading to improved accuracy of document parsing.
To improve the knowledge base’s effectiveness, we recommend adding metadata to documents. This allows for advanced filtering or support for additional tooling. Learn more about metadata extraction.
2. Create a knowledge base: storage data for retrieval
Now, the data needs to be saved for long-term use. The LlamaIndexGoogle Cloud database integrations support storage and retrieval of a growing knowledge base.
2.1. Create a secure connection to the AlloyDB or Cloud SQL database
Utilize the AlloyDBEngine class to easily create a shareable connection pool that securely connects to your PostgreSQL instance.
Create only the necessary tables needed for your knowledge base. Creating separate tables reduces the level of access permissions that your agent needs. You can also specify a special “publication_date” metadata column that you can filter on later.
2.2. Customize the underlying storage with the Document Store, Index Store, and Vector Store. For the vector store, specify the metadata field “publication_date” that you created previously.
2.4. Create tools from indexes to be used by the agent.
code_block
<ListValue: [StructValue([(‘code’, ‘search_tool = QueryEngineTool.from_defaults(rn query_engine=index.as_query_engine(),rn description=”Useful for retrieving specific snippets from research publications.”,rn)rnrnsummary_tool = = QueryEngineTool.from_defaults(rn query_engine=summary_tool.as_query_engine(),rn description=”Useful for questions asking questions about research publications.”,rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f070>)])]>
3. Prompt: create an outline for the report
Reports may have requirements on sections and formatting. The agent needs instructions for formatting. Here is an example outline of a report format:
code_block
<ListValue: [StructValue([(‘code’, ‘outline=”””rn# DATE Daily report: TOPICrnrn## Executive Summaryrnrn## Top Challenges / Description of problemsrnrn## Summary of papersrnrn| Title | Authors | Summary | Links |rn| —– | ——- | ——- | —– |rn|LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data | Liana Patel, Siddharth Jha, Carlos Guestrin, Matei Zaharia | … | https://arxiv.org/abs/2407.11418v1 |rn”””‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fdf0>)])]>
4. Define the workflow: outline agentic steps
Next, you define the workflow to guide the agent’s actions. For this example workflow, the agent tries to reason what tool to call: summary tools or the vector search tool. Once the agent has reasoned it doesn’t need additional data, it can exit out of the research loop to generate a report.
LlamaIndex Workflows provides an easy to use SDK to build any type of workflow:
Now that you’ve set up a knowledge base and defined an agent, you can set up automation to generate a report!
code_block
<ListValue: [StructValue([(‘code’, ‘query = “What are the recently published RAG techniques”rnreport = await agent.run(query=query)rnrn# Save the reportrnwith open(“report.md”, “w”) as f:rn f.write(report[‘response’])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fa90>)])]>
There you have it! A complete report that summarizes recent research in LLM and RAG techniques. How easy was that?
Get started today
In short, these LlamaIndex integrations with Google Cloud databases enables application developers to leverage the data in their operational databases to easily build complex agentic RAG workflows. This collaboration supports Google Cloud’s long-term commitment to be an open, integrated, and innovative database platform. With LlamaIndex’s extensive user base, this integration further expands the possibilities for developers to create cutting-edge, knowledge-driven AI agents.
Ready to get started? Take a look at the following Notebook-based tutorials:
Browser isolation is a security technology where web browsing activity is separated from the user’s local device by running the browser in a secure environment, such as a cloud server or a virtual machine, and then streaming the visual content to the user’s device.
Browser isolation is often used by organizations to combat phishing threats, protect the device from browser-delivered attacks, and deter typical command-and-control (C2 or C&C) tactics used by attackers.
In this blog post, Mandiant demonstrates a novel technique that can be used to circumvent all three current types of browser isolation (remote, on-premises, and local) for the purpose of controlling a malicious implant via C2. Mandiant shows how attackers can use machine-readable QR codes to send commands from an attacker-controlled server to a victim device.
Background on Browser Isolation
The great folks at SpecterOps released a blog post earlier this year on browser isolation and how penetration testers and red team operators may work around browser isolation scenarios for ingress tool transfer, egress data transfer, and general bypass techniques. In summary, browser isolation protects users from web-based attacks by sandboxing the web browser in a secure environment (either local or remote) and streaming the visual content back to the user’s local browser. The experience is (ideally) fully transparent to the end user. According to most documentation, three types of browser isolation exist:
Remote browser isolation (RBI), the most secure and the most common variant, sandboxes the browser in a cloud-based environment.
On-premises browser isolation is similar to RBI but runs the sandboxed browser on-premises. The advantage of this approach is that on-premises web-based applications can be accessed without requiring complex cloud-to-on-premises connectivity.
Local browser isolation, or client-side browser isolation, runs the sandboxed browser in a local containerized or virtual machine environment ( e.g., Docker or Windows Sandbox).
The remote browser handles everything from page rendering to executing JavaScript. Only the visual appearance of the web page is sent back to the user’s local browser (a stream of pixels). Keypresses and clicks in the local browser are forwarded to the remote browser, allowing the user to interact with the web application. Organizations often use proxies to ensure all web traffic is served through the browser isolation technology, thereby limiting egress network traffic and restricting an attacker’s ability to bypass the browser isolation.
SpecterOps detailed some of the challenges that offensive security professionals face when operating in browser isolation environments. They document possible approaches on how to circumvent browser isolation by abusing misconfigurations, such as using HTTP headers, cookies, or authentication parameters to bypass the isolation features.
Command and control (C2 or C&C) refers to an attacker’s ability to remotely control compromised systems via malicious implants. The most common channel to send commands to and from a victim device is through HTTP requests:
The implant requests a command from the attacker-controlled C2 server through an HTTP request (e.g., in the HTTP parameters, headers, or request body).
The C2 server returns the command to execute in the HTTP response (e.g., in headers or response body).
The implant decodes the HTTP response and executes the command.
The implant submits the command output back to the C2 server with another HTTP request.
The implant “sleeps” for a while, then repeats the cycle.
However, this approach presents challenges when browser isolation is in use—when making HTTP requests through a browser isolation system, the HTTP response returned to the local browser only contains the streaming engine to render the remote browser’s visual page contents. The original HTTP response (from the web server) is only available in the remote browser. The HTTP response is rendered in the remote browser, and only a stream of pixels is sent to the local browser to visually render the web page. This prevents typical HTTP-based C2 because the local device cannot decode the HTTP response (step 3).
In this blog post, we will explore a different approach to achieving C2 with compromised systems in browser isolation environments, working entirely within the browser isolation context.
Sending C2 Data Through Pixels
Mandiant’s Red Team developed a novel solution to this problem. Instead of returning the C2 data in the HTTP request headers or body, the C2 server returns a valid web page that visually shows a QR code. The implant then uses a local headless browser (e.g., using Selenium) to render the page, grabs a screenshot, and reads the QR code to retrieve the embedded data. By taking advantage of machine-readable QR codes, an attacker can send data from the attacker-controlled server to a malicious implant even when the web page is rendered in a remote browser.
Instead of decoding the HTTP response for the command to execute; the implant visually renders the web page (from the browser isolation’s pixel streaming engine) and decodes the command from the QR code displayed on the page. The new C2 loop is as follows:
The implant controls a local headless browser via the DevTools protocol.
The implant retrieves the web page from the C2 server via the headless browser. This request is forwarded to the remote (isolated) browser and ultimately lands on the C2 server.
The C2 server returns a valid HTML web page with the command data encoded in a QR code (visually shown on the page).
The remote browser returns the pixel streaming engine back to the local browser, starting a visual stream showing the rendered web page obtained from the C2 server.
The implant waits for the page to fully render, then grabs a screenshot of the local browser. This screenshot contains the QR code.
The implant uses an embedded QR scanning library to read the QR code data from the screenshot, thereby obtaining the embedded data.
The implant executes the command on the compromised device.
The implant (again through the local browser) navigates to a new URL that includes the command output encoded in a URL parameter. This parameter is passed through to the remote browser and ultimately to the C2 server (after all, in legitimate cases, the URL parameters may be required to return the correct web page).The C2 server can decode the command output as in traditional HTTP-based C2.
The implant “sleeps” for a while, then repeats the cycle.
Mandiant developed a proof-of-concept (PoC) implant using Puppeteer and the Google Chrome browser in headless mode (though any modern browser could be used). We even went a step further and integrated the implant with Cobalt Strike’s External C2 feature, allowing the use of Cobalt Strike’s BEACON implant while communicating over HTTP requests and QR code responses.
Because this technique relies on the visual content of the web page, it works in all three browser isolation types (remote, on-premises, and local).
While the PoC demonstrated the feasibility of this technique, there are some considerations and drawbacks:
During Mandiant’s testing, using QR codes with the maximum data size (2,953 bytes, 177×177 grid, Error Correction Level “L”) was infeasible as the visual stream of the web page rendered in the local browser was of insufficient quality to reliably read the QR code contents. Mandiant was forced to fall back to QR codes containing a maximum of 2,189 bytes of content. Note: QR codes can store up to 2953 bytes per instance, depending on the Error Correction Level (ECL). Higher ECL settings make the QR code more easily readable, but reduce the maximum data size.
Due to the overhead of using Chrome in headless mode, the remote browser startup time, the page rendering requirements, and the stream of visual content from the remote browser back to the local browser, each request takes ~5s to reliably show and scan the QR code. This introduces significant latency in the C2 channel. For example, at the time of writing, a BEACON payload is ~323 KiB. At 2,189 bytes per QR code and 5s per request, a full BEACON payload is transferred in approximately 12m20s (~438 bytes/s, assuming every QR code can be successfully scanned and every network request goes through seamlessly).While this throughput is certainly sufficient for typical C2 operations, some techniques (e.g., SOCKS proxying) become infeasible.
Other security features of browser isolation, such as domain reputation, URL scanning, data loss prevention, and request heuristics, are not considered in this blog post. Offensive security professionals will have to overcome these protection measures as well when operating in browser isolation environments.
Conclusion and Recommendations
In this blog post, Mandiant demonstrated a novel technique to establish C2 when faced with browser isolation. While this technique proves that browser isolation technologies have weaknesses, Mandiant still recommends browser isolation as a strong protection measure against other types of attacks (e.g., client-side browser exploitation, phishing, etc). Organizations should not solely rely on browser isolation to protect themselves from web-based threats, but rather embrace the “defense in depth” strategy and establish a well-rounded cyber defense posture. Mandiant recommends the following controls:
Monitor for anomalous network traffic:Even when using browser isolation, organizations should inspect network traffic and monitor for anomalous usage. The C2 method described in this post is low-bandwidth, hence transferring even small datasets will require many HTTP requests.
Monitor for browsers in automation mode:Organizations can monitor when browsers are used in automation mode (as shown in the video above) by inspecting the process command line. Chromium-based browsers use flags such as --enable-automation and --remote-debugging-port to enable other processes to control the browser through the DevTools protocol. Organizations can monitor for these flags during process creation.
Through numerous adversarial emulation engagements and Red Team and Purple Team assessments, Mandiant has gained an in-depth understanding of the unique paths attackers may take in compromising their targets. Review our Technical Assurance services and contact us for more information.
Businesses across all industries are turning to AI for a clear view of their operations in real-time. Whether it’s a busy factory floor, a crowded retail space, or a bustling restaurant kitchen, the ability to monitor your work environment helps businesses be more proactive and ultimately, more efficient.
Gemini 1.5 Pro’s multimodal and long context window capabilities can improve operational efficiency for businesses by automating tasks from inventory management to safety assessments. One powerful use case that’s emerged for developers is AI-powered kitchen analysis for busy restaurants. AI-powered kitchen analysis can benefit everyone – it can help a restaurant’s bottom line, and also train employees more efficiently while improving safety assessments that help create a safer work environment.
In this post, we’ll show you how this works, and ways you can apply it to your business.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ed1bfd67be0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Understanding multimodal AI & long context window:
Before we step into the kitchen, let’s break down what “multimodal” and “long context window” mean in the world of AI:
Multimodal AI can process and understand multiple types of data. Think of it as an AI system that can see, hear, read, and understand all at once. In our context, it can take the following forms:
Text: Recipes, orders, and inventory lists
Images: Food presentation and kitchen layouts
Audio: Kitchen commands and customer feedback
Video: Real-time cooking processes and staff movements
These data representations added together can reach GBs in size, which is where Gemini’s long context window comes into play. Long-context windows can consume millions of tokens (data points) at once. This makes it possible to input all the data mentioned above – from text to video – to generate cohesive outputs without losing any of your context.
With a projected market size of over $13 billion by 2032 and a staggering CAGR of around 30% from 2024 to 2032, multimodal plus long context window capabilities are the secret ingredients for success.
Let’s look at a real world example
When it comes to running a restaurant, AI can step in as is your inventory manager and safety inspector all rolled into one. In the following test, we fed Gemini a five-minute video of a chef preparing meals during peak operating hours.
We asked Gemini with a simple prompt to analyze the video and return multiple values that would help us analyze the meal preparation’s efficiency. First, we asked Gemini for the timestamps spent on each part of the process:
Next, to find bottlenecks and optimize workflows we asked Gemini to identify the following key moments:
Positive moments
Potential safety issues
Inventory counts
Suggestions for improvement
Together, we put these values in a graph that broke down the efficiency of each task and identified opportunities for improvement. We also asked Gemini to translate this in several different languages for a diverse kitchen staff.
The final result: Here’s how Gemini analyzed the kitchen
1. Real-time meal preparation and object tracking:
Gemini’s object detection capabilities identified ingredients and monitored cooking processes in real-time. By extracting the start and end timestamps for each meal preparation, you can precisely measure meal prep times.
2. Inventory management:
Say goodbye to the “Oops, we’re out of that” moment. By accurately tracking ingredient usage, Gemini helped prevent stock-outs and enabled proactive inventory replenishment.
3. Safety assessments:
From detecting a slippery floor to noticing an unattended flame, Gemini picked up on those details that are easy to miss. It’s not about replacing human vigilance—it’s about enhancing it, creating a safer environment for both staff and diners.
4. Multilingual capabilities:
In a global culinary landscape, language barriers can be troublesome. Gemini broke down these barriers, ensuring that whether your chef speaks Mandarin or your server speaks Spanish, everyone’s on the same page.
Gemini’s analysis of a five-minute video could help restaurants optimize operations, reduce costs, and enhance the customer experience. By automating and optimizing mundane tasks, staff can focus on what matters—creating culinary masterpieces and delivering exceptional service. It also helps businesses grow by improving cost savings – optimized inventory and resource management translate directly to a business’s financial bottom line.
And, proactive hazard detection means fewer accidents and a safer work environment. It’s not just about avoiding lawsuits—it’s about creating a culture of care.
The future is served
Gemini’s models are pioneers in the market, unlocking use cases that are made possible with Google’s research and advancements. But Gemini’s impact extends far beyond the restaurant industry – its long context window allows businesses to analyze vast amounts of data, unlocking insights that were previously too costly to attain.
Enterprises across industries are investing in AI technologies to move faster, be more productive, and give their customers the products and services that they need. But moving AI from prototype to production isn’t easy. That’s why we created Fireworks AI.
The story of Fireworks AI started seven years ago at Meta AI, where a group of innovators worked on PyTorch — an ambitious project building leading AI infrastructure from scratch. Today, PyTorch is one of the most popular open-source AI frameworks, serving trillions of inferences daily.
Many companies building AI products struggle to balance total cost of ownership (TCO) with performance quality and inference speed, while transitions from prototype to production can also be challenging. Leaders at PyTorch saw a tremendous opportunity to use their years of experience to help companies solve this challenge. And so, Fireworks AI was born.
Fireworks AI delivers the fastest and most efficient gen AI inference engine to date. We’re pushing the boundaries with compound AI systems, which replace more traditional single AI models with multiple interacting models. Think of a voice-based search application that uses audio recognition models to transcribe questions and language models to answer them.
With support from partners like NVIDIA and their incredible CUDA and CUTLASS libraries, we’re evolving fast so companies can start taking their next big steps into genAI.
Here’s how we work with Google Cloud to tackle the scale, cost, and complexity challenges of GenAI.
Matching customer growth with scale
Scale is a primary concern when moving into production, because AI moves fast. Fireworks’ customers might develop new models that they want to roll out right away or find that their demand has doubled overnight, so we need to be able to scale quickly and immediately.
While we’re building state-of-the-art infrastructure software for gen AI, we look to top partners to provide architectural components for our customers. Google Cloud’s engineering strength provides an incredible environment for performance, reliability, and scalability. It’s designed to handle high-volume workloads while maintaining excellent uptime. Currently, Fireworks processes over 140 billion tokens daily with 99.99% API uptime, so our customers never experience interruptions.
Google Kubernetes Engine (GKE) and Compute Engine are also essential to our environment, helping us run control plane APIs and manage the fleet of GPUs.
Google Cloud offers us outstanding scalability so that we’re always only using right-sized infrastructure. When customers need to scale, we can instantly meet their requests.
Since Fireworks is a member of the Google for Startups program, Google Cloud provided us with credits that were essential for growing our operations.
Stopping runaway costs of AI
Scale isn’t the only thing companies need to worry about. Costs can balloon overnight after deploying AI, and enterprises need efficient ways to scale to maintain sustainable growth. By analyzing performance and environments, Fireworks can help them balance scale and efficiency.
We use Cloud Pub/Sub and Cloud Functions for reporting and billing event processing, and Cloud Monitoring for logging analytics and alerting metrics for analytics. All the request and billing data is then stored in BigQuery, where we can analyze use and volumes for each customer model. It helps us determine if we have extra capacity, if we need to scale, and by how much.
Google Cloud’s blue-chip cloud environment also allows us to provide more to our customers without breaking budgets. Because we can offer 4X lower latency and 4X higher throughput compared to competing hosted services, we provide better performance for reduced prices. Customers then won’t need to swell their budget to increase performance, keeping TCO down.
The right environment for any customer
Every genAI solution has its own complexities and nuances, so we need to remain flexible to tailor the environment for each customer. Some enterprises might need different GPUs for different parts of a compound AI system, or they might want to deploy smaller fine-tuned models alongside larger models. Google Cloud gives us the freedom to split up tasks and use any GPUs that we need, as well as integrate with a diverse range of models and environments.
This is especially important when it comes to data privacy and security concerns for customers in sensitive industries such as finance and healthcare. Google Cloud provides robust security features like encryption and secure VPC connectivity, and it helps comply with compliance statutes such as HIPAA and SOC 2.
Meeting our customers where they are – which is a moving target – is critical to our success in gen AI. Companies like Google Cloud and NVIDIA help us do just that.
Powering innovation in gen AI
Our philosophy is that enterprises of all sizes should be able to experiment with and build AI products. AI is a powerful technology that can transform industries and help businesses compete on a global scale.
Keeping AI open source and accessible is paramount, and that’s one of the reasons we continue to work with Google Cloud. With Google Cloud, we can enable more companies to drive value from innovative uses of gen AI.
Generative AI is leading to real business growth and transformation. Among enterprise companies with gen AI in production, 86% report an increase in revenue1, with an estimated 6% growth. That’s why Google is investing in its AI technology with new models like Veo, our most advanced video generation model, and Imagen 3, our highest quality image generation model. Today, we’re building on that momentum at Google Cloud by offering our customers access to these advanced generative media models on Vertex AI:
Veo, now available on Vertex AI in private preview, empowers companies to effortlessly generate high-quality videos from simple text or image prompts. As the first hyperscaler to offer an image-to-video model, we’re helping companies transform their existing creative assets into dynamic visuals. This groundbreaking technology unlocks new possibilities for creative expression and streamlines video production workflows.
Imagen 3 will be available to all Vertex AI customers starting next week. Imagen 3 generates the most realistic and highest quality images from simple text prompts, surpassing previous versions of Imagen in detail, lighting, and artifact reduction. Businesses can seamlessly create high quality images that reflect their own brand style and logos for use in marketing, advertising, or product design.
Vertex AI provides an orchestration platform that makes it simple to customize, evaluate performance, and deploy these models on our leading infrastructure. In alignment with ourAI Principles, the development and deployment of Veo and Imagen 3 on Vertex AI prioritizes safety and responsibility with built-in precautions like digital watermarking, safety filters, and data governance.
Veo: our most capable video generation model, now available on Vertex AI
Developed by Google DeepMind, Veo generates high-quality, high-definition videos based on text or image prompts in a wide range of cinematic and visual styles with exceptional speed. With an advanced understanding of natural language and visual semantics, it generates video that closely aligns to the prompt. Veo on Vertex AI creates footage that’s consistent and coherent, so people, animals, and objects move realistically throughout shots. See examples of Veo’s image-to-video generation capabilities on Vertex AI below:
Image-to-video: Veo generates videos from existing or AI-generated images. Below are examples of how Veo uses images generated using Imagen 3 (top two images) and real-world images (bottom two images) to create short video clips.
Text-to-video: Below are examples of how Veo uses text to create short video clips.
Veo on Vertex AI empowers companies to effortlessly generate high-quality videos from simple text or image prompts. This means faster production, reduced costs, and the ability to quickly prototype and iterate on video content. Veo’s technology can be a great partner for human creativity by allowing creators to focus on higher-level tasks while AI can help handle tedious or repetitive aspects of video production. Customers like Agoda are using the power of AI models like Veo, Gemini, and Imagen to streamline their video ad production, achieving a significant reduction in production time. Whether you’re a marketer crafting engaging social media posts, a sales team creating compelling presentations, or a production team exploring new concepts, Veo streamlines your workflow and unlocks new possibilities for visual storytelling.
Imagen 3: Our highest quality image generation model, now generally available on Vertex AI
Imagen 3is our highest quality text-to-image model. It generates an incredible level of detail, producing photorealistic, lifelike images, with far fewer distracting visual artifacts than our prior models.
Starting next week, all Google Cloud customers will be able to accessImagen 3 on Vertex AI. With Imagen 3 on Vertex AI, you can generate high definition images and videos from a simple text prompt. See examples of Imagen 3’s image generation capabilities below:
Additionally, we’re making new features generally available to customers on our allowlist that help companies edit and customize images to meet their business needs. To join the allowlist, apply here.
Imagen 3 editingprovides a powerful and user-friendly way to refine and tailor any image. You can edit photos with a simple text prompt, edit only parts of an image (mask-based editing) including updating product backgrounds, or upscale the image to meet size requirements.
Imagen 3 Customization provides greater control by guiding the model to generate images with your desired characteristics. It is now possible to infuse your own brand, style, logo, subject or product features when generating new images. This opens up new creative possibilities as it accelerates development by augmenting the marketing process for advertising and marketing assets.
Build with enterprise safety and security
Designing and developing AI to be secure, safe, and responsible is paramount. Consistent with our AI Principles, Veo and Imagen 3 on Vertex AI were built with safety at the core.
Digital watermarking: Google DeepMind’s SynthID embeds invisible watermarks into every image and frame that Imagen 3 and Veo produce, helping decrease misinformation and misattribution concerns.
Safety filters:Veo and Imagen 3 both have built-in safeguards to help protect against the creation of harmful content and adhere to Google’s Responsible AI Principles. We will continue investing in new techniques to improve the safety and privacy protections of our models.
Data governance: We do not use customer data to train our models, in accordance with Google Cloud’s built-in data governance and privacy controls. Your customer data is only processed according to your instructions.
Copyright indemnity: Our indemnity for generative AI services offers peace of mind with an industry-first approach to copyright concerns.
Customers delivering value with Veo and Imagen on Vertex AI
Leading consumer packaged goods company Mondelez International, which includes brands such as Chips Ahoy!, Cadbury, Oreo, and Milka, is using generative AI to accelerate and enhance campaign content creation, allowing rapid development of consumer-ready visuals at scale for 100+ brands sold in 150 countries.
“Our collaboration with Google Cloud has been instrumental in harnessing the power of generative AI, notably through Imagen 3, to revolutionize content production. This technology has enabled us to produce hundreds of thousands of customized assets, enhancing creative quality while significantly reducing both time to market and costs. With the introduction of Veo, Mondelez and its agency partners (Accenture, Publicis, The Martin Agency, VCCP, Vayner and WPP) are poised to expand these capabilities into video content, further streamlining production processes and setting new benchmarks in marketing.” — Jon Halvorson, SVP of Consumer Experience & Digital Commerce, Mondelez International
“Mondelez is embarking on a bold journey of AI-driven transformation, partnering strategically with Google Cloud as our core AI platform. This is not simply a technology adoption; it’s a deep, collaborative partnership leveraging Google’s cutting-edge AI capabilities and infrastructure to fuel our innovation and growth ambitions. This partnership reinforces Mondelez’s commitment to continuous adoption of leading-edge technology to advance our business capabilities.” — Tiffani Sossei, SVP Chief Digital Experience Officer, Mondelez International
WPP is a world leader in marketing and communication services. Its AI-powered operating system for marketing transformation, WPP Open, already utilizes Imagen 3 for image generation and will soon incorporate Veo for video generation, streamlining the ideation and production of content. This expansion empowers WPP to unlock even greater levels of creativity and efficiency.
“At WPP, we believe in the transformative power of AI to enable our people to do their best work. We built WPP Open from the ground up and leverage Google Cloud’s AI capabilities within it to help bring to life the creative vision of clients such as L’Oréal, resulting in the production of compelling content and making iteration and concepting easier than ever before. With Veo and Imagen, we are narrowing the gap between imagination and execution, enabling our people to develop high-quality, photo-realistic, campaign-ready visuals in a matter of minutes.” – Stephan Pretorius, Chief Technology Officer, WPP
Agoda is a digital travel platform that helps travelers see the world for less with its great value deals on a global network of over 4.5M hotels and holiday properties worldwide, plus flights, activities, and more. They’re now testing Imagen and Veo on Vertex AI to create visuals, allowing Agoda teams to generate unique images of travel destinations which would then be used to generate videos.
“At Agoda, we’re committed to helping people see the world for less and making travel experiences more accessible. We are exploring the media generation capabilities of Google Cloud AI, using Imagen to create unique visuals of dream destinations in various styles. These images are then brought to life as videos through experiments with Veo’s image-to-video technology. These technologies hold the potential to streamline our content creation process from days to hours. By continuing our testing, we aim to explore how this combination can enhance creative possibilities and personalized advertising efficiently. With these tools, we hope to engage customers meaningfully and inspire future adventures.” – Matteo Frigerio, Chief Marketing Officer, Agoda
Quora, a leading online platform for people worldwide to share knowledge and learn from each other, has developed Poe, a platform that allows users to interact with leading gen AI models, including Gemini, Imagen, and now Veo through Vertex AI. With Veo and Imagen, Poe users can unlock new levels of creativity and bring their ideas to life with incredible ease and speed.
“We created Poe to democratize access to the world’s best gen AI models. With Veo, we’re now enabling millions of users to bring their ideas to life through stunning, high-quality generative video. Through partnerships with leaders like Google, we’re expanding creative possibilities across all AI modalities. We can’t wait to see what our community creates with Veo.” – Spencer Chan, Product Lead, Poe by Quora
Honor is a leading global provider of smart devices. They are now bringing the power of AI image generation directly to consumers’ fingertips by integrating Imagen into millions of smartphones. This allows users to easily enhance and customize their photos with features like outpainting and stylization.
“At Honor, we’re committed to delivering cutting-edge technology that our millions of users can implement to enhance their daily lives. We chose to integrate Imagen on Vertex AI because it provides outstanding image generation capabilities that are both powerful and user-friendly. With Imagen, our customers can effortlessly create, edit, and reimagine images directly on their smartphones, transforming everyday moments into extraordinary visuals. We look forward to innovating with Google Cloud as their latest generative media models continue to push the boundaries of creative expression.” – George Zhao, CEO, Honor
Get started
To get started with Veo on Vertex AI, reach out to your Google Cloud account representative. To get started with Imagen on Vertex AI, find our documentation. You’ll be able to access Imagen 3 on Vertex AI starting next week.
At the Gemini for Work event in September, we showcased how generative AI is transforming the way enterprises work. Across all the customer innovation we saw at the event, one thing was clear – if last year was about gen AI exploration and experimentation, this year is about achieving real-world impact.
Gen AI has the potential to revolutionize how we work, but only if its output is reliable and relevant. Large language models (LLMs), with their knowledge frozen in time during training, often lack access to the latest information and your internal data. In addition, they are by design creative and probabilistic, and therefore prone to hallucinations. And finally, they do not offer built-in source attribution. These limitations hinder their ability to provide up-to-date, contextually relevant and dependable responses.
To overcome these challenges, we need to connect LLMs with sources of truth. This is where concepts like grounding, retrieval augmented generation (RAG), and search come into play. Grounding means providing an LLM with external information to root its response in reality, which reduces the chances of it hallucinating or making things up. RAG is a specific technique for grounding that finds relevant information from a knowledge base and gives it to the LLM as context. Search is the core retrieval technology behind RAG, as it’s how the system finds the right information in the knowledge base.
To unlock the true potential of gen AI, businesses need to ground their LLMs in what we at Google call enterprise truth. These are trusted internal data across documents, emails and storage systems, third party applications, and even fresh information from the internet that helps knowledge workers perform their jobs better.
By tapping into your enterprise truth, grounded LLMs can deliver more accurate, contextually relevant, and up-to-date responses, enabling you to use generative AI for real-world impact. This means enhanced customer service with more accurate and personalized support, automated tasks like generating reports and summarizing documents with greater accuracy, deeper insights derived from analyzing multiple data sources to identify trends and opportunities, and ultimately, driving innovation by developing new products and services based on a richer understanding of customer needs and market trends.
Now let’s look at how you can easily overcome these challenges with the latest enhancements from Vertex AI, Google Cloud’s AI platform.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7114e8c490>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Tap into the latest knowledge from the internet
LLMs have a fundamental limitation: their knowledge is anchored to the data they were trained on, which becomes outdated over time. This will impact the quality of response for any question that needs fresh data – the latest news, company 10K results, dates for a sports event or a concert. Grounding with Google Search allows the language model to find fresh information from the internet. It even provides source links so you can fact check or learn more. Grounding with Google Search is offered with our Gemini models out-of-the-box. Just toggle to turn it on, and Gemini will ground the answer using Google Search.
If you’re not sure if your next request requires grounding with Google Search, you can now use the new “dynamic retrieval” feature. Just turn it on and Gemini will interpret your query and predict whether it needs up-to-date information in order to increase the accuracy of the answer. You can set the prediction score threshold on when Gemini will be triggered to use grounding with Google Search.This means you get the best of both worlds: high-quality results when you need them, and lower costs, because Gemini will only tap Google Search when needed for your users’ query.
Connect data across all your enterprise truth
Connecting to fresh facts is just the start. The value for any enterprise is grounding in their proprietary data. RAG is a technique that enhances LLMs by connecting them to non-training data sources, helping them to retrieve information from this data before generating a response. There are several options available for RAG, but many of those don’t work for enterprises because they either lack in quality, reliability, or scalability. The quality of grounded gen AI apps can only be as good as their ability to retrieve your data.
That’s where Vertex AI comes in. Whether you are looking for a simple solution that works out-of-the-box, want to build your own RAG system with APIs, or use highly performative vector embeddings for RAG, Vertex AI offers a comprehensive set of offerings to help meet your needs.
Here’s an easy guide to RAG for the enterprise:
First, use out-of-the-box RAG for most enterprise applications: Vertex AI Search simplifies the end-to-end information discovery process with Google quality RAG aka search. With Vertex AI Search, Google Cloud manages your RAG service and all the various parts of building a RAG system: Optical Character Recognition (OCR), data understanding and annotation, smart chunking, embedding, indexing, storing, query rewriting, spell checking, and so on. Vertex AI search connects to your data including your documents, your websites, your databases, structured data, and also third party apps like JIRA and Slack with built in connectors. The best part is that it can be set up in just a few minutes
Developers can get a taste of grounding with Google Search and enterprise data in the Vertex Grounded Generation playground on Github where you can compare grounded and ungrounded responses to queries side by side.
Then, build your own RAG for specific use cases: If you need to build your own RAG system, Vertex AI offers the various pieces off the shelf as individual APIs for layout parsing, ranking, grounded generation, check grounding, text embeddings and vector search. The layout parser can transform unstructured documents into structured representations and comes with multimodal understanding of charts and figures, which significantly enhances search quality across documents – like PDFs with embedded tables and images, which are challenging for many RAG systems.
Our vector search offering is particularly valuable for enterprises who need custom highly performant embeddings based information retrieval. Vector search can scale to billions of vectors, can find the nearest neighbors in a few milliseconds making it suitable for the needs of the large enterprises. Vector search now offers hybrid search that combines both embeddings and semantic search technologies to ensure the most relevant and accurate responses for your users.
No matter how you build your gen AI apps, thorough evaluation is essential to ensure they meet your specific needs. The gen AI evaluation service in Vertex AI empowers you to go beyond generic benchmarks and define your own evaluation criteria. This means you get a truly accurate picture of how well a model aligns with your unique use case, whether it’s generating creative content, or analyzing documents.
Moving beyond the hype for real world impact
The initial excitement surrounding gen AI has given way to a more pragmatic focus on real-world applications and tangible business value. Grounding is important for achieving this goal, ensuring that your AI models are not just generating text, but generating insights that are grounded in your unique enterprise truth.
Alaska Airlines is developing natural language search, providing travelers with a conversational experience powered by AI that’s akin to interacting with a knowledgeable travel agent. This chatbot aims to streamline travel booking, enhance customer experience, and reinforce brand identity.
Motorola Mobility’s Moto AI leverages Gemini and Imagen to help smartphone users unlock new levels of productivity, creativity, and enjoyment with features such as conversation summaries, notification digests, image creation, and natural language search — all with reliable responses grounded in Google Search.
Cintas is using Vertex AI Search to develop an internal knowledge center for customer service and sales teams to easily find key information.
Workday is using natural language processing in Vertex AI to make data insights more accessible for technical and non-technical users alike.
By embracing grounding, businesses can unlock the full potential of gen AI and lead the way in this transformative era. To learn more, check out my session from Gemini at Work where I cover our grounding offerings in more detail. Download our ebook to see how better search (including grounding) can lead to better business outcomes.
At PayPal, revolutionizing commerce globally has been a core mission for over 25 years. We create innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, empowering consumers and businesses in approximately 200 markets. Ensuring the availability of services offered to both merchants and consumers is paramount.
PayPal’s journey with Dataflow has been a success – empowering the company to overcome streaming analytics challenges, unlock new opportunities, and build a more reliable, efficient, and scalable observability platform.
The observability platform team at PayPal is responsible for providing a telemetry platform for developers, technical account teams, and product managers. They own the SDKs, open telemetry collectors, and data streaming pipelines for receiving, processing, and exporting metrics and traces to their backend. PayPal developers rely on this observability platform for telemetry data to detect and fix problems in the shortest possible time. With applications running on diverse stacks like Java, Go, and Node.js, producing around three petabytes of logs per day, a robust, high-throughput, low-latency data streaming solution is critical for generating log-based metrics and traces.
Until 2023, PayPal’s observability platform used a self-managed Apache Flink-based infrastructure for streaming logs-based pipelines that generated metrics and spans. However, this solution presented several challenges:
Reliability: The system was highly unreliable, with no checkpointing in most pipelines, leading to data loss during restarts.
Efficiency: Managing the system was expensive and inefficient. Pipelines had to be planned for peak load, even if it occurred infrequently.
Security: The deployment needed to better conform to security guidelines.
Cluster management: Cluster creation and maintenance were manual tasks, requiring significant engineering time.
Community Support: The solution was proprietary, limiting community support and collaboration.
Software upgrades: Customizations required updating the binary, which was no longer supported.
Long-term support: The solution was an end-of-sale product, placing business continuity at risk.
PayPal needed a cloud-native solution that could address these challenges and unlock new opportunities. Their key requirements included:
Effortless scalability: Handling massive data volumes and fluctuating workloads with automatic scaling and resource optimization.
Cost reduction: Optimizing resource utilization and eliminating costly infrastructure management.
Seamless integration: Connecting with other data and AI tools within PayPal’s ecosystem.
Empowering real-time AI/ML: Leveraging advanced streaming ML capabilities for data enrichment, model training, and real-time inference.
After extensive research and a successful proof of concept, PayPal decided to migrate to Google Cloud’s Dataflow. Dataflow is a fully managed, serverless streaming analytics platform built on Apache Beam, offering unparalleled scalability, flexibility, and cost-effectiveness.
The migration process involved several key steps:
Initial POC: PayPal tested and validated Dataflow’s capabilities to meet their specific requirements.
Pipeline Optimization: Working with Google Cloud experts, PayPal fine-tuned pipelines for maximum efficiency, including redesigning the partitioning scheme and optimizing data shuffling.
Technical Benefits
Dataflow’s automatic scaling capabilities ensure consistent performance and cost efficiency by dynamically adjusting resources based on real-time data demands. Its robust state management capabilities enable accurate and reliable real-time insights from complex streaming operations, while its ability to process data with minimal latency provides up-to-the-minute insights for faster decision-making. Additionally, Dataflow’s comprehensive monitoring tools and integration with other Google Cloud services simplify troubleshooting and performance optimization.
Business benefits
The serverless architecture and dynamic resource allocation of Dataflow have significantly reduced infrastructure and operational costs for PayPal. They’ve also seen enhanced stability and uptime of critical streaming pipelines, leading to greater business continuity. Furthermore, Dataflow’s simplified programming model and rich tooling have accelerated development and deployment cycles, boosting developer productivity.
Implementing a high-throughput, low-latency streaming platform is critical to providing high cardinality analytics to business, developers and our command center teams. The dataflow integration has now empowered our engineering teams with a strong platform to monitor paypal.com 24 x 7 thereby ensuring PayPal is highly available for our consumers and merchants.
Perhaps most importantly, Dataflow has freed up PayPal’s engineering resources to focus on high-value initiatives. This includes integrating with Google BigQuery for real-time Failed Custom Interaction (FCI) analytics, providing the Site Reliability Engineering team with immediate insights. They’re also implementing real-time merchant monitoring, analyzing high-cardinality merchant API traffic for enhanced insights and risk management.
PayPal is excited to continue exploring Dataflow’s capabilities and further leverage its power to drive innovation and deliver exceptional experiences for their customers.
Back in January of 2020, we announced the availability of IBM Power Systems for Google Cloud. But while the pandemic accelerated cloud computing adoption, many large enterprises still faced challenges with critical workloads such as those often found on the Enterprise IBM Power platform.
At the beginning of 2022, we partnered with Converge Technology Solutions, a company with deep expertise in this market, to expand our support for customers with IBM Power Workloads. Converge was already an important partner and they have upgraded the service by enhancing network connectivity to Google Cloud along with bringing full support to the IBM i operating system.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7115bf7250>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Today, Converge Enterprise Cloud with IBM Power for Google Cloud, or simply IP4G, supports all three major environments for Power: AIX, IBM i and Linux. In addition, it’s now available in four new regions in production — two in Canada and two in EMEA, bringing the total to six:
Based on these developments and Converge’s expert engagement, we have seen a tremendous increase in customer adoption for IP4G.
“Infor was one of the original IP4G subscribers, and years later, we continue to run mission-critical IBM Power workloads in IP4G for our clients. IP4G’s availability and performance have more than met our requirements, and we are extremely satisfied with our overall IP4G experience.” – Scott Vassh, Vice President, WMS Development
Are you thinking of moving your IBM Power workloads to the cloud? For questions and information regarding custom cloud plans, please reach out to power4gcp@googlegroups.com. Your email is private to Converge and Google Cloud representatives, who will follow up with you. Looking for a bit more information first? Check out our data center migration and mainframe modernization solution webpages.
Saturday, November 30, 2024, is Small Business Saturday, a day where we celebrate and support small businesses and their impact on our economy and local communities. Like many small businesses, Google was built in a garage with the spirit of doing things differently.1 We are committed to providing tools and services that support local small businesses.2
Small businesses create jobs, drive innovation and contribute to the overall well-being of communities. In fact, 99.9% of American businesses are small.2 In addition, small businesses employ 45.9% of American workers, or about 59 million people, and make up 43.5% of US Gross Domestic Product (GDP).3
In addition to creating jobs for local talent, small businesses also directly give back to their communities by contributing tax revenues and supporting community based programs. Local charity programs rely on the help from their local small businesses through sponsorship, volunteer, and fundraising efforts.4 With the support of their local chamber of commerce, small businesses are deeply ingrained in their local communities generating social and economic good.
Like any organization, small businesses rely on technology to run their business. But when small businesses are time and resource-strapped, it can be difficult to get the best of technology. For instance, small businesses are three times more likely to be targeted by cyber criminals,5 and in a recent study, IT managers indicated they have to spend up to half their work week securing and managing devices.6 How can small businesses keep up? We believe it is critical to help small businesses be successful with devices that are secure, simpler to manage, and affordable.
How ChromeOS empowers small businesses
ChromeOS, the OS at the heart of every ChromeOS device, is designed to keep businesses safe, simplify IT management and save businesses money.
Security ChromeOS is the most secure OS out of the box.7 In other words, you are protected from the moment you boot up the machine without having to add any antivirus software. In fact, there have been zero reported instances of successful virus or ransomware attacks on ChromeOS devices as of 2024.*
Calbag Metals is a West Coast leader in scrap metal recycling and is located in Portland, Oregon. Calbag’s mission is protecting the environment, recycling the past, and preserving the future. The business has been led by three family generations. As the business grew, they started to face integration challenges with a wide range of devices to manage, spending a significant amount of their work week on management. Recognizing the security benefits of ChromeOS, like not storing files locally, automatic updates, and easily blocking apps across all devices from one place, Calbag made the switch to ChromeOS.
We’re happy to leave the security to Google. The updates to ChromeOS are automatic and run in the background, so there’s no need to visit every workstation to confirm security,
Jim Perris
Senior Vice President of Finance and Operations, Calbag Metals
Simple to manage
When it comes to device management, ChromeOS gives time back; ChromeOS devices are 63% faster to deploy and 36% easier to manage than other operating systems.8
Sage Goddess is an e-commerce and e-learning provider for spiritual tools and teachings, reaching over two million people across the globe every week.
As the business grew with new employees and new devices, the business needed cost effective device management. Sage Goddess deployed 60 ChromeOS devices across different functions in the organization, and were able to centrally manage devices and keep them secure.
ChromeOS makes IT management easy, so we can focus on growing our business. We appreciate the simplicity because in the past, those management tasks were often time-consuming and complicated,
David Meizlik
President and COO, Sage Goddess
Save money Chromebooks are generally more affordable than traditional laptops, making them an attractive option for budget-conscious small businesses. Additionally, ChromeOS devices require minimal maintenance or software add-ons, reducing IT costs.
One business benefiting from this is Triple Impact Connections, a veteran-owned business processing outsourcing (BPO) company based in Killeen, Texas. Triple Impact Connections delivers contact center services for banking, healthcare and retail. The company’s management and agent workforce is made up almost entirely of military spouses and disabled veterans.
To stay competitive, Triple Impact Connections was looking to save on costs without compromising high performance. ChromeOS allows Triple Impact Connections to use ChromeOS devices for longer periods with automatic updates, helping the company save 30% on deployment costs when onboarding new employees.
ChromeOS devices are designed to be durable and receive 10 years of automatic updates,** allowing Triple Impact Connections agents to use them for longer periods and reduce device replacement costs. In addition, since ChromeOS automatically updates in the background, we can rest assured that our devices are secure—saving us $60,000 per year on cybersecurity monitoring,
These small-businesses are all driving transformative change for their communities in their own unique way. ChromeOS supports your journey by providing secure and cost-effective devices, freeing up IT teams, and giving back time to small business owners to do what they love. If you’re a small business owner looking for security you can trust, explore the possibilities with ChromeOS. Visit ChromeOS to learn more, or try our quiz to learn how you can get started with ChromeOS.
Ready to celebrate Small Business Saturday? Show your support by shopping local, leaving positive reviews, and spreading the word about your favorite small businesses.
*As of 2024, there has been no evidence of any documented, successful virus attack or ransomware attack on ChromeOS. Data based on ChromeOS monitoring of various national and internal databases.
**For devices prior to 2021 that are eligible to receive extended updates, some features and services may not be supported. See our Help Center for details.
Today, we’re announcing the Australia Connect initiative to further the reach, reliability, and resilience of digital connectivity in Australia and the Indo-Pacific region. This investment will deliver new subsea cable systems and build on the Pacific Connect initiative.
The Bosun subsea cable will connect Darwin, Australia to Christmas Island, which has onward connectivity to Singapore. The name, Bosun, refers to both the White-tailed Tropicbird — the iconic bird of Christmas Island — and the nautical term for a ship’s lead deckhand. Additionally, a new interlink cable will connect Melbourne, Perth, and Christmas Island. In Melbourne, the interlink cable will connect to the Honomoana cable system, part of the Pacific Connect initiative, creating a new interconnection point for services from the U.S. to Asia.
Once operational, Bosun and the interlink cable will deliver new digital pathways for Australia, enhancing the reliability and resilience of the Internet within the country and throughout the Indo-Pacific region.
In addition to the Bosun subsea cable system, we’re working with partners like Vocus to deliver terrestrial fiber pairs that connect Darwin to the Sunshine Coast, connecting Bosun with the Tabua subsea cable system that connects the United States and Australia to Fiji.
The Australia Connect initiative is a collaborative effort involving Google and several key partners, including: NEXTDC, SUBCO, Vocus, along with state and local governments in Darwin, Perth, and the Sunshine Coast.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e766846d520>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Michelle Rowland,MP, Minister for Communications, Australia:“The Australian Government welcomes the announcement of the Australia Connect initiative by Google and its partners. These new cable systems will not only expand and strengthen the resilience of Australia’s own digital connectivity through new and diversified routes, but will also complement the Government’s active work with industry and government partners to support secure, resilient and reliable connectivity across the Pacific.”
Craig Scroggie, CEO & Managing Director, NEXTDC: “Submarine cables are the critical, often unseen lifelines linking Australia to the global digital ecosystem. We’re proud to be working in partnership with Google in establishing cable landing stations in Darwin, the Sunshine Coast, and Melbourne. These investments across our national data center network will improve every customer’s experience by boosting data speeds, enhancing reliability and redundancy, and strengthening cybersecurity across Australia and the Indo-Pacific.”
Belle Lajoie, Co-CEO, Soda Infrastructure/SUBCO: “We are excited to partner with Google as they expand their subsea infrastructure in Australia. This collaboration allows both parties to harness shared infrastructure, enhancing resiliency, speeding up project delivery, and minimizing environmental and community impact. Together, we’re delivering vital subsea connectivity to Australia’s major cities and establishing new, robust subsea cable routes between Sydney and Melbourne, strengthening connectivity across the region.”
Rosanna Natoli, Mayor, Sunshine Coast: “We are excited to partner with Google and NEXTDC on this project, to help improve digital resilience here, across the country and the Indo-Pacific. Investing in digital infrastructure is helping to develop a connected, thriving and tech-ready future for the Sunshine Coast and beyond.”
Jarrod Nink, Interim CEO, Vocus: “Vocus is thrilled to have the opportunity to deepen our strategic network partnership with Google, and to play a part in establishing critical digital infrastructure for our region. Australia Connect will bolster our nation’s strategic position as a vital gateway between Asia and the United States by connecting key nodes located in Australia’s East, West, and North to global digital markets. Australia Connect will create a low latency, secure, and stable network architecture while providing added reliability for Google, our customers, and partners.”
Google is deeply committed to building a strong digital future for all Australians. In 2021, we launched the Digital Future Initiative, a $1 billion AU$, five-year initiative that builds on our long-term commitment to infrastructure, local partnerships, and research capabilities. Analysys Mason estimates Google’s previous submarine cable deployments in Australia will lead to a cumulative increase in GDP of $98.5 billion AU$ ($64 billion US$) between 2022 and 2026 and support the creation of around 68,000 additional jobs by 2027.
We look forward to sharing more as we work closely with our partners in Australia and the Indo-Pacific.
Artificial intelligence is not just a technological advancement; it’s a national security priority. In this new era, AI is both a powerful technology that can bolster any organization’s cybersecurity capabilities and also a critical part of the technology infrastructure that we need to defend and protect.
Google recently commissioned IDC to conduct a study that surveyed 161 federal CAIOs, government AI leaders and other decision makers to understand how agency leaders are leading in this new AI era. The study found that internal cybersecurity protection is currently the top AI use case for federal agencies according to 60% of those surveyed.
In terms of their key drivers for implementing AI, 62% of federal agencies surveyed identified strengthening cybersecurity as a top motivator, and 40% of federal agencies stated that protecting critical infrastructure is a key driver of their AI investments1.
Security leadership in the AI era
Agencies with high AI maturity, including the implementation of policies and standards for responsible AI practices, are leading the charge – and applying the strategic use of AI to implement security posture management tools and practices that help assess risk, monitor threats and respond to incidents. Championing a culture of AI innovation can bolster security and drive significant mission impact.
The partnership between the CAIO, CISO, and CTO is crucial for successful AI adoption. Of the federal agencies who responded, 54% consider the CTO as the most important collaborator in executing the CAIO’s initiatives, followed by the CISO at 51%.By working together, these leaders can harness the power of AI while effectively assessing risk, monitoring threats, and responding to incidents, ultimately safeguarding sensitive data and critical infrastructure.
Security and AI leadership in action: CalHEERS and Covered California
In the U.S., California, a state at the forefront of AI leadership, is spearheading the evaluation and deployment of ethical, transparent, and trustworthy AI. As part of statewide initiatives to study AI’s development, use, and risks, Covered California is applying AI technology in an effort to streamline the process of providing essential benefits to residents.
California Healthcare Eligibility, Enrollment, and Retention System (CalHEERS), fortified its defenses, ensuring the protection of sensitive health information for millions of Californians. They successfully implemented Google Security Operations, an intelligence-driven and AI-powered platform, to protect sensitive patient data and operating systems across a multi-cloud infrastructure.
“Google Security Operations, combined with Deloitte’s expertise, has provided us with the tools and capabilities we need to effectively navigate the complex cyber landscape and safeguard our multi-cloud environment.” said Kevin Cornish, Chief Information Officer, Covered California.
Google’s commitment to your mission with Secure AI
Google Public Sector is supporting agencies with an adaptive, secure, responsible, and intelligent way forward, protecting data and decisions with secure AI solutions and AI-powered security tools. Underscoring our dedication to this vision, we unveiled several important announcements at our recent Google Public Sector Summit, showcasing how our comprehensive approach, including Google’s robust security infrastructure, industry-leading intelligence, AI-powered capabilities, and the Secure AI Framework (SAIF), can empower agencies to innovate with AI confidently and responsibly.
Cybersecurity is critical for national and economic security – and agencies need to continuously enhance their ability to address and respond to evolving threats. Google Public Sector is dedicated to partnering with government organizations to secure their AI systems and also provide them with AI-powered security solutions.
Join us at our Google Public Sector Summit On-Demand
To delve deeper into the critical intersection of AI and Security, register for the Google Public Sector Summit On-Demand on Tuesday, December 3rd, 2024. Google Public Sector leaders will explore a range of topics including how AI can be used to enhance national security while upholding safety and responsibility standards.
¹ IDC Signature White Paper, The Chief Artificial Intelligence Officer (CAIO) Playbook: A Practical Guide for Advancing AI Innovation in Government, sponsored by Google Public Sector, Doc# US52616824, October 2024.
Welcome to the second Cloud CISO Perspectives for November 2024. Today, Monica Shokrai, head of business risk and insurance, Google Cloud, and Kimberly Goody, cybercrime analysis lead, Google Threat Intelligence Group, explore the role cyber-insurance can play in combating the scourge of ransomware.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
–Phil Venables, VP, TI Security & CISO, Google Cloud
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5991a78c70>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Ending the ransomware scourge starts with reporting, not blocking cyber-insurance
By Monica Shokrai, head of business risk and insurance, Google Cloud, and Kimberly Goody, cybercrime analysis lead, Google Threat Intelligence Group
Ransomware is wreaking havoc around the world, underscoring the need for better collective defensive action from public and private sector organizations.
Globally, ransomware continues to be a complicated and pernicious threat, according to our M-Trends 2024 report. It accounts for more than 20 percent of cyberattacks, year after year. Ransomware at one U.S. health insurance organization forced the shut down of operations at hospitals and pharmacies for several weeks earlier this year, a move that cost the company an estimated $872 million so far.
The numbers paint a dire picture of the security impact of operating legacy systems:
71% said that legacy technology has left organizations less prepared for the future.
63% believe that their organization’s technology landscape is less secure than it was in the past.
More than 66% told us that their organizations are investing more time and money than ever in securing their environments — but still experience costly security incidents.
81% of organizations experience at least one security incident per year.
Organizations experience eight security incidents on average per year.
We know many security leaders have convinced the business to invest in more security tools, because the survey also found that 61% of organizations are using more security tools than they did two years ago. Yet while more than two-thirds of organizations are investing more time and money in securing their environments, many are still experiencing expensive security incidents.
Victims of these attacks are often left with the difficult decision to pay a ransom. At least $3.1 billion has been paid in ransom for more than 4,900 ransomware attacks since 2021, wrote Anne Neuberger, U.S. deputy national security adviser for cyber and emerging technology, in October — and these are only the attacks that we know of because they’ve been reported.
Law enforcement and impacted organizations have stepped up their fight against ransomware this year. Some of them have developed a multifaceted approach that combines strategic interventions, technological defenses, and law enforcement efforts to combat it, and so far that’s proven helpful. These efforts led to 14 disruptions by law enforcement in ransomware operations as of September.
Despite these actions, attacks continue. Defending against ransomware is so complicated that even some independent cybersecurity researchers, who had been calling for bans on insurance payments to organizations suffering from ransomware attacks, have backed down from their hard-line positions.
While solutions to the threat are complex, cyber-insurance can play a key role. Cyber-insurers can help reduce attackers’ financial gains from incidents, first and most importantly by requiring a minimum level of security standards to strengthen an organization’s defenses before approving an insurance policy.
Insurers have also been shown to reduce attackers’ financial gains by limiting or avoiding ransom payments altogether and advising on best practices, particularly regarding backups. If a ransomware attacker demands a $2 million bounty to restore data, but cyber-insurance can embolden an organization under attack to more confidently assert their counter-demand for a reduced payment, that can help the attacked organization strengthen its position and even pay a lower sum — or none at all.
Cowbell Cyber, a cyber-insurance firm, recently found that ’businesses using Google Cloud report a 28% lower frequency of cyber incidents relative to other cloud users.’
However, some believe that cyber-insurance encourages ransomware payments, and would prefer cyber-insurance coverage for ransomware to be banned. Outright bans on cyber-insurance coverage for ransomware payments are likely to harm small businesses more than large ones. Larger businesses are often better positioned to absorb the financial cost of ransomware payments on their own. Conversely, a ban would hurt smaller businesses in outsized ways.
If the ultimate goal of banning insurers from reimbursing ransomware payments is to reduce the profitability of ransomware attacks, then actions that require victims to report payments have the potential to be more impactful. Mandatory reporting could improve law enforcement tracking efforts and introduce more opportunities to recover funds even after payment is sent.
If larger companies continue to pay the ransom despite insurance not covering it, the impact of a ban on the insurance coverage becomes less meaningful. However, a more effective approach may be to incentivize the adoption of policies that improve the digital resilience of private and public-sector organizations to drive down the risks they face. As Phil and Andy wrote in the previous edition of this newsletter, this often means updating legacy IT.
One approach is to incentivize the adoption of secure by design and secure by default technologies, such as those that we develop at Google Cloud. Cowbell Cyber, a cyber-insurance firm, recently found that “businesses using Google Cloud report a 28% lower frequency of cyber incidents relative to other cloud users.” The report also found that Google Cloud exhibited the lowest severity of cyber incidents compared to other cloud service providers.
At-Bay, another cyber-insurance firm, found customers using Google Workspace experienced, on average, 54% fewer email security incidents.
There is an opportunity with AI, as well, to better scale existing anti-ransomware efforts to meet the needs of defenders. We’ve already begun to see AI have a positive impact by helping organizations grow their threat detection efforts and more efficiently address vulnerabilities before attackers can exploit them.
In your fight against ransomware, Google Cloud is here to help you every step of the way. From technology solutions and Mandiant Consulting Services, to threat intelligence insight, we can help you prepare for, protect against, and respond to ransomware attacks. You can learn more about the latest ransomware protection and containment strategies in this report.
For more leadership guidance from Google Cloud experts, please see our CISO Insights hub.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5991a785e0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Cyber risk top 5: What every board should know: Boards should learn about security and digital transformation to better manage their organizations. Here’s five top risks they need to know. Read more.
Make IAM for GKE easier to use with Workload Identity Federation: Workload Identity Federation for GKE is now even easier to use with deeper IAM integration. Here’s what you need to know. Read more.
Shift-left your cloud compliance auditing with Audit Manager: Our Audit Manager service, which can help streamline the compliance auditing process, is now generally available. Read more.
Learn how to build a secure data platform: A new ebook, Building a Secure Data Platform with Google Cloud, details the tools available to protect your data as you use it to grow your business. Read more.
Bug hunting in Google Cloud’s VPC Service Controls: You can get rewarded for finding vulnerabilities in VPC Service Controls, which helps prevent data exfiltration. Here’s how. Read more.
Finding bugs in Chrome with CodeQL: Learn how to use CodeQL, a static analysis tool, to search for vulnerabilities in Chrome. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Fact of the month’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5991a78af0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://cloud.google.com/blog/topics/threat-intelligence/ransomware-attacks-surge-rely-on-public-legitimate-tools’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
Using AI to enhance red team engagements: Mandiant researchers look at several case studies that demonstrate how we can use AI to analyze data from complex adversarial emulation engagements to better defend organizations. Read more.
Empowering Gemini for malware analysis: In our latest advancements in malware analysis, we’re equipping Gemini with new capabilities to address obfuscation techniques and obtain real-time insights on indicators of compromise by integrating the Code Interpreter extension and the Google Threat Intelligence function calling. Read more.
Understanding the digital marketing ecosystem spreading pro-PRC influence operations: GLASSBRIDGE is an umbrella group of four different companies that operate networks of “fake” news sites and newswire services tracked by the Google Threat Intelligence Group. They publish thematically similar, inauthentic content that emphasizes narratives aligned to the political interests of the People’s Republic of China. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Google Cloud Security and Mandiant podcasts
Your top cloud IAM pet peeves (and how to fix them): Google Cloud’s Michele Chubirka, staff cloud security advocate, and Sita Lakshmi Sangameswaran, senior developer relations engineer, join host Anton Chuvakin for a deep dive into the state of Identity Access Management in the cloud, why you might be doing IAM wrong, and how to get it right. Listen here.
Behind the Binary: Motivation, community, and the future with YARA-X: Victor Manuel Alvarez, the creator of YARA, sits down with host Josh Stroschein to talk about how YARA became one of the most powerful tools in cybersecurity, and why we need a ground-up rewrite of this venerable tool. Listen here.
Behind the Binary: A look at the history of incident response, Mandiant, and Flare-On: Nick Harbour joins Josh to discuss his career journey from the Air Force to Mandiant, share insights into the evolution of malware analysis, and the development of the reverse engineering Flare-On contest. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in two weeks with more security-related updates from Google Cloud.
In the domain of software development, AI-driven assistance is emerging as a transformative force to enhance developer experience and productivity and ultimately optimize overall software delivery performance. Many organizations started to leverage AI-based assistants, such as Gemini Code Assist, in developer IDEs to support them in solving more difficult problems, understanding unfamiliar code, generating test cases, and many other common programming tasks. Based on the productivity gains experienced by the individual developers in their IDEs, many organizations are looking to expand their use of generative AI technologies to other aspects of their software development lifecycle including pull-requests, code reviews, or generating release notes.
In this article we want to explore how to use generative AI to enhance the quality and efficiency in software delivery. We also provide a practical example of how to leverage Gemini models in Vertex AI within a continuous delivery pipeline to support code reviews and generate release notes for pull requests.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud developer tools’), (‘body’, <wagtail.rich_text.RichText object at 0x3e47e61a1c70>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Generative AI beyond the IDE
Whilst AI-powered coding assistance within an IDE offers a significant boost to a developer’s productivity, the benefits of this technology are not limited to the direct interaction between the developer and the codebase. By expanding the use of large language models to other aspects of the software delivery lifecycle, we open up a range of new opportunities to streamline time-consuming tasks. By integrating AI capabilities within automated CI/CD pipelines, we not only free up time for developers to focus on more strategic and creative aspects of their work but also have a chance to enhance the code quality overall and detect issues within the codebase early and before they make it to production environments.
The concept of using automated tooling within a CI/CD pipeline to proactively detect issues with code quality isn’t entirely new. We’ve used several forms of static code analysis for decades to identify potential errors and vulnerabilities and to enforce coding standards. However, the advances in generative AI present new opportunities that go beyond the capabilities of traditional code analysis. With their advanced language understanding and contextual awareness they can provide more nuanced commentary and provide more grounded recommendations on how to improve on a certain code base. In many cases these cools can help reduce cognitive load or labor intensive tasks that a human developer had to perform in the form of code reviews and help them focus on the bigger picture and overall impact on the codebase.
This doesn’t mean that the AI tools are in a position to replace the trusted tools and processes altogether. As illustrated in the practical example below these tools are most impactful when they are embedded within a combination of deterministic tools and human experts and each perform the tasks that they are best equipped to.
Ingredients for an AI-infused SDLC
To illustrate how generative AI can be used to enhance software delivery we’ll use the following products and tools:
Gemini models in Vertex AI
Gemini models are designed to process and understand vast amounts of information, enabling more accurate and nuanced responses to user prompts. With a focus on enhanced capabilities in areas like logical reasoning, coding, and creative collaboration, Gemini revolutionized the way we are able to collaborate with AI.
Gemini can be used directly or indirectly when it powers a packaged experience. For example Gemini Code Assist is a end user application that is built on top of the Gemini models and provides an assistant that helps in code generation, transformation and understanding as mentioned above.
Developers can also directly integrate Gemini models in their own application through Vertex AI, an end-to-end platform which lets them create, customize, manage, and scale AI applications.
In this example we will use Gemini in Vertex AI to build a custom extension of a CI/CD pipeline that uses Gemini’s language and text generation capabilities to provide meaningful assistance in a code review process.
Friendly CI-CD Helper
To abstract away the mechanics of interacting with the Gemini APIs in Vertex AI and centrally manage aspects like prompt design and how the context is fed to the model we build a small demo tool called friendly-cicd-helper. The tool can be used either as a standalone Python application or as a container that can run in a container-based CI/CD pipeline such as Cloud Build.
In its core the friendly-cicd-helper uses Gemini to analyze code changes (here in the form of a Git diff) and can generate the following outputs:
Summary of the changes to help speed up a MR/PR review
PR/MR comments for code changes to provide initial feedback to the author
Release Notes for changes for code changes
We use the friendly-cicd-helper tool as an example of how to leverage Gemini capabilities in a CI/CD pipeline. It is not an official product and most use cases will require you to build your own implementation based on your own needs and preferences.
Cloud Build
Cloud Build is a fully managed, serverless CI/CD (Continuous Integration/Continuous Delivery) platform provided by Google Cloud. It allows you to automate the process of building, testing, and deploying your applications across various environments like VMs, Kubernetes, serverless platforms, and Firebase.
You can define how the above tasks are linked together in your build through a build config specification, in which each task is defined as a build step.
Your build can be linked to a source-code repository so that your source code is cloned in your workspace as part of your build, and triggers can be configured to run the build automatically when a specific event, such as a new merge request, occurs.
Example Cloud Build Pipeline with Gemini
In our example the following Cloud Build pipeline is triggered when a developer opens a merge request in Gitlab (any other Cloud Build supported repository would work). The pipeline first fetches the latest version of the source branch of the pull request and executes the following steps in order:
1. The first step generates a Git diff to collect the code changes that are proposed as part of the merge request in a file. The file is persisted in the workspace mount that is shared between the steps such that it can later be used in the context for the LLM prompts.
2. Then we use Gemini to generate an automated code review of our merge request with the friendly-cicd-helper vertex-code-review --diff /workspace/diff.txt command. The model response is then appended to the GitLab merge request thread as a comment.
code_block
<ListValue: [StructValue([(‘code’, ‘- id: Using Vertex AI to provide an automated MR Reviewrn name: ‘europe-west1-docker.pkg.dev/$PROJECT_ID/tools/friendly-cicd-helper’rn entrypoint: shrn args:rn – -crn – |rn export VERTEX_GCP_PROJECT=$PROJECT_IDrn echo “## Automated Merge Request Review Notes (generated by Vertex AI)” | tee mergerequest-review.mdrn echo “_Note that the following notes do not replace a thorough code review by an expert:_” | tee -a mergerequest-review.mdrnrn friendly-cicd-helper vertex-code-review –diff /workspace/diff.txt | tee -a mergerequest-review.mdrnrn cat mergerequest-review.md | friendly-cicd-helper gitlab-comment –project $_GITLAB_PROJECT –mergerequest $$(cat /workspace/gitlab_merge_request_iid)rn secretEnv: [‘GITLAB_TOKEN’]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e47ca6f5880>)])]>
If you look at friendly-cicd-helper.py you’ll see that the vertex_code_review function calls the code_review function from the vertex_api.py module
code_block
<ListValue: [StructValue([(‘code’, ‘def vertex_code_review(diff):rn “””rn Review on a Git Diffrn “””rn import lib.vertex_api as vertexrn return vertex.code_review(diff)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e47ca6f5160>)])]>
That function submit a prompt to Gemini to get a code review using the Git diff as context:
code_block
<ListValue: [StructValue([(‘code’, ‘def code_review(diff_path):rn “””rn Generate a code review based on a Git diff.rn “””rnrn response = model.generate_content(rn f”””rnYou are an experienced software engineer.rnYou only comment on code that you found in the merge request diff.rnProvide a code review with suggestions for the most important rnimprovements based on the following Git diff:rnrn${load_diff(diff_path)}rnrn “””,rn generation_config=generation_configrn )rn print(response.text.strip())rn return response.text)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e47ca6f5100>)])]>
3. The same pattern can be repeated for generating other artifacts like suggested release notes that describe the contained changes in the MR and also append them to the same thread as a comment.
code_block
<ListValue: [StructValue([(‘code’, ‘- id: Using Vertex AI to provide automated Release Notesrn name: ‘europe-west1-docker.pkg.dev/$PROJECT_ID/tools/friendly-cicd-helper’rn entrypoint: shrn args:rn – -crn – |rn export VERTEX_GCP_PROJECT=$PROJECT_IDrn echo “## Automated Suggestions for Release Notes (generated by Vertex AI)” | tee mergerequest-release-notes.mdrnrn friendly-cicd-helper vertex-release-notes –diff /workspace/diff.txt | tee -a mergerequest-release-notes.mdrnrn cat mergerequest-release-notes.md | friendly-cicd-helper gitlab-comment –project $_GITLAB_PROJECT –mergerequest $$(cat /workspace/gitlab_merge_request_iid)rn secretEnv: [‘GITLAB_TOKEN’]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e47ca6f5e80>)])]>
Here you can see the prompt submitted to Vertex from the vertex_api.py module
code_block
<ListValue: [StructValue([(‘code’, ‘def release_notes(diff_path):rn “””rn Generate release notes based on a Git diff in unified format.rn “””rnrn response = model.generate_content(rn f”””rnYou are an experienced tech writer.rnWrite short release notes in markdown bullet point format for the most important changes based on the following Git diff:rnrn${load_diff(diff_path)}rn “””,rn generation_config=generation_configrn )rn print(response.text.strip())rn return response.text’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e47ca6f5b80>)])]>
4. Lastly our pipeline builds a container image with the updated code and deploys the application to a QA environment using Cloud Deploy, where UAT can be executed.
code_block
<ListValue: [StructValue([(‘code’, ‘- id: Build the image with Skaffoldrn name: gcr.io/k8s-skaffold/skaffoldrn entrypoint: /bin/bashrn args:rn – -crn – |rn skaffold build –interactive=false –file-output=/workspace/artifacts.json –default-repo=$_REPOrn – id: Create a release in Cloud Deploy and rollout to stagingrn name: gcr.io/cloud-builders/gcloudrn entrypoint: ‘bash’rn args:rn – ‘-c’rn – |rn MERGE_REQUEST_IID=$$(cat /workspace/gitlab_merge_request_iid)rn gcloud deploy releases create ledgerwriter-${SHORT_SHA} –delivery-pipeline genai-sw-delivery \rn –region europe-west1 –annotations “commitId=${REVISION_ID},gitlab_mr=$$MERGE_REQUEST_IID” –build-artifacts /workspace/artifacts.json’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e47ca6f5790>)])]>
Seeing the pipeline in action
We will try our pipeline in the context of Bank of Anthos, a sample web app that simulates a bank’s payment processing network, allowing users to create artificial bank accounts and complete transactions.
For the purpose of this demo we’ve modified the ledger writer service that accepts and validates incoming transactions before writing them to the ledger. The repository fork is available here.
Starting from existing code we added the method below to the TransactionValidator class to obfuscate account number for logging purposes:
code_block
<ListValue: [StructValue([(‘code’, ‘public String obfuscateAccountNumber(String acctNum) {rn String obfuscated = “”;rn for (int i = 0; i < acctNum.length(); i++) {rn if (Character.isDigit(acctNum.charAt(i))) {rn obfuscated += “0”;rn } else {rn obfuscated += “x”;rn }rn }rn return obfuscated;rn }’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e47ca6f5df0>)])]>
In addition to that, we created a new TransactionValidatorTest class and added a test for the new method:
Once we open a MR in GitLab, after we insert the /gcbrun comment that we configured our Cloud Build trigger to require. This triggers the pipeline that we outlined above and appends the following comment with the AI-generated comments in the MR thread:
Then similarly the requested release note suggestions are also appended to the comment thread:
Summary
You saw an example of automating code reviews and release notes generation using Vertex AI and Gemini.
You can continue to try by yourself using the above example repository and friendly-cicd-helper, start from it and tune your prompts or implement your own script to submit a prompt to Gemini in your CD pipeline.