We are excited to announce that Amazon Location Service now supports AWS PrivateLink integration, enabling customers to establish private connectivity between their VPCs and Amazon Location Service without data ever traversing the public internet.
With this new capability, customers can now access Amazon Location Service APIs through private IP addresses within their VPC, significantly enhancing their security posture. This integration simplifies network architecture by eliminating the need for internet gateways, NAT devices, or public IP addresses, while helping customers meet strict regulatory and compliance requirements by keeping all traffic within the AWS network.
Setting up AWS PrivateLink for Amazon Location Service is straightforward. Customers can create interface VPC endpoints through the AWS Management Console or AWS Command Line Interface (AWS CLI) commands. Once configured, applications can immediately begin accessing Amazon Location Service APIs using private IP addresses, with all traffic remaining secure within the AWS network.
Amazon Relational Database Service (Amazon RDS) for Oracle now supports the Spatial Patch Bundle (SPB) for the January 2025 Release Update (RU) for Oracle Database version 19c. This update delivers important fixes for Oracle Spatial and Graph functionality, helping ensure reliable and optimal performance for your spatial operations.
You can now create new DB instances or upgrade existing ones to engine version ‘19.0.0.0.ru-2025-01.spb-1.r1’. The SPB engine version will be visible in the AWS Console by selecting the “Spatial Patch Bundle Engine Versions” checkbox in the engine version selector, making it simple to identify and implement the latest spatial patches for your database environment.
To learn more about Oracle SPBs supported on Amazon RDS for each engine version, see the Amazon RDS for Oracle Release notes. For more information about the AWS Regions where Amazon RDS for Oracle is available, see the AWS Region table.
When it comes to data center power systems, batteries play an important role. The applications that run in our data centers require nearly continuous uptime. And while utility power is highly reliable, power outages are unavoidable.
When an outage happens, batteries can supply short-duration power, allowing servers to operate continuously when the facility switches between AC power sources, or to ride through transient power disturbances. Or, if a facility loses both primary and alternate power sources for an extended period of time, batteries can supply sufficient power to allow machines to execute a clean shutdown procedure. This is helpful in expediting machine restarts after the power outage. More importantly, it helps ensure that critical user data is safely stored to disk and not lost in the power disruption.
aside_block
<ListValue: [StructValue([(‘title’, “Ensure Your Data’s Safety and Uptime with Google Cloud for free”), (‘body’, <wagtail.rich_text.RichText object at 0x3e4d6dc78b20>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
At Google, we rely on a 48Vdc rack power system with integrated battery backup units (BBUs), and in 2015, we became one of the first hyperscale data center providers to deploy Lithium-ion BBUs. These Li-ion batteries had twice the life, twice the power and half the volume of previous-generation lead-acid batteries. Switching from lead-acid batteries to Li-ion means we deploy only one-quarter the number of batteries, greatly reducing the battery waste generated by our data centers.
We recently reached an important milestone: Google has more than 100 million cells deployed in battery packs across our global data center fleet. This is remarkable, and only possible thanks to the safety-first approach we take to deploy Li-ion batteries at scale.
The main safety risk associated with Li-ion batteries is the battery going into thermal runaway if it’s accidentally mishandled or exposed to excessive temperatures or overcharging. While a rare event, the resulting fire is extremely difficult to extinguish due to the large amount of heat generated, driving a thermal runaway chain reaction to nearby cells.
To deploy this large fleet of Li-ion cells, we have had to make safety a core principle of our battery design. Specifically, as an early adopter of the UL9540A thermal runaway test method, we subject our Li-ion BBU designs to rigorous flame safety testing that demonstrates their ability to limit thermal runaway. As a result, Google has successfully been granted permits to deploy BBUs in some of the world’s most stringent jurisdictions, in the APAC region.
In addition, our Li-ion BBUs benefit from our distributed UPS architecture that offers significant availability and TCO benefits compared to traditional monolithic UPS systems. The distributed UPS architecture improves machine availability by: 1) reducing the failure-domain blast radius to a single rack, and 2) locating the batteries in the rack to eliminate intermediate points of failure between the UPS and machines. This architecture also provides TCO benefits by scaling the UPS with the deployment, i.e., reducing day-1 UPS cost. Additionally, locating the batteries in the rack on the same DC bus as the machines eliminates intermediate AC/DC power conversion steps that cause efficiency losses. In 2016 we shared the 48V rack power system spec with the Open Compute Project, including specs for the Li-ion BBUs.
Li-ion batteries have been crucial to ensuring the uninterrupted operation of Google Cloud data centers. By transitioning from lead-acid to Li-ion BBUs, we’ve significantly improved power availability, efficiency, and lifespan, even as we simultaneously address their critical safety risks. Our commitment to rigorous safety testing and adherence to standards and test methods like UL9540A has enabled us to deploy millions of Li-ion BBUs globally, providing our customers with the high level of reliability they expect from Google Cloud.
Getting to 100 million Li-ion batteries is just one of many examples of how we are building a reliable cloud and power-efficient AI. As data center power systems evolve to include new technologies including large battery energy storage systems (BESS) and new workload requirements (AI workloads), we remain dedicated to exploring and implementing innovative solutions to build the most efficient and safest cloud data centers.
The authors would like to acknowledge Vijay Boovaragavan, Matt Tamashiro, Sandeep Sebastian, Thibault Pelloux-Gervais, Ken Wong, Mike Meakins, Stanley Fung, and Scott Sharp for their contributions.
Many specialized vector databases today require you to create complex pipelines and applications in order to get the data you need. AlloyDB for PostgreSQL offers Google Research’s, state-of-the-art vector search index, ScaNN, enabling you to optimize the end-to-end retrieval of the most fresh, relevant data with a single SQL statement.
Today, we are introducing a set of new enhancements to help you get even more out of vector search in AlloyDB. First, we are launching inline filtering, a major performance enhancement to filtered vector search in AlloyDB. One of the most powerful features in AlloyDB is the ability to perform filtered vector search directly in the database, instead of post-processing on the application side. Inline filtering helps ensure that these types of searches are fast, accurate, and efficient — automatically combining the best of vector indexes and traditional indexes on metadata columns to achieve better query performance.
Second, we are launching enterprise-grade observability and management tooling for vector indexes to help you ensure stable performance and the highest quality search results. This includes a new recall evaluator, or built-in tooling for evaluating recall, a key metric of vector search quality. That means you no longer have to build your own measurement pipelines and processes for your applications to deliver good results. We’re also introducing vector index distribution statistics, allowing customers with rapidly changing real-time data to achieve more stable, consistent performance.
Together, these launches further strengthen our mission of providing performant, flexible, high-quality end-to-end solutions for vector search that enterprises can rely on.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e025d3b4280>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A review of filtered vector search in AlloyDB
Many customers start their journey with vector search trying simple search on a single column. For example, a retailer might want to perform a semantic search on product descriptions to surface the right products to match end-user queries.
code_block
<ListValue: [StructValue([(‘code’, “SELECT * FROM productrnORDER BY embedding <=> embedding(‘text-embedding-005’, ‘red cotton crew neck shirt’)::vectorrnLIMIT 50;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e025cfdf370>)])]>
However, very quickly, as you look to productionize these solutions and improve the quality of your results, you may find that the queries themselves get more interesting. You might iterate — add filters, perform joins with other tables, and aggregate your data. For example, the retailer might want to allow users to filter by size, price, and more.
code_block
<ListValue: [StructValue([(‘code’, “SELECT * FROM productrnWHERE category=’shirt’&& size=’S’&& price<100rnORDER BY embedding <=> embedding(‘text-embedding-005’, ‘red cotton crew neck’)::vectorrnLIMIT 50;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e025cfdff70>)])]>
AlloyDB’s PostgreSQL interface provides a strong developer experience for these types of workloads. Because vector search is integrated into the SQL interface, developers can very easily query structured and unstructured data together in a single SQL statement, as opposed to writing complex application code that pulls data from multiple sources.
Moreover, changing requirements such as adding new query filters typically don’t require schema or index updates. If our retailer, for example, wants to only show in-stock items at the end user’s local store, they can very easily join their products table with an existing store inventory table via the SQL interface.
code_block
<ListValue: [StructValue([(‘code’, “SELECT * FROM product prnJOIN product_inventory pi ON p.id = pi.product_idrnWHERE category=’shirt’ && pi.inventory>0rnORDER BY embedding <=> embedding(‘text-embedding-005’, ‘red cotton crew neck’)::vectorrnLIMIT 50;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e025cfdf820>)])]>
All of this, and more, is possible in AlloyDB!
Inline filtering
But as a developer, you don’t just want to execute the query — you also want excellent performance and recall. To deliver the best performance, the AlloyDB query optimizer makes choices on how to execute a query with filters. Inline filtering, is a new query optimization technique that allows the AlloyDB query optimizer to evaluate both the metadata filtering conditions and the vector search in tandem, leveraging both vector indexes and indexes on the metadata columns. Inline filtering is now available for the ScaNN index in AlloyDB, a search technology based on over a decade of Google research into semantic search algorithms.
AlloyDB intelligently and automatically employs this technique when it’s most beneficial. Depending on the query and the distribution of the underlying data, the query planner automatically chooses the execution plan with the best performance. When filters are very selective, i.e., when a very small number of rows matches the filter, the query planner typically executes a pre-filter. This can leverage an index on a metadata column to find the small subset of rows that match the filter, and then perform a nearest-neighbor search on only those rows. Alternatively, the query planner may decide to execute a post-filter in cases of low selectivity — i.e., if a large percentage of rows match the filtered condition. Here, the query planner starts with the vector index to come up with a list of relevant candidates, and then removes results that do not match the predicates on the metadata columns.
Inline filtering, on the other hand, is best for cases with medium selectivity. As AlloyDB searches through the vector index, it only computes distances for vectors that match the metadata filtering conditions. This massively improves performance for these queries complementing the advantages of post-filter or pre-filter. With this feature, AlloyDB provides great performance across the whole gamut of selectivities of filters when combined with vector search.
Enterprise-grade observability
If you’re running similarity search or generative AI workloads in production, you need stable performance and quality of results, just as you do for any other database workload. Observability and manageability tooling are key to achieving that.
With the new recall evaluator, built directly into the database, you can now more systematically measure, and ultimately tune, search quality with a single stored procedure in the database rather than build custom evaluation pipelines.
Recall in similarity search is the fraction of relevant instances that were retrieved from a search, and is the most common metric used for measuring search quality. One source of recall loss comes from the difference between approximate nearest neighbor search, or aNN, and k (exact) nearest neighbor search, or kNN. Vector indexes like AlloyDB’s ScaNNimplement aNN algorithms, allowing you to speed up vector search on large datasets in exchange for a small tradeoff in recall. Now, AlloyDB provides you with the ability to measure this tradeoff directly in the database for individual queries and ensure that it is stable over time. You can update query and index parameters in response to this information to achieve better results and performance. This management tooling is critical if you care deeply about stable, high-quality results.
In addition to recall improvements, we’re also introducing vector index distribution statistics for the ScaNN index, allowing developers to see the distribution of vectors within the index. This is particularly useful for workloads with high write throughput or data change rates. In these scenarios, new real-time data is automatically added to the index and is ready for querying right away. Now, you can monitor any changes in vector-index distribution, and ensure that performance stays robust through these data changes.
To learn more about the ScaNN for AlloyDB index, check out our introduction to the ScaNN for AlloyDB index, or read our ScaNN for AlloyDB whitepaper for an introduction to vector search at large, and then a deep dive into the ScaNN algorithm and how we implemented it in PostgreSQL and AlloyDB.
<ListValue: [StructValue([(‘title’, ‘Get certified for 50% off today’), (‘body’, <wagtail.rich_text.RichText object at 0x3e659d9efee0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
The cloud is evolving fast — and that means you need to evolve fast. With the explosion of AI, it’s not enough to build skills; you have to be able to prove you have them.
As more companies pursue digital transformation and AI adoption, validating skills quickly and effectively is more critical than ever. That’s not just the message everyone is hearing from recruiters and executives — we also have new research demonstrating the impact of getting Google Cloud certified. And recognizing just how valuable certification is, we’ve got three new ones to help drive your success in 2025.
A recent Ipsos study commissioned by Google Cloud surveyed more than 3,000 cloud practitioners, students, and decision-makers and confirmed that certifications not only increase career opportunities, they also drive efficiencies for digital businesses (which is pretty much every business these days). From seasoned professionals to those just starting out, as well as the organizations for which they work, everyone benefits from validating their cloud skills with trusted certifications.
Certifications: A catalyst for confidence and career advancement
Certifications are a key driver of career growth for engineers, data scientists, and other cloud professionals. For Google Cloud learners, certifications are considered the most valuable part of their learning journey. This is supported by the Ipsos data: eight in 10 Google Cloud learners report that certification equips them with the skills needed for in-demand roles, accelerates their promotion potential, and contributes to their overall professional success when they share their credentials online.
The Ipsos research also reveals the significant impact of Google Cloud certifications on students. Empowered by certifications, students report higher salaries and faster time-to-hire. An impressive nine in 10 Google Cloud certified students say their training made them more competitive in the job market, leading to better career opportunities.
Helping cloud leaders find and build more efficient teams
Furthermore, certifications build confidence and efficiency for cloud leaders and decision-makers. Leaders from organizations using Google Cloud report that certifications significantly improve the efficiency of their cloud operations. They cite increased confidence in on-time project completion, accelerated onboarding to roles and projects, and greater confidence in a candidate’s knowledge during the hiring process. In fact, more than six in 10 leaders say one of the most important resources for cloud learners is getting certified, and approximately 70% believe certified employees are more productive.
Explore—and prepare for—the latest certifications from Google Cloud, integrated with AI concepts
To get started out take your training to the next level, you can explore the full catalog of Google Cloud certifications, which now include these newly launched certifications:
Associate Data Practitioner Certification: This certification is a great fit for data scientists who want to validate their Google Cloud data skills and knowledge, like ensuring data is clean, secure, and usable for AI and machine learning models. Follow this learning path to prepare for the exam.
Associate Google Workspace Administrator Certification: Validate your proficiency in the core skills required to successfully manage Google Workspace environments, including effectively managing the AI-powered assistant. Follow this learning path to prepare for the exam.
Professional Cloud Architect Certification [Renewal]: Prove your skills as a professional cloud architect with this new, streamlined recertification exam, focused on the application of generative AI solutions to solve real-world business challenges. Check out the exam guide to prepare for the exam.
How certification moves the needle: Hear from certified professionals
People are already transforming their careers and supercharging their teams with the help of certifications. Hear from a Chief Technology Officer, a senior cloud architect, a risk manager and a student in information systems about the difference a certification makes in their day-to-day:
The rapid pace of AI innovation has made skills validation an imperative — for cloud professionals and the companies they call home. Learn more about all your cloud credentialing options here or explore our full suite of Google Cloud learning tools at skills.google.
CloudWatch Database Insights announces support of databases hosted on Amazon Relational Database Service (RDS). Database Insights is a database observability solution that provides a curated experience designed for DevOps engineers, application developers, and database administrators (DBAs) to expedite database troubleshooting and gain a holistic view into their database fleet health.
Database Insights consolidates logs and metrics from your applications, your databases, and the operating systems on which they run into a unified view in the console. Using its pre-built dashboards, recommended alarms, and automated telemetry collection, you can monitor the health of your database fleets and use a guided troubleshooting experience to drill down to individual instances for root-cause analysis. Application developers can correlate the impact of database dependencies with the performance and availability of their business-critical applications. This is because they can drill down from the context of their application performance view in Amazon CloudWatch Application Signals to the specific dependent database in Database Insights.
You can get started with Database Insights by enabling it on your RDS databases using the RDS service console, AWS APIs, and SDKs. Database Insights delivers database health monitoring aggregated at the fleet level, as well as instance-level dashboards for detailed database and SQL query analysis.
Today, we’re announcing Claude 3.7 Sonnet, Anthropic’s most intelligent model to date and the first hybrid reasoning model on the market, is available in preview on Vertex AI Model Garden. Claude 3.7 Sonnet can produce quick responses or extended, step-by-step thinking that is made visible to the user. Claude 3.7 Sonnet includes improvements in coding, and is optimized for real-world, practical use cases to reflect customers’ needs.
“Claude 3.7 Sonnet represents an exciting breakthrough as the first hybrid reasoning model, combining rapid responses and reasoning in a single model,” said Kate Jensen, Head of Revenue at Anthropic. “By making Claude 3.7 Sonnet available through Vertex AI, Google Cloud customers can now apply this transformative technology across their organizations. Whether developing complex software solutions, delivering customer experiences, or conducting strategic analysis, Claude on Vertex AI helps teams to tackle their most challenging business problems with enterprise-grade reliability.”
We’re also announcing Vertex AI support for Anthropic’s new agentic coding tool, Claude Code. Claude Code lets developers delegate coding tasks to Claude directly from their terminal and is available through Anthropic’s limited research preview. For more information on Claude 3.7 Sonnet and Claude Code, including how to access Claude Code, check out Anthropic’s blog here.
Build on a unified AI platform with Vertex AI
To explore the full potential of foundational models like Claude, you’ll need advanced development tools and enterprise-grade reliability to use them in your applications. That’s what you get with Vertex AI, which is built on Google’s AI-optimized infrastructure, stringent security, and learnings from serving 300+ real-world use cases.
Vertex AI empowers you to take your Claude-powered applications from concept to production on a unified platform. With Vertex AI’s Model-as-a-Service (MaaS) offering, you benefit from simplified procurement, fully managed infrastructure, enterprise-grade security, and advanced developer tools.
Confidently deploy agents in production: Power production-grade AI agents with Claude 3.7 Sonnet, using Vertex AI’s full suite of agentic tools and services, including RAG Engine and Agent Engine (coming soon).
Optimize performance with fully managed infrastructure: Simplify how you deploy and scale Claude 3.7 Sonnet with Vertex AI’s fully managed infrastructure that’s tailored for AI workloads.
Accelerate development with powerful MLOps tools: Explore and evaluate Claude 3.7 Sonnet with fully integrated platform tools like Vertex AI Evaluation for model testing and evaluation and the LangChain integration for custom application building.
Build with enterprise-grade security, compliance, and data governance: Leverage Google Cloud’s robust built-in security, privacy, and compliance measures to securely scale your applications. Enterprise controls, such as Vertex AI Model Garden’s organization policy, provide the right access controls to make sure only approved models can be accessed.
Additional features to make the most of Claude on Vertex AI
To enhance your interaction and deployment of Claude models on Vertex AI, including Claude 3.7 Sonnet, we also offer advanced features designed to reduce latency and costs, increase throughput, and optimize Claude model utilization:
Count tokens (generally available): Make more informed decisions about your prompts and usage by determining the number of tokens in a message before sending it to Claude. Learn more on how to use count tokens with Claude models and which models are supported here.
Citations (generally available): Verify sources with detailed references to the exact sentences and passages it uses to generate responses, leading to more verifiable, trustworthy outputs. Claude 3.7 Sonnet, upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku support Citations.
Batch predictions (preview): Process large volumes of requests asynchronously for cost savings. Popular applications include analyzing large datasets—such as customer databases—for risk assessment or fraud detection, and applications that require periodic updates—such as generating daily reports. Each batch job is processed in less than 24 hours and costs 50% less than standard Anthropic API calls. Learn more on how to use batch predictions with Claude models and which models are supported here.
Prompt caching (preview): Provide Claude with more background knowledge and example outputs to improve response accuracy—all while reducing costs. You can cache all or specific parts of your frequently used inputs, so that subsequent queries can use the cached results. Learn more on how to use prompt caching with Claude models and which models are supported here.
We’re also excited to share that Claude 3.5 Haiku, which is already available on Vertex AI Model Garden, now supports multi-modal image input. Claude 3.5 Haiku is Anthropic’s fastest and most cost-effective model.
Customers are driving business results with Anthropic on Google Cloud
AES, a global energy company, uses Claude on Vertex AI to significantly increase the accuracy and speed of the company’s health and safety audits:
“Our auditors previously spent 14 days completing each audit process. Now, with our Claude-powered agents on Vertex AI, the same work is completed in just one hour. I love the accuracy of Anthropic’s Claude models and the security and advanced AI tools that Google Cloud provides to utilize these models for our auditing process.” — Sean Otto, Senior Director of Data Science & Analytics at AES
Palo Alto Networks, a global cybersecurity company, is accelerating software development and security by deploying Anthropic’s Claude models on Vertex AI:
“With Claude running on Vertex AI, we saw a 20% to 30% increase in feature development and code implementation. Running Claude on Google Cloud’s Vertex AI not only accelerates development projects, it enables us to hardwire security into code before it ships.” — Gunjan Patel, Director of Engineering, Office of the CPO at Palo Alto Networks
Quora, the global knowledge-sharing platform, is harnessing Claude’s capabilities on Vertex AI to facilitate millions of daily interactions through Quora’s own AI-powered chat platform, Poe:
“We consistently hear from our users about how much they enjoy the intelligence, adaptability, and natural conversational abilities of Anthropic’s Claude models. They’re relying on these qualities for a wide variety of tasks, from the complex to the creative. By leveraging Claude with Vertex AI’s secure and scalable platform, we’re able to facilitate millions of daily interactions, ensuring both speed and reliability.” — Spencer Chan, Product Lead at Poe by Quora
Replit, a platform for software development and deployment, leverages Claude on Vertex AI to power Replit Agent, which empowers people across the world to use natural language prompts to turn their ideas into applications, regardless of coding experience.
“Our AI agent is made more powerful through Anthropic’s Claude models running on Vertex AI. This integration allows us to easily connect with other Google Cloud services, like Cloud Run, to work together behind the scenes to help customers turn their ideas into apps.” — Amjad Masad, Founder and CEO of Replit
Amazon Verified Permissions now supports the same JSON format for entity and context data, as the Cedar SDK. Developers can use this simpler format for authorization requests. This aligns the Amazon Verified Permissions API more closely with the open source Cedar SDK, and simplifies moving from the SDK to Amazon Verified Permissions or vice versa.
Amazon Verified Permissions is a permissions management and fine-grained authorization service for the applications that you build. Using Cedar, an expressive and analyzable open-source policy language, developers and admins can define policy-based access controls using roles and attributes for more granular, context-aware access control. For example, an HR application might call Amazon Verified Permissions (AVP) to determine if Alice is permitted to access Bob’s performance evaluation, given that she is in the HR Managers group. Customers can use Cedar JSON format to pass entity data describing the principal (Alice) and the resource (Bob’s performance evaluation).
This change is available in all AWS regions supported by Amazon Verified Permissions. The service will continue to support the old format, and so the change does not break existing application integrations. To learn more about using the Cedar JSON format, see Cedar JSON entity in the Cedar user guide and the Verified Permissions user guide. To learn more about Amazon Verified Permissions, visit the product page. For more information visit the Verified Permissions product page.
Anthropic’s Claude 3.7 Sonnet hybrid reasoning model, their most intelligent model to date, is now available in Amazon Bedrock. Claude 3.7 Sonnet represents a significant advancement in AI capabilities, offering both quick responses and extended, step-by-step thinking made visible to the user. This new model includes strong improvements in coding and brings enhanced performance across various tasks, like instruction following, math, and physics.
Claude 3.7 Sonnet introduces a unique approach to AI reasoning by integrating it seamlessly with other capabilities. Unlike traditional models that separate quick responses from those requiring deeper thought, Claude 3.7 Sonnet allows users to toggle between standard and extended thinking modes. In standard mode, it functions as an upgraded version of Claude 3.5 Sonnet. While in extended thinking mode, it employs self-reflection to achieve improved results across a wide range of tasks. Amazon Bedrock users can adjust how long the model thinks, offering a flexible trade-off between speed and answer quality. Additionally, users can control the reasoning budget by specifying a token limit, enabling more precise management of cost.
Anthropic has optimized Claude 3.7 Sonnet for real-world applications that align closely with typical language model use cases, rather than focusing solely on math and computer science competition problems. This approach ensures that the model is well-suited to address the diverse needs of customers across various industries and use cases.
Claude 3.7 Sonnet is now available in Amazon Bedrock in the US East (N. Virginia), US East (Ohio), and US West (Oregon) regions. To get started, visit the Amazon Bedrock console. Integrate it into your applications using the Amazon Bedrock API or SDK. For more information and to learn more read the AWS News Blog and Claude in Bedrock product detail page.
AWS WAF enhances Service Quotas capabilities, enabling organizations to proactively monitor and manage quotas for their cloud deployments.
AWS WAF is a web application firewall that helps protect your web applications or APIs against common web exploits and bots that may affect availability, compromise security, or consume excessive resources. By leveraging AWS Service Quotas, you can quickly understand your applied service quota values for these WAF resources and request increases when needed. This enhanced integration brings three key benefits. First, you can now monitor the current utilization of your account-level quotas for WAF resources such as web ACLs, rule groups, and IP sets in the Service Quotas console. Second, certain service quota increase requests will now be auto-approved, enabling customers to access higher quotas faster. For example, smaller increases are usually automatically approved while larger requests are submitted to AWS Support. Lastly, you can now create Amazon CloudWatch alarms to notify you when your utilization of a given quota exceeds a configurable threshold. This enables you to better adapt your utilization based on your applied quota values and automate your quota increase requests.
You can access AWS Service Quotas through the AWS console, AWS APIs, and CLI. Integration with AWS Service Quotas is available in all AWS regions where AWS WAF is offered. You can learn more about AWS WAF by visiting Developer Guide.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7gd instances with up to 3.8 TB of local NVMe-based SSD block-level storage are available in the AWS GovCloud (US-East) Region.
These Graviton3-based instances with DDR5 memory are built on the AWS Nitro System and are a great fit for applications that need access to high-speed, low latency local storage, including those that need temporary storage of data for scratch space, temporary files, and caches. They have up to 45% improved real-time NVMe storage performance than comparable Graviton2-based instances. Graviton3-based instances also use up to 60% less energy for the same performance than comparable EC2 instances, enabling you to reduce your carbon footprint in the cloud.
C7gd instances are now available in the following AWS regions: US East (N. Virginia, Ohio), US West (Oregon, N. California), Europe (Spain, Stockholm, Ireland, Frankfurt), Asia Pacific (Tokyo, Mumbai, Singapore, Sydney, Malaysia) and AWS GovCloud (US-East).
Distributed tracing is a critical part of an observability stack, letting you troubleshoot latency and errors in your applications. Cloud Trace, part of Google Cloud Observability, is Google Cloud’s native tracing product, and we’ve made numerous improvements to the Trace explorer UI on top of a new analytics backend.
The new Trace explorer page contains:
A filter bar with options for users to choose a Google Cloud project-based trace scope, all/root spans and a custom attribute filter.
A visualization of matching spans including an interactive span duration heatmap (default), a span rate line chart, and a span duration percentile chart.
A table of matching spans that can be narrowed down further by selecting a cell of interest on the heatmap.
A tour of the new Trace explorer
Let’s take a closer look at these new features and how you can use them to troubleshoot your applications. Imagine you’re a developer working on the checkoutservice of a retail webstore application and you’ve been paged because there’s an ongoing incident.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e8295f2b1c0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
This application is instrumented using OpenTelemetry and sends trace data to Google Cloud Trace, so you navigate to the Trace explorer page on the Google Cloud console with the context set to the Google Cloud project that hosts the checkoutservice.
Before starting your investigation, you remember that your admin recommended using the webstore-prod trace scope when investigating webstore app-wide prod issues. By using this Trace scope, you’ll be able to see spans stored in other Google Cloud projects that are relevant to your investigation.
You set the trace scope to webstore-prod and your queries will now include spans from all the projects included in this trace scope.
You select checkoutservice in Span filters (1) and the following updates load on the page:
Other sections such as Span name in the span filter pane (2) are updated with counts and percentages that take into account the selection made under service name. This can help you narrow down your search criteria to be more specific.
The span Filter bar (3) is updated to display the active filter.
The heatmap visualization (4) is updated to only display spans from the checkoutservice in the last 1 hour (default). You can change the time-range using the time-picker (5). The heatmap’s x-axis is time and the y-axis is span duration. It uses color shades to denote the number of spans in each cell with a legend that indicates the corresponding range.
The Spans table (6) is updated with matching spans sorted by duration (default).
Other Chart views (7) that you can switch to are also updated with the applied filter.
From looking at the heatmap, you can see that there are some spans in the >100s range which is abnormal and concerning. But first, you’re curious about the traffic and corresponding latency of calls handled by the checkoutservice.
Switching to the Span rate line chart gives you an idea of the traffic handled by your service. The x-axis is time and the y-axis is spans/second. The traffic handled by your service looks normal as you know from past experience that 1.5-2 spans/second is quite typical.
Switching to the Span duration percentile chart gives you p50/p90/p95/p99 span duration trends. While p50 looks fine, the p9x durations are greater than you expect for your service.
You switch back to the heatmap chart and select one of the outlier cells to investigate further. This particular cell has two matching spans with a duration of over 2 minutes, which is concerning.
You investigate one of those spans by viewing the full trace and notice that the orders publish span is the one taking up the majority of the time when servicing this request. Given this, you form a hypothesis that the checkoutservice is having issues handling these types of calls. To validate your hypothesis, you note the rpc.method attribute being PlaceOrder and exit this trace using the X button.
You add an attribute filter for key: rpc.method value:PlaceOrder using the Filter bar, which shows you that there is a clear latency issue with PlaceOrder calls handled by your service. You’ve seen this issue before and know that there is a runbook that addresses it, so you alert the SRE team with the appropriate action that needs to be taken to mitigate the incident.
Share your feedback with us via the Send feedback button.
Behind the scenes
This new experience is powered by BigQuery, using the same platform that backs Log Analytics. We plan to launch new features that take full advantage of this platform: SQL queries, flexible sampling, export, and regional storage.
In summary, you can use the new Cloud Trace explorer to perform service-oriented investigations with advanced querying and visualization of trace data. This allows developers and SREs to effectively troubleshoot production incidents and identify mitigating measures to restore normal operations.
The new Cloud Trace explorer is generally available to all users — try it out and share your feedback with us via the Send feedback button.
From transcribing customer calls and meetings, to analyzing research interviews and creating accessible content, audio transcription plays a vital role in extracting insights from spoken data. Our partners are collaborating with clients across industries to implement transcription solutions that enhance efficiency, accessibility, and data-driven decision-making.
Traditional audio transcription methods, such as manual transcription or basic speech-to-text tools can be time-consuming, error-prone, and expensive. In this blog post, we show how Gemini offers a cutting-edge solution for scalable audio transcription by automating the process and delivering results with high accuracy at a fast pace – all in a cost-effective way.
The challenges of scaling audio transcription
As organizations scale their transcription needs, they might encounter challenges such as increasing costs, latency in handling large volumes of audio, and maintaining accuracy across diverse audio conditions. In particular, legacy solutions struggle with:
Handling complex audio with multiple speakers, accents, or background noise.
Maintaining accuracy in industry-specific terminology across healthcare, legal, and customer service domains.
Adapting to multilingual needs, especially in global business environments.
Optimizing processing time and cost, ensuring fast turnaround without excessive resource consumption.
A scalable solution must address these challenges efficiently, without compromising speed, accuracy, or customization — this is where Gemini excels.
How our partners put Gemini to work
Google Cloud Partners leverage audio transcription to help clients across various industries improve efficiency, compliance, and accessibility. Here are some examples:
Media and entertainment: Transcribe interviews, podcasts, and webinars for content creation, and generate subtitles to enhance accessibility and engagement.
Legal and Compliance: Transcribe legal proceedings, contracts, and compliance-related communications to improve accuracy, streamline case management, and ensure regulatory adherence.
Healthcare: Convert medical dictations and clinical notes into structured records for better documentation, electronic health record (EHR) integration, and regulatory compliance.
Business and corporate: Transcribe meetings, interviews, and presentations to improve collaboration, knowledge sharing, and record-keeping.
Gemini redefines the possibilities of scalable audio transcription, offering a potent combination of advanced AI and seamless integration with Google Cloud. Here’s what sets it apart:
Efficient processing of large datasets: Gemini can handle large volumes of audio data with ease, making it ideal for organizations with high-throughput transcription needs.
Exceptional accuracy and contextual understanding: Backed by decades of Google research and development in speech recognition and natural language understanding, Gemini delivers highly accurate transcriptions that capture the nuances of conversations. This minimizes the need for manual review and correction, especially in cases with multiple speakers, accents, or challenging audio conditions
Speaker diarization: Gemini can accurately identify and differentiate between speakers in an audio file, making it easier to follow conversations and attribute dialogue correctly
Multilingual support: Gemini supports transcription in multiple languages and dialects, expanding its utility for global businesses and diverse content
Customizable formatting: Gemini offers flexible formatting options, allowing users to tailor transcripts to their specific needs, including timestamps, speaker labels, and punctuation.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e8290547340>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Introducing a differentiated solution
The Google Cloud Partner Engineering team worked together with System Integrators (SIs) to build a differentiated solution that allows customers to implement audio transcription at scale using Google’s Gemini on Google Cloud.
Gemini’s advanced multi-modal and reasoning capabilities have unlocked new possibilities for audio transcription. This solution allows audio files to be sent directly to Gemini for transcription. The reference architecture below illustrates how this solution is built;
gen AI powered audio transcription reference architecture
This architecture demonstrates a robust and scalable approach to audio transcription using Gemini. It can be modified for any audio transcription use case. Here’s how it works:
1. File upload and sorting: The Upload Cloud Storage bucket is used to store source audio files like .wav, .mp3, .mp4, files etc. When these files are uploaded, Eventarc triggers the SortCloud Run function. This trigger event is passed using Cloud Pub/Sub.
The Sort Cloud Run function manages incoming files by sorting and filtering them based on their file types (e.g., .wav, .mp3). Depending on the file type, the files are then stored in either the Recordings Cloud Storage bucket or the Archive Cloud Storage bucket.
2. Transcription: When audio files are placed in the Recordings Cloud Storage bucket, Eventarc uses Cloud Pub/Sub to trigger the Recording Cloud Run function. This Recording function then sends the audio files to the Gemini 1.5 Flash LLM model for audio transcription.
3. Gemini’s multi-faceted processing: Gemini performs three key tasks:
a. Analysis and formatting: It analyzes the audio file, extracting pertinent data and structuring it into JSON format based on the audio file schema.
b. Transcription and summarization: Gemini transcribes the audio content into text and generates a concise summary.
c. Output and evaluation: The summarized text is sent to a “TTS Output” Cloud Storage bucket, triggering the TTS Audio Generation function. This function executes a script from the “Golden Script” Cloud Storage bucket to generate sample audio, which is then used to evaluate the transcription quality against established metrics like Word Error Rate (WER), Character Error Rate (CER), Match Rate, etc.
This approach provides key benefits: dynamic scaling through a serverless, event-driven architecture (Cloud Run, Eventarc), simplified management via fully managed services (Cloud Storage), cost-effectiveness by consuming resources only when needed, and enhanced capabilities like advanced summarization and speaker diarization powered by Gemini.
Design considerations
When designing audio transcription applications and services on Google Cloud with Gemini, several factors are crucial for optimal performance and scalability:
1. Efficient audio file handling: Avoid loading large audio files directly into memory for serverless transcription on Google Cloud. Instead, use Google Cloud Storage URI to efficiently access and process audio without memory limitations.
2. Serverless function timeouts: To prevent premature termination when processing large audio files in Cloud run, increase the function timeout up to 60 minutes. Also set the Pub/Sub subscription acknowledgement deadline to 300 seconds for Eventarc.
3. Model selection and context window: For gen AI audio transcription, audio file size and duration dictate the model selection. Larger files and longer audio require models with large context windows like Gemini 1.5 Flash (1M tokens) and Gemini 1.5 Pro (2M tokens), overcoming prior LLM input limitations on the market today. The Gemini 1.5’s extended context window and near-perfect retrieval capabilities open up many new possibilities;
Context lengths of leading foundation models
This shows that for audio transcription use cases, Gemini 1.5 Pro and Flash offer scalable audio transcription, processing up to 22 and 11 hours of audio respectively based on customer needs.
a. Use the latest Gemini SDK: Ensure your code utilizes the most up-to-date SDK for optimal diarization performance.
b. Design effective prompts: Craft prompts that clearly instruct Gemini on diarization and formatting requirements. The diagram below shows a code example of a diarization prompt
Sample transcription & diarization prompt
This sample code prompts Gemini to transcribe an audio file from Cloud Storage URI and displays the transcription.
5. Advanced diarization techniques: For complex scenarios with multiple speakers, accents, or overlapping speech, design prompts efficiently to improve Gemini’s diarization accuracy. Consider separating diarization and transcription into separate functions, the snippet below shows an example of this;
Separate transcription & diarization function
From the screenshot above, the function highlighted in the red box is the prompt that instructs Gemini to transcribe. It also shows how we want the transcription to be formatted. This operation allows Gemini to focus first on transcribing the audio into text and summarizing it.
The transcription function is actually a straightforward function and zero shot prompt. For the diarization function, we recommend you design your prompt with a few short examples. The code block highlighted in blue shows the diarization function with some examples to help the model to diarize effectively and efficiently when there are multiple speakers.
6. Evaluating transcription quality: when building gen AI powered audio transcription systems on Google Cloud we recommend implementing a mechanism to evaluate the transcribed responses to further ensure better accuracy. Consider using tools like our Model Evaluation Service to assess and improve transcription quality.
Get started
Ready to unlock the power of scalable audio transcription with Gemini? Explore Gemini’s API documentation and discover how easy it is to integrate its advanced capabilities into your solutions. By implementing the best practices and design considerations discussed in this post, you can deliver exceptional transcription experiences to clients and drive innovation across various industries.
If you are an approved partner and require assistance, contact your Google Partner Engineer for deployment support.
Written by: Ashley Pearson, Ryan Rath, Gabriel Simches, Brian Timberlake, Ryan Magaw, Jessica Wilbur
Overview
Beginning in August 2024, Mandiant observed a notable increase in phishing attacks targeting the education industry, specifically U.S.-based universities. A separate investigation conducted by the Google’s Workspace Trust and Safety team identified a long-term campaign spanning from at least October 2022, with a noticeable pattern of shared filenames, targeting thousands of educational institution users per month.
These attacks exploit trust within academic institutions to deceive students, faculty, and staff, and have been timed to coincide with key dates in the academic calendar. The beginning of the school year, with its influx of new and returning students combined with a barrage of administrative tasks, as well as financial aid deadlines, can create opportunities for attackers to carry out phishing attacks. In these investigations, three distinct campaigns have emerged, attempting to take advantage of these factors.
In one campaign, attackers leveraged phishing campaigns utilizing compromised educational institutions to host Google Forms. At this time, Mandiant has observed at least 15 universities targeted in these phishing campaigns. In this case, the malicious forms were reported and subsequently removed. As such, at this time none of the phishing forms identified are currently active. Another campaign involved scraping university login pages and re-hosting them on the attacker-controlled infrastructure. Both campaigns exhibited tactics to obfuscate malicious activity while increasing their perceived legitimacy, ultimately to perform payment redirection attacks. These phishing methods employ various tactics to trick victims into revealing login credentials and financial information, including requests for school portal login verification, financial aid disbursement, refund verification, account deactivation, and urgent responses to campus medical inquiries.
Google takes steps to protect users from misuse of its products, and create an overall positive experience. However, awareness and education play a big role in staying secure online. To better protect yourself and others, be sure to report abuse.
Case Study 1: Google Forms Phishing Campaign
The first observed campaign involved a two-pronged phishing campaign. Attackers distributed phishing emails that contained a link to a malicious Google Form. These emails and their respective forms were designed to mimic legitimate university communications, but requested sensitive information, including login credentials and financial details.
Figure 1: Example phishing email
Figure 2: Another example phishing email
The email is just the initial stage of the attack. While there are legitimate URLs contained within the phish, there is also a request to visit an external link to provide “urgent” information. This external link leads victims to a Google Form that has been tailored to the targeted university, including a color scheme in the school colors, a header with the logo or mascot, and references to the university name. Mandiant has observed the creation and staging of several different Google Forms, all with different methods employed to trick victims into providing sensitive information. In one instance, the social engineering pretext is that a student’s account is “associated with logins from two separate university portals”, a conflict which, if not resolved, will lead to interruption in service at both universities.
Figure 3: Example Google Form phish
These Google Forms phishing campaigns are not just limited to targeting login credentials. In several instances, Mandiant observed threat actors attempting to obtain financial institution details.
Your school has collaborated with <redacted> to streamline fund
distribution to students. <redacted> ensures the quickest, most
dependable, and secure method for disbursing Emergency Grants
to eligible students. Unfortunately, we've identified an outstanding
issue regarding the distribution of your financial aid through <redacted>.
We kindly request that you review and, if necessary, update your
<redacted> information within the net 24 hours. Failing to address
this promptly may result in delays in receiving your funds.
Figure 4: Example Google Form phish
After successfully compromising and propagating additional phishes using the compromised environment, the threat actor then uses the victim’s infrastructure to host a similar campaign targeting future victims. In some cases, the Google Form link was shut down and then repurposed to further the attacker’s objectives.
Case Study 2: Website Cloning and Redirection
This campaign involves a sophisticated phishing attack where threat actors cloned a university website, mimicking the legitimate login portal. However, this cloned website involved a series of redirects, specifically targeting mobile devices.
The embedded JavaScript performs a “mobile check” and user-agent string verification and performed the following hex-encoded redirect:
if (window.mobileCheck()) {
window.location.href="x68x74x74x70x3ax2fx2fx63x75x74
x6cx79x2ex74x6fx64x61x79x2fx4ax4ex78x30x72x37"; }
Figure 5: JavaScript Hex-encoded redirect
This JavaScript checks to determine if the user is on a mobile device. If they are, it redirects them to one of several possible follow-on URLs. These are two examples:
hxxp://cutly[.]today/JNx0r7
hxxp://kutly[.]win/Nyq0r4
Case Study 3: Two-Step Phishing Campaign Targeting Staff and Students
Google’s Workspace Trust and Safety team also observed a two-step phishing campaign targeting staff and students. First, attackers send a phishing email to faculty and staff. The emails are designed to entice faculty and staff to provide their login credentials in order to view a document about a raise or bonus.
Figure 6: Example of phishing email targeting faculty and staff
Next, attackers use login credentials provided by faculty and staff to hijack their account and email phishing forms to students. These forms are designed to look like job applications, and phish for personal and financial information.
Figure 7: Example of phishing form emailed to students
Understanding Payment Redirection Attacks
Payment redirection attacks via Business Email Compromise (BEC) are a sophisticated form of financial fraud. In these attacks, cyber threat actors gain unauthorized access to a business email account and exploit it to redirect payments meant for legitimate recipients into their own accounts. While these attacks often involve the diversion of large transfers, there have been instances where attackers divert small amounts (typically 5-10%) to lower the likelihood of detection. This outlier tactic allows them to steal funds gradually, making it more challenging to detect unauthorized transactions.
Figure 8: Payment redirection attacks
Initial Compromise: Attackers often begin by gaining access to a legitimate email account through phishing, social engineering, or exploiting vulnerabilities. A common phishing technique involves using online surveys or other similar platforms to create convincing but fraudulent login pages or forms. When unsuspecting employees enter their credentials, attackers capture them and gain unauthorized access.
Reconnaissance: Once they have access to the email account, attackers closely monitor communications to understand the organization’s financial processes, the relationships with vendors, and the typical language used in financial transactions. This reconnaissance phase is crucial for the attackers to craft convincing fraudulent emails that appear authentic to their victims.
Impersonation and Execution: Armed with the information gathered during reconnaissance, attackers impersonate the compromised user or create look-alike email addresses. The TA then sends emails to employees, vendors, or clients, instructing them to change payment details for an upcoming transaction. Believing these requests to be legitimate, recipients comply, and the funds are redirected to accounts controlled by the attackers.
Withdrawal and Laundering: After the funds are diverted, attackers quickly withdraw or move the money across multiple accounts to make recovery difficult. The types of funds being stolen can vary widely and include financial aid such as FAFSA, refunds, scholarships, payroll, and other large transactions like vendor payments or grants. This diversity in targeted funds complicates efforts by organizations and law enforcement to trace and recover the stolen money, as each category may involve different institutions and processes.
The Impact of Payment Redirection Attacks
The consequences of a successful payment redirection attack can be severe:
Financial Losses: Organizations may lose substantial amounts of money, potentially running into millions of dollars, depending on the size of the transactions.
Reputational Damage: Clients and partners affected by these attacks may lose trust in the organization, which can harm long-term business relationships and brand reputation.
Operational Disruption: The aftermath of an attack often involves extensive investigations, coordination with financial institutions and law enforcement, and implementing enhanced security measures, all of which can disrupt normal business operations.
Mitigating Payment Redirection Attacks
To protect against payment redirection attacks, Mandiant recommends a multi-layered approach focusing on prevention, detection, and response:
Implement Multi-Factor Authentication (MFA): Requiring MFA for accessing email accounts adds an additional layer of security. Even if an attacker obtains a user’s credentials, they would still need the second factor to gain access, significantly reducing the risk of account compromise. Mandiant has observed many universities, which require MFA for current faculty/staff/students, but not for alumni accounts. While alumni accounts aren’t necessarily at risk of payment redirection attacks, Mandiant has identified instances where alumni accounts have been leveraged to access other user accounts in the environment.
Conduct Employee Training: Regular training sessions can help employees recognize phishing attempts and suspicious emails. Training should emphasize vigilance against phishing forms hosted on platforms like Google Forms, and stress the importance of verifying unusual requests, especially those involving financial transactions or changes in payment details. If a Google Forms page seems suspicious, report it as phishing.
Establish Payment Verification Protocols: Organizations should have strict procedures for verifying changes in payment information. For example, a policy that requires confirmation of changes via a known phone number or a separate communication channel can help ensure that any alterations are legitimate.
Use Canary Tokens for Detection: Deploying canary tokens, which are unique identifiers embedded in web pages or documents, can serve as an early warning system. If attackers scrape legitimate web pages to host them maliciously on their infrastructure, these tokens trigger alerts, notifying security teams of potential compromise or unauthorized data access.
Use Advanced Email Security Solutions: Deploying advanced email filtering and monitoring solutions can help detect and block malicious emails. These tools can analyze email metadata, check for domain anomalies, and identify patterns indicative of BEC attempts.
Built-in Protections with Gmail: Employs AI, threat signals, and Safe Browsing to block 99.9% of spam, phishing, and malware, while also detecting more malware than traditional antivirus and preventing suspicious account sign-ins.
Develop a robust Incident Response Plan: A well-defined incident response plan specifically addressing BEC scenarios enables organizations to act swiftly when an attack is detected. This plan should include procedures for containing the breach, notifying affected parties, and collaborating with financial institutions and law enforcement to recover lost funds.
Limit the number of emails a standard user can send in a day: Implementing a policy that restricts the number of emails a standard user can send daily provides additional safeguards in preventing the mass dissemination of phishing emails or malicious content from compromised accounts. This limit can act as a safety net, reducing the potential impact of a compromised account and making it harder for attackers to carry out large-scale phishing campaigns.
Context-Aware Access Monitoring: Utilize context-aware access monitoring to enhance security by analyzing the context of each login attempt. This includes evaluating factors such as the user’s location, device, and behavior patterns. If an access attempt deviates from established norms, such as an unusual login location or device, additional verification steps can be triggered. This helps detect and prevent unauthorized access, particularly in cases where credentials may have been compromised.
Detection
To assist the wider community in hunting and identifying activity outlined in this blog post, we have included a subset of these indicators of compromise (IOCs) in this post, and in a GTI Collection for registered users.
Amazon AppStream 2.0 improves the end-user experience by adding support for certificate-based authentication (CBA) on multi-session fleets running the Microsoft Windows operating system and joined to an Active Directory. This functionality helps administrators to leverage the cost benefits of the multi-session model while providing an enhanced end-user experience. By combining these enhancements with the existing advantages of multi-session fleets, AppStream 2.0 offers a solution that helps balance cost-efficiency and user satisfaction.
By using certificate-based authentication, you can rely on the security and logon experience features of your SAML 2.0 identity provider, such as passwordless authentication, to access AppStream 2.0 resources. Certificate-based authentication with AppStream 2.0 enables a single sign-on logon experience to access domain-joined desktop and application streaming sessions without separate password prompts for Active Directory.
This feature is available at no additional cost in all the AWS Regions where Amazon AppStream 2.0 is available. AppStream 2.0 offers pay-as-you go pricing. To get started with AppStream 2.0, see Getting Started with Amazon AppStream 2.0.
To enable this feature for your users, you must use an AppStream 2.0 image that uses AppStream 2.0 agent released on or after February 7, 2025 or your image is using Managed AppStream 2.0 image updates released on or after February 11, 2025.
Amazon Database Migration Service (DMS) now supports the Multi-ENI networking model and Credentials Vending System for DMS Homogenous Migrations.
Customers can now choose the Multi-ENI connection type and use the Credentials Vending System, providing a simplified networking configuration experience for secure connectivity to their on-premises database instances.
Amazon Relational Database Service (RDS) for PostgreSQL now supports the latest minor versions 17.4, 16.8, 15.12, 14.17, and 13.20. Please note, this release supports the versions released by the PostgreSQL community on February, 20,2025 to address the regression that was part of the February 13, 2025 release. We recommend that you upgrade to the latest minor versions to fix known security vulnerabilities in prior versions of PostgreSQL, and to benefit from the bug fixes added by the PostgreSQL community.
You can use automatic minor version upgrades to automatically upgrade your databases to more recent minor versions during scheduled maintenance windows. You can also use Amazon RDS Blue/Green deployments for RDS for PostgreSQL usingphysical replication for your minor version upgrades. Learn more about upgrading your database instances, including automatic minor version upgrades and Blue/Green Deployments in the Amazon RDS User Guide.
Amazon RDS for PostgreSQL makes it simple to set up, operate, and scale PostgreSQL deployments in the cloud. See Amazon RDS for PostgreSQL Pricing for pricing details and regional availability. Create or update a fully managed Amazon RDS database in the Amazon RDS Management Console.
AWS CodePipeline introduces a new action to deploy to Amazon Elastic Compute Cloud (EC2). This action enables you to easily deploy your application to a group of EC2 instances behind load balancers.
Previously, if you wanted to deploy to EC2 instances, you had to use CodeDeploy with an AppSpec file to configure the deployment. Now, you can simply use this new EC2 deploy action in your pipeline to deploy to EC2 instances, without the necessity of managing CodeDeploy resources. This streamlined approach reduces your operational overhead and simplifies your deployment process.
To learn more about using the EC2 deploy action in your pipeline, visit our tutorial and documentation. For more information about AWS CodePipeline, visit our product page. This new action is available in all regions where AWS CodePipeline is supported, except the AWS GovCloud (US) Regions and the China Regions.
Amazon Managed Streaming for Apache Kafka (Amazon MSK) now supports Apache Kafka version 3.8. You can now create new clusters using version 3.8 with either KRAFT or ZooKeeper mode for metadata management or upgrade your existing ZooKeeper based clusters to use version 3.8. Apache Kafka version 3.8 includes several bug fixes and new features that improve performance. Key new features include support for compression level configuration. This allows you to further optimize your performance when using compression types such as lz4, zstd and gzip, by allowing you to change the default compression level. For more details and a complete list of improvements and bug fixes, see the Apache Kafka release notes for version 3.8.
Amazon MSK is a fully managed service for Apache Kafka and Kafka Connect that makes it easier for you to build and run applications that use Apache Kafka as a data store. Amazon MSK is compatible with Apache Kafka, which enables you to quickly migrate your existing Apache Kafka workloads to Amazon MSK with confidence or build new ones from scratch. With Amazon MSK, you can spend more time innovating on streaming applications and less time managing Apache Kafka clusters. To learn how to get started, see the Amazon MSK Developer Guide.
Support for Apache Kafka version 3.8 is offered in all AWS regions where Amazon MSK is available.
Amazon Web Services, Inc. now supports China UnionPay credit cards for creating new AWS accounts, eliminating the need for international credit cards for customers in China.
To use China UnionPay for creating your AWS account, enter your address and billing country in China, then provide your local China UnionPay credit card details and verify your personal identity or business license. All subsequent AWS charges will be billed in Chinese Yuan currency, providing convenient payment experience for customers in China.
To get started, select China UnionPay as your payment method when creating a new AWS account. For more information on using China UnionPay credit cards with AWS, visit Set up a Chinese yuan credit card.