Tibor Kiss

About Tibor Kiss

Posts by Tibor Kiss:

2025 11 21

GCP – Building with Gemini in the newest Vertex AI Studio

Today, we’re sharing new ways Vertex AI Studio – Google’s developer console for production-ready AI – will help teams turn ideas into scalable, production gen AI apps. We’ve introduced powerful new tools that directly address developer needs for efficiency and team-based work. Now, you can:

Use agents as tools to help you prompt and build
Collaborate with others much more efficiently
Get started more quickly

This all comes with a new, fresh interface that makes it easier to build with Gemini 3 and many other AI foundation models. In this post, we’ll walk you through the new interface, the new capabilities available today, and share tutorials to get started.

A new experience

The new experience is highly fluid and adaptive to your needs. Start with a prompt or an idea, and we’ll give you the right tools to see it all the way to production.

1. Agents as tools to help you prompt and build

Leverage specialized agents right within the Studio to enhance your prompt engineering and application development. These tools reduce manual effort and accelerate complex tasks:

Agent command	Purpose	Key functionality
/Prompt	Optimize prompt and system Instructions	Automatically refines your prompt and/or system instructions for better results
/Evaluate	Evaluate prompts	Uses custom autoraters for objective quality assessment.
/Build	Application development and structuring	Enables rapid app building from a simple prompt and allows for continuous iteration. Use prompts or agents to define your application’s specifications.

Prompt Agent: Optimize and evaluate for production

The guessing game of prompt engineering is over. Instead of the classic trial-and-error loop, the prompt agent helps you collaborate with AI to instantly achieve perfect results with prompt generation and refinement.

Use the /Evaluate command to evaluate your prompt. You can use an autorater to assess the quality of each response, and you can even create your own grading rubric for AI to assess.

Build agent: Rapid application development

Once your optimized prompt is ready and validated, the /Build command and other tools turn your work into code immediately:

One-click code generation: With your key logic finalized, instantly click “Get Code” to generate an application that can be downloaded or shared with others to accelerate “buy-in”. GitHub integrations and push-to-Cloud Run coming soon.

Use your prompts: Directly embed your optimized prompt within the working application

Use a custom agent to power your app: Seamlessly scale your work. Create an agent with Vertex AI Agent Builder, then reference it in your application.

2. Collaboration tools to help you work more efficiently

Generative AI development is rarely a solo task. The new collaboration features ensure your entire team, and even external stakeholders, can work together seamlessly in a secure enterprise environment:

Share prompts with non-GCP users: Extend collaboration beyond your cloud environment by easily sharing your best prompts with anyone.
Access saved prompts across Google’s ecosystem: Gemini CLI, Gemini Enterprise, Vertex SDK.
Share notes about updates: Keep your team aligned with integrated note-sharing features to document changes and insights.
See version history to track changes over time: Maintain control and transparency by tracking all revisions to your prompts and configurations.

3. Get started more quickly

We’ve streamlined the onboarding and initial setup experience:

One-click API key: Get instant access to the APIs you need without complex configuration.
Ask questions directly to the /Ask agent: Find answers fast with an integrated help agent, designed to provide immediate guidance.
Start with express mode: Jump straight into building with a simplified, default view that minimizes setup time.
Try models for free: Loginless access with no email login required.
Get access to the latest Google models: Gemini 3, Nano Banana Pro, Imagen, Veo, Lyria, Chirp and more.

We’re committed to continually refining Vertex AI Studio based on your feedback, which you can share right in the console, ensuring you have the tools you need for building the next generation of AI applications.

Try a tutorial.
Explore the new Vertex AI Studio via vertexai.google and see how these feature and usability improvements can accelerate your development.

Read More for the details.

2025 11 21

GCP – How Google Does It: Building the largest known Kubernetes cluster, with 130,000 nodes

Tibor Kiss Cloud, Google Cloud gcp

At Google Cloud, we’re constantly pushing the scalability of Google Kubernetes Engine (GKE) so that it can keep up with increasingly demanding workloads — especially AI. GKE already supports massive 65,000-node clusters, and at KubeCon, we shared that we successfully ran a 130,000-node cluster in experimental mode — twice the number of nodes compared to the officially supported and tested limit.

This kind of scaling isn’t just about increasing the sheer number of nodes; it also requires scaling other critical dimensions, such as Pod creation and scheduling throughput. For instance, during this test, we sustained Pod throughput of 1,000 Pods per second, as well as storing over 1 million objects in our optimized distributed storage. In this blog, we take a look at the trends driving demand for these kinds of mega-clusters, and do a deep dive on the architectural innovations we implemented to make this extreme scalability a reality.

The rise of the mega cluster

Our largest customers are actively pushing the boundaries of GKE’s scalability and performance with their AI workloads. In fact, we already have numerous customers operating clusters in the 20-65K node range, and we anticipate the demand for large clusters to stabilize around the 100K node mark.

This sets up an interesting dynamic. In short, we are transitioning from a world constrained by chip supply to a world constrained by electrical power. Consider the fact that a single NVIDIA GB200 GPU needs 2700W of power. With tens of thousands, or even more, of these chips, a single cluster’s power footprint could easily scale to hundreds of megawatts — ideally distributed across multiple data centers. Thus, for AI platforms exceeding 100K nodes, we’ll need robust multi-cluster solutions that can orchestrate distributed training or reinforcement learning across clusters and data centers. This is a significant challenge, and we’re actively investing in tools like MultiKueue to address it, with further innovations on the horizon. We are also advancing high-performance RDMA networking with the recently announced managed DRANET, improving topology awareness to maximize performance for massive AI workloads. Stay tuned.

At the same time, these investments also benefit users who operate at more modest scales — the vast majority of GKE customers. By hardening GKE’s core systems for extreme usage, we create substantial headroom for average clusters, making them more resilient to errors, increasing tolerance for user misuse of the Kubernetes API, and generally optimizing all controllers for faster performance. And of course, all GKE customers, large and small, benefit from investments in an intuitive, self-service experience.

Key architectural innovations

With that said, achieving this level of scale requires significant innovations throughout the Kubernetes ecosystem, including control plane, custom scheduling and storage. Let’s take a look at a few key areas that were critical to this project.

Optimized read scalability

When operating at scale, there’s a need for a strongly consistent and snapshottable API server watch cache. At 130,000 nodes, the sheer volume of read requests to the API server can overwhelm the central object datastore. To solve this, Kubernetes includes several complementary features to offload these read requests from the central object datastore.

First, the Consistent Reads from Cache feature (KEP-2340), detailed in here, enables the API server to serve strongly consistent data directly from its in-memory cache. This drastically reduces the load on the object storage database for common read patterns such as filtered list requests (e.g., “all Pods on a specific node”), by ensuring the cache’s data is verifiably up-to-date before it serves the request.

Building on this foundation, the Snapshottable API Server Cache feature (KEP-4988) further enhances performance by allowing the API server to serve LIST requests for previous states (via pagination or by specifying resourceVersion) directly from that same consistent watch cache. By generating a B-tree “snapshot” of the cache at a specific resource version, the API server can efficiently handle subsequent LIST requests without repeatedly querying the datastore.

Together, these two enhancements address the problem of read amplification, ensuring the API server remains fast and responsive by serving both strongly consistent filtered reads and list requests of previous states directly from memory. This is essential for maintaining cluster-wide component health at extreme scale.

An optimized distributed storage backend

To support the cluster’s massive scale, we relied on a proprietary key-value store based on Google’s Spanner distributed database. At 130K nodes, we required 13,000 QPS to update lease objects, ensuring that critical cluster operations such as node health checks didn’t become a bottleneck, and providing the stability needed for the entire system to operate reliably. We didn’t witness any bottlenecks with respect to the new storage system and it showed no signs of it not being able to support higher scales.

Kueue for advanced job queueing

The default Kubernetes scheduler is designed to schedule individual Pods, but complex AI/ML environments require more sophisticated, job-level management. Kueue is a job queueing controller that brings batch system capabilities to Kubernetes. It decides *when* a job should be admitted based on fair-sharing policies, priorities, and resource quotas, and enables “all-or-nothing” scheduling for entire jobs. Built on top of the default scheduler, Kueue provided the orchestration necessary to manage the complex mix of competing training, batch, and inference workloads in our benchmark.

Future of scheduling: Enhanced workload awareness

Beyond Kueue’s job-level queueing, the Kubernetes ecosystem is evolving towards workload-aware scheduling in its core. The goal is to move from a Pod-centric to a workload-centric approach to scheduling. This means the scheduler will make placement decisions considering the entire workload’s needs as a single unit, encompassing both available and potential capacity. This holistic view is crucial for optimizing price-performance, especially for the new wave of AI/ML training and inference workloads.

A key aspect of the emerging kubernetes scheduler is the native implementation of gang scheduling semantics within Kubernetes, a feature currently provided by add-ons like Kueue. The community is actively working on this through KEP-4671: Gang Scheduling.

In time, support for workload-aware scheduling in core Kubernetes will simplify orchestrating large-scale, tightly coupled applications on GKE, making the platform even more powerful for demanding AI/ML and HPC use cases. We’re also working on integrating Kueue as a second-level scheduler within GKE.

GCS FUSE for data access

AI workloads need to be able to access data efficiently. Together, Cloud Storage FUSE with parallel downloads and caching enabled and paired with the zonal Anywhere Cache, allowing access to model data in Cloud Storage buckets as if it were a local file system, reducing latency up to 70%. This provides a scalable, high-throughput mechanism for feeding data to distributed jobs or scale-out inference workflows. Alternatively, there’s Google Cloud Managed Lustre, a fully managed persistent zonal storage solution that supports workloads that need multi-petabyte capacity, TB/s throughput, and sub-millisecond latency. You can learn more about your storage options for AI/ML workloads here.

Benchmarking GKE for large-scale, dynamic AI workloads

To validate GKE’s performance with large-scale AI/ML workloads, we designed a four-phase benchmark simulating a dynamic environment with complex resource management, prioritization, and scheduling challenges. This builds on the benchmark used in the previous 65K node scale test.

We upgraded the benchmark to represent a typical AI platform that hosts mixed workloads, using workloads with distinct priority classes:

Low Priority: Preemptible batch processing, such as data preparation jobs.
Medium Priority: Core model training jobs that are important but can tolerate some queuing.
High Priority: Latency-sensitive, user-facing inference services that must have resources guaranteed.

We orchestrated the process using Kueue to manage quotas and resource sharing, and JobSet to manage training jobs.

Phase 1: Establishing a performance baseline with a large training job

To begin, we measure the cluster’s foundational performance by scheduling a single, large-scale training workload. We deploy one JobSet configured to run 130,000 medium-priority Pods simultaneously. This initial test allows us to establish a baseline for key metrics like Pod startup latency and overall scheduling throughput, revealing the overhead of launching a substantial workload on a clean cluster. This set the stage for evaluating GKE’s performance under more complex conditions. After execution, we removed this JobSet from the cluster, leaving an empty cluster for Phase 2.

1_Phase 1_ Establishing a performance baseline by deploying a massive pre-training workload of 130,000 pods on a clean cluster — Figure 1: Phase 1: Establishing a performance baseline by deploying a massive pre-training workload of 130,000 Pods on a clean cluster.

Phase 2: Simulating a realistic mixed-workload environment

Next, we introduced resource contention to simulate a typical MLOps environment. At first, we deployed 650 low-priority batch Jobs (totaling 65,000 Pods), filling up half of the capacity of the cluster’s 130K nodes.

2_Phase 2_ Simulating a realistic MLOps environment by introducing 65,000 low-priority batch job pods to fill 50_ of cluster capacity — Figure 2: Phase 2: Simulating a realistic MLOps environment by introducing 65,000 low-priority batch job Pods to fill 50% of cluster capacity.

Then we introduced 8 large, medium-priority fine-tuning Jobs (totaling 104,000 Pods), taking 80% of the cluster capacity, and preempting 60% of the batch workloads (which represents 30% of total cluster capacity). This phase tested GKE’s ability to manage mixed workloads, as well preemption within a mixed workloads environment. In this scenario, we observed Kueue in action, preempting existing workload and gang-scheduling a large number of batch jobs all at once to allow for fine-tuning jobs to be scheduled. This highlighted Kueue’s advantage over kube-scheduler: preemption happens much faster, and switching between workloads is almost instantaneous.

3_Kueue in Action_ Preempting low-priority batch workloads to accommodate 104,000 pods for higher-priority fine-tuning jobs — Figure 3: Kueue in action: Preempting low-priority batch workloads to accommodate 104,000 Pods for higher-priority fine-tuning jobs.

Phase 3: Prioritizing and scaling a latency-sensitive inference service

In this phase, we simulated the arrival of a critical inference service by deploying a high-priority Job, totalling 26K Pods, or 20% of the capacity. To accommodate it, Kueue preempted the remaining low-priority batch jobs.

4_Phase 3_ Prioritizing a critical, latency-sensitive inference service (26,000 pods) by preempting lower-priority batch jobs — Figure 4: Phase 3: Prioritizing a critical, latency-sensitive inference service (26,000 Pods) by preempting the remaining of lower-priority batch jobs.

We then scaled the inference workload to simulate a spike in traffic, first, preempting part of the medium-priority fine-tuning jobs. The inference workload scaled up to a total of 52,000 Pods, representing 40% of the capacity. Once fully scaled, we ran a 10-minute traffic simulation to measure performance under load.

5_Simulating a traffic spike_ Scaling the inference workload to 52,000 pods (40_ capacity) triggers further preemption of fine-tuning jobs — Figure 5: Simulating a traffic spike. Scaling the inference workload to 52,000 Pods (40% capacity) triggers partial preemption of fine-tuning jobs.

Phase 4: Validating cluster elasticity and resource recovery

Finally, we evaluated the cluster’s ability to efficiently recover and reallocate resources once peak demand was over. We scaled down the high-priority inference workload by 50%, returning to its original initial phase. This demonstrated GKE’s elasticity, ensuring that valuable compute resources were not left idle as workload demands change, thereby maximizing utilization and cost-efficiency. Again, Kueue took care of admitting back the preempted fine-tuning workloads that were waiting in the cluster queue.

6_Phase 4_ Demonstrating cluster elasticity by scaling down the inference workload and automatically recovering resources for pending batch jobs — Figure 6: Phase 4: Demonstrating cluster elasticity by scaling down the inference workload and automatically recovering resources for pending fine-tuning jobs.

With the benchmark concluded, the resulting data paints a clear picture of how GKE handles extreme-scale pressure.

Demonstrating GKE’s scalability across dimensions

The four benchmark phases tested multiple performance dimensions. In Phase 1, the cluster scaled to 130,000 Pods in 3 minutes and 40 seconds. In Phase 2, the low-priority batch workloads were created in 81 seconds, an average throughput of around 750 Pods/second.

Below is a diagram showing the execution timeline of the workload, highlighting the various phases of the benchmark.

7_Execution timeline highlighting the four distinct phases of the large-scale AI workload benchmark — Figure 7: Execution timeline highlighting the four distinct phases of the large-scale AI workload benchmark.

Overall, the benchmark demonstrated GKE’s ability to manage fluctuating demands by preempting lower-priority jobs to make room for critical training and inference services, showcasing the cluster’s elasticity and resource reallocation capabilities.

8_Total number of running workload pods over time, demonstrating GKE_s ability to maintain high utilization through dynamic preemption and resource reallocation — Figure 8: Total number of running workload Pods over time, demonstrating GKE’s ability to maintain high utilization through dynamic preemption and resource reallocation.

Intelligent workload management with Kueue

For this benchmark, Kueue was a critical component for enabling workload prioritization. In Phase 2, Kueue preempted 60% of the batch workloads (30% of the cluster capacity) to make room for medium-priority jobs, with the remainder preempted in Phase 3 for the high-priority inference workload. This simulation of urgent tasks taking precedence is a common operational scenario, and this large-scale preemption highlights how the combination of GKE and Kueue can dynamically allocate resources to the most critical jobs. At its peak in Phase 2, 39,000 Pods were preempted in 93 seconds. The Pod churn during the preemption of batch workloads and admission and creation of fine-tuning workloads reached a median of 990 and an average of 745 Pods/s, as seen below.

9_API request throughput during preemption events, showing a mix of POST and DELETE requests averaging 745 operations per second — Figure 9: API request throughput during preemption events, showing a mix of POST and DELETE requests averaging Pod churn of 745 Pods per second.

Checking the status of the admitted vs. evicted workloads from Kueue shows that many batch workloads were initially admitted, only to be preempted later by fine-tuning and later inference workloads.

10_Workload status over time, visualizing the volume of jobs admitted versus those preempted (evicted) by Kueue as priorities shifted — Figure 10: Workload status over time, visualizing the volume of jobs admitted versus those preempted (evicted) by Kueue as priorities shifted.

Blazing-fast scheduling at 1,000 pods/second

The key measure of Kubernetes’ control-plane performance is its ability to create and schedule Pods quickly. Throughout the benchmark, especially during the most intense phases, GKE consistently achieved and sustained a throughput of up to 1,000 operations per second for both Pod creation and Pod binding (the act of scheduling a Pod to a node).

11_Control plane throughput_ Sustaining up to 1,000 operations per second for both Pod creations and Pod bindings during intense scheduling phases — Figure 11: Control plane throughput: Sustaining up to 1,000 operations per second for both Pod creation and Pod binding during intense scheduling phases.

12_Detailed Pod creation throughput statistics (Average, Max, P50, P90, P99) across large pre-training, batch, and fine-tuning workloads — Figure 12: Detailed pod-creation throughput statistics (Average, Max, P50, P90, P99) across large pre-training, batch, and fine-tuning workloads.

Low pod startup latency

At the same time, pod-creation throughput was matched by low Pod-startup latencies across all workload types. For latency-sensitive inference workloads, the 99th percentile (P99) startup time was approximately 10 seconds, ensuring services could scale quickly to meet demand.

13_Pod startup latency across workload types, highlighting a P99 latency of approximately 10 seconds for latency-sensitive inference workloads — Figure 13: Pod startup latency across workload types.

Control plane stability under extreme load

GKE’s cluster control plane remained stable throughout the test. The total number of objects in a single database replica exceeded 1 million at its peak, while API server latencies for critical operations remained well below their defined thresholds. This confirms that the cluster can remain responsive and manageable even at this scale.

14_API Server latency for GET and LIST operations, remaining stable and well below defined thresholds despite the massive cluster scale — Figure 14: API Server latency for GET and LIST operations, remaining stable and well below defined thresholds, and despite the cluster’s massive scale.

15_API request duration broken down by verb (GET, POST, PUT, PATCH, DELETE), confirming consistent response times under load — Figure 15: API request duration broken down by verb (GET, POST, PUT, PATCH, DELETE), confirming consistent response times under load.

16_Duration for LIST operations specifically, remaining stable throughout the benchmark phases — Figure 16: Duration for LIST operations specifically, remaining stable throughout the benchmark phases.

17_Total count of Kubernetes objects (including Pods, Leases, and Nodes) in the database, exceeding 1 million objects at peak scale — Figure 17: Total count of Kubernetes objects (including Pods, Leases, and Nodes) in the database, exceeding 1 million objects.

Destination: Massive scale

All told, this experiment demonstrated that GKE can support AI and ML workloads at a scale well beyond current public limits. Further, the insights we gained from operating at this scale are helping us plan the GKE’s future development.While we don’t yet officially support 130K nodes, we’re very encouraged by these findings. If your workloads require this level of scale, reach out to us to discuss your specific needs! You can also enjoy these wonderful conversations on scale and other topics from KubeCon at Atlanta with Google experts and analysts.

Read More for the details.

2025 11 21

GCP – BigQuery AI: The convergence of data and AI is here

Tibor Kiss Cloud, Google Cloud gcp

From uncovering new insights in multimodal data to personalizing customer experiences, AI is emerging as the engine of modern innovation. The explosion in AI adoption has created a need to bring data and AI closer — not only to streamline the AI lifecycle, but also to bring AI-driven insights and workflow automation to everyone in the organization.

We created BigQuery ML to bring AI to your data, enabling data scientists and data analysts to build and deploy machine learning models directly inside BigQuery. Over the years, we built on this foundation by introducing capabilities such as AI-powered search, generative AI with SQL and many others.

Today, we’re introducing BigQuery AI, which brings together BigQuery’s built-in ML capabilities, generative AI functions, vector search, intelligent agents, and agent tools. Using BigQuery AI, you can:

Apply gen AI to your data : Bring Google and partner AI models directly to your multimodal data in BigQuery through simple SQL functions.
Simplify your data-to-ML journey: Manage your whole machine learning lifecycle in BigQuery — everything from feature engineering to model training, tuning, inferencing and monitoring, all without moving your data.
Create workflows and apps faster: Whether you’re a data engineer, data scientist, or business user, you can accelerate your workflows with intuitive, role-specific agents built right into BigQuery.

Let’s take a closer look at the tools and technologies that fall under the BigQuery AI umbrella.

Unlock insights from multimodal data with generative AI

Bringing state-of-the-art AI models directly to your data through simple SQL commands can help you perform generative AI tasks as well as unlock deeper, semantic understanding from your multimodal data.

AI functions integrate LLMs and embedding models directly into your SQL queries, enabling you to perform tasks such as content generation, analysis, summarization, structured data extraction, classification, embedding generation, and data enrichment. You can also use AI functions for routine tasks such as filtering, rating, and classification. With managed AI functions, BigQuery chooses a model for you that is optimized for cost and quality.
Embeddings and search functions help you find information more intelligently. Traditional text search lets you quickly locate specific keywords in your data, but vector search allows you to search by meaning and context, not just exact words. This helps you uncover conceptually related items, finding relevant information that a simple keyword search would miss. Embeddings and vector search in BigQuery power use cases such as RAG, multimodal search, data deduplication, clustering and recommendation engines.

Data processing to AI inference all under one roof

When we first launched BigQuery ML, our goal was to bring AI and ML closer to your data, empowering SQL users to perform machine learning tasks directly in BigQuery, on their BigQuery data. Over the years, we added capabilities to provide a complete, end-to-end platform for accelerating the entire machine learning lifecycle.

Enterprises are using these capabilities to powerful effect. For instance, PUMA used BigQuery’s integrated machine learning capabilities to advance beyond manual segmentation, crafting sophisticated audience segments based on purchase propensity. The outcome was hugely impactful: the top ML-derived audience segments demonstrated a remarkable 149.8% surge in click-through rate, a 4.6% uptick in conversion rate, and a 6% increase in average order value.

Data processing to AI inference workflow in BigQuery

BigQuery AI streamlines the entire machine learning lifecycle by bringing the code to your data.

No data movement: Train and run models directly in BigQuery using SQL or Python. No data movement or infrastructure management needed.
End-to-end lifecycle: Handle everything from feature engineering to model training, evaluation, tuning , deployment and inference in BigQuery without needing expertise in specialized ML frameworks.
Model flexibility: Choose from built-in models, import custom models you have trained in AI, or use pre-trained models (like TimesFM) for zero-shot inference.
Unified inference: Seamlessly execute predictions via batch processing, real-time streaming, or remote inference.

And of course, BigQuery AI lets you use your preferred development environment — BigQuery Studio, the integrated AI-powered Colab Enterprise notebook, or an IDE of your choice.

Agentic experience for every data user

Under the BigQuery AI umbrella, we are also consolidating the data agents and assistive AI capabilities that are designed to streamline and automate workflows for various data professionals.

Data Engineering Agent allows you to build, modify, and manage data pipelines by describing your requirements in natural language. It translates your plain-language requests into production-ready SQL code, automating complex tasks like data cleaning, transformations, and schema modeling.
Data Science Agent helps automate end-to-end data science workflows. It creates multi-step plans, generating and executing code, reasons about the results, and presents its findings. You also use it to generate visualizations with simple prompts, explain and transform code, as well as explain errors and fix them automatically.
Conversational Analytics Agent empowers business users to bypass technical barriers, allowing them to ask questions in natural language and receive clear, actionable intelligence, truly democratizing data for all.
Assistive AI features like data canvas and code completion simplify and speed up routine tasks.

Beyond first-party agents and assistive AI capabilities, BigQuery also provides a powerful suite of tools for building custom agents and integrating agentic AI into your applications. The Conversational Analytics API provides the building blocks to embed natural language processing capabilities into your own products for tailored data experiences. And for more advanced use cases, the Agent Development Kit (ADK) offers a full-stack framework to build and deploy complex, multi-agent systems, while the Model Context Protocol (MCP) standardizes how AI models communicate with databases and other tools.

AI is changing how we all live and work, and nowhere is that more apparent than in how data professionals are approaching their jobs. BigQuery AI is a significant leap forward in how you can connect your data to AI. To learn more about BigQuery AI and get started with it, check out this guide. We can’t wait to see what you build next!

Read More for the details.

2025 11 21

GCP – Google is a Leader in the 2025 Gartner® Magic Quadrant for Cloud Database Management Systems

Tibor Kiss Cloud, Google Cloud gcp

Google is a Leader in the 2025 Gartner Magic Quadrant for Cloud Database Management Systems for the sixth year in a row, and for the third consecutive year is positioned furthest in vision.

This comes as we cross into the era of agentic AI. Leading enterprises are in the early days of deploying agents to automate core business functions. At the center is data, the fuel for AI. This recognition from Gartner — especially our placement for Completeness of Vision — in our opinion validates our long-standing belief that the future of enterprise data is unified, open, and inseparable from the AI layer. This is the blueprint for what we call the AI-native Data Cloud.

The AI-native Data Cloud is the only architecture engineered from the ground up to eliminate the complexity and hidden cost of the agentic era. Those who learn to apply the power of agents today will be the winning businesses of tomorrow.

2025 Gartner Magic Quadrant for Cloud Database Management Systems

A data and AI foundation for the agentic enterprise

Your organization’s ability to take advantage of AI will depend on your enterprise data strategy. Agents require a new data foundation that combines analytical platforms for their rich historical petabytes of data with high-performance transactional databases for real-time actions. Agents need to be firmly grounded in governed, enterprise data so that they can generate trustworthy output. Siloed, fragmented data stacks struggle to offer these capabilities.

Google’s Data Cloud is anchored by the integration of BigQuery as the analytical engine, databases such as Spanner and AlloyDB for operational processing, Looker for business intelligence, and Dataplex Universal Catalog for data management and governance. It is designed with a single, unified vision where operational, analytical, and AI systems co-process as one native fabric. With AI infused in every layer, the platform automates tasks across the entire data lifecycle. Grounded in business context and enterprise data, it delivers trusted intelligence at scale. This active foundation continuously adapts with real-time intelligence, empowering teams to build next-generation intelligent applications and agentic experiences, while minimizing complexity.

Let’s review three benefits enterprises realize when they operate on a unified, AI-native Data Cloud, along with our latest innovations to help you thrive in the agentic era.

Accelerate time to market with autonomous design

To move quickly in the agentic era, your organization must remove manual processes that separate data from AI-driven actions. Our AI-native Data Cloud brings the power of AI directly to your data to fuel autonomous, agentic systems. This is timely, as customers are increasingly shifting to AI-driven workflows, as evidenced by a 27x increase in the volume of data processed in BigQuery with Gemini.

We are delivering on this vision by embedding a set of specialized agents directly into the platform. We provide a specialized, autonomous data agent for every kind of data user — from data scientists and engineers to business analysts. These include the Data Engineering Agent to automate complex pipelines, the Data Science Agent to execute ML modeling without manual setup, and the Conversational Analytics Agent to empower any user in your organization to get answers using natural language. These agents form a collaborative AI agent network that can power end-to-end data workflows.

Your developers also have the tools they need to build agents tailored to your specific business processes, for example, the new Data Agents API and Agent Development Kit (ADK). Gemini CLI extensions allow data teams to use natural language for complex analysis, and Agent Analytics in BigQuery, built using ADK, lets you capture, analyze, and visualize agent performance, user interaction, and their associated costs.

These capabilities deliver business results. Lowe’s implemented an AI-first strategy on Google’s Data Cloud for their e-commerce site and mobile application to improve product discovery for customers who shop with a visual preference. Now customers can now find visually similar products, resulting in more than $15 million in incremental annualized revenue for home decor items and increased sales conversion rates.

Control operational costs on a single governed foundation

Data fragmentation can contribute to high AI costs. Google’s Data Cloud is an integrated platform that connects all your operational and analytical workloads, including easy integration with Vertex AI, our platform for building AI models and agents. This minimizes redundant data movement and storage and creates a more efficient economic model. In fact, according to our analysis, it can be eight to 16 times more cost-efficient to run data and AI workloads on the single BigQuery and Vertex AI platform rather than on separate, disconnected systems.

In the age of agents, trust and compliance are paramount. Disconnected data and AI have the potential for significant governance risk: the threat of data leakage, agent hallucinations, biased outcomes, and regulatory non-compliance. Effective governance helps ensure these agents’ integrity. At the same time, the platform’s governance and knowledge engine creates an active AI catalog across your cloud estate, providing you with a deep understanding of your data environment. AI agents can use this catalog to identify the correct datasets to use with over 50% greater accuracy than traditional methods, reducing errors and improving trust.

We continue to deliver new features that reinforce this foundation. For instance, with AI, context is everything, but providing that context can be complicated when you’re training agents on multimodal data such as text, video, and images. To help, we’ve unified a full range of AI capabilities within BigQuery. Whether you want to build machine learning models, process massive amounts of unstructured data with generative AI, or build retrieval augmented generation (RAG) applications using vector embeddings and hybrid search, you can now do it directly where your data lives. Spanner’s vector search, meanwhile, ties together complex multimodal queries, letting your teams consolidate full text search, graph, and vector workloads onto a single system.

At the same time, autonomous, mission-critical applications such as ones that rely on real-time financial transactions or perform global inventory updates need to be built on a proven database. Spanner’s new columnar engine unifies transactional and analytical processing, with analytical queries running up to 200x faster on live operational data.

Google (Spanner) ranked in the highest three across all use cases in the 2025 Gartner Critical Capabilities for Operational Cloud DBMS report, including #1 in Lightweight Transactions. And Google (BigQuery) ranked #1 for the Event Analytics use case in the 2025 Gartner Critical Capabilities for Analytical Cloud DBMS report, which in our opinion underscores our ability to deliver the real-time data processing required for high-performance, autonomous systems.

You can see the strategic advantage of a unified, governed foundation in Banco BV’s modernization effort. The company migrated from Databricks to Google Cloud to enhance governance, scale their data infrastructure, and meet increasing customer demands, all while strictly managing security. By migrating, centralizing data management, and accelerating AI model testing, they aim to improve business output by 100%.

Future-proof your architecture with an open platform

To ensure their long-term viability and prevent vendor lock-in, your AI investments need a foundation built on open standards. An open platform gives your teams the flexibility to modernize your data ecosystem and build AI-powered systems that can process data across any cloud environment. This approach resolves the trade-off between the flexibility of a data lake and the performance of a data warehouse.

We deliver on the promise of open platform innovation by prioritizing speed, security, and flexibility. AlloyDB is more than four times faster for transactional workloads and provides up to two times better price-performance compared to self-managed PostgreSQL. Plus, you can run it anywhere with AlloyDB Omni, enabling multi and hybrid cloud environments. For your critical Spark workloads, we deliver high performance with Lightning Engine for Apache Spark, now generally available. This engine improves Spark performance by more than four times compared to open-source Spark and it delivers 10% faster query execution than Databricks Photon.

Then, to future-proof your data estate, we keep everything accessible and interoperable through BigLake, our management layer that acts as a unifying fabric for your open data. Our support for open formats, including the recent general availability of the Apache Iceberg REST Catalog, helps your data stay accessible and ready for your future needs.

Our commitment to open innovation is shown by Deutsche Telekom, who modernized over 40 legacy data systems into a “One Data Ecosystem” on Google Cloud to meet stringent German data sovereignty regulations. By using Sovereign Cloud and Apache Iceberg as the core open platform, they established a unified, compliant architecture that provides a single source of truth across services like BigQuery and Spanner, resulting in a 22x performance boost for a key use case.

What’s next?

To learn more about our placement and how we think we can accelerate your data journey, download the complimentary 2025 Gartner Magic Quadrant for Cloud Database Management Systems report.

^{Gartner, Magic Quadrant for Cloud Database Management Systems, Henry Cook, Xingyu Gu, Ramke Ramakrishnan, Aaron Rosenbaum, Masud Miraz, November 18, 2025}

^{Gartner, Critical Capabilities for Cloud Database Management Systems for Operational Use Cases, Ramke Ramakrishnan, Masud Miraz, Xingyu Gu, Henry Cook, Aaron Rosenbaum, November 19, 2025}

^{Gartner, Critical Capabilities for Cloud Database Management Systems for Analytical Use Cases, Aaron Rosenbaum, Ramke Ramakrishnan, Henry Cook, Xingyu Gu, Masud Miraz, November 19, 2025}

^{Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s Research & Advisory organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.}

^{GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.}

^{This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google.}

Read More for the details.

2025 11 21

AWS – AWS announces Flexible Cost Allocation on AWS Transit Gateway

Tibor Kiss AWS, Cloud AWS

AWS announces general availability of Flexible Cost Allocation on AWS Transit Gateway, enhancing how you can distribute Transit Gateway costs across your organization.

Previously, Transit Gateway only used a sender-pay model, where the source attachment account owner was responsible for all data usage related costs. The new Flexible Cost Allocation (FCA) feature provides more versatile cost allocation options through a central metering policy. Using FCA metering policy, you can choose to allocate all of your Transit Gateway data processing and data transfer usage to the source attachment account, the destination attachment account, or the central Transit Gateway account. FCA metering policies can be configured at an attachment-level or individual flow-level granularity. FCA also supports middle-box deployment models enabling you to allocate data processing usage on middle-box appliances such as AWS Network Firewall to the original source or destination attachment owners. This flexibility allows you to implement multiple cost allocation models on a single Transit Gateway, accommodating various chargeback scenarios within your AWS network infrastructure.

Flexible Cost Allocation is available in all commercial AWS Regions where Transit Gateway is available. You can enable these features using the AWS Management Console, AWS Command Line Interface (CLI) and the AWS Software Development Kit (SDK). There is no additional charge for using FCA on Transit Gateway. For more information, see the Transit Gateway documentation pages.

Read More for the details.

2025 11 21

AWS – Amazon Location Service introduces Address Form Solution Builder

Tibor Kiss AWS, Cloud AWS

Today, AWS announced Address Form Solution Builder from Amazon Location Service, enabling developers to build a customized address form, without writing any code, that helps their users enter their address with predictive suggestions, autofill address fields such as postal code, and customizable layout. This guided experience allows developers to generate a ready-to-use application in minutes and download the developer package in React JavaScript, React Typescript, or Standalone HTML/JavaScript.

Developers can use address forms to improve the user experience, speed, and accuracy of collecting address information from their users. Features such as predictive suggestions helps end-users select their complete address after just a few keystrokes, reducing the data entry time and error rate. The integrated map view lets users visualize their selected address’s location and adjust the placement of the pin on the map to indicate a specific entrance. By improving the speed and accuracy of address collection, enterprises can improve their customer experience, reduce fraud, and increase delivery success rate.

Amazon Location Service’s Address Form Solution Builder is available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Stockholm), Europe (Spain), and South America (São Paulo). Build your first address form using the Amazon Location Console or learn more about this feature in our Developer Guide.

Read More for the details.

2025 11 21

AWS – AWS Security Incident Response now provides agentic AI-powered investigation

Tibor Kiss AWS, Cloud AWS

AWS Security Incident Response now provides agentic AI-powered investigation capabilities to help you prepare for, respond to, and recover from security events faster and more effectively. The new investigative agent automatically gathers evidence across multiple AWS data sources, correlates the data, then presents findings for you in clear, actionable summaries. This helps you reduce the time required to investigate and respond to potential security events, thereby minimizing business disruption.

When a security event case is created in the Security Incident Response console, the investigative agent immediately assesses the case details to identify missing information, such as potential indicators, resource names, and timeframes. It asks the case submitter clarifying questions to gather these details. This proactive approach helps minimize delays from back-and-forth communications that traditionally extend case resolution times. The investigative agent then collects relevant information from various data sources, such as AWS CloudTrail, AWS Identity and Access Management (IAM), Amazon EC2, and AWS Cost Explorer. It automatically correlates this data to provide you with a comprehensive analysis, reducing the need for manual evidence gathering and enabling faster investigation. Security teams can track all investigation activities directly through the AWS console and view summaries in their preferred integration tools.

This feature is automatically enabled for all Security Incident Response customers at no additional cost in all AWS Regions where the service is available.

To learn more and get started, visit the Security Incident Response overview page and console.

Read More for the details.

2025 11 21

AWS – Amazon EKS add-ons now supports the AWS Secrets Store CSI Driver provider

Tibor Kiss AWS, Cloud AWS

Today, AWS announces the general availability of the AWS Secrets Store CSI Driver provider EKS add-on. This new integration allows customers to retrieve secrets from AWS Secrets Manager and parameters from AWS Systems Manager Parameter Store and mount them as files on their Kubernetes clusters running on Amazon Elastic Kubernetes Service (Amazon EKS). The add-on installs and manages the AWS provider for the Secrets Store CSI Driver.

Now, with the new Amazon EKS add-on, customers can quickly and easily set up new and existing clusters using automation to leverage AWS Secrets Manager and AWS Systems Manager Parameter Store, enhancing security and simplifying secrets management. Amazon EKS add-ons are curated extensions that automate the installation, configuration, and lifecycle management of operational software for Kubernetes clusters, simplifying the process of maintaining cluster functionality and security.

Customers rely on AWS Secrets Manager to securely store and manage secrets such as database credentials and API keys throughout their lifecycle. To learn more about Secrets Manager, visit the documentation. For a list of regions where Secrets Manager is available, see the AWS Region table. To get started with Secrets Manager, visit the Secrets Manager home page.

This new Amazon EKS add-on is available in all AWS commercial and AWS GovCloud (US) Regions.
To get started, see the following resources:

Read More for the details.

2025 11 21

AWS – AWS License Manager introduces license asset groups for centralized software asset management

Tibor Kiss AWS, Cloud AWS

AWS License Manager now provides centralized software asset management across AWS regions and accounts in an organization, reducing compliance risks and streamlines license tracking through automated license asset groups. Customers can now track license expiry dates, streamline audit responses, and make data-driven renewal decisions with a product-centric view of their commercial software portfolio.

With this launch, customers no longer need to manually track licenses across multiple regions and accounts in their organization. Now with license asset groups, customers can gain organization-wide visibility of their commercial software usage with customizable grouping and automated reporting. The new feature is available in all commercial regions where AWS License Manager is available.

To get started, visit the Licenses section of the AWS License Manager console, and the AWS License Manager User Guide.

Read More for the details.

2025 11 21

AWS – Amazon OpenSearch Service OR2 and OM2 now available in additional Regions

Tibor Kiss AWS, Cloud AWS

Amazon OpenSearch Service, expands availability of OR2 and OM2, OpenSearch Optimized Instance family to 11 additional regions. The OR2 instance delivers up to 26% higher indexing throughput compared to previous OR1 instances and 70% over R7g instances. The OM2 instance delivers up to 15% higher indexing throughput compared to OR1 instances and 66% over M7g instances in internal benchmarks.

The OpenSearch Optimized instances, leveraging best-in-class cloud technologies like Amazon S3, to provide high durability, and improved price-performance for higher indexing throughput better for indexing heavy workload. Each OpenSearch Optimized instance is provisioned with compute, local instance storage for caching, and remote Amazon S3-based managed storage. OR2 and OM2 offers pay-as-you-go pricing and reserved instances, with a simple hourly rate for the instance, local instance storage, as well as the managed storage provisioned. OR2 instances come in sizes ‘medium’ through ‘16xlarge’, and offer compute, memory, and storage flexibility. OM2 instances come in sizes ‘large’ through ‘16xlarge’ Please refer to the Amazon OpenSearch Service pricing page for pricing details.

OR2 instance family is now available on Amazon OpenSearch Service across 12 additional regions globally: US West (N. California), Canada (Central), Asia Pacific (Hong Kong, Jakarta , Malaysia, Melbourne, Osaka , Seoul, Singapore), Europe (London), and South America (Sao Paulo).

OM2 instance family is now available on Amazon OpenSearch Service across 14 additional regions globally: US West (N. California), Canada (Central), Asia Pacific (Hong Kong, Hyderabad, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo), Europe ( Paris, Spain), Middle East (Bahrain), South America (Sao Paulo).

Read More for the details.

2025 11 21

AWS – AWS Control Tower now supports seven new compliance frameworks and 279 additional AWS Config rules

Tibor Kiss AWS, Cloud AWS

Today, AWS Control Tower announces support for an additional 279 managed Config rules in Control Catalog for various use cases such as security, cost, durability, and operations. With this launch, you can now search, discover, enable and manage these additional rules directly from AWS Control Tower and govern more use cases for your multi-account environment. AWS Control Tower also supports seven new compliance frameworks in Control Catalog. In addition to existing frameworks, most controls are now mapped to ACSC-Essential-Eight-Nov-2022, ACSC-ISM-02-Mar-2023, AWS-WAF-v10, CCCS-Medium-Cloud-Control-May-2019, CIS-AWS-Benchmark-v1.2, CIS-AWS-Benchmark-v1.3, CIS-v7.1

To get started, go to the Control Catalog and search for controls with the implementation filter AWS Config to view all AWS Config rules in the Catalog. You can enable relevant rules directly using the AWS Control Tower console or the ListControls, GetControl and EnableControl APIs. We’ve also enhanced control relationship mapping, helping you understand how different controls work together. The updated ListControlMappings API now reveals important relationships between controls – showing which ones complement each other, are alternatives, or are mutually exclusive. For instance, you can now easily identify when a Config Rule (detection) and a Service Control Policy (prevention) can work together for comprehensive security coverage.

These new features are available in AWS Regions where AWS Control Tower is available, including AWS GovCloud (US). Reference the list of supported regions for each Config rule to see where it can be enabled. To learn more, visit the AWS Control Tower User Guide.

Read More for the details.

2025 11 21

AWS – CloudWatch Database Insights adds cross-account cross-region monitoring

Tibor Kiss AWS, Cloud AWS

Amazon CloudWatch Database Insights now supports cross-account and cross-region database fleet monitoring, enabling centralized observability across your entire AWS database infrastructure. This enhancement allows DevOps engineers and database administrators to monitor, troubleshoot, and optimize databases spanning multiple AWS accounts and regions from a single unified console experience.

With this new capability, organizations can gain holistic visibility into their distributed database environments without account or regional boundaries. Teams can now correlate performance issues across their entire database fleet, streamline incident response workflows, and maintain consistent monitoring standards across complex multi-account architectures, significantly reducing operational overhead and improving mean time to resolution.

This feature is available in all AWS commercial regions where CloudWatch Database Insights is supported.

To learn more about cross-account and cross-region monitoring in CloudWatch Database Insights, as well as instructions to get started monitoring your databases across your entire organization and regions, visit the CloudWatch Database Insights documentation.

Read More for the details.

2025 11 21

AWS – AWS introduces new VPC Encryption Controls and further raises the bar on data encryption

Tibor Kiss AWS, Cloud AWS

AWS launches VPC Encryption Controls to make it easy to audit and enforce encryption in transit within and across Amazon Virtual Private Clouds (VPC), and demonstrate compliance with encryption standards. You can turn it on your existing VPCs to monitor encryption status of traffic flows and identify VPC resources that are unintentionally allowing plaintext traffic. This feature also makes it easy to enforce encryption across different network paths by automatically (and transparently) turning on hardware-based AES-256 encryption on traffic between multiple VPC resources including AWS Fargate, Network Load Balancers, and Application Load Balancers.

To meet stringent compliance standards like HIPAA and PCI DSS, customers rely on both application layer encryption and the hardware-based encryption that AWS offers across different network paths. AWS provides hardware-based AES-256 encryption transparently between modern EC2 Nitro instances. AWS also encrypts all network traffic between AWS data centers in and across Availability Zones, and AWS Regions before the traffic leaves our secure facilities. All inter-region traffic that uses VPC Peering, Transit Gateway Peering, or AWS Cloud WAN receives an additional layer of transparent encryption before leaving AWS data centers. Prior to this release, customers had to track and confirm encryption across all network paths. With VPC Encryption Controls, customers can now monitor, enforce and demonstrate encryption within and across Virtual Private Clouds (VPCs) in just a few clicks. Your information security team can turn it on centrally to maintain a secure and compliant environment, and generate audit logs for compliance and reporting.

VPC Encryption Controls is now available in the following AWS Commercial regions: US East (N. Virginia), US East (Ohio), US West (Oregon), US West (N. California), Europe (Ireland), Europe (Frankfurt), Europe (London), Europe (Paris), Europe (Milan), Europe (Zurich), Europe (Stockholm), Asia Pacific (Sydney), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Melbourne), Asia Pacific (Hong Kong), Asia Pacific (Osaka), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Jakarta), Canada West (Calgary), Canada (Central), Middle East (UAE), Middle East (Bahrain), Africa (Cape Town) and South America (São Paulo). To learn more about this feature and its use cases, please see our documentation.

Read More for the details.

2025 11 21

AWS – Amazon EC2 Fleet adds new encryption attribute for instance type selection

Tibor Kiss AWS, Cloud AWS

Amazon EC2 Fleet now supports a new encryption attribute for Attribute-Based Instance Type Selection (ABIS). Customers can use the RequireEncryptionInTransit parameter to specifically launch instance types that support encryption-in-transit, in addition to specifying resource requirements like vCPU cores and memory.

The new encryption attribute addresses critical compliance needs for customers who use VPC Encryption Controls in enforced mode and require all network traffic to be encrypted in transit. By combining encryption requirements with other instance attributes in ABIS, customers can achieve instance type diversification for better capacity fulfillment while meeting their security needs. Additionally, the GetInstanceTypesFromInstanceRequirements (GITFIR) allows you to preview which instance types you might be allocated based on your specified encryption requirements.

This feature is available in all AWS commercial and AWS GovCloud (US) Regions.

To get started, set the RequireEncryptionInTransit parameter to true in InstanceRequirements when calling the CreateFleet or GITFIR APIs. For more information, refer to the user guides for EC2 Fleet and GITFIR.

Read More for the details.

2025 11 21

AWS – AWS Control Tower introduces a controls-dedicated experience

Tibor Kiss AWS, Cloud AWS

AWS Control Tower offers the easiest way to manage and govern your environment with AWS managed controls. Starting today, customers can have direct access to these AWS managed controls without requiring a full Control Tower deployment. This new experience offers over 750 managed controls that customers can deploy within minutes while maintaining their existing account structure.

AWS Control Tower v4.0 introduces direct access to Control Catalog, allowing customers to review available managed controls and deploy them into their existing AWS Organization. With this release, customers now have more flexibility and autonomy over their organizational structure, as Control Tower will no longer enforce a mandatory structure. Additionally, customers will have improved operations such as cleaner resource and permissions management and cost attribution due to the separation of S3 buckets and SNS notifications for the AWS Config and AWS CloudTrail integrations.

This controls-focused experience is now available in all AWS Regions where AWS Control Tower is supported. For more information about this new capability see the AWS Control Tower User Guide or contact your AWS account team. For a full list of Regions where AWS Control Tower is available, see the AWS Region Table.

Read More for the details.

2025 11 21

AWS – Announcing flexible AMI distribution capabilities for EC2 Image Builder

Tibor Kiss AWS, Cloud AWS

Amazon EC2 Image Builder now allows you to distribute existing Amazon Machine Images(AMIs), retry distributions, and define custom distribution workflows. Distribution workflows are a new workflow type that complements existing build and test workflows, enabling you to define sequential distribution steps such as AMI copy operations, wait-for-action checkpoints, and AMI attribute modifications.

With enhanced distribution capabilities, you can now distribute an existing image to multiple regions and accounts without running a full Image Builder pipeline. Simply specify your AMI and distribution configuration, and Image Builder handles the copying and sharing process. Additionally, with distribution workflows, you can now customize distribution process by defining custom steps. For example, you can distribute AMIs to a test region first, add a wait-for-action step to pause for validation, and then continue distribution to production regions after approval. This provides the same step-level visibility and control you have with build and test workflows.

These capabilities are available to all customers at no additional costs, in all AWS regions including AWS China (Beijing) Region, operated by Sinnet, AWS China (Ningxia) Region, operated by NWCD, and AWS GovCloud (US) Regions.
You can get started from the EC2 Image Builder Console, CLI, API, CloudFormation, or CDK, and learn more in the EC2 Image Builder documentation.

Read More for the details.

2025 11 20

AWS – Amazon Connect now offers persistent agent connections for faster call handling

Tibor Kiss AWS, Cloud AWS

Amazon Connect now offers the ability to maintain an open communication channel between your agents and Amazon Connect, helping reduce the time it takes to establish a connection with a customer. Contact center administrators can configure an agent’s user profile to maintain a persistent connection after a conversation ends, allowing for subsequent calls to connect faster. Amazon Connect persistent agent connection makes it easier to support compliance requirements with telemarketing laws such as the U.S. Telephone Consumer Protection Act (TCPA) for outbound campaigns’ calling by reducing the time it takes for a customer to connect with your agents.

Amazon Connect persistent connection is now available in all AWS regions where Amazon Connect is offered, and there is no additional charge beyond standard pricing for the Amazon Connect service usage and associated telephony charges. To learn more, visit our product page or refer to our Admin Guide.

Read More for the details.

2025 11 20

AWS – Validate and enforce required tags in CloudFormation, Terraform and Pulumi with Tag Policies

Tibor Kiss AWS, Cloud AWS

AWS Organizations Tag Policies announces Reporting for Required Tags, a new validation check that proactively ensures your CloudFormation, Terraform, and Pulumi deployments include the required tags critical to your business. Your infrastructure-as-code (IaC) operations can now be automatically validated against tag policies to ensure tagging consistency across your AWS environments. With this, you can ensure compliance for your IaC deployments in two simple steps: 1) define your tag policy, and 2) enable validation in each IaC tool.

Tag Policies enables you to enforce consistent tagging across your AWS accounts with proactive compliance, governance, and control. With this launch, you can specify mandatory tag keys in your tag policies, and enforce guardrails for your IaC deployments. For example, you can define a tag policy that all EC2 instances in your IaC templates must have “Environment”, “Owner”, and “Application” as required tag keys. You can start validation by activating AWS::TagPolicies::TaggingComplianceValidator Hook in CloudFormation, adding validation logic in your Terraform plan, or activating aws-organizations-tag-policies pre-built policy pack in Pulumi. Once configured, all CloudFormation, Terraform, and Pulumi deployments in the target account will be automatically validated and/or enforced against your tag policies, ensuring that resources like EC2 instances include the required “Environment”, “Owner”, and “Application” tags.

You can use Reporting for Required Tags feature via AWS Management Console, AWS Command Line Interface, and AWS Software Development Kit. This feature is available with AWS Organizations Tag Policies in AWS Regions where Tag Policies is available. To learn more, visit Tag Policies documentation. To learn how to set up validation and enforcement, see the user guide for CloudFormation, this user guide for Terraform, and this blog post for Pulumi.

Read More for the details.

2025 11 20

AWS – AWS DMS Schema Conversion adds SAP (Sybase) ASE to PostgreSQL support with generative AI

Tibor Kiss AWS, Cloud AWS

AWS Database Migration Service (DMS) Schema Conversion is a fully managed feature of DMS that automatically assesses and converts database schemas to formats compatible with AWS target database services. Today, we’re excited to announce that Schema Conversion now supports conversions from SAP Adaptive Server Enterprise (ASE) database (formerly known as Sybase) to Amazon RDS PostgreSQL and Amazon Aurora PostgreSQL, powered by Generative AI capability.

Using Schema Conversion, you can automatically convert database objects from your SAP (Sybase) ASE source to an to Amazon RDS PostgreSQL and Amazon Aurora PostgreSQL target. The integrated generative AI capability intelligently handles complex code conversions that typically require manual effort, such as stored procedures, functions, and triggers. Schema Conversion also provides detailed assessment reports to help you plan and execute your migration effectively.

To learn more about this feature, see the documentation for using SAP (Sybase) ASE as a source for AWS DMS Schema Conversion and using SAP (Sybase) ASE as a source for AWS DMS for data migration. For details about the generative AI capability, please refer to the User Guide. For AWS DMS Schema Conversion regional availability, please refer to the Supported AWS Regions page.

Read More for the details.

2025 11 20

AWS – Amazon RDS supports Multi-AZ for SQL Server Web Edition

Tibor Kiss AWS, Cloud AWS

Amazon Relational Database Service (Amazon RDS) for SQL Server now supports Multi-AZ deployment for SQL Server Web Edition. SQL Server Web Edition is specifically designed to support public and internet-accessible web pages, websites, web applications, and web services, and is used by web hosters and web value-added providers (VAPs). These applications need high availability, and automated failover to recover from hardware and database failures. Now customers can use SQL Server Web Edition with Amazon RDS Multi-AZ deployment option, which provides a high availability solution. The new feature eliminates the need for customers to use more expensive options for high availability, such as using SQL Server Standard Edition or Enterprise Edition.

To use the feature, customers simply configure their Amazon RDS for SQL Server Web Edition instance with Multi-AZ deployment option. Amazon RDS automatically provisions and maintains a standby replica in a different Availability Zone (AZ), and synchronously replicates data across the two AZs. In situations where your Multi-AZ primary database becomes unavailable, Amazon RDS automatically fails over to the standby replica, so customers can resume database operations quickly and without any administrative intervention.

For more information about Multi-AZ deployment for RDS SQL Server Web Edition, refer to the Amazon RDS for SQL Server User Guide. See Amazon RDS for SQL Server Pricing for pricing details and regional availability.

Read More for the details.