Tibor Kiss

About Tibor Kiss

Posts by Tibor Kiss:

2025 06 24

GCP – How to use Gemini 2.5 to fine-tune video outputs on Vertex AI

Recently, we announced Gemini 2.5 is generally available on Vertex AI. As part of this update, tuning capabilities have extended beyond text outputs – now, you can tune image, audio, and video outputs on Vertex AI.

Supervised fine tuning is a powerful technique to customize LLM output using your own data. Through tuning, LLMs become specialized in your business context and task by learning from the tuning examples, therefore achieving higher quality output. With video outputs, here’s some use cases our customers have unlocked:

Automated video summarization: Tuning LLMs to generate concise and coherent summaries of long videos, capturing the main themes, events, and narratives. This is useful for content discovery, archiving, and quick reviews.
Detailed event recognition and localization: Fine-tuning allows LLMs to identify and pinpoint specific actions, events, or objects within a video timeline with greater accuracy. For example, identifying all instances of a particular product in a marketing video or a specific action in sports footage.
Content moderation: Specialized tuning can improve an LLM’s ability to detect sensitive, inappropriate, or policy-violating content within videos, going beyond simple object detection to understand context and nuance.
Video captioning and subtitling: While already a common application, tuning can improve the accuracy, fluency, and context-awareness of automatically generated captions and subtitles, including descriptions of nonverbal cues.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6d910b6df0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Today, we will share actionable best practices for conducting truly effective tuning experiments using the Vertex AI tuning service. In this blog, we will cover the following steps:

Craft your prompt
Detect multiple labels
Conduct single-label video task analysis
Prepare video tuning dataset
Set the hyperparameters for tuning
Evaluate the tuned checkpoint on the video tasks

I. Craft your prompt

Designing the right prompt is a cornerstone of any effective tuning, directly influencing model behavior and output quality. An effective prompt for video tuning typically comprises several key components, ensuring clarity in the prompt.

Task context: This component sets the overall context and defines the intention of the task. It should clearly articulate the primary objective of the video analysis. For example….
Task definition: This component provides specific, detailed guidance on how the model should perform the task including label definitions for tasks such as classification or temporal localization. For example, in video classification, clearly define positive and negative matches within your prompt to ensure accurate model guidance.
Output specification: This component provides how the model is expected to produce its output. This includes specific rules or schema for structured formats such as JSON. To maximize clarity, embed a sample JSON object directly in your prompt, specifying its expected structure, schema, data types, and any formatting conventions.

II: Detect multiple labels

Multi-label video analysis involves detecting multiple labels corresponding to a single video. This is a desirable setup for video tasks since the user can train a single model for several labels and obtain predictions for all the labels via a single query request to the tuned model during inference time. These tasks are usually quite challenging for the off-the-shelf models and often need tuning.

See an example prompt below.

code_block: <ListValue: [StructValue([(‘code’, ‘Focus: you are a machine learning data labeller with sports expertise. rnrn### Task definition ###rnGiven a video and an entity definition, your task is to find out the video segments that match the definition for any of the entities listed below and provide the detailed reason on why you believe it is a good match. Please do not hallucinate. There are generally only few or even no positive matches in most cases. You can just output nothing if there are no positive matches.rnrn Entity Name: “entity1″rn Definition: “define entity 1”rn Labeling instruction: provide instruction for entity1rnrn Entity Name: “entity 2″rn Definition: “define entity 2”rn Labeling instruction: provide instruction for entity 2rnrn..rn..rnrn### Output specification ###rnYou should provide the output in a strictly valid JSON format same as the following example.rn[{rn”cls”: {the entity name},rn”start_time”: “Start time of the video segment in mm:ss format.”,rn”end_time”: “End time of the video segment in mm:ss format.”,rn},rn{rn”cls”: {the entity name},rn”start_time”: “Start time of the video segment in mm:ss format.”,rn”end_time”: “End time of the video segment in mm:ss format.”,rn}]rnBe aware that the start and end time must be in a strict numeric format: mm:ss. Do not output anything after the JSON content.rnrnYour answer (as a JSON LIST):’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6d910b6a30>)])]>

Challenges and mitigations for multi-label video tasks:

- The tuned model tends to learn dominant labels (i.e., labels that appear more frequently in the dataset).
- Mitigation: We recommend balancing the target label distribution as much as possible.
When working with video data, skewed label distributions are further complicated by the temporal aspect. For instance, in action localization, a video segment might not contain “event X” but instead feature “event Y” or simply be background footage.
- Mitigation: For such use cases, we recommend using multi-class single-label design described below.
- Mitigation: Improving the positive:negative instance ratio per label would further improve the tuned model’s performance.
The tuned model tends to hallucinate if the video task involves a large number of labels per instance (typically >10 labels per video input).
- Mitigation: For effective tuning, we recommend using multi-label formulation for video tasks that involve less than 10 labels per video.
For video tasks that require temporal understanding in dynamic scenes (e.g. event detection, action localization), the tuned model may not be effective for multiple temporal labels that are overlapping or are very close.

III: Conduct single-label video task analysis

Multi-class single-label analysis involves video tasks where a single video is assigned exactly one label from a predefined set of mutually exclusive labels. In contrast to multi-label tuning, multi-class single-label tuning recipes show good scalability with an increasing number of distinct labels. This makes the multi-class single-label formulation a viable and robust option for complex tasks. For example, tasks that involve categorizing videos into one of many possible exclusive categories or detecting several overlapping temporal events in the video.

In such a case, the prompt must explicitly state that only one label from a defined set is applicable to the video input. List all possible labels within the prompt to provide the model with the complete set of options. It is also important to clarify how a model should handle negative instances, i.e., when none of the labels occur in the video.

See an example prompt below:

code_block: <ListValue: [StructValue([(‘code’, ‘You are a video analysis expert. rnrn### Task definition ###rnDetect which animal appears in the video.The video can only have one of the following animals: dog, cat, rabbit. If you detect none of these animals, output NO_MATCH.rnrn### Output specification ###rnGenerate output in the following JSON format:rn[{rn”animal_name”: “<CATEGORY>”,rn}]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6d8d2a2760>)])]>

Challenges and mitigations for multi-class single-label video tasks

Using highly skewed data distributions may cause quality regression on the tuned model. The model may simply learn to predict the majority class, failing to identify the rare positive instances.

Mitigation: Undersampling the negative instances or oversampling the positive instances to balance the distributions are some effective strategies for tuning recipes. The undersampling/oversampling rate depends on the specific use case at hand

Some video use cases can be formulated as both multi-class single-label tasks and multi-label tasks. For example, detecting time intervals for several events in a video.

For fewer event types with non-overlapping time intervals (typically fewer than 10 labels per video), multi-label formulation is a good option.
On the other hand, for several similar event types with dense time intervals, multi-class single-label recipes yield better model performance . Model inference involves sending a separate query for each class (e.g., “Is event A present?”, then “Is event B present?”). This approach effectively treats the multi-class problem as a series of N binary decisions. This would mean for N classes, you will need to send N inference requests to the tuned models.

This is a tradeoff between higher inference latency and cost vs target performance. The choice should be made based on expected target performance from the model for the use case.

IV. Prepare video tuning dataset

The Vertex Tuning API uses *.jsonl files for both training and validation datasets. Validation data is used to select a checkpoint from the tuning process. Ideally, there should be no overlap in the JSON objects contained within train.jsonl and validation.jsonl. Learn more about how to prepare tuning dataset and its limitations here.

For maximum efficiency when tuning Gemini 2.0 (and newer) models on video, we recommend to use the MEDIA_RESOLUTION_LOW setting, located within the generationConfig object for each video in your input file. It dictates the number of tokens used to represent each frame, directly impacting training speed and cost.

You have two options:

MEDIA_RESOLUTION_LOW (default): Encodes each frame using 64 tokens.
MEDIA_RESOLUTION_MEDIUM: Encodes each frame using 256 tokens.

While MEDIA_RESOLUTION_MEDIUM may offer slightly better performance on tasks that rely on subtle visual cues, it comes with a significant trade-off: training is approximately four times slower. Given that the lower-resolution setting provides comparable performance for most applications, sticking with the default MEDIA_RESOLUTION_LOW is the most effective strategy for balancing performance with crucial gains in training speed.

V. Set the hyperparameters for tuning

After preparing your tuning dataset, you are ready to submit your first video tuning job! We supports 3 hyperparameters:

epochs: specifies the number of iterations over the entire training dataset. With a dataset size of ~500 examples, starting with epochs = 5 is the default value for video tuning tasks. Increase the number of epochs when you have <500 samples and decrease when you have >500 samples.
learning_rate_multiplier: specifies multiplier for the learning rate. We recommended experimenting with values less than 1 if the model is overfitting and values greater than 1 if the model is underfitting.
adapter_size: specified the rank of the LoRA adapter. The default values are adapter_size=8 for flash model tuning. For most use cases, you won’t need to adjust this, but a higher size allows the model to learn more complex tasks.

To streamline your tuning process, Vertex AI provides intelligent, automatic hyperparameter defaults. These values are carefully selected based on the specific characteristics of your dataset, including its size, modality, and context length. For the most direct path to a quality model, we recommend starting your experiments with these pre-configured values. Advanced users looking to further optimize performance can then treat these defaults as a strong baseline, systematically adjusting them based on the evaluation metrics from their completed tuning jobs.

VI. Evaluate the tuned checkpoint on the video tasks

Vertex AI tuning service provides loss and accuracy graph for training and validation dataset out of the box. The monitoring graph is updated in real time as your tuning job progresses. Intermediate checkpoints are automatically deployed for you. We recommend selecting the checkpoint corresponding to the epochs that show loss values on the validation dataset have saturated.

To evaluate the tuned model endpoint, See a sample code snippet below:

code_block: <ListValue: [StructValue([(‘code’, ‘response = tuned_model.generate_content(rn contents=[rn Content(role=”user”, parts=[rn Part.from_text(PROMPT),rn Part.from_uri(rn uri=”<path to video file>”,rn mime_type=”video/mp4″,rn video_offset: {rn “start_time_offset”: “0s”,rn “end_time_offset”: “200s”rn })rn ])rn ],rn generation_config=GenerationConfig.from_dict(rn {rn “temperature” : 0,rn “media_resolution”: “MEDIA_RESOLUTION_LOW”rn }rn ),rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6d8d2a2250>)])]>

For best performance, it is critical that the format, context and distribution of the inference prompts align with the tuning dataset. Also, we recommend using the same mediaResolution for evaluation as the one used during training.

For thinking models like Gemini 2.5 Flash, we recommend setting the thinking budget to 0 to turn off thinking on tuned tasks for optimal performance and cost efficiency. During supervised fine-tuning, the model learns to mimic the ground truth in the tuning dataset, omitting the thinking process.

Get started on Vertex AI today

The ability to derive deep, contextual understanding from video is no longer a futuristic concept—it’s a present-day reality. By applying the best practices we’ve discussed for prompt engineering, tuning dataset design, and leveraging the intelligent defaults in Vertex AI, you are now equipped to effectively tune Gemini models for your specific video-based tasks.

What challenges will you solve? What novel user experiences will you create? The tools are ready and waiting. We can’t wait to see what you build.

Read More for the details.

2025 06 24

GCP – Run your own code at the edge with Service Extensions plugins for Cloud CDN

Tibor Kiss Cloud, Google Cloud gcp

At Google Cloud, we’re committed to delivering the best performance possible globally for web and API content. Cloud CDN is a high-performance edge caching solution that runs at over 200 points of presence, and we continue to add more features and capabilities to it. Recently we launched invalidation with cache tags, device characterization, 0-RTT early data, and geo-targeting. These powerful first class features address many use cases, but organizations tell us they also need a more flexible, lightweight, edge computing solution.

We are excited to announce that you can now run Service Extensions plugins with Cloud CDN, allowing you to run your own custom code directly in the request path in a fully managed Google environment with optimal latency. This allows you to customize the behavior of Cloud CDN and the Application Load Balancer to meet your business requirements.

Service Extensions plugins with Cloud CDN supports the following use cases:

Custom traffic steering: Manipulate request headers to influence backend service selection.
Cache optimization: Influence which content is served from a Cloud CDN cache.
Exception handling: Redirect clients to a custom error page for certain response classes.
Custom logging: Log user-defined headers or custom data into Cloud Logging.
Header addition: Create new headers relevant for your applications or specific customers.
Header manipulation: Rewrite existing request headers or override client headers on their way to the backend.
Security: Write custom security policies based on client requests and make enforcement decisions within your plugin.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6d90586d30>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/products?#networking’), (‘image’, None)])]>

Where you can run your code

Service Extensions plugins run at multiple locations in the request and response processing path. With this launch, you can run plugins before requests to Cloud CDN using an edge extension. In a previous launch, we added the capability to run plugins post the Cloud CDN cache, closer to the origin, via a traffic extension. Now, with support for both edge and traffic extensions, you can choose where you want your code to execute in the processing path.

Service Extensions deep dive

Service Extensions plugins are designed for lightweight compute operations that run as part of the Application Load Balancer request path. Plugins are built on WebAssembly (Wasm), which provides several benefits:

Near-native execution speed, and startup time in the single milliseconds
Support for a variety of programming languages, such as Rust, C++, and recently Go
Cross-platform portability, so you can run the same plugin in various deployments, or locally for testing
Security protections, such as executing plugin logic in a sandboxed environment

Service Extensions plugins leverage Proxy-Wasm, a Google-supported open source project that provides a standard API for Wasm modules to interface with network proxies.

Use Cloudinary’s image & video optimization solution

We are excited to announce our latest partner to integrate their offering with Service Extensions and Cloud CDN. Cloudinary makes an advanced image and video optimization solution and has integrated it with Service Extensions plugins for deployment to Cloud CDN customers.

Packaged as a Wasm plugin, Cloudinary’s plugin takes directives from Client Requests such as User-Agent information, and content type expressed as MIME types from HTTP Accept headers, to determine the most optimal media type to serve to the end user. The plugin also takes care of Cache Key normalization so that the images and videos are cached properly based on device types and content types.

“Cloudinary’s image and video solutions help customers manage and optimize their visual media assets at scale while ensuring they are optimized for the best format, device, channel and viewing context. We are excited to partner with the Google team to offer Cloudinary’s image and video optimization solutions to Cloud CDN customers via Service Extensions.” – Gary Ballabio, VP, Strategic Technology Partnerships, Cloudinary

For more information on Cloudinary’s solution, please review this guide.

What’s next

To get started with Service Extensions plugins, take a look at our growing samples repository with a local testing toolkit and follow our quickstart guide in the documentation.

Read More for the details.

2025 06 23

AWS – Announcing Amazon WorkSpaces Core Managed Instances to simplify VDI migrations

Tibor Kiss AWS, Cloud AWS

AWS today announced Amazon WorkSpaces Core Managed Instances, simplifying virtual desktop infrastructure (VDI) migrations with highly customizable instance configurations. Utilizing EC2 Managed Instances at its foundation, WorkSpaces Core can now provision resources in your AWS account, handling infrastructure lifecycle management for both persistent and non-persistent workloads. Managed Instances complement existing WorkSpaces Core pre-configured bundles by providing greater flexibility for organizations requiring specific compute, memory, or graphics configurations.

You can now use existing discounts, Savings Plans, and other features like On-Demand Capacity Reservations (ODCRs), with the operational simplicity of WorkSpaces – all within the security and governance boundaries of your AWS account. WorkSpaces Core Managed Instances is ideal for organizations migrating from on-premises VDI environments or existing AWS customers seeking enhanced cost optimization without sacrificing control over their infrastructure configurations. You can use a broad selection of instance types, including accelerated graphics instances, while your Core partner solution handles desktop and application provisioning and session management through familiar administrative tools.

Amazon WorkSpaces Core Managed Instances is available today in all AWS Regions where WorkSpaces is supported. Customers will incur standard compute costs along with an hourly fee for WorkSpaces Core. See the WorkSpaces Core pricing page for more information.

To learn more about Amazon WorkSpaces Core Managed Instances, visit the product page. For technical documentation, getting started guides, and the shared responsibility model for partner VDI solutions integrating WorkSpaces Core bundles and managed instances, see the Amazon WorkSpaces Core Documentation.

Read More for the details.

2025 06 23

AWS – Amazon OpenSearch Serverless now supports Point in Time (PIT) and SQL search in the AWS GovCloud (US) Regions

Tibor Kiss AWS, Cloud AWS

Amazon OpenSearch Serverless has added support for Point in Time (PIT) search and SQL in AWS GovCloud US-East and US-West Regions, enabling you to run multiple queries against a dataset fixed at a specific moment. With PIT search you to maintain consistent search results even as your data continues to change, making it particularly useful for applications that require deep pagination or need to preserve a stable view of data across multiple queries. OpenSearch SQL API allows you to leverage your existing SQL skills and tools to analyze data stored in your collections.

PIT supports both forward and backward navigation through search results, ensuring consistency even during ongoing data ingestion. This feature is ideal for e-commerce applications, content management systems, and analytics platforms that require reliable and consistent search capabilities across large datasets. SQL and PPL API support addresses the need for familiar query syntax and improved integration with existing analytics tools, benefiting data analysts and developers who work with OpenSearch Serverless collections.

Please refer to the AWS Regional Services List for more information about Amazon OpenSearch Service availability. To learn more about OpenSearch Serverless, see the documentation.

Read More for the details.

2025 06 23

AWS – Amazon VPC raises default Route Table capacity

Tibor Kiss AWS, Cloud AWS

AWS VPC has increased the default value for routes per route table from 50 to 500 entries.

Before this enhancement, customers had to request a limit increase to use more than 50 routes per VPC route table. Organizations often need additional routes to maintain precise control over their VPC traffic flows to insert firewalls or network functions in the traffic path, or direct traffic to peering connections, internet gateway, virtual private gateway or transit gateway. This enhancement automatically increases the route table capacity to 500 routes, mitigating administrative overhead and enables customers to scale their network architecture seamlessly as their requirements grow.

The new default limit will be automatically available for all route tables in all AWS commercial and AWS GovCloud (US) Regions. Customer accounts without route quota overrides will automatically get 500 routes per VPC route table for their existing and new VPCs. Customer accounts with route quota overrides will not see any changes to their existing or new VPC setups. To learn more about this quota increase, please refer to our documentation.

Read More for the details.

2025 06 23

AWS – AWS AppSync is now available in 3 additional regions

Tibor Kiss AWS, Cloud AWS

AWS AppSync is now available in Asia Pacific (Malaysia, Thailand), and Canada West (Calgary). AWS AppSync GraphQL is a fully managed service that enables developers to create scalable APIs that simplify application development by allowing applications to securely access, manipulate, and combine data from one or multiple sources. AWS AppSync Events is a fully managed service for serverless WebSocket APIs with full connection management.

To learn more about AWS AppSync’s regional availability, please visit the AWS Services by Region page. For more information about AWS AppSync, visit the AWS AppSync documentation.

Read More for the details.

2025 06 23

GCP – Looker developers gain speed and accuracy with debut of Continuous Integration

Tibor Kiss Cloud, Google Cloud gcp

With more than a thousand connected data sources available out-of-the-box and an untold number of custom tools, developers rely on Looker’s cloud-first, open-source-friendly model to create new data interpretations and experiences. Today, we are taking a page from modern software engineering principles with our launch of Continuous Integration for Looker, which will help speed up development and help developers take Looker to new places.

As a developer, you rely on your connections to be stable, your data to be true, and for your code to run the same way every time. And when it doesn’t, you don’t want to spend a long time figuring out why the build broke, or hear from users who can’t access their own tools.

Continuous Integration for Looker helps streamline your code development workflows, boost the end-user experience, and give you the confidence you need to deploy changes faster. With Continuous Integration, when you write LookML code, your dashboards remain intact and your Looker content is protected from database changes. This helps to catch data inconsistencies before your users do, and provides access to powerful development validation capabilities directly in your Looker environment.

With Continuous Integration, you can automatically unify changes to data pipelines, models, reports, and dashboards, so that your business intelligence (BI) assets are consistently accurate and reliable.

2 runsandsuites — Continuous Integration in Looker checks your downstream dependencies for accuracy and speeds up development.

Developers benefit from tools that help them maintain code quality, ensure reliability, and manage content effectively. As Looker becomes broadly adopted in an organization, with more users creating new dashboards and reports and connecting Looker to an increasing number of data sources, the potential for data and content errors can increase. Continuous Integration proactively tests new code before it is pushed to production, helping to ensure a strong user experience and success.

Specifically, Continuous Integration in Looker offers:

Early error detection and improved data quality: Minimize unexpected errors in production. Looker’s new Continuous Integration features help LookML developers catch issues before new code changes are deployed, for higher data quality.
Validators that:

Flag upstream SQL changes that may break Looker dimension and measure definitions.
Identify dashboards and Looks that reference outdated LookML definitions.
Validate LookML for errors and antipatterns as a part of other validations.

Enhanced developer efficiency: Streamline your workflows and integrate Continuous Integration pipelines, for a more efficient development and code review process that automatically checks code quality and dependencies, so you can focus on delivering impactful data experiences.

Increased confidence in deployments: Deploy with confidence, knowing your projects have been thoroughly tested, and confident that your LookML code, SQL queries, and dashboards are robust and reliable.

1 run_validator — Continuous Integration flags development issues early.

Manage Continuous Integration directly within Looker

Looker now lets you manage your continuous integration test suites, runs, and admin configurations within a single, integrated UI. With it, you can

Easily monitor the status of your Continuous Integration runs and manage your test suites directly in Looker.
Leverage powerful validators to ensure accuracy and efficiency of your SQL queries, LookML code, and content.
Trigger Continuous Integration runs manually or automatically via pull requests or schedules whenever you need them, for control over your testing process.

In today’s fast-paced data environment, speed, accuracy and trust are crucial. Continuous Integration in Looker helps your data team promote developmental best practices, reduce risk of introducing errors in production, and increase your organization’s confidence in its data. The result is a consistently dependable Looker experience for all users, including those in line-of-business, increasing reliability across all use cases. Continuous Integration in Looker is now available in preview. Explore its capabilities and see how it can transform your Looker development workflows. For more information, check our product documentation to learn how to enable and configure Continuous Integration for your projects.

Read More for the details.

2025 06 23

AWS – Amazon Neptune Analytics now Integrates with GraphStorm for Scalable Graph Machine Learning

Tibor Kiss AWS, Cloud AWS

Today, we’re announcing the integration of Amazon Neptune Analytics with GraphStorm, a scalable, open-source graph machine learning (ML) library built for enterprise-scale applications. This integration brings together Neptune’s high-performance graph analytics engine and GraphStorm’s flexible ML pipeline, making it easier for customers to build intelligent applications powered by graph-based insights.

With this launch, customers can train graph neural networks (GNNs) using GraphStorm and bring their learned representations—such as node embeddings, classifications, and link predictions—into Neptune Analytics. Once loaded, these enriched graphs can be queried interactively and analyzed using built-in algorithms like community detection or similarity search, enabling a powerful feedback loop between ML and human analysis. This integration supports a wide range of use cases, from detecting fraud and recommending content, to improving supply chain intelligence, understanding biological networks, or enhancing customer segmentation. GraphStorm simplifies model training with a high-level command-line interface (CLI) and supports advanced use cases via its Python API. Neptune Analytics, optimized for low-latency analysis of billion-scale graphs, allows developers and analysts to explore multi-hop relationships, analyze graph patterns, and perform real-time investigations.

By combining graph ML with fast, scalable analytics, Neptune and GraphStorm help teams move from raw relationships to real insights—whether they’re uncovering hidden patterns, ranking risks, or personalizing experiences. To learn more about using GraphStorm with Neptune Analytics, visit the blog post.

Read More for the details.

2025 06 23

AWS – AWS End User Messaging now supports Service Quotas

Tibor Kiss AWS, Cloud AWS

Today, AWS End User Messaging announces support for Service Quota. This integrations provides customers with improved visibility and control over their SMS, voice, and WhatsApp service quotas, streamlining the quota management process and reducing the need for manual intervention.

With Service Quotas, customers can now view and manage their End User Messaging quota limits directly through the AWS Service Quotas console. This integration enables automated limit increase approvals for eligible requests, improving response times and reducing the number of support tickets. Customers will also benefit from visibility into quota usage for all on-boarded quotas via Amazon CloudWatch usage metrics, allowing for better resource planning and management.

Service Quotas for End User Messaging is available in all commercial regions and the AWS GovCloud (US) Regions.

To learn more about Service Quotas and how to manage your End User Messaging quotas, visit the Service Quotas User Guide or the AWS End User Messaging product page.

Read More for the details.

2025 06 23

AWS – Amazon Time Sync Service now supports Nanosecond Hardware Packet Timestamps

Tibor Kiss AWS, Cloud AWS

The Amazon Time Sync Service now supports nanosecond-precision hardware packet timestamping on supported Amazon EC2 instances.

Built on Amazon’s proven network infrastructure and the AWS Nitro System, customers can enable the Amazon Time Sync Service’s hardware packet timestamping to add a 64 bit nanosecond-precision timestamp to every inbound network packet. By timestamping at the hardware level, before the kernel, socket, or application layer, customers can now more directly leverage the reference clock running in the AWS Nitro System and bypass any delays added by timestamping in software. Customers can then use these timestamps to determine the order and resolve fairness of incoming packets to their ec2 instances, measure 1-way network latency, and further increase distributed system transaction speed with higher precision and accuracy than most on-premises solutions. Customers already using the Amazon Time Sync Service’s PTP Hardware Clocks (PHC) can install the latest ENA Linux driver and enable hardware packet timestamping, accessible through standard Linux socket API, for all incoming network packets without needing any updates to their VPC configurations.

Hardware packet timestamping is available starting today in all regions and EC2 instance types where the Amazon Time Sync Service’s PHC is supported. Hardware packet timestamping can be used on virtualized or bare metal instances. There is no additional charge for using this feature.

Configuration instructions, and more information on the Amazon Time Sync Service, are available in the EC2 User Guide.

Read More for the details.

2025 06 23

AWS – AWS Private CA now supports Internet Protocol Version 6 (IPv6)

Tibor Kiss AWS, Cloud AWS

AWS Private Certificate Authority (AWS Private CA) now supports Internet Protocol version 6 (IPv6) through new dual-stack endpoints. Customers can connect to AWS Private CA service, download Certificate Revocation Lists (CRLs), and check revocation status via Online Certificate Status Protocol (OCSP) over the public internet using IPv6, IPv4, or dual-stack clients. AWS Private CA Connector for Active Directory (AD) and AWS Private CA Connector for Simple Certificate Enrollment Protocol (SCEP) also support IPv6. The existing AWS Private CA endpoints supporting IPv4 will remain available for backwards compatibility.

AWS Private CA is a managed service that lets you create private certificate authorities (CAs) to issue digital certificates for authenticating users, servers, workloads, and devices within your organization, while securing the CA’s private keys using FIPS 140-3 Level 3 hardware security modules (HSMs). AWS Private CA offers connectors so you can use AWS Private CA with Kubernetes, Active Directory, and mobile device management (MDM) software.

AWS Private CA support for IPv6 is available in all AWS Regions, including AWS GovCloud (US) Regions and the China Regions.

To learn more on best practices for configuring IPv6 in your environment, visit the whitepaper on IPv6 in AWS.

To learn more about AWS Private CA IPv6 support, visit the AWS Private CA user guide.

Read More for the details.

2025 06 23

GCP – How Conversational Agents and Looker can boost contact center efficiency and enhance constituent services

Tibor Kiss Cloud, Google Cloud gcp

Conversational agents are transforming the way public sector agencies engage with constituents — enabling new levels of hyper-personalization, multimodal conversations, and improving interactions across touchpoints. And this is just the beginning. Our Conversational Agents can help constituents with a variety of tasks such as getting information about government programs and services, scheduling appointments with government agencies, and so much more.

Read on to discover how Google Cloud’s Conversational Agents and tooling can help you build virtual agents that provide rich insights for agency staff, and support excellent constituent services.

Diving deeper into Customer Engagement Suite (CES)

Customer Engagement Suite (CES) with Google AI can improve constituent services and drive greater operational efficiency. It offers tools to automate interactions via 24×7 multilingual virtual agents, assist agents during calls, analyze conversations and provide a unified channel experience. This includes:

Conversational Agents (Dialogflow CX) – now FedRAMP High authorized – includes new generative AI components like data store agents, generative fallbacks and generators, as well as fully generative agents called Playbooks. Conversational Agents are virtual agents that handle natural text, chat or voice conversations with end-users, and are used for a variety of support use cases. They use AI to translate user input (text or audio) into structured queries, integrate with various organization-wide applications, systems and knowledge bases, and help address a user’s questions. Agencies can define deterministic and generative AI-based agentic workflows to support the end-user through processes, guide the overall conversational flow, and take actions.
Agent Assist – Now FedRAMP High authorized – empowers call center operators with real-time support and guidance during the call, providing important context as the conversation unfolds and enabling employees to more efficiently find information for callers. Agent Assist improves accuracy, reduces handle time and after-call work, drives more personalized and effective engagement, and enhances overall service delivery.
Conversational Insights: Unlocks insights about call drivers to improve outcomes.
Contact Center as a Service: Delivers seamless and consistent interactions across all your channels with a turnkey, enterprise-grade, omnichannel contact center solution that is cloud-native and built on Google Cloud’s foundational security, privacy, and AI innovations.

Leveraging analytics for deeper insights

The Analytics Panel in the Conversational Agents Console provides a comprehensive overview of how your agent is performing. It includes metrics like conversation volume, average conversation duration, and conversation abandonment rate. This information can help identify areas where your agent can be improved.

Conversational Insights provides the ability to discover patterns and visualize contact center data trends, offering valuable insights into constituent sentiment, call topics, and agent support needs. This can help identify areas for improvement in the constituent experience. However, analyzing information through the console can be challenging. Custom reports developed with Looker simplify the process of analytics and make trend analysis easier.

Standard Reports allow you to export your Insights data into BigQuery. This allows you to create tailored reports using tools like Looker and Looker Studio. This can give you even more insights into your contact center data – such as conversation sentiment, word clouds with popular entities, Agent Performance reports and conversation specific reporting. Looker Blocks for Contact Center as a Service provides pre-built data models, dashboards, and reports specifically designed for contact center analytics. This accelerates the process of setting up and visualizing contact center data. Understanding conversational data supports mission effectiveness, drives value for the agency, improves operational efficiency, and enhances the overall constituent experience.

Implementing analytics with Contact Center as a Service

To get these pre-made reports that uncover insights from Contact Center Operations using Looker Blocks, you’ll need to do two things.

First, export ConversationaI Insights data into BigQuery. The best way to do this is to set up a scheduled data feed through data engineering pipelines. This automation ensures data synchronization to BigQuery, eliminating the need for manual exports and preventing data loss.

Next, log in to your Looker console, go to the Looker Marketplace, and install the block. Once it’s installed, point it to the BigQuery export datasets, and voila! The dashboards are ready for you to use. Looker Blocks have the ability to recognize the data model and produce metrics for contact center operations. Besides the ready-made dashboards, blocks can also be used as a foundation for reporting and can be tailored to your specific requirements within the organization.

Conversational Agent to Looker analytics pipeline leveraging BigQuery for storage and processing

Overall, these tools can help improve the performance of your contact center. By understanding your agent’s performance, identifying patterns in your contact center data, and creating tailored reports, you can empower agency call center staff with data-driven decisions that enhance the constituent experience.

A great example of this technology in action is in Sullivan County, New York. The county faced the challenge of effectively serving a growing population and managing high inquiry volumes with limited staff and budget. To address this and enhance constituent engagement, they implemented a conversational agent, providing 24/7 online assistance and freeing up county employees for more complex problem-solving. By using Google Cloud’s latest innovations, the county launched a chatbot that streamlined communication. Looker was instrumental in identifying crucial insights, including a 62% year-over-year drop in constituent call volume, tracking their expansion to 24-hour service availability, further augmenting staff capacity and providing Sullivan County constituents with the best possible service.

Tapping into Looker and BigQuery to streamline contact center analytics

Looker is a complete AI for business intelligence (BI) platform allowing users to explore data, chat with their data via AI agents using natural language, and create dashboards and self-service reports with as little as a single natural language query. As a cloud-native and cloud-agnostic conversational enterprise-level BI tool, Looker provides simplified and streamlined provisioning and configuration.

Integrating Looker’s pre-built block with BigQuery offers an immediate and adaptable analytics solution for the public sector. This connection provides a customizable dashboard that visualizes critical contact center data, enabling stakeholders to quickly identify trends, assess performance, and make data-driven decisions to optimize operations. This readily available analytical power eliminates the need for extensive data engineering and accelerates the time to insight, allowing organizations to focus on delivering superior public service.

Ready to see how Looker can transform your contact center data into actionable insights? Sign up for your free Looker trial today.

Read More for the details.

2025 06 23

GCP – Work Smarter with Chromebook Plus and Google AI

Tibor Kiss Cloud, Google Cloud gcp

The way we use technology at work is changing at a rapid pace. Innovation in AI is leading to new experiences and expectations for what can be done on laptops. That’s why we’re excited to unveil the next evolution of Chromebook Plus, a powerful leap forward and designed to help businesses unlock productivity, creativity, and collaboration for employees.

We’ve been hard at work, not only refining the features you already know and love, but also integrating even more Google AI capabilities directly into your devices. We’re also introducing the next wave of Chromebook Plus devices, including the brand-new Lenovo Chromebook Plus, an innovative device powered by the most advanced processor in a Chromebook ever—the MediaTek Kompanio Ultra.

This moment also marks a milestone in our larger effort to improve our hybrid computing approach to AI. With built-in NPU (neural processing unit) capabilities on Chromebook Plus, we now offer on-device AI for offline use, complemented by cloud-based capabilities that benefit from continuous updates and advancements. This hybrid approach allows us to balance performance, efficiency, privacy, cost, and reliability in Chromebook Plus.

The latest in Chromebook Plus

The Lenovo Chromebook Plus (14”, 10) is the world’s first Chromebook with an NPU and NPU-enabled capabilities. Powered by MediaTek’s Kompanio Ultra processor, and boasting 50 TOPS (trillions of operations per second) of AI processing power to enable on-device generative AI experiences, this intelligent device is built to keep up with modern workers and offer up to 17 hours of battery life. Learn more.

The Lenovo Chromebook Plus also comes with exclusive AI features built in. Organization is easy with Smart grouping, which provides you with a glanceable chip of your recent tabs and apps. You can also automatically group related items, move them to a new desk, or reopen all tabs in a single window. And with On device image generation, you can effortlessly turn any image into a sticker or standalone graphic with a transparent background, ready for use in Google Slides, Docs, and more.

Plus_SmartGrouping_May_2025_16x9_1x — Smart Grouping

A device for every need

We also understand that every business has its own unique needs and requirements. That’s why we’re so excited to expand the Chromebook portfolio with additional devices, including the ASUS Chromebook Plus CX15, ASUS Chromebook Plus CX14, and the ASUS Chromebook CX14. These additions further broaden the range of choices available, ensuring businesses can find a device that aligns with both their operational needs and budget.

When it comes to modernizing your team, the right device can make all the difference.

For cost-conscious businesses, who prioritize a highly affordable and reliable solution for essential tasks like email, web browsing, and cloud-based applications, standard Chromebooks offer exceptional value.
For enhanced interactions and versatility, especially for teams in retail, field services, or more creative roles, we offer touchscreen options, as well as detachable and convertible form factors so you can adapt to various work environments and presentation styles.
For advanced use cases and future-proofing, and employees that require cutting-edge performance, Chromebook Plus devices are the ideal choice. With powerful processors, more memory, double the storage of standard Chromebooks, and on-device AI capabilities, these devices are equipped to handle the next generation of productivity tools and smart features, future-proofing your investment.

New Google AI features to supercharge your workforce

Along with all of this new hardware, we’re also introducing new and updated features built directly into Chromebook and Chromebook Plus.

For productivity, we’ve enhanced Help me read, which now can simplify complex language into more straightforward, digestible text. This is perfect for quickly grasping complicated topics, technical documents, or anything that might otherwise require more time to understand. Additionally, we’re introducing the new Text capture feature. Leveraging generative AI, it extracts specific information from anything on your screen and provides contextual recommendations. Imagine automatically adding events to your calendar directly from an email banner, or effortlessly taking a receipt screenshot and pulling that data into a spreadsheet for easier tracking. Finally, Select to search with Lens helps you get more information from whatever is on your screen. Whether you’re curious about a landmark, a product, or anything else, this feature helps you quickly identify and learn more about it.

Plus_TextCapture_May_2025_16x10 — Text Capture

Just as critical as productivity is empowering teams to unleash their creativity. With that in mind, we’ve improved Quick Insert to now include image generation capabilities. With just the press of a button, you can generate high-quality AI images and instantly insert them into your emails, slides, documents, and more. Need a unique visual for a presentation or an email? Simply describe it, and let AI bring your vision to life.

Plus_QuickInsertImageGen_May_2025_16x10 — Quick Insert with Image Generation

As always, these features come with built-in policies, ensuring IT admins maintain full control over your organization’s access and usage of AI.

Preparing for the future of work

We continue to invest in making Chromebook Plus the definitive choice for businesses seeking to modernize their operations, empower their end-users with productivity and creativity, and prepare for the evolving demands of the future of work. With Chromebook Plus, your organization gains a secure, intelligent, and powerful platform designed to drive progress today and into tomorrow.

Click here to learn about ChromeOS devices, and discover which device is best for your business.

Read More for the details.

2025 06 20

AWS – AWS Step Functions TestState now available in the AWS GovCloud (US) Regions

Tibor Kiss Cloud

AWS Step Functions now offers TestState in the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions. This expansion allows customers operating in these regions to test individual states of their workflows without creating or updating existing state machines, helping to enhance the development and troubleshooting process for their applications.

AWS Step Functions is a visual workflow service that enables customers to build distributed applications, automate IT and business processes, and build data and machine learning pipelines using AWS services. TestState allows developers to validate a state’s input and output processing, test AWS service integrations, and verify HTTP task requests and responses.

With TestState now available in the AWS GovCloud (US) Regions, customers can test and validate individual workflow steps. TestState supports various state types including Task, Pass, Wait, Choice, Succeed, and Fail, with tests running for up to five minutes.

This feature is now generally available in the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions, in addition to all commercial regions where AWS Step Functions is available. To learn more about TestState and how to incorporate it into your workflow development process, visit the AWS Step Functions documentation. You can start testing your workflow states using the Step Functions console, AWS Command Line Interface (CLI), or AWS SDKs.

Read More for the details.

2025 06 20

AWS – Amazon EC2 R7g instances are now available in AWS Asia Pacific (Melbourne) region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) R7g instances are available in AWS Asia Pacific (Melbourne) region. These instances are powered by AWS Graviton3 processors that provide up to 25% better compute performance compared to AWS Graviton2 processors, and built on top of the the AWS Nitro System, a collection of AWS designed innovations that deliver efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage.

Amazon EC2 Graviton3 instances also use up to 60% less energy to reduce your cloud carbon footprint for the same performance than comparable EC2 instances. For increased scalability, these instances are available in 9 different instance sizes, including bare metal, and offer up to 30 Gbps networking bandwidth and up to 20 Gbps of bandwidth to the Amazon Elastic Block Store (EBS).

To learn more, see Amazon EC2 R7g. To explore how to migrate your workloads to Graviton-based instances, see AWS Graviton Fast Start program and Porting Advisor for Graviton. To get started, see the AWS Management Console.

Read More for the details.

2025 06 20

AWS – Amazon RDS for Oracle now offers Reserved Instances for R7i and M7i instances

Tibor Kiss AWS, Cloud AWS

Amazon Relational Database Service (Amazon RDS) for Oracle offers Reserved Instances for R7i and M7i instances with up to 46% cost savings compared to On-Demand prices. These instances are powered by custom 4th Generation Intel Xeon Scalable processors and provide larger sizes up to 48xlarge with 192 vCPUs and 1536 GiB of latest DDR5 memory.

Reserved instance benefits apply to both Multi-AZ and Single-AZ configurations. This means that customers can move freely between configurations within the same database instance class type, making them ideal for varying production workloads. Amazon RDS for Oracle Reserved Instances also provide size flexibility for the Oracle database engine under the Bring Your Own License (BYOL) licensing model. With size flexibility, discounted rate for Reserved Instances will automatically apply to usage of any size in the same instance family.

Customers can now purchase Reserved Instances for Amazon RDS Oracle in all AWS regions where R7i and M7i instances are available. For information on specific Oracle database editions and licensing options that support these database instance types, refer to the Amazon RDS user guide.

To get started, purchase Reserved Instances through the AWS Management Console, AWS CLI, or AWS SDK. For detailed pricing information and purchase options, visit the Amazon RDS for Oracle pricing page.

Read More for the details.

2025 06 20

AWS – Amazon EC2 C7i-flex and C7i instances are now available in 2 additional regions

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7i-flex and C7i instances are available in the Asia Pacific (Hong Kong) and Europe (Zurich) Regions. These instances are powered by powered by custom 4th Generation Intel Xeon Scalable processors (code-named Sapphire Rapids) custom processors, available only on AWS, and offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers.

C7i-flex instances expand the EC2 Flex instances portfolio to provide the easiest way for you to get price performance benefits for a majority of compute intensive workloads, and deliver up to 19% better price-performance compared to C6i. C7i-flex instances offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources. With C7i-flex instances, you can seamlessly run web and application servers, databases, caches, Apache Kafka, and Elasticsearch, and more.

C7i instances deliver up to 15% better price-performance versus C6i instances and are a great choice for all compute-intensive workloads, such as batch processing, distributed analytics, ad serving, and video encoding. C7i instances offer larger instance sizes, up to 48xlarge, and two bare metal sizes (metal-24xl, metal-48xl). These bare-metal sizes support built-in Intel accelerators: Data Streaming Accelerator, In-Memory Analytics Accelerator, and QuickAssist Technology that are used to facilitate efficient offload and acceleration of data operations and optimize performance for workloads.

To learn more, visit Amazon EC2 C7i Instances. To get started, see the AWS Management Console.

Read More for the details.

2025 06 20

AWS – Amazon EC2 M7i-flex and M7i instances are now available in Asia Pacific (Hong Kong) Region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M7i-flex and M7i instances powered by custom 4th Gen Intel Xeon Scalable processors (code-named Sapphire Rapids) are available in Asia Pacific (Hong Kong) region. These custom processors, available only on AWS, offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers.

M7i-flex instances are the easiest way for you to get price-performance benefits for a majority of general-purpose workloads. They deliver up to 19% better price-performance compared to M6i. M7i-flex instances offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources such as web and application servers, virtual-desktops, batch-processing, and microservices.

M7i deliver up to 15% better price-performance compared to M6i. M7i instances are a great choice for workloads that need the largest instance sizes or continuous high CPU usage, such as gaming servers, CPU-based machine learning (ML), and video-streaming. M7i offer larger instance sizes, up to 48xlarge, and two bare metal sizes (metal-24xl, metal-48xl). These bare-metal sizes support built-in Intel accelerators: Data Streaming Accelerator, In-Memory Analytics Accelerator, and QuickAssist Technology that are used to facilitate efficient offload and acceleration of data operations and optimize performance for workloads.

To learn more, visit Amazon EC2 M7i Instances. To get started, see the AWS Management Console.

Read More for the details.

2025 06 20

AWS – Amazon IVS Real-Time Streaming now supports E-RTMP multitrack video ingest

Tibor Kiss AWS, Cloud AWS

Starting today, you can use E-RTMP (Enhanced Real-Time Messaging Protocol) multitrack video to send multiple video qualities to your Amazon Interactive Video Service (Amazon IVS) stages. This feature enables adaptive bitrate streaming, allowing viewers to watch in the best quality for their network connection. Multitrack video is supported in OBS Studio and complements the existing simulcast capabilities in the IVS broadcast SDK. There is no additional cost for using multitrack video with Real-Time Streaming.

Amazon IVS is a managed live streaming solution designed to make low-latency or real-time video available to viewers around the world. Visit the AWS region table for a full list of AWS Regions where the Amazon IVS console and APIs for control and creation of video streams are available.

To learn more, please visit the Amazon IVS RTMP ingest documentation page.

Read More for the details.

2025 06 20

AWS – Anthropic’s Claude 3.7 Sonnet is now available on Amazon Bedrock in London

Tibor Kiss AWS, Cloud AWS

Anthropic’s Claude 3.7 Sonnet hybrid reasoning model is now available in Europe (London). Claude 3.7 Sonnet offers advanced AI capabilities with both quick responses and extended, step-by-step thinking made visible to the user. This model has strong capabilities in coding and brings enhanced performance across various tasks, like instruction following, math, and physics.

Claude 3.7 Sonnet introduces a unique approach to AI reasoning by integrating it seamlessly with other capabilities. Unlike traditional models that separate quick responses from those requiring deeper thought, Claude 3.7 Sonnet allows users to toggle between standard and extended thinking modes. In standard mode, it functions as an upgraded version of Claude 3.5 Sonnet. In extended thinking mode, it employs self-reflection to achieve improved results across a wide range of tasks. Amazon Bedrock customers can adjust how long the model thinks, offering a flexible trade-off between speed and answer quality. Additionally, users can control the reasoning budget by specifying a token limit, enabling more precise cost management.

Claude 3.7 Sonnet is also available on Amazon Bedrock in the Europe (Frankfurt), Europe (Ireland), Europe (Paris), Europe (Stockholm), US East (N. Virginia), US East (Ohio), and US West (Oregon) regions. To get started, visit the Amazon Bedrock console. Integrate it into your applications using the Amazon Bedrock API or SDK. For more information, see the AWS News Blog and Claude in Amazon Bedrock.

Read More for the details.