Cloud

2025 07 08

AWS – AWS Site-to-Site VPN now supports IPv6 addresses on outer tunnel IPs

AWS Site-to-Site VPN now supports IPv6 addresses on outer tunnel IPs, making it easier for customers to build or transition to IPv6-only networks. Customers with mandates to use IPv6 network address deployments can now easily build IPv6-only VPN connections to meet their regulatory and compliance needs.

AWS Site-to-Site VPN is a fully managed service that allows you to create a secure connection between your data center or branch office and your AWS resources using IP Security (IPSec) tunnels. Until now, customers could use IPv6 addresses on the inner tunnels of their VPN connections, but the outer tunnels still required public IPv4 addresses. With this launch, customers can now configure IPv6 addresses on both inner and outer tunnels of their VPN connection, eliminating the complexity of dealing with cross IPv4/IPv6 addressing scheme. This feature also helps customers reduce their public IPv4 costs as there is no charge for using IPv6 address on the outer tunnel IP.

This capability is available in all AWS commercial Regions and AWS GovCloud (US) Regions where AWS Site-to-Site VPN is available, except Europe (Milan) Region. To learn more and get started, visit the AWS Site-to-Site VPN documentation.

Read More for the details.

2025 07 08

AWS – AWS Network Firewall: Native AWS Transit Gateway support in all regions

Tibor Kiss AWS, Cloud AWS

AWS Network Firewall now supports native integration with AWS Transit Gateway for centralized traffic inspection in all AWS Regions where both services are available. This integration enables customers to directly attach a network firewall to a transit gateway and easily route traffic between these services for consistent traffic inspection. The new feature eliminates the need to manage dedicated VPC subnets and route tables when connecting these services.

You can use this capability to protect traffic across your entire AWS network including VPCs and on-premises networks connected via AWS Site-to-Site VPN or AWS Direct Connect. The integration improves network security and resiliency through automatic multi-AZ redundancy, ensuring continuous service availability across regions.

Native integration is available in all AWS Regions where both AWS Network Firewall and AWS Transit Gateway are supported.

To learn more, visit the AWS Network Firewall service documentation.

Read More for the details.

2025 07 08

AWS – Amazon SageMaker AI is now available in AWS Asia Pacific (Taipei) Region

Tibor Kiss Cloud

Starting today, you can build, train, and deploy machine learning (ML) models in Asia Pacific (Taipei).

Amazon SageMaker AI is a fully managed platform that provides every developer and data scientist with the ability to build, train, and deploy machine learning (ML) models quickly. SageMaker AI removes the heavy lifting from each step of the machine learning process to make it easier to develop high quality models.

To learn more and get started, see SageMaker AI documentation and pricing page.

Read More for the details.

2025 07 08

AWS – Amazon Bedrock introduces API keys for streamlined development

Tibor Kiss AWS, Cloud AWS

Today, we’re announcing the launch of API keys for Amazon Bedrock to simplify the getting started experience and accelerate generative AI development. API keys for Amazon Bedrock enable developers to quickly generate access credentials directly within the Amazon Bedrock console or AWS SDK without needing to manually configure IAM principals and policies.

With the introduction of API keys for Amazon Bedrock, developers can generate short-term and long-term API keys directly from the Amazon Bedrock console or API to authenticate API calls to Amazon Bedrock models. Short-term API keys are valid for the duration of your console session, or up to 12 hours, whichever is shorter. Long-term API keys give you the flexibility to define key validity duration and manage the keys from the AWS IAM console.

Amazon Bedrock API key authentication is available in 20 AWS Regions where Amazon Bedrock is available: Asia Pacific (Hyderabad, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Spain, Stockholm, Zurich), South America (São Paulo), US East (N. Virginia, Ohio), and US West (Oregon).

To learn more about API keys in Amazon Bedrock, visit the API Keys documentation in the Amazon Bedrock user guide, or check out our blog for code snippets and implementation examples.

Read More for the details.

2025 07 08

AWS – Amazon SNS now supports sending SMS in the Mexico (Central) Region

Tibor Kiss AWS, Cloud AWS

Customers that use Amazon Simple Notification Service (Amazon SNS) in the Mexico (Central) Region can now send text messages (SMS) to subscribers in more than 200 countries and territories.

With this launch, customers using SNS in the Mexico (Central) Region can send SMS messages via AWS End User Messaging. Amazon SNS is a fully managed pub/sub messaging service that enables message delivery to multiple endpoints including AWS Lambda, Amazon SQS, Amazon Data Firehose, mobile devices, and email.

Amazon SNS now supports the ability to send SMS in 30 AWS Regions.

More information:

To learn more about sending SMS messages with SNS, visit Mobile text messaging (SMS).
For the list of supported countries and regions, visit SMS Supported Regions and Countries.

Read More for the details.

2025 07 08

AWS – Amazon Neptune Analytics now integrates with Mem0 for graph-native memory in GenAI applications

Tibor Kiss AWS, Cloud AWS

Today, we’re announcing the integration of Amazon Neptune Analytics with Mem0, an open-source, agentic memory system purpose-built for generative AI (GenAI) applications. With this launch, customers can use Neptune as the graph store for memory retrieval and reasoning, enabling long-term memory for AI agents across richly connected graphs—powering more personalized and context-aware AI experiences.

This integration allows Mem0 users to store and query memory graphs at scale, unlocking advanced use cases where LLMs need to learn from each interaction, becoming more personalized and effective over time. It enables graph-native long-term memory for LLMs by using Neptune as an external memory store, improving response quality through multi-hop graph reasoning, and supporting hybrid retrieval across graph, vector, and keyword modalities.

Mem0 is a self-improving memory layer that powers personalized, cost-efficient GenAI experiences. To learn more about the Neptune–Mem0 integration, visit the User Guide.

Read More for the details.

2025 07 08

AWS – Oracle Database@AWS announces general availability, expands networking capabilities

Tibor Kiss AWS, Cloud AWS

Today, as we announce general availability of Oracle Database@AWS, we are launching a set of new AWS Networking capabilities that allow customers to easily build enterprise applications and services using Oracle Database@AWS. With this launch, customers can seamlessly connect their Oracle Database@AWS (ODB) networks with VPCs and on-premises networks and can establish secure access to AWS services natively from their ODB networks.

First, customers want to easily connect their Oracle Exadata database workloads in the ODB network to applications in VPCs and on-premises networks. Customers can use the integration with AWS Transit Gateway – to simplify connectivity at scale, AWS Cloud WAN – to connect global networks, and Amazon VPC Lattice – for secure service-to-service connectivity, to simplify their hybrid network connectivity with minimal networking changes. Second, customers want the ability to publish their database backups to their S3 buckets. With native connectivity between the ODB network and AWS services, powered by Amazon VPC Lattice, customers can set up private and secure access from the ODB network to Amazon S3. Last, customers want to maintain robust security and resource isolation, by launching their Oracle Exadata database and applications in different AWS accounts, while peering them for low-latency connectivity. Customer can now use the AWS RAM integration for cross-account peering support. These integrations enable customers to migrate and run production workloads on Oracle Exadata Database Service and Oracle Autonomous Database on Dedicated Exadata infrastructure within AWS.

Oracle Database@AWS networking capabilities are available in US East (N. Virginia) and US West (Oregon) AWS Regions. To get started, customers can use the AWS Management Console to provision and manage their Oracle Database@AWS resources. To learn more about the networking capabilities, see the blog, and documentation. For pricing, see product page per integration.

Read More for the details.

2025 07 08

AWS – Amazon RDS Custom now supports Cumulative Update 19 for Microsoft SQL Server 2022

Tibor Kiss AWS, Cloud AWS

Amazon Relational Database Service (Amazon RDS) Custom for SQL Server now supports Cumulative Update (CU) 19 for Microsoft SQL Server 2022. This update is available for SQL Server Developer, Web, Standard, and Enterprise editions, and includes performance improvements, security fixes, and bug fixes. For more details about the improvements in this update, please review Microsoft KB5054531 release notes.

We recommend upgrading to this CU as it includes security fixes. You can upgrade with just a few clicks in the Amazon RDS Management Console or by using the AWS SDK or CLI. Learn more about upgrading your database instances from the Amazon RDS Custom User Guide.

This CU is available in all AWS Regions where Amazon RDS Custom for SQL Server is available.

RDS Custom is a managed database service that allows customization of the underlying operating system and database environment. RDS Custom for SQL Server supports two licensing models: License Included (LI) and Bring Your Own Media (BYOM). By using Bring Your Own Media (BYOM), customers can use their existing SQL Server licenses with Amazon RDS Custom for SQL Server. See Amazon RDS Custom Pricing for pricing details and regional availability.

Read More for the details.

2025 07 08

AWS – Oracle Database@AWS is now generally available

Tibor Kiss AWS, Cloud AWS

Oracle Database@AWS is now generally available in the US East (N. Virginia) and US West (Oregon) Regions. It is a joint offering from AWS and Oracle that gives you access to Oracle Exadata Database Service and Oracle Autonomous Database on Dedicated Exadata Infrastructure within AWS data centers. You can benefit from a unified experience between AWS and Oracle with collaborative support, purchasing, management, and operations. Use of Oracle Database@AWS qualifies for AWS commitments as well as Oracle Support Rewards.

With Oracle Database@AWS, you can easily and quickly migrate your Oracle Exadata Database workloads, including Oracle Real Application Clusters (RAC), with minimal to no modifications to databases and associated applications. You can establish low-latency connections between AWS applications and Oracle Database@AWS. Zero-ETL integration with Amazon Redshift enables near real-time analytics and machine learning (ML) on transactional data stored in Oracle Database@AWS. You can store both Oracle-managed backups and backups you manually take to Amazon Simple Storage Service (Amazon S3), which is designed to provide 11 nines of durability. Oracle Database@AWS also integrates with AWS services such as AWS IAM for authentication and authorization, Amazon CloudWatch for monitoring, AWS CloudFormation for infrastructure as code, AWS CloudTrail for governance and compliance, Amazon EventBridge for event management, and Amazon VPC Lattice for simplified connectivity to AWS services.

Oracle Database@AWS will expand to 20 more regions across the Americas, Europe, and Asia Pacific including: US East (Ohio), US West (N. California), Asia Pacific (Hyderabad), Asia Pacific (Melbourne), Asia Pacific (Mumbai), Asia Pacific (Osaka), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Milan), Europe (Paris), Europe (Spain), Europe (Stockholm), Europe (Zurich), and South America (São Paulo).

To learn more, visit Oracle Database@AWS and its documentation. You can request a private offer from Oracle through AWS Marketplace and use the AWS Management Console to provision and manage your resources.

Read More for the details.

2025 07 08

AWS – Amazon VPC Lattice announces support for Oracle Database@AWS

Tibor Kiss AWS, Cloud AWS

With VPC Lattice support for Oracle Database@AWS (ODB) you can now connect your applications in VPCs and on-premises to your ODB network. You can also leverage VPC Lattice to privately and securely access Amazon S3 and Amazon Redshift from your Oracle Exadata workloads.

With this launch, your ODB databases can easily connect to AWS services, HTTP APIs and TCP applications, across thousands of VPCs and on-premises, without the need to setup complex networking. VPC Lattice simplifies network management and provides centralized visibility. You can also use ODB managed integrations (powered by VPC Lattice) to privately and securely access Amazon S3 and Amazon Redshift. With a few clicks, you can enable OCI managed backup of your ODB databases to Amazon S3, or configure your own Amazon S3 backup. Additionally, the Zero-ETL integration connects ODB databases to Amazon Redshift to analyze data across multiple databases.

VPC Lattice support is available in all AWS Regions where Oracle Database@AWS is generally available.

To get started, use the AWS Management Console to provision Oracle Database@AWS resources. You can use the AWS CLI, SDK or AWS Management Console to configure VPC Lattice resources. To learn more, please read the launch blog, Amazon VPC Lattice and Oracle Database@AWS documentation.

Read More for the details.

2025 07 08

AWS – Amazon EC2 C8g, M8g and R8g instances now available in Asia Pacific (Singapore)

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C8g, M8g, and R8g instances are available in AWS Asia Pacific (Singapore) region. These instances are powered by AWS Graviton4 processors and deliver up to 30% better performance compared to AWS Graviton3-based instances. Amazon EC2 C8g instances are built for compute-intensive workloads, such as high performance computing (HPC), batch processing, gaming, video encoding, scientific modeling, distributed analytics, CPU-based machine learning (ML) inference, and ad serving. Amazon EC2 M8g instances are built for general-purpose workloads, such as application servers, microservices, gaming servers, midsize data stores, and caching fleets. Amazon EC2 R8g instances are ideal for memory-intensive workloads such as databases, in-memory caches, and real-time big data analytics. These instances are built on the AWS Nitro System.

AWS Graviton4-based Amazon EC2 instances deliver the best performance and energy efficiency for a broad range of workloads running on Amazon EC2. These instances offer larger instance sizes with up to 3x more vCPUs and memory compared to Graviton3-based instances. AWS Graviton4 processors are up to 40% faster for databases, 30% faster for web applications, and 45% faster for large Java applications than AWS Graviton3 processors.

To learn more, see Amazon EC2 C8g Instances, Amazon EC2 M8g Instances, and Amazon EC2 R8g Instances. To explore how to migrate your workloads to Graviton-based instances, see AWS Graviton Fast Start program and Porting Advisor for Graviton. To get started, see the AWS Management Console.

Read More for the details.

2025 07 08

AWS – Amazon CloudWatch and Application Signals MCP servers for AI-assisted troubleshooting

Tibor Kiss AWS, Cloud AWS

Today, AWS announces two new Model Context Protocol (MCP) servers in the AWS Labs MCP open-source repository: CloudWatch MCP server and Application Signals MCP server. These servers enable AI agents to leverage comprehensive observability capabilities for automated troubleshooting and monitoring. The MCP servers allow AI assistants to analyze metrics, alarms, logs, traces, and service health data across your AWS environment to quickly identify and diagnose issues through simple conversational interfaces.

The MCP servers provide curated sets of tools designed specifically for operational troubleshooting scenarios. The CloudWatch MCP server supports alarm-based incident response, metric analysis, and log pattern detection, while the Application Signals MCP server enables service health monitoring through Service Level Objectives (SLOs), and automated root cause analysis using OpenTelemetry data. By leveraging the MCP standard, AI agents can perform complex troubleshooting workflows through natural language interactions, from analyzing alarm patterns, and metric anomalies to investigating service health issues and querying logs and traces. Rather than requiring developers to manually navigate multiple AWS consoles and APIs, these MCP servers enable AI agents to orchestrate these interactions intelligently while reducing the development times typically required for API integrations.

The CloudWatch MCP server can be used with CloudWatch in all AWS regions, and the Application Signals MCP server can be used in all regions where Application Signals is available.

To download and try out these open-source MCP servers locally with your AI-enabled IDE of choice, visit the AWS Labs MCP open-source repository.

Read More for the details.

2025 07 08

GCP – Accelerate your AI workloads with the Google Cloud Managed Lustre

Tibor Kiss Cloud, Google Cloud gcp

Today, we’re making it even easier to achieve breakthrough performance for your AI/ML workloads: Google Cloud Managed Lustre is now GA, and available in four distinct performance tiers that deliver throughput ranging from 125 MB/s, 250 MB/s, 500 MB/s, to 1000 MB/s per TiB of capacity — with the ability to scale up to 8 PB of storage capacity. The Managed Lustre solution is powered by DDN’s EXAScaler, combining DDN’s decades of leadership in high-performance storage with Google Cloud’s expertise in cloud infrastructure.

Managed Lustre provides a POSIX-compliant, parallel file system that delivers consistently high throughput and low latency, essential for:

High-throughput inference: For applications that require near-real-time inference on large datasets, Lustre provides high parallel throughput and sub-millisecond read latency.
Large-scale model training: Accelerate the training cycles of deep learning models by providing rapid access to petabytes-sized datasets. Lustre’s parallel architecture ensures GPUs and TPUs are fed with data, minimizing idle time.
Checkpointing and restarting large models: Save and restore the state of large models during training faster, improving goodput and allowing for more efficient experimentation.
Data preprocessing and feature engineering: Process raw data, extract features, and prepare datasets for training, reducing the time spent on data pipelines.
Scientific simulations and research: Beyond AI/ML, Lustre excels in traditional HPC scenarios like computational fluid dynamics, genomic sequencing, and climate modeling, where massive datasets and high-concurrency access are critical.

Lustre is designed for the highly parallel and random I/O that characterizes many AI/ML training and inference tasks. This parallel processing capability across multiple clients ensures your compute resources are never starved for data.

Performance tiers and pricing

Managed Lustre offers flexible pricing and performance tiers designed to meet the diverse needs of your workloads, whether you’re focused on capacity or highest throughput density.

Throughput MB/s per TiB of storage capacity	Storage pricing per GiB per month
125	$0.145
250	$0.21
500	$0.34
1000	$0.60

Please see more details at the Managed Lustre pricing page.

Irrespective of the aggregate throughput, all tiers come with sub-millisecond read latency, high single-stream throughput, and are perfect for parallel access to many small files.

Driving innovation together: partnering with DDN

Google Cloud’s Managed Lustre is powered by DDN’s EXAScaler, bringing together two industry leaders in high-performance computing and elastic cloud infrastructure. This partnership represents a joint commitment to simplifying the deployment and management of large-scale AI and HPC workloads in the cloud, thanks to:

Trusted leaders: By combining DDN’s decades of expertise in high-performance Lustre with Google Cloud’s global infrastructure and AI ecosystem, we are delivering a foundational capability that removes storage bottlenecks and helps our customers solve their most complex challenges in AI and HPC.
Fully managed and supported solution: Enjoy the benefits of a fully managed service from Google, with comprehensive support from both Google and DDN, for seamless operations and peace of mind.
Global availability and ecosystem integration: Managed Lustre is now globally accessible in multiple Google Cloud regions and integrates with the broader Google Cloud ecosystem, including Google Kubernetes Engine (GKE) and TPUs.

These benefits caught the attention of one of our largest partners, NVIDIA, who is looking forward to having it as part of its NVIDIA AI platform.

“Enterprises today demand AI infrastructure that combines accelerated computing with high-performance storage solutions to deliver uncompromising speed, seamless scalability and cost efficiency at scale. Google and DDN’s collaboration on Google Cloud Managed Lustre creates a better-together solution uniquely suited to meet these needs. By integrating DDN’s enterprise-grade data platforms and Google’s global cloud capabilities, organizations can readily access vast amounts of data and unlock the full potential of AI with the NVIDIA AI platform (or NVIDIA accelerated computing platform) on Google Cloud — reducing time-to-insight, maximizing GPU utilization, and lowering total cost of ownership.” – Dave Salvator, Director of Accelerated Computing Products, NVIDIA

Get started today!

Ready to supercharge your AI/ML and HPC workloads? Getting started with Managed Lustre is simple:

Navigate to Managed Lustre in the Google Cloud console.
Provision your Managed Lustre instance, choosing the performance tier and size that best fits your needs.
Connect your compute instances, GKE clusters to your new high-performance file system.

For detailed instructions and documentation, please visit the Managed Lustre documentation. And if needed, reach out to Google Cloud sales specialists.

Watch the Fireside Chat

Don’t miss the opportunity to learn more about the strategic partnership between Google Cloud and DDN, and the unique capabilities of Managed Lustre. Read the official DDN press release here.

Watch the fireside chat with Sameet Agarwal, VP/GM Storage and Sven Oehme, CTO of DDN, here.

Read More for the details.

2025 07 08

AWS – AWS Parallel Computing Service (PCS) is now available in the AWS Europe (London) Region

Tibor Kiss AWS, Cloud AWS

Today, AWS launches AWS Parallel Computing Service (PCS) in the AWS Europe (London) region, enabling you to easily build and manage High Performance Computing (HPC) clusters using the Slurm workload manager.

AWS PCS is a managed service that makes it easier for you to run and scale your high performance computing (HPC) workloads and build scientific and engineering models on AWS using Slurm. You can use AWS PCS to build complete, elastic environments that integrate compute, storage, networking, and visualization tools. AWS PCS simplifies cluster operations with managed updates and built-in observability features, helping to remove the burden of maintenance. You can work in a familiar environment, focusing on your research and innovation instead of worrying about infrastructure.

To get started, visit the AWS PCS page and the AWS PCS documentation.

Read More for the details.

2025 07 08

GCP – Expanding Z3 family with 9 new VMs and a bare metal instance for storage and I/O intensive workloads

Tibor Kiss Cloud, Google Cloud gcp

Today, we are thrilled to announce the expansion of the Z3 Storage Optimized VM family with the general availability of nine new Z3 virtual machines that offer local SSD capacity ranging from 3 TiB to 18 TiB per VM, complementing existing Z3 VMs which offer 36TiB of Local SSD per VM. We are also very pleased to launch a Z3 bare metal instance, which includes up to 72 TiB of Local SSDs. Z3 VMs enable customers like Shopify, Tenderly and ScyllaDB to achieve impressive performance improvements for their high performance storage workloads by reducing the IO access latency by up to 35% compared to VM instances using previous-generation local SSDs.

Z3 VMs are designed to run I/O-intensive workloads that require large local storage capacity and high storage performance, including SQL, NoSQL, and vector databases, data analytics, semantic data search and retrieval, and distributed file systems. The Z3 bare metal instance provides direct access to the physical server CPUs and is ideal for workloads that require low-level system access like private and hybrid cloud platforms, custom hypervisors, container platforms, or applications with specialized performance or licensing needs.

Both Z3 VMs and the bare metal instance are based on Titanium SSDs, which offload local storage processing from CPU resources to deliver real-time data processing, low-latency, high-throughput storage performance and enhanced storage security. Z3 VMs with Titanium SSD offer up to 36 GiB/s of read throughput and up to 9M IOPS, increasing write storage performance by up to 25% compared to previous generation Local SSDs¹.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5fa49db850>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>

Based on the 4th Gen Intel Xeon scalable processor, Z3 VMs come with up to 176 vCPUs, 1,408 GiB of memory, and 36 TiB of local storage in 11 virtual machine shapes. The Z3 bare metal instance offers 192 vCPUs, 1,536 GiB of memory and 72 TiB of local storage. Z3 VMs and the bare metal instance deliver the connectivity and storage performance that enterprise workloads need, with up to 100 Gbps in standard bandwidth and up to 200 Gbps with Tier1 networking for high-traffic applications.

The expanded Z3 virtual machine portfolio lets you rightsize your infrastructure and scale your clusters to meet workloads requirements by providing larger total local SSD capacity and higher local SSD capacity per vCPU. Z3 offers two different VM types: the standardlssd VM types, which include five VM shapes that offer about 200 GiB of local SSD per vCPU. They are optimized for data analytics (OLAP), and SQL databases like MySQL and Postgres workloads. The highlssd VM types include six different VM shapes and the Z3 bare metal instance. They offer about 400 GiB of local SSD per vCPU and are optimized for distributed databases, data streaming, large parallel file systems and data search.

What our customers and partners are saying

“We are thrilled to announce Nutanix Cloud Clusters coming to Google Cloud at the end of CY25 as part of Nutanix’s commitment to delivering flexible, hybrid cloud solutions. Google Cloud’s Z3 instance types represent a perfect foundation for Nutanix to enable performance and resilience for enterprise applications. We’re excited about our partnership with Google Cloud in empowering our joint customers with greater choice and simplicity in their cloud journey.” – Saveen Pakala, Vice President of Product Management, Nutanix

“OP Labs contributes to the Optimism protocol, which enables orders of magnitude of improved performance and scalability for Ethereum. Z3 reduces p99 block insertion tail latencies by 30-50% for our most I/O-demanding blockchain nodes compared to N2. By migrating our solution to Z3, we will be able to scale our blockchain nodes to handle L2 state growth in a more performant and cost-effective way.” – Zach Howard Senior Staff Engineer, OP Labs

The launch of Google Cloud’s Z3 storage optimized instances with smaller VM shapes represents a leap forward in performance for high-traffic NoSQL environments. In internal tests and customer projects, ScyllaDB has impressively leveraged the advantages of Z3 including extremely low latencies under high read and write loads, high IOPS capacity enabling the processing of massive amounts of data and excellent cost-performance ratio for large-scale production systems. We are very excited to offer Z3 family servers in ScyllaDB Cloud, including Bring Your Own Account (BYOA).” – Avi Kivity, Co-founder and CTO, ScyllaDB

“Shopify has found Z3s to be an excellent platform to build our most performance sensitive storage systems on. We experienced a critical need for both large data volumes while remaining sensitive to latency and throughput on the storage side. While Google has a lot of options, local SSD was really the best fit, and Z3s allowed us to achieve the best price/performance along with enhanced stability appropriate for a source of truth Storage workload. Right now we see these storage optimized VMs as our platform of choice for the future.” – Mattie Toia, VP Infrastructure, Shopify

“Tenderly is built to be your go-to for Web3 production and development, bringing all the necessary infrastructure into one place. This allows teams to operate with speed and confidence, making blockchain technology accessible. We’ve seen impressive results running blockchain workloads on Z3 instances, with a 40% improvement on read latency compared to N2 and N2D instances.” – Ilija Petrovic, SRE Lead, Tenderly

“The VAST AI Operating System gives organizations a unified platform to reason over all of their data – structured, unstructured, and streaming through a global namespace that spans cloud and on-prem environments – enabling intelligent agents and applications to operate with full context and real-time speed. ,For customers running on Google Cloud, Z3 VMs complement this vision by providing the ideal storage infrastructure to accelerate these workloads, ensuring AI pipelines run fast and scale effortlessly in the cloud.” – Renen Hallak, Founder & CEO, VAST Data

Z3 VMs are also the physical foundation of AlloyDB, our flagship PostgreSQL-compatible database service, delivering sophisticated multi-level caching. AlloyDB uses Z3’s expansive local SSDs as an ultra-fast cache, holding datasets up to 25x larger than can be stored in memory. Database queries can access these large, cached datasets at latencies that closely approach in-memory performance, particularly when factoring in overall end-to-end application response times. This is a significant advantage for very large databases, including real-time analytical workloads, as AlloyDB’s high-performance columnar engine operates entirely within this massive cache. AlloyDB on Z3 VMs will soon be available in preview, delivering up to 3x better performance than N-series VMs for transactional workloads, particularly for large datasets.

Enhanced maintenance experience

Z3 instances make it easier for you to plan ahead and schedule maintenance operations at a time of your choosing by providing notice from the system several days in advance of a required maintenance. The new Z3 VMs further enhance the maintenance experience by allowing you to live-migrate an instance during maintenance events for VMs with 18 TiB or less of local SSD storage. For Z3 VMs with 36 TiB of local SSD and for Z3 bare metal instances, you’ll also receive in-place upgrades that preserve your data through the planned maintenance events.

Support for Hyperdisk

Z3 VMs support Hyperdisk, Google Cloud’s workload-optimized block storage that lets you optimize the performance for each workload by independently tuning the storage performance and capacity for each instance.

Z3 VMs are compatible with Hyperdisk Balanced, Hyperdisk Throughput, and Extreme Hyperdisk storage for scalable, high-performance network-attached storage, supporting up to 512 TiB of capacity per instance. For general-purpose workloads, Hyperdisk Balanced, with up to 160K IOPS per instance, offers a mix of performance and cost-efficiency. Hyperdisk Extreme delivers ultra-low latency and supports up to 350K IOPS and 5,000 MiB/s throughput per Z3 VM instance and up 500K IOPS and 10,000 MiB/s throughput for the Z3 bare metal instance — making it well-suited for demanding workloads like databases. Using Hyperdisk for persistent storage and Z3 Local SSD for caching creates an optimal storage architecture for high end databases and mission critical workloads

Get started with Z3 today

Z3 VMs and bare metal instances are available today in most regions worldwide. To start using Z3 instances, select Z3 under the new Storage-Optimized machine family when creating a new VM or GKE node pool in the Google Cloud console. Learn more at the Z3 machine series page. Contact your Google Cloud sales representative for more information on regional availability.

^{1. Results are based on Google Cloud’s internal benchmarking}

Read More for the details.

2025 07 08

GCP – Announcing Vertex AI Agent Engine Memory Bank available for everyone in preview

Tibor Kiss Cloud, Google Cloud gcp

Developers are racing to productize agents, but a common limitation is the absence of memory. Without memory, agents treat each interaction as the first, asking repetitive questions and failing to recall user preferences. This lack of contextual awareness makes it difficult for an agent to personalize their assistance–and leaves developers frustrated.

How we normally mitigate memory problems: So far, a common approach to this problem has been to leverage the LLM’s context window. However, directly inserting entire session dialogues into an LLM’s context window is both expensive and computationally inefficient, leading to higher inference costs and slower response times. Also, as the amount of information fed into an LLM grows, especially with irrelevant or misleading details, the quality of the model’s output significantly declines, leading to issues like “lost in the middle” and “context rot”.

How we can solve it now: Today, we’re excited to announce the public preview of Memory Bank, the newest managed service of the Vertex AI Agent Engine, to help you build highly personalized conversational agents to facilitate more natural, contextual, and continuous engagements. Memory Bank helps us address memory problems in four ways:

Personalize interactions: Go beyond generic scripts. Remember user preferences, key events, and past choices to tailor every response.
Maintain continuity: Pick up conversations seamlessly where they left off across multiple sessions, even if days or weeks have passed.
Provide better context: Arm your agent with the necessary background on a user, leading to more relevant, insightful, and helpful responses.
Improve user experience: Eliminate the frustration of repeating information and create more natural, efficient, and engaging conversations.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5fa0277220>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Where you can access it: Memory Bank is integrated with the Agent Development Kit (ADK) and Agent Engine Sessions. You can define an agent using ADK, enable Agent Engine Sessions to store and manage conversation history within individual sessions. Now, you can enable Memory Bank to provide long-term memory for agents to store, retrieve, and manage relevant information across multiple sessions. You can also use Memory Bank to manage your memories with other agent frameworks including LangGraph and CrewAI.

Here’s how Memory Bank works

It understands and extracts memories from interactions: Using Gemini models, Memory Bank can analyze a user’s conversation history with the agent (stored in Agent Engine Sessions) to extract key facts, preferences, and context to generate new memories. This happens asynchronously in the background, without you needing to build complex extraction pipelines.
It stores and updates memories intelligently: Key information—like “My preferred temperature is 71 degrees,” or “I prefer aisle seats on flights” — is stored persistently and organized by your defined scope, such as user ID. When new information arises, Memory Bank (using Gemini) can consolidate it with existing memories, resolving contradictions and keeping the memories up to date.
It recalls relevant information: When a user starts a new conversation (session), the agent can retrieve these stored memories. This can be a simple retrieval of all facts or a more advanced similarity search (using embeddings) to find the memories most relevant to the current topic, ensuring the agent is always equipped with the right context.

1 - Memory Bank system — A diagram illustrating how an AI agent uses conversation history from Agent Engine Sessions to generate and retrieve persistent memories about the user from Memory Bank.

This entire process is grounded in Google Research’s novel research method (accepted by ACL 2025), which enables an intelligent, topic-based approach to how agents learn and recall information, setting a new standard for agent memory performance.

Let’s take an example. Imagine you’re a retailer in the beauty industry. You have a personal beauty companion equipped with memory that recommends products and skincare routines.

As shown in the illustration, the agent is able to remember the user’s skin type (maintaining context) even after it evolves over time and be able to make personalized recommendations. This is the power of an agent with long-term memory.

Get started today with Memory Bank

You can integrate Memory Bank into your agent in two primary ways:

Develop an agent with Google Agent Development Kit (ADK) for an out-of-the-box experience
Develop an agent that orchestrates API calls to Memory Bank if you are building your agent with any other framework.

To get started, please refer to the official user guide and the developer blog. For hands-on examples, the Google Cloud Generative AI repository on GitHub offers a variety of sample notebooks, including integration with ADK and deployment to the Agent Engine runtime. For those wishing to try Memory Bank with third-party frameworks, we also provide notebook samples for LangGraph and CrewAI.

If you’re a developer using Agent Development Kit (ADK) but have never used Google Cloud before, you can still start by using our new express mode registration for Agent Engine Sessions and Memory Bank. Here’s how it works:

Sign up with your Gmail account to receive an API key
Use the key to access Agent Engine Sessions and Memory Bank
Build and test your agent within the free tier usage quotas
Seamlessly upgrade to a full Google Cloud project when you are ready for production

If you want to know more about Memory Bank, join the Vertex AI Google Cloud community to share your experiences, ask questions, and collaborate on new projects.

Read More for the details.

2025 07 08

GCP – Google is a Leader in the 2025 Gartner® Magic Quadrant™ for Search and Product Discovery

Tibor Kiss Cloud, Google Cloud gcp

We’re thrilled to announce that Google has been named a Leader in the Gartner® Magic Quadrant™ for Search and Product Discovery.

We believe this recognition affirms Google’s evolving commitment to delivering AI solutions tuned to unique industry challenges, empowering businesses to transform the digital commerce experience and deliver high ROI-generating product discovery, through the use of AI in relevance, ranking, and personalization.

1 - MQ Graphic — Download the complimentary 2025 Gartner Magic Quadrant for Search and Product Discovery

Built upon Google’s deep expertise & knowledge of the way users search, interact with & purchase products across a broad landscape of commerce domains, Vertex AI Search for commerce is a fully-managed, AI-first solution tailored for e-commerce.

Redefining product discovery with generative AI

As digital channels continue to increase in traffic and adoption, businesses in e-commerce need to navigate the challenges involved in providing great digital experiences. Customers expect relevance, convenience, and personalization.

Vertex AI Search for commerce uses the best of Google AI to drive personalized product discovery at scale, optimized for revenue-per visitor. This enables businesses to not only understand user intent, but proactively surface the right products and content to shoppers through a variety of multimodal capabilities that leverage the latest advancements in generative AI.

Search and browse: Powering digital commerce sites and applications with Google-quality capabilities highly tuned for e-commerce and revenue maximization.
Conversational search: Enabling real-time, back and forth conversation, powered by Gemini models to guide users through their shopping journeys, while optimizing for conversion.
Recommendations: Deliver highly personalized recommendations at scale.
Semantic image search: Users can search by image (“find this blouse”) or by a combination of text & image (“find the shoes that would look great with this blouse”).

In particular, our customers have noted our leadership in conversational search through one of our latest offerings, Conversational Commerce, which helps retailers provide assisted shopping on any digital channel, to engage with customers in a more natural and human-like conversation. This includes helping customers find their desired products online or helping a store associate answer questions using data from multiple sources to increase buying confidence for customers worldwide.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5f90eb37c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Intuitive search at the speed of intent, tuned for ecommerce

Search is changing as consumers are leveraging longer queries, seeking more comprehensive answers to questions.

Over the past decade, we’ve made significant strides in understanding user intent and refining our proprietary algorithms, now enhanced with gen AI. This has become increasingly important as user search behavior and the shopping journey have evolved, transforming the retail landscape. Unlike traditional models that are not AI-native, Vertex AI Search for commerce nearly eliminates the need for manual overriding to re-rank results and compensate for suboptimal search quality. Our solution is developed to help businesses, from digitally native e-commerce companies to traditional retailers, adapt to this new era of buying and selling by leveraging AI to maximize revenue.

Access to cutting edge AI/ML models: Vertex AI Search for commerce redefines e-commerce search using powerful AI / ML models like Gemini for unparalleled accuracy and relevance.
Leverage Google’s proprietary intelligence: Vertex AI Search for commerce utilizes Google Shopping’s vast query & click datasets, along with our most advanced knowledge graphs, to train our industry-leading AI Relevance models.
Advanced intent detection: Interprets complex queries and nuances in user intent beyond simple keyword matching, focusing on semantic meaning for intuitive results.
AI-based catalog optimization: Retail catalog search results are enhanced by combining Google Shopping’s effective web crawlers with Google’s extensive understanding of web content and our unique AI models.
Deep and scalable personalization: Analyzes individual shopper behavior, preferences, and historical data to deliver tailored product recommendations and search rankings, boosting satisfaction, conversions, and loyalty.
Managed deployment: Offers a fully managed solution for seamless integration and launch, minimizing engineering overhead.
Flexible customization and control: Businesses can customize the search experience with specific business rules and optimization objectives, aligning with unique KPI goals and other unique metrics.

The future of e-commerce is personalized and simplified

We believe being positioned as a Leader in the Gartner® Magic Quadrant™ for Search and Product Discovery underscores Google’s proven ability to deliver real business value. Vertex AI Search for commerce provides a comprehensive, AI-first solution that guides your customers through the buying journey, ensuring they find exactly what they need, every time.

To download the full 2025 Gartner Magic Quadrant for Search and Product Discovery report, click here, and for more information on Vertex AI Search for commerce, see here or register for our upcoming product roadmap webinar.

^{Gartner, Magic Quadrant for Search and Product Discovery – Mike Lowndes, Noam Dorros, Sandy Shen, Aditya Vasudevan, June 24, 2025}

^{Disclaimer: Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google.}

^{GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.}

Read More for the details.

2025 07 08

GCP – Google Public Sector supports AI-optimized HPC infrastructure for researchers at Caltech

Tibor Kiss Cloud, Google Cloud gcp

For decades, institutions like Caltech, have been at the forefront of large-scale artificial intelligence (AI) research. As high-performance computing (HPC) clusters continue to evolve, researchers across disciplines have been increasingly equipped to process massive datasets, run complex simulations, and train generative models with billions of parameters. With this powerful combination, researchers have accelerated scientific discovery across diverse use cases including genomic analysis, drug discovery, weather forecasting, and beyond.

Modern research workloads, driven by AI and HPC, demand processing of structured and unstructured data at an unprecedented scale, while maintaining sub-millisecond storage latency, enterprise-level security and compliance, and reproducibility despite varying hardware and software configurations. This presents significant technical challenges for both researchers and supporting departments.

To accelerate scientific discovery in the AI era, Google Public Sector has announced it will support AI-optimized HPC infrastructure for researchers at Caltech. This new initiative furthers Caltech’s mission to expand human knowledge and benefit society through research integrated with education.

This support provides Caltech researchers four key resources:

A workhorse of diverse processor types, including Cloud GPUs and Google’s custom-design Arm-based processors (Axion) and Cloud Tensor Processing Units (TPUs) for AI acceleration and intense workloads.
Access to Google’s first party datasets like AlphaFold, Earth Engine, Google Maps Platform and VirusTotal to accelerate discoveries across all disciplines.
A fully-managed, unified AI development platform, Vertex AI Platform, which includes Vertex AI Agent Builder and 200+ first-party (Gemini, Imagen 3), third-party, and open (Gemma) foundation models in Model Garden.
Dedicated campus training and workshops for students, researchers, and supporting teams, enabling them to increase AI literacy and adoption.

Google Public Sector and Caltech will integrate this AI-optimized infrastructure with Caltech’s existing HPC research environments to provide researchers instant access while maintaining their existing data and workloads.

One of the first initiatives that will be powered by this new AI infrastructure will be led by Dr. Babak Hassibi, Mose and Lillian S. Bohn Professor of Electrical Engineering and Computing and Mathematical Sciences at Caltech. Dr. Hassibi’s research focuses on making AI models more efficient, which is crucial for the advancement of the field. Current large language models (LLMs) can have billions or trillions of parameters. In efforts to make models even more efficient and useful. Dr. Hassibi’s proposal uses Vertex AI and has the potential to make AI more accessible and sustainable.

“We will be using Vertex AI to develop training methods on TPUs that incorporate pruning, quantization, and distillation, as well as considerations regarding resilience to attacks, into the training process itself. The former has the potential to significantly reduce inference time costs of the trained models, making AI much more accessible and sustainable. The latter can markedly improve the safety of systems that employ AI. Both will allow AI to move to the edge. In addition to the practical benefits, the work will inform the theoretical studies of AI models, in particular, their generalization performance and the limits of their compressibility, which is a major focus of my research group,” said Dr. Babak Hassibi.

By providing access to advanced AI and planet-scale infrastructure, this support enables Caltech researchers to continue to push scientific boundaries, investigate complex problems, and develop innovative solutions.

“We are living a time when we need to answer bigger questions faster, and do more with less. Google Public Sector is excited to support Caltech to build an AI-optimized infrastructure that will lead scientific discoveries across all domains for the best of all our constituents,” said Reymund Dumlao, Director of State & Local Government and Education at Google Public Sector.

At Google Public Sector, we’re passionate about applying the latest cloud, AI and security innovations to help you meet your mission. Subscribe to our Google Public Sector Newsletter to stay informed and stay ahead with the latest updates, announcements, events and more.

For more information about Caltech’s research heritage and centers, visit https://www.caltech.edu/research

Read More for the details.

2025 07 08

GCP – Formula E accelerates its work with Google Cloud Storage and Google Workspace

Tibor Kiss Cloud, Google Cloud gcp

In the high-speed world of global motorsport, operational efficiency and technological innovation are as critical off the track as they are on it. And when it comes to innovating in the field, Formula E, with its focus on the future of mobility and sustainability, regularly takes the checkered flag.

Formula E orchestrates thrilling races with super-electric-power cars in cities worldwide. A central part of this experience is the management and distribution of immense volumes of media content. Over the course of a decade of high-octane racing, Formula E has accumulated a vast media archive, comprising billions of files and petabytes of data.

This valuable collection of content, previously housed in AWS S3 storage, necessitated a more robust, high-performance, and globally accessible solution to fully harness its potential for remote production and content delivery. These needs became especially acute as AI offers more opportunities to leverage archives and historic data.

At the same time, the organization faced internal operational challenges, including common issues like disconnected systems, which limited collaboration, alongside escalating costs from multiple software subscriptions. Formula E sought a unified solution to foster greater integration and prepare for the future of work.

This led to the widespread adoption of Google Cloud’s ecosystem, including Google Workspace, to address both its large-scale data storage needs and internal collaboration and productivity workflows.

A decade of data and disconnected operations

Formula E’s media archive represented a significant challenge due to its sheer scale. With 10 years of racing history captured, the archive contained billions of files and petabytes of data, making any migration a formidable task.

The demands of its global operations, which include taking production to challenging race locations across the globe, require seamless connectivity and high throughput. The organization had previously encountered difficulties connecting to its S3 storage from remote geographical locations. These experiences prompted successful experiments with Google Cloud, underscoring the critical need for a cloud solution that could deliver consistent performance, even in areas with less reliable internet infrastructure.

Internally, Formula E’s use of disparate systems, including Office 365 and multiple communication tools, resulted in disjointed workflows and limited collaborative capabilities. The reliance on multiple software subscriptions for communication and collaboration was also driving up operational costs. The team needed a more integrated and cost-effective environment that could streamline processes and foster a more collaborative culture, setting the stage for future advancements in workplace technology.

A pitstop in the cloud drives better performance

To address these multifaceted challenges, Formula E embarked on a strategic, two-pronged migration. This involved moving its massive media archive to Google Cloud Storage and transitioning its entire staff to Google Workspace.

For the monumental task of migrating its petabyte-scale archive, Formula E collaborated closely with Google Cloud and Mimir, its media management system provider. Following meticulous planning with Mimir and technical support from Google Cloud, the team chose the Google Cloud Storage Transfer Service (STS).

STS is a fully managed service engineered for rapid data transfer, which crucially required no effort from Formula E’s team during the migration itself. A pivotal element that ensured a smooth transition was Cloud Storage’s comprehensive support for S3 APIs. This compatibility enabled Formula E to switch cloud providers without any interruption to its business, guaranteeing continuity for its critical media operations.

To revolutionize its internal operations, Formula E partnered with Cobry, a Google Workspace migration specialist, for a two-phased migration process. The initial phase focused on migrating Google Drive content, which was soon followed by the full migration to Workspace. Cobry not only helped with the technical migration, the company also provided support for change management and training.

To ensure a smooth transition and encourage strong user adoption, Google champions from across the business received in-person training at Google’s offices. The project successfully migrated 220 staff members along with 9 million emails and calendar invites.

The need for speed, collaboration, and savings

The migration of the media archive to Google Cloud proved as successful as a clever hairpin maneuver at the track.

The Storage Transfer Service performed beyond expectations, moving data out of S3 at an impressive rate of 240 gigabits per second — which translates to approximately 100 terabytes per hour. The entire sprawling archive was transferred in less than a day, a level of throughput that impressed even Google Cloud’s internal teams and confirmed that STS could deliver results at major scale. Such efficiency meant that Formula E experienced no business interruption and maintained continuous access to its valuable media assets.

Beyond the rapid migration, Formula E now benefits significantly from Google Cloud’s global network. By leveraging this infrastructure, the organization enjoys lower latency and higher throughput, which are critical for its remote production studios when operating worldwide.

A core technology behind this enhanced performance is Google Cloud’s “Cold Potato Routing.” This strategy keeps data on Google’s private, internal backbone network for as long as possible, using the public internet for only the shortest final leg of the journey. This approach guarantees improved throughput and latency, effectively resolving the connectivity challenges Formula E previously faced in remote race locations.

Google Cloud’s commitment to an open cloud ecosystem, demonstrated by its full support for S3 APIs, was instrumental in facilitating a smooth transition without vendor lock-in.

The transition to Google Workspace, meanwhile, has transformed Formula E’s internal operations, fostering a more integrated and collaborative work environment. Developed with collaboration at its core, Google Workspace has seamlessly integrated Gemini into the team’s workflow. Initially adopted by office-based teams for automating repetitive tasks, formatting, and summarizing data, Gemini is now accessible across the entire business, empowering staff to work more intelligently.

“Moving from Office 365 to Google Workspace has been a great step for our team in productivity,” said Hayley Mann, chief people officer at Formula E. “The enhanced collaboration features have been really beneficial, and we’re excited about how Google’s integrated Al capabilities will empower and enable our people to work smarter and even more efficiently every day.”

The migration is also delivering financial benefits, as Formula E was also able to replace their usage of Slack and Zoom with Chat and Meet in Google Workspace. Further savings are anticipated as other existing software contracts expire.

Racing toward an open ecosystem

Looking to the future, Formula E is also positioned to realize significant cost savings by using Cloud Storage’s Autoclass feature, which intelligently manages storage classes based on access patterns to optimize costs.

With its petabyte-scale media archive now residing on Google Cloud, Formula E is well positioned to continue innovating in media management, ensuring its content is high-performance, cost-effective, and globally accessible for years to come. This includes leveraging new AI capabilities emerging across the Google ecosystem.

And by embracing integrated AI through Gemini and fostering a truly collaborative environment with Google Workspace, Formula E is accelerating its journey towards peak operational efficiency — mirroring the innovation it champions on the racetrack.

Read More for the details.

2025 07 07

GCP – This migration from Snowflake to BigQuery accelerated model building and cut costs in half

Tibor Kiss Cloud, Google Cloud gcp

In 2024, retail sales for consumer packaged goods were worth $7.5 trillion globally. Their sheer variety — from cosmetics to clothing, frozen vegetables to vitamins — is hard to fathom. And distribution channels have multiplied: Think big box stores in the brick-and-mortar world and mega ecommerce sites online. Most importantly, jury-rigged digital tools can no longer keep pace with the ever-growing web of regulations designed to protect consumers and the environment.

SmarterX uses AI to untangle that web. Our Smarter1 AI model aggregates and triangulates publicly available datapoints — hundreds of millions UPCs and SKUs, as well as product composition and safety information — from across the internet. By matching specific products to applicable regulatory information and continuously updating our models for a particular industry or client, SmarterX helps retailers make fully compliant decisions about selling, shipping, and disposing of regulated consumer packaged goods.

And just like our clients, we needed to accelerate and expand our capabilities to keep pace with that data deluge and build better AI models faster. Migrating to Google Cloud and BigQuery gave us the power, speed, and agility to do so.

Embracing BigQuery: a flexible, easy-to-use, AI-enabled data platform

Because we deal with data from so many sources, we needed a cloud-based enterprise data platform to handle multiple formats and schemas natively. That’s exactly what BigQuery gives us. Since data is the foundation of our company and products, we began by migrating all our data — including the data housed in Snowflake — to BigQuery.

With other data platforms, the data has to be massaged before you can work with it: a time-consuming, often manual process. BigQuery is built to quickly ingest and aggregate data in many different formats, and its query engine allows us to work with data right away in whatever format it lands. Coupled with Looker, we can create easy-to-understand visualizations of the complex data in BigQuery without ever leaving Google Cloud.

In addition, because Gemini Code Assist is integrated with BigQuery, even our less-technical team members can do computational and analytical work.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1bb6ef6160>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/1’), (‘image’, None)])]>

An integrated tech stack unleashes productivity and creativity

After 10 years in business, SmarterX was also suffering from system sprawl.

Just as migrating data between platforms is inefficient, developers become less efficient when they have to bounce around different tools. Even with the increase in AI agents to help with coding and development, the tools struggle, too: When hopping among multiple systems, they pick up noise along the way. And governing identity and access management (IAM) individually for all those systems was time-consuming and left us vulnerable to potential security risks caused by misapplied access privileges.

Google Cloud provides a fully integrated tech stack that consolidates our databases, data ingestion and processing pipelines, models, compute power — even our email, documents, and office apps — in a single, unified ecosystem. And its LLMs are integrated throughout that ecosystem, from the Chrome browser to the SQL variants themselves. This obviates building custom pipelines for most new data sets and allows us to work more efficiently and coherently:

We’re now releasing new products 10-times faster than we were prior to migrating to Google Cloud.
We onboard new customers in less than a week instead of six months.
Our data pipelines handle 100 times the data they did previously, so we can maintain hundreds of customer deployments with a relatively small staff.

Consolidating on Google Cloud also lowered our overhead by 50% because we deprecated several of other SaaS platforms and teams can easily engage with Google’s tools without specialized expertise. Our entire team now lives in Google Cloud: Not an hour goes by that we aren’t using some form of the platform’s technology.

Eliminating system sprawl also means we no longer need to maintain security protocols for separate platforms. Permissioning and identity and access management are handled centrally, and Google Cloud makes it easy to stay current on compliance requirements like SOC-2.

A vision for AI in tune with our own: Gemini

The value SmarterX provides our customers relies heavily on our platform’s AI-driven capabilities. Finding the right AI model development platform and AI models was therefore one of the driving forces behind our choice of a new data platform. And when it comes to creating AI models, philosophy matters.

Google’s philosophy dovetails with our own because they’ve always been at the forefront of understanding how people want to access information. Since the company’s expertise makes web data searchable on an enterprise scale, its Gemini models are tuned beautifully to do what SmarterX needs them to. Before switching to Vertex AI and Gemini, it took us months to release a new model; we can now do the same work in a matter of weeks.

When SmarterX hires new team members, we look for creative thinkers, not speakers of a specific coding language. And we want to give our developers the brainspace to focus on complex problem-solving rather than puzzling over syntax for SQL coding. Gemini Code Assist in BigQuery is easy to learn and can accurately handle the syntax for them. That leaves our developers more time for finding innovative solutions.

A smooth migration by a team that knows its stuff

We couldn’t have completed our migration without the support of the Google Technical Onboarding Center. They really know their way around their technology and had spelunking tools at the ready for tricky scenarios we encountered along the way.

In less than a month, we migrated terabytes of data from Snowflake to BigQuery: more than 80 databases and thousands of tables from 21 different sources. We used a two-pronged approach that leveraged external tables for rapid access to data and native tables for optimized query performance.

Prior to the migration, Google provided foundational training for managing and operating the Google Cloud Platform. They also took the time to understand SmarterX technology. So instead of being constrained by a cookie-cutter migration plan, the Google team helped us to design and schedule a migration — with minimal disruptions or downtime — in the way that made the most sense for SmarterX and our customers. Google’s expertise in best-practices for security and identity and access management further enhanced the security of our new cloud environment.

Even though we’re not a huge customer pumping petabytes of data through Google Cloud daily, the team treated us as if we were on par with larger organizations. When you’re literally moving the foundation of your entire business, it feels good to know that Google has your back.

Snowflake felt like a traditional enterprise data warehouse grafted onto the cloud, completely uninfluenced by the AI revolution, with a database that forced us to work in a specific, predetermined way. With BigQuery, we have a real information production system: a computing cloud with a built-in SQL-friendly data platform, a wide-ranging toolset, embedded AI and model development, and a single user interface for developing products our own way.

Unlimited imagination, not roadmaps

Many people are surprised when I tell them that SmarterX doesn’t have roadmaps — we make bets. We’re wagering that companies want AI to solve whatever real-world use cases arise. Rather than telling you what to do, AI has the ability to understand things that were previously impossible to understand, and to help people express ideas that were previously impossible to express.

SmarterX works with over 2,000 brands. Ultimately, what they’re purchasing is the speed at which we can help them solve their business challenges with artificial intelligence. In much the same way, Google Cloud is solving our own technology challenges, sometimes before we even know we have them, so we can deliver top-notch products to our customers.

Instead of doing battle with a growing sprawl of outdated technology, BigQuery and the rest of the Google Cloud integrated toolset is allowing us to relentlessly reinvent ourselves. Not a week goes by when I don’t hear someone say, “Oh, wow, we can do that with Google Cloud too?”

^{Company description}^{SmarterX helps retailers, manufacturers, and logistics companies minimize regulatory risk, maximize sales, and protect consumers and the environment by giving them AI-driven tools to safely and compliantly sell, ship, store, and dispose of their products. Its clients include global brands that are household names all across the world.}

Read More for the details.