GCP – Zero-downtime migrations to Memorystore for Redis Cluster
With the announcement of Memorystore for Redis Cluster at Google Next, we have many Redis administrators and developers asking how they can migrate from their existing Redis cluster environments. We understand that these are business-critical applications and need a zero downtime migration.
Adopting Memorystore helps remove repetitive tasks like scaling, patching, backing up and configuring observability. This frees up Redis developers and administrators to focus on activities that provide direct value to their users like releasing features and applications. It can also reduce costs.
Memorystore for Redis Cluster is a managed service offering that is fully OSS compatible and easy to set up. Memorystore for Redis Cluster serves the most demanding use cases like caching, leaderboards and stream processing. Memorystore for Redis Cluster provides automatic zonal distribution of nodes for high availability, automated replica management and promotion, and zero-downtime scale in and out with automatic key redistribution. You can migrate from a variety of standalone node or clustered Redis sources, including from self-managed Redis on Compute Engine, Google Kubernetes Engine, or from third-party platforms like Redis Enterprise or Elasticache. You can learn more about Memorystore for Redis Cluster in our documentation.
In this blog, we will describe how to use RIOT, “Redis Input/Output Tool” for an online migration from an existing Redis cluster to a fully managed Memorystore for Redis Cluster. We will provide some guidelines to enable a problem-free migration.
What is RIOT?
RIOT is an open-source tool developed by Julien Ruaux, Principal Field Engineer, at Redis. RIOT is for data migration between various sources and targets including files, relational databases and Redis instances. We will focus on how it facilitates a hassle-free migration from one Redis cluster to another. Note that RIOT does not currently work with Redis 7.0 as a source. If you wish to migrate Redis 7.0, please look at type-based replication.
Ensuring a smooth migration
We recommend the following additional efforts take place to ensure a smooth migration:
Planning – Write a detailed migration project plan including dependencies, time estimates and tasks owners.Automation – Any actions should be scripted.Testing – Test the migration and incorporate lessons learned into the migration plan and automation. Iterate the tests several times.
This methodology will help eliminate downtime and human error. Further information about migration planning can be found in this blog.
Migration workflow overview
Before we get started, let’s review a high level plan for the migration. The following diagrams are a logical overview of using RIOT for a no-downtime migration, though there may be some backfill as replication catches up at cutover.
1. Deploy a Memorystore for Redis Cluster instance sized similarly to your existing cluster.
2. Deploy a Compute Engine VM with Java Virtual Machine (JVM) and RIOT installed to manage the data movement. When you start RIOT, it will take a full snapshot of the current production Redis cluster instance and write the snapshot to your new Memorystore for Redis Cluster instance. This could take some time depending on the size of the cluster and the network connectivity.
3. RIOT propagates new changes from your existing Redis cluster to your new Memorystore for Redis Cluster instance while your application is live. Replication lag can range from milliseconds to seconds depending on the rate of change and network connectivity. A typical migration on the GCP network with a source and target that has adequate resources can have a replication latency measured in milliseconds.
4. When ready for cutover, stop traffic to the existing Redis cluster. Reconfigure the application to point to the new Memorystore for Redis Cluster instance.
Now that we’ve discussed the process at a high level. Let’s get into the finer details.
Guide: Performing the migration
The following step by step instructions can be used as a guide for your near zero downtime migration.
Step 1: Create a VM to run RIOT
You can create the RIOT VM from the console or with a similar gcloud command. Edit the project, zone, network and service account as needed.
Note: Networking will be needed to the source and target Memorystore instance on the Redis ports. Memorystore uses the default Redis port 6379.
Step 2: Install RIOT and the JVM on a GCP VM.
Run the following command to install Java Virtual Machine (JVM) on your VM. This works on debian and may need to be adjusted for other distributions.
Run the following command to download RIOT. Check for the latest versions.
Extract RIOT:
Install the Redis CLI:
Let’s set up the environment. We need to edit host and port variables for the Memorystore target and the Redis source. You can get the Memorystore information from the console.
Both commands should return PONG as see below
Enable Key Space Notifications on the source Redis instance
RIOT uses keyspace notifications to capture any updates to the database for replication.
Step 3: Use RIOT to begin the migration
Start RIOT
RIOT will provide the status of the initial sync (Scanning) and the changes being streamed in real time (Listening)
Step 4: Validation
There are many ways you can validate the success of your migration such as dumping each database and comparing or checking the number of total keys. For the sake of this walkthrough, we will be validating by comparing key counts between source and target to ensure that the replication is caught up. Note: On instances with a high rate of change, this could be hard to get extremely accurate.
First get all of the Memorystore for Redis Cluster ports:
Loop through each Memorystore for Redis Cluster slot to obtain the key count per slot:
Step 5: Cutover production traffic and decommission the old instance:
You have two options for production cutover:
If your application does not require strong consistency between the source and destination, simply modify your redis client to point to the new Memorystore for Redis Cluster instance and go live. This will result in zero downtime and your databases will be eventually consistent.For use cases where strong consistency is required, stop write traffic to your Redis database. Wait for RIOT to complete the replication of the remaining changes to the new Memorystore for Redis Cluster instance. Update your redis client configuration to point to the new Memorystore for Redis Cluster instance and go live. This will result in a few seconds to a few minutes of downtime based on write frequency and replication lag, but will provide strong consistency.
You are now live on Memorystore for Redis Cluster. You can now review the Monitoring tab in the console to see usage metrics of your production workload.
Final considerations
The launch of Memorystore for Redis Cluster will allow you to take your applications to the highest scale while providing microseconds latency. Memorystore for Redis Cluster removes the burden of managing Redis, so that you can focus on shipping new features and applications that provide value to your users.
With this migration guide, you have a framework for an easy migration with zero downtime with some back fill if there is replication lag. GCP is here to support your adoption of Memorystore. To learn about the latest releases for Memorystore for Redis Cluster, we suggest following our Release Notes.
Read More for the details.