GCP – Lighter lift-and-shifts with the new Database Migration Service
“Database migrations are super fun!” – No one ever.
There can be considerable friction in moving databases from platform to platform. If you’re doing a lift-and-shift to Google Cloud, your ability to take advantage of cloud features slows down when you have to handle all the intricacies of:
- Devising a migration strategy for safely and efficiently moving data (while managing downtime)
- Assessing the impact of the migration
- Database preparation
- Secure connectivity setup
- Data replication
- And migration validation
Beyond that, there might be manual work to rewrite database code for a new engine or rewrite applications, and deep validation of aspects like performance impact.
It goes without saying that migration to the cloud is a complex process with many moving parts and personas involved, including a network administrator to account for infrastructure/security requirements like VPN. Most DBAs know that one of the largest risks of migrating a database is downtime, which often prevents companies from taking on the heavy task. Typically you shutdown the application, create a backup of the current database schema, perform all required update operations using migration tools, restart the application, and hope that everything works fine.
But that changes if you can’t accept any downtime. PostgreSQL users, for example, often have very large databases, meaning they are facing hours of downtime, which for most companies isn’t realistic.
Migration tools as a fast track
A number of tools are available to help you move data from one type of database to another or to move data to another destination like a data warehouse or data lake. Moving critical datasets — entire databases — requires the latest-generation cloud-based migration tools that can handle data replication with ease, while providing enhanced security. While we’ve seen cloud-based migration tools, like Alooma, Matillion, and Snaplogic, we also know cloud migration tools need to integrate well with both the source and the target systems, enabling you to migrate databases with minimal effort, downtime, and overhead.
In 2019 Alooma joined Google Cloud, bringing Alooma one step closer to delivering a full self-service database migration experience bolstered by Google Cloud technology. Alooma helps enterprise companies streamline database migration in the cloud with a data pipeline tool that simplifies moving data from multiple sources to a single data warehouse. The addition of Alooma and their expertise in enterprise and open source databases has been critical to bringing additional migration capabilities to Google Cloud. Then, in November 2020, Google Cloud launched the new, serverless Database Migration Service (DMS) as part of our vision for meeting these modern needs in a user-friendly, fast, and reliable way for migration to Cloud SQL. While Alooma is an ETL platform for data engineers to build a flexible streaming data pipeline to a cloud data warehouse for analytics, DMS is a database migration service for DBAs and IT professionals to migrate their databases to the cloud as part of their larger migration goals.
Database Migration Service is now GA
Database Migration Service makes it easy for you to migrate your data to Google Cloud. It’s a fully managed service that helps you lift and shift your MySQL and PostgreSQL workloads into Cloud SQL. You can migrate from on-premises, self-hosted in Google Cloud, or from another cloud, and get a direct path to unlocking the benefits of Cloud SQL. The focus of DMS is to manage the migration of your database schema, metadata, and the data itself. It streamlines the networking workflow, manages the initial snapshot and ongoing replication, provides a status of the migration operation, and supports one-time and continuous migrations. This lets you cut over with minimal downtime.
That’s a lot to absorb, but here are three main things I want you to take away:
- With DMS, you get a simple, integrated experience to guide you through every step of the migration (not just a combination of tools to perform the assessment and migration).
- It’s serverless. You don’t need to deploy, manage, or monitor instances that run the migration. The onus of deciding on appropriate sizing, monitoring the instance, ensuring that compute / storage are sufficient is on Google Cloud. Serverless migrations eliminate surprises and are performant at scale.
- It’s free.
Supported source databases include:
- AWS RDS 5.6, 5.7, 8.0
- Self-managed (on-prem, on any cloud VM) 5.5, 5.6, 5.7, 8.0
- Cloud SQL 5.6, 5.7, 8.0
Supported destination databases include:
- Cloud SQL for MySQL 5.6, 5.7, 8.0
Gabe Weiss, Developer Advocate for Cloud SQL, has gone in-depth around the various migration scenarios, how DMS works, and how to prepare Postgres instances for migration, so check out his content, as well as best practices around homogeneous migrations. For now, I’ll give you a quick sneak peak by walking you through the basic DMS experience for a new migration job.
A stroll through DMS
You can access DMS from the Google Cloud console under Databases. To get started, you’ll create a migration job. This represents the end-to-end process of moving your source database to a Cloud SQL destination.
Define your migration job
First, let’s define what kind of migration the job will run. I want to move a MySQL database I am running on Google Compute Engine to Cloud SQL. I can choose between one-time or a continuous replication. For minimal downtime, I select continuous. Once I define my job, DMS shows me the source and connectivity configuration required to be able to connect to and migrate the source database.
Define your source
First you have to create a connection profile, a resource that represents the information needed to connect to a database. These profiles aren’t locked to an individual migration. This means you can reuse it if you want to first test out a migration or if someone else in your organization is in charge of connecting to the database.
- If you’re replicating from a self-hosted database, enter the Hostname or IP address (domain or IP) and Port to access the host.
- If you’re replicating from a Cloud SQL database select the Cloud SQL instance from the dropdown list.
Create a destination
The next step is to create the Cloud SQL instance to migrate the database to. This will feel familiar to anyone who has created a Cloud SQL instance before. You’ll see many of the same options, like connectivity and machine configuration. Since DMS relies on the replication technology of the Database Engine, you don’t have to create resources, aside from the Cloud SQL instance.
Connect your source to your target
To me this is where the magic is because establishing connectivity is often viewed as pretty hard. Depending on the type of your source database and its location, you can choose among four types of connectivity methods:
- IP allowlists – Use this when your source database is external to Google Cloud.
- Reverse SSH tunnel – Use this to set up private connectivity through a reverse SSH tunnel on a VM in your project.
- VPCs through VPNs – Use this if your source database is inside a VPN (i.e. in AWS or your on-premises VPN).
- VPC peering – Use this if your source database is in the same VPC on Google Cloud.
Since my source VM is in Google Cloud, I set up VPC Peering. Just be sure to enable the Service Networking API to do this, which provides automatic management of network configurations necessary for certain services.
Validate the migration job
And that’s it. I configured my source, created a Cloud SQL destination, and established connectivity between them. All that’s left is to validate the migration job setup and start my migration. Once it’s validated, I can trust that my migration will run smoothly.
You can start the job and run it immediately. Once the migration job has started, you can monitor its progress and see if it encounters any issues. DMS will first transfer the initial snapshot of existing data, then continuously replicate changes as they happen.
When the initial snapshot is migrated and continuous replication is keeping up, you can promote the Cloud SQL instance to be your primary instance and point your applications to work directly against it.
DMS guides you through the process and manages the connection between your source and Cloud SQL instance with flexible options. You can test the migration and get a reliable way to migrate with minimal downtime to a managed Cloud SQL instance. It’s serverless, highly performant, and free to use. If you want to give it a spin, check out the quickstart and migrations solutions guide, and let me know your thoughts and any feedback.
Looking for SQL Server migrations? You can request access to participate in the SQL Server preview.
Find me on Twitter: @stephr_wong
Read More for the details.