GCP – Keeping students, universities and employers connected with Cloud SQL
Editor’s note: Today we’re hearing from Handshake, an innovative startup and platform that partners with universities and employers to ensure that college students have equal access to meaningful career opportunities. With over 7 million active student users, 1,000 university and 500,000 employer partners, it’s now the leading early career community in the U.S. Here’s how they migrated to Google Cloud SQL.
At Handshake, we serve students and employers across the country, so our technology infrastructure has to be reliable and flexible to make sure our users can access our platform when they need it. In 2020, we’ve expanded our online presence, adding virtual solutions and establishing new partnerships with community colleges and bootcamps to increase career opportunities for our student users.
These changes and our overall growth would have been harder to implement on Heroku, our previous cloud service platform. Our website application, running on Rails, uses a sizable cluster and PostgreSQL as our primary data store. As we grew, we were finding Heroku to be increasingly expensive at scale.
To reduce maintenance costs, boost reliability, and provide our teams with increased flexibility and resources, Handshake migrated to Google Cloud in 2018, choosing to have our data managed through Google Cloud SQL.
Cloud SQL freed up time and resources for new solutions
This migration proved to be the right decision. After a relatively smooth migration over a six-month period, our databases are completely off of Heroku now. Cloud SQL is now at the heart of our business. We rely on it for nearly every use case, continuing with a sizable cluster and using PostgreSQL as our sole owner of data and source of truth. Virtually all of our data, including information about our students, employers, and universities, is in PostgreSQL. Anything in our website is translated to a data model that’s reflected in our database.
Our main web application uses a monolithic database architecture. It uses an instance with one primary and one read replica and it has 60 CPUs, almost 400 GB of memory, and 2 TB of storage, of which 80 percent is utilized.
Cloud SQL is at the heart of our business, providing our startup with enterprise-level features.
Several Handshake teams use the database, including Infrastructure, Data, Student, Education, and Employer teams. The data team is usually interacting with the transactional data, writing pipelines, pulling data out of PostgreSQL and loading it into BigQuery or Snowflake. We run a separate replica for all of our databases, specifically for the data team, so they can export without a performance hit.
With most managed services, there will always be maintenance that requires downtime, but with Cloud SQL, any necessary maintenance is easy to schedule. If the Data team needs more memory, capacity, or disk space, our Infrastructure team can coordinate and decide if we need a maintenance window or a similar approach that involves zero downtime.
We also use Memorystore as a cache and heavily leverage Elasticsearch. Our Elasticsearch index system uses a separate PostgreSQL instance for batch processing. Whenever there are record changes inside our main application, we send a Pub/Sub message from which the indexers queue off, and they’ll use that database to help with that processing, putting that information into Elasticsearch and creating those indices.
Nimble, flexible and planning for the future
With Cloud SQL managing our databases, we can devote resources toward creating new services and solutions. If we had to run our own PostgreSQL cluster, we’d need to hire a database administrator. Without Cloud SQL’s service-level agreement (SLA) promises, if we were setting up a PostgreSQL instance in a Compute Engine virtual machine, our team would have to double in size to handle the work that Google Cloud now manages. Cloud SQL also offers automatic provisioning and storage capacity management, saving us additional valuable time.
We’re generally far more read-heavy than write-heavy, and our future plans for our data with Cloud SQL include offloading more of our reads to read replicas, and keeping the primary for just writes, using PgBouncer in front of the database to decide where to send which query.
We are also exploring committed use discounts to cover a good baseline of our usage. We still want to have the flexibility to do cost cutting and reduce our usage where possible, and to realize some of those initial savings right away. Also, we’d like to split up the monolith into smaller databases to reduce the blast radius, so that they can be tuned more effectively to each use case.
With Cloud SQL and related services from Google Cloud freeing time and resources for Handshake, we can continue to adapt and meet the evolving needs of students, colleges, and employers.
Read more about Handshake and the solutions we found in Cloud SQL.
Read More for the details.