GCP – Navigating Google Cloud: a decision tree for storage workloads
Google Cloud has a vast array of storage options to meet your workload needs, including block storage for high-performance applications, object storage for workloads including content distribution and AI/ML, file storage for workloads including those that require multi-writer capabilities, and specialized storage for data lakes and warehouses.
As the second installment in our decision-tree series (take a look at data & analytics as well!), we’ve created a guide to help you research and select the storage services that best match your specific workload needs.
A decision tree for storage workloads on Google Cloud. Click to download and zoom!
Let’s take a closer look at these different types of storage types, and the options provided by Google Cloud in each category:
Block storage
Block storage involves storing and accessing data in fixed-sized blocks. It is ideal for applications that require low-latency and high-performance storage, such as databases and applications requiring very low latency (<1ms), high IOPS, high throughput, or a combination of the three.
Within Google Cloud, there are several block storage options available:
Local SSD: Provides high-performance, temporary storage that is directly attached to the virtual machine instances. It is suitable for demanding workloads that require low latency and high IOPS.Persistent Disk: Offers durable and reliable persistent storage that can be dynamically attached to and detached from virtual machines.Hyperdisk: Provides best fit for scale-out data analytics and other cost-sensitive throughput- or IOPS- bound workloads.
Object storage
Object storage stores data as discrete objects, along with metadata and a unique identifier. It is typically used for applications that require high scalability (think multi-PB to EB scale) and durability. Sample use cases include: content delivery networks, disaster recovery, backup targets for applications not using Cloud Storage, data analysis, and AI/ML workloads.
Cloud Storage is a scalable and highly available object storage service within Google Cloud. Cloud Storage delivers 11 9s in durability as well as offers redundancy and accessibility across regions.
NFS and Multi-Writer File Storage
Network File System (NFS) and Multi-Writer File Storage are file systems that allow multiple users and VMs to access and modify files. NFS allows concurrent read and write access to shared files among multiple clients while Multi-Writer File Storage enables simultaneous writing by multiple clients or nodes, suitable for real-time collaboration, version control systems, and distributed computing.
Within Google Cloud, there are a few options available:
Filestore is a fully managed file storage service within Google Cloud that combines NFS compatibility and multi-writer capabilities in one. With Filestore, you can create highly available and scalable NFS file shares that seamlessly integrate with Google Cloud’s compute offerings, including Google Kubernetes Engine (GKE) and Google Cloud VMWare Engine environments, allowing multiple containers and pods to access and share files and write to the same file system simultaneously.Google Cloud NetApp Volumes is also a fully managed file storage service that provides high performance and allows you to migrate and run demanding enterprise applications and workloads in Google Cloud, without any refactoring.
SMB storage
SMB storage is a network-based file sharing protocol commonly used in Windows environments to enable seamless file sharing among networked VMs.
Google Cloud NetApp Volumes: In addition to providing NFS and dual protocol support, NetApp Volumes enables SMB support for provisioning file storage for your application environment.
Data lakes and data warehouses
Data lakes and data warehouses enable the consolidation of data from multiple sources, facilitating data integration and providing a unified view of the organization’s data assets. They allow organizations to perform complex analytics and gain insights from large volumes of structured and unstructured data, as well as serve as a foundation for machine learning workloads.
Google Cloud has solutions for both:
If your focus is on structured data analysis, business intelligence, and reporting, look at BigQuery for your data warehouse solution. BigQuery is a fully-managed, serverless data warehouse that allows you to store, query, and analyze massive datasets with high performance and scalability using SQL-based queries.For building a data lake, Cloud Storage serves as a scalable and durable storage option for raw data, supporting the ingestion and storage of structured, semi-structured, and unstructured data in its original format. It has seamless integration with other Google Cloud services. For example, BigQuery has the ability to directly query data stored in Cloud Storage, acting as the analytics engine for your data lake.
Take a look at our previous blog post on data & analytics workloads to learn more about how these solutions integrate across ingesting, processing, storing, governing, orchestrating, and consuming data in Google Cloud.
Next steps
Whether you need block storage, object storage, file storage, or data warehousing, Google Cloud has you covered. We hope this decision tree helps point you in the right direction for meeting the storage needs of your workload. Save it as a bookmark, and keep a look out as we publish more decision trees for other cloud workloads in the future.
Let us know what you think of this post and the decision tree by heading over to the Cloud Discord channel! Just make sure you’ve joined Innovators and the Google Developers Discord.
Read More for the details.