Azure – Public preview: .NET 7 support in Linux Consumption Plan
Announcing the .NET 7 support for Azure functions in isolated process in public preview for Linux Consumption Plan.
Read More for the details.
Announcing the .NET 7 support for Azure functions in isolated process in public preview for Linux Consumption Plan.
Read More for the details.
General availability: Azure App Service Environment v3 support for custom domain suffix
Read More for the details.
You can now create Windows-based node pools with FIPS 140-2 enabled.
Read More for the details.
You can now completely stop specific user node pools and pick up later where you left off with a switch of a button, saving time and costs.
Read More for the details.
Use an Azure Policy to block the deployment of vulnerable images on AKS.
Read More for the details.
Azure Load Testing is in public preview in West US 2.
Read More for the details.
Gain powerful tools for working with JSON formatted data in Redis through the RedisJSON module.
Read More for the details.
Audit your restore actions on continuous mode in Azure Cosmos DB accounts.
Read More for the details.
Use the new migration tool to migrate workloads from Single to Flexible Server on Azure Database for PostgreSQL, a managed service running the open-source Postgres database on Azure.
Read More for the details.
Public preview enhancements and updates released for Azure SQL in early August 2022
Read More for the details.
We’re excited to announce the general availability of AWS Compute Optimizer in 5 additional regions — Asia Pacific (Osaka), Asia Pacific (Hong Kong), Middle East (Bahrain), Africa (Cape Town), and Europe (Milan).
Read More for the details.
As enterprise and public sector cloud adoption continues to accelerate, having an accurate picture of who did what in your cloud environment is important for security and compliance purposes. Logs are critical when you are attempting to detect a breach, investigating ongoing security issues, or performing forensic investigations. These five must-know Cloud Logging security and compliance features can help customers create logs to best conduct security audits. The first three features were launched recently in 2022, while the last two features have been available for some time.
Google Cloud’s Assured Workloads helps customers meet compliance requirements with a software-defined community cloud. Cloud Logging and external log data is in scope for many regulations, which is why Cloud Logging is now part of Assured Workloads. Cloud Logging with Assured Workloads can make it even easier for customers to meet the log retention and audit requirements of NIST 800-53 and other supported frameworks.
Learn how to get started by referring to this documentation.
FedRAMP is a U.S. government program that promotes the adoption of secure cloud services by providing a standardized approach to security and risk assessment for federal agencies adopting cloud technologies. The Cloud Logging team has received certification for implementing the controls required for compliance with FedRAMP at the High Baseline level. This certification will allow customers to store sensitive data in cloud logs and use Cloud Logging to meet their own compliance control requirements.
Below are the controls that Cloud Logging has implemented as required by NIST for this certification. In parenthesis, we’ve included example control mapping to capabilities:
Event Logging (AU-2) – A wide variety of events are captured. Examples of events as specified include password changes, failed logons or failed accesses related to systems, security or privacy attribute changes, administrative privilege usage, Personal Identity Verification (PIV) credential usage, data action changes, query parameters, or external credential usage.
Making Audits Easy (AU-3) – To provide users with all the information needed for an audit, we capture the type of event, time occurred, location of the event, source of the event, outcome of the event, and identity information. .
Extended Log Retention (AU-4) – We support the outlined policy for log storage capacity and retention to provide support for after-the-fact investigations of incidents. We help customers meet their regulatory and organizational information retention requirements by allowing them to configure their retention period.
Alerts for Log Failures (AU-5) – A customer can create alerts when a log failure occurs.
Create Evidence (AU-16) – A system-wide (logical or physical) audit trail composed of audit records in a standardized format is captured. Cross-organizational auditing capabilities can be enabled.
Check out this webinar to learn how Assured Workloads can help support your FedRAMP compliance efforts.
For customers with specific encryption requirements, Cloud Logging now supports CMEK via Cloud KMS. CMEK can be applied to individual logging buckets and can be used with the log router. Cloud Logging can be configured to centralize all logs for the organization into a single bucket and router if desired, which makes applying CMEK to the organization’s log storage simple.
Learn how to enable CMEK for Cloud Logging Buckets here.
Access Transparency logs can help you to audit actions taken by Google personnel on your content, and can be integrated with your existing security information and event management (SIEM) tools to help automate your audits on the rare occasions that Google personnel may access your content. While Cloud Audit logs tell you who in your organization accessed data in Google Cloud, Access Transparency logs tell you if any Google personnel accessed your data.
These Access Transparency logs can help you:
Verify that Google personnel are accessing your content only for valid business reasons, such as fixing an outage or attending to your support requests.
Review actual actions taken by personnel when access is approved.
Verify and track Assured Workload Support compliance with legal or regulatory obligations.
Learn how to enable Access Transparency for your organization here.
Access Approvals can help you to restrict access to your content to Google personnel according to predefined characteristics. While this is not a logging-specific feature, it is one that many customers ask about. If a Google support person or engineer needs to access your content for support for debugging purposes (in the event a service request is created), you would use the access approval tool to approve or reject the request.
Learn about how to set up access approvals here.
We hope that these capabilities make adoption and use of Cloud Logging easier, more secure, and more compliant. With additional features on the way, your feedback on how Cloud Logging can help meet additional security or compliance obligations is important to us.
Learn more about Cloud Logging with our qwiklab quest and join us in our discussion forum. As always, we welcome your feedback. To share feedback, contact us here.
Read More for the details.
The use of digital certificates to establish trust across our digital infrastructure continues to grow at a rapid pace, driven by development and deployment of cloud-based, containerized, microservice-based applications and the proliferation of connected Internet of Things and smart devices.
Google Cloud Certificate Authority Service (CAS) provides a highly scalable and available private CA to help organizations address the growing need for certificates. With CAS, you can offload time-consuming tasks associated with operating a private CA, like hardware provisioning, infrastructure security, software deployment, high-availability configuration, disaster recovery, backups, and more to the cloud.
While a cloud-based CA is uniquely suited to the scalability and availability requirements of cloud-native environments, organizations who have adopted cloud-based CAs increasingly want to extend the capabilities and value of their CA to their on-premises environments as well, where certificates continue to be the primary mechanism for identifying and securing enterprise endpoints and existing on-prem CA options continue to be complex and costly to operate and manage.
To get started on this converged public key infrastructure (PKI), enterprises can now deploy a private CA through Google Cloud CAS along with a partner solution that simplifies, manages, and automates the digital certificate operations in on-prem use cases such as issuing certificates to routers, printers, or users. ISV partners with Google Cloud CAS integration include AppviewX, Venafi (which includes JetStack), KeyFactor, and SmallStep.
One of the most commonly-requested features for on-prem certificate enrollment is Windows auto-enrollment: Today, organizations with on-prem deployments of private CA can auto-enroll client certificates using Windows Active Directory Certificate Services (ADCS). Windows auto-enrollment helps to automate registration and renewal of endpoint/client certificates. Google Cloud now is able to offer an alternative to MS CA Service that integrates into Windows environments with the integration of partner solutions from AppviewX such as PKIaaS, CLMaaS, and KeyFactor.
In addition to addressing the scalability and management issues of digital certificates, the converged PKI deployment in the public cloud offers these benefits:
Simplified and automated certificate management compliance
Centralized policy definition and decentralized certificate enrollment
Improved visibility through partner solutions for Certificate Lifecycle Management (CLM)
Service level agreements for large scale deployments
Reduction in CapEx
We discuss these in greater detail in our papers on deploying a secure and reliable PKI with Google Cloud CAS, and scaling certificate management with Google Cloud CAS.
Google Cloud CAS with an integrated partner solution can help simplify enterprise PKI deployments and provide a highly available, comprehensive, and converged private CA. And now, on-prem private CA deployments with Windows CA and auto-enrollment are supported through partner solutions. To get started, visit the CAS product page or one of the partner links above. If you have additional questions, you can also contact cas-support@google.com.
Read More for the details.
As consumer data privacy regulations tighten and the end of third-party cookies looms, organizations of all sizes may be looking to carve a path toward consent-positive, privacy-centric ways of working. Consumer-facing brands should look more closely at the customer data they’re collecting, and learn to embrace a first-party data-driven approach to doing business.
But while brands today recognize the privacy and consumer consent imperative, many may not know where to start. What’s worse is that many don’t know what consumers really want when it comes to data privacy. Today, 40% of consumers do not trust brands to use their data ethically (KPMG)1.There is room, however, for improvement.
Although the gap between how brands and consumers think about privacy is evident, it doesn’t need to continue to widen. Organizations must begin to treat consumer data privacy as a pillar of their business: as a core value that guides the way data is used, processes are run, and teams behave. By implementing a cross-functional data advocacy panel, brands can ensure that the protection of consumer data is always top of mind — and that a dedicated team remains accountable for guaranteeing privacy-centricity throughout the organization.
Winning brands see the customer data privacy imperative as an opportunity, not a threat. Consumers today are clear about what they want, and it’s simply up to brands to deliver. First and foremost, transparency is key. Most consumers are already demanding more transparency from the brands they frequent, but as many as 40% of consumers would willingly share personal information if they knew how it would be used (KPMG)1.This simple value exchange could be the key to a first-party data-driven future. So what’s deterring businesses from taking action?
Organizational change can seem like a daunting undertaking, and many businesses who do recognize the importance of consumer data privacy simply don’t know how to move forward. What steps can they take? Where do they start? How can they prepare? In addition, how can they make sure the improvements they invest in have an impact now and in the long term? A data advocacy panel, woven into the DNA of the organization, can serve as a north star for consent-positivity.
A data advocacy panel’s mission focuses on building and maintaining a consent-positive culture across an organization. It can serve as a way to hone the power of customer data for your business while also giving the power of consent to your customers. And in an era when most business decisions are (and should be) driven by data, having a data advocacy panel makes a world of sense.
But what exactly does a data advocacy panel look like? Importantly, you need to include the right players. Your data advocacy panel should include representatives from every business unit that has responsibility for protecting, collecting, creating, sharing, or accessing data of any kind. These members might include marketing, IT/security, Legal, HR, accounting, customer service, sales, and/or partner relations.
These team members should then come together to tackle two key goals: to set the strategy and policies for how data is handled throughout the organization, and to react quickly to new data developments such as:
a potential data breach,
shifting market sentiments, or
new compliance requirements
It should be the responsibility of the data advocacy panel to help decide how, when, why, and where data is used in your business.
Once a company has established a data advocacy panel, they’ve also built a foundation upon which to create a new and future-ready organizational structure that will provide maximal transparency and auditability, explainability, and expanded support for how first-party consumer data is collected, joined, stored, managed, and activated.
The value of consumer consent, data advocacy, and privacy-centricity
According to Google Cloud’s VP of Consumer Packaged Goods, Giusy Buonfantino, “The changing privacy landscape and shifting consumer expectations mean companies must fundamentally rethink how they collect, analyze, store, and manage consumer data to drive better business performance and provide customers with personalized experiences.”
Companies are adopting Customer Data Platforms (CDP) to enable a privacy-centric engagement with their customers. Lytics’ customer data platform solution is built with Google Cloud BigQuery to help enterprises continue to evolve how they capture and use consumer data in our changing environment. Lytics on BigQuery helps businesses collect and interpret first-party behavioral customer data on a secure and scalable platform with built-in machine learning.
Reggie Wideman, Head of Strategy, Lytics noted, “By design, traditional CDPs create enormous risk by asking the company to collect and unify customer data in the CDP system, separate from all other customer data and outside of your internal controls and governance. We think there’s a better, smarter way. By taking a composable CDP approach we enable companies to layer our tools into their existing martech ecosystems, rather than being forced to build around a proprietary, external CDP toolset.”
Wideman continues, ”This allows us to enable developers and data managers to create a ‘Customer 360’ in their own secure, privacy-compliant data warehouse. The advantage of this approach is a privacy-centric architecture that collects and unifies customer data creating a profile schema that is synced to the customer’s data warehouse, helping internal data teams build and manage a secure, persistent ‘Customer 360’, while also providing a direct sync to marketing tools and channels for targeted advertising and activation.”
In short, it’s not just about small tweaks and minor changes. It’s about starting with a catalyst that will drive larger-scale change. That catalyst is your data advocacy panel, and it is and will continue to be at the center of the value exchange between your brand and your customers.
“The importance of ‘consumer consent’ and ‘value exchange’ is front and center in the conversations we are having with our customers,” shares Buonfantino, “And first-party data done right can help drive more meaningful high-value consumer engagement.”
For more information on data advocacy panels, how to implement them, and how they bring the power of your customer data to every employee, readthe whitepaper. In addition, we invite you to explore the Lytics CDP on the Google Cloud Marketplace.
Read More for the details.
Today we are announcing Data Studio, our self-service business intelligence and data visualization product, as a Google Cloud service, enabling customers to get Data Studio on the Google Cloud terms of service, simplifying product acquisition and integration in their company’s technology stack.
Google Cloud customers of all types widely use Data Studio today as a critical piece of their business intelligence measurement and reporting workflow. Many of our customers have asked for Data Studio on Google Cloud terms, to ensure Google supports the same privacy and security commitments for Data Studio as for other Google Cloud products. Now, that’s possible.
Data Studio now supports additional compliance standards for internal auditing, controls and information system security, including SOC 1, SOC 2, SOC 3 and PCI DSS, with more compliance certifications coming soon. Data Studio can be used under the same terms as other Google Cloud services, reducing procurement complexity and enabling it to be covered by customers’ existing Cloud Master Agreement.
If customers are subject to HIPAA and have signed a Google Cloud Business Associate Amendment (BAA), it will apply to Data Studio as well. Data Studio is still free to use, although as a free offering, it is not currently supported through Google Cloud support.
This additional certification does not change a single pixel of the end-user experience for Data Studio. Customers can still analyze their data, create beautiful reports, and share insights using all of Data Studio’s self-service BI functionality with no disruption. For customers who aren’t yet using Google Cloud, Data Studio will continue to be available under our existing terms and conditions as well.
When everyone is empowered to dig into data, the results can be transformational. This is just the beginning of our investment in making the power of Google Cloud accessible to everyone through easy-to-use cloud BI.
To switch Data Studio to the Google Cloud terms, follow these simple steps. Visualize on.
Read More for the details.
Today’s consumers expect incredible feats of speed and service delivered through easy-to-use apps and personalized interactions. Modern conveniences have taught consumers that their experience is paramount—no matter the size of the company, complexity of the problem, or regulations in the industry.
“Cloud-native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach. These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow developers to make high-impact changes frequently and predictably with minimal toil.” (per Cloud Native Computing Foundation)
Containers are a better way to develop and deploy modern cloud applications. Containers are more lightweight, faster, more portable, and easier to manage than virtual machines. Containers help developers to build more testable, secure systems while the operations team can isolate workloads inside cost-effective clusters. In a climate where IT needs are rapidly changing, driven by evolving customer demands, building and managing modern cloud applications means so much more than having a managed service platform. Modern cloud has become synonymous with containers and having a Kubernetes strategy is essential to success in IT.
A managed container platform like Kubernetes can extend the advantages of containers even further. Think of Kubernetes as the way to build customized platforms that enforce rules your enterprise cares about through controls over project creation, the nodes you use, and libraries and repositories you pull from. Background controls are not typically managed by app developers, rather they provide developers with a governed and secure framework to operate within.
Kubernetes is not just a technology — it’s a model for creating and scaling value for your business, a way of developing reliable apps and services, and a means to secure and develop cloud-based IT capabilities for innovation.
Google invented Kubernetes and continues to be the leading committer to this open source project. By betting on open source, you get the freedom to run where you want to. And the ecosystem around open source projects like Kubernetes means you get standardized plugins and extensions to create a developer-friendly, comprehensive platform. You can build best-in-class modern applications using open source that can seamlessly and securely be moved to Google Cloud when they are ready to deploy in the cloud.
Open source gives you freedom, while managed services based on open source give you the built-in best practices for deploying and running that software. Created by the same developers that built Kubernetes, Google Kubernetes Engine (GKE) is the best of both. Use standard Kubernetes, expertly operated by the company that knows it best. GKE lets you recognize the benefits of innovation initiatives without getting bogged down troubleshooting infrastructure issues and managing day-to-day operations related to enterprise-scale container deployment. The recipe for long-term success with Kubernetes is two-fold: automation that matters and scale that saves.
For cloud-based companies, the only constant is change. That means you need to be able to adapt quickly to changing conditions. This applies to your platforms too! Your application platform needs to be elastic and able to absorb changes without downtime. GKE delivers most dimensions of automation to efficiently and easily operate your applications. With fully managed Autopilot mode of operation combined with multi-dimensional auto-scaling capabilities, you can get started with a production ready secured cluster in minutes and have complete control over the configurations and maintenance.
Day 2 operations: With GKE, you have the option to automate node provisioning and upgrades, control plane upgrades, with choice of selective node auto upgrades and configurations. These capabilities provide you the flexibility to automate your infrastructure the way you want and gain significant time savings and alleviate maintenance requirements. Moreover, with GKE release channels, you have the power to decide not only when, but how, and what to upgrade in your clusters and nodes.
Modern cloud stack: You can install service mesh and config management solutions with the click of a button, and leave the provisioning and operations of these solutions to us. Google Cloud provisions, scales, secures and updates both the control and data planes, giving you all the benefits of a service mesh with none of the operational burden. You can let Google manage the upgrade and lifecycle tasks for both your cluster and your service mesh. In addition, you can take advantage of advanced telemetry, security and Layer 7 network policies provided by the mesh.
Cost optimization: You can optimize your Kubernetes resources with actionable insights: use GKE cost optimization insights, workload rightsizing and cost estimator, built right into the Google Cloud console . Read how a robotics startup switched clouds and reduced its Kubernetes ops costs with GKE Autopilot; fewer pages at night as clusters are scaled and maintained by Google Cloud, reduced cost, and a better and more secure experience for customers, freed up developer time away from managing Kubernetes.
Partner solutions: You can use your favorite DevOps and security solutions with GKE Autopilot out of the box. Despite being a fully managed Kubernetes platform that provides you with a hands-off approach to nodes, GKE Autopilot still supports the ability to run node agents using DaemonSets. This allows you to do things like collect node-level metrics without needing to run a sidecar in every Pod.
Whether your organization is scaling up to meet a sudden surge in demand or scaling down to manage costs, modern cloud applications have never been more important. Only GKE can run 15,000 node clusters, outscaling other cloud providers by up to 10X, letting you run applications effectively and reliably at scale. Organizations like Kitabisa and IoTex are already experiencing the benefits of running their modern cloud applications on the most scalable Kubernetes platform.
“The transformative value of GKE became apparent when severe flooding hit Sumatra in November 2021, affecting 25,000 people. Our system easily handled the 30% spike in donations.” – Kitabisa
“We regularly experience massive scaling surges from random places in the crypto universe. In the future, the IoTeX platform will secure billions of connected devices feeding their data snapshot to the blockchain. With GKE Autopilot and Cloud Load Balancing, we can easily absorb any load no matter how much or how fast we grow.” – Larry Pang, Head of Ecosystem, IoTeX
Want to learn how to incorporate GKE into your own cloud environment? Register now to learn helpful strategies and best practices to power your business with modern cloud apps.
Read More for the details.
Today, to accelerate research in the bio-pharma space, from the creation of treatments for diseases to the production of new synthetic biomaterials, we are announcing a new Vertex AI solution that demonstrates how to use Vertex AI Pipelines to run DeepMind’s AlphaFold protein structure predictions at scale.
Once a protein’s structure is determined and its role within the cell is understood, scientists can develop drugs that can modulate the protein function based on its role in the cell. DeepMind, an AI research organization within Alphabet, created the AlphaFold system to advance this area of research by helping data scientists and other researchers to accurately predict protein geometries at scale.
In 2020, in the Critical Assessment of Techniques for Protein Structure Prediction (CASP14) experiment, DeepMind presented a version of AlphaFold that predicted protein structures so accurately, experts declared the “protein-folding problem” solved. The next year, DeepMind open sourced the AlphaFold 2.0 system. Soon after, Google Cloud released a solution that integrated AlphaFold with Vertex AI Workbench to facilitate interactive experimentation. This made it easier for many data scientists to efficiently work with AlphaFold, and today’s announcement builds on that foundation.
Last week, AlphaFold took another significant step forward when DeepMind, in partnership with the European Bioinformatics Institute (EMBL-EBI), released predicted structures for nearly all cataloged proteins known to science. This release expands the AlphaFold database from nearly 1 million structures to over 200 million structures—and potentially increases our understanding of biology to a profound degree. Between this continued growth in the AlphaFold database and the efficiency of Vertex AI, we look forward to the discoveries researchers around the world will make.
In this article, we’ll explain how you can start experimenting with this solution, and we’ll also survey its benefits, which include offering lower costs through optimized selection of hardware, reproducibility through experiment tracking, lineage and metadata management, and faster run time through parallelization.
Generating a protein structure prediction is a computationally intensive task. It requires significant CPU and ML accelerator resources and can take hours or even days to compute. Running inference workflows at scale can be challenging—these challenges include optimizing inference elapsed time, optimizing hardware resource utilization, and managing experiments.Our new Vertex AI solution is meant to address these challenges.
To better understand how the solution addresses these challenges, let’s review the AlphaFold inference workflow:
Feature preprocessing. You use the input protein sequence (in the FASTA format) to search through genetic sequences across organisms and protein template databases using common open source tools. These tools include JackHMMER with MGnify and UniRef90, HHBlits with Uniclust30 and BFD, and HHSearch with PDB70. The outputs of the search (which consist of multiple sequence alignments (MSAs) and structural templates) and the input sequences are processed as inputs to an inference model. You can run the feature preprocessing steps only on a CPU platform. If you’re using full-size databases, the process can take a few hours to complete.
Model inference. The AlphaFold structure prediction system includes a set of pretrained models, including models for predicting monomer structures, models for predicting multimer structures, and models that have been fine-tuned for CASP. At inference time, you independently run the five models of a given type (such as monomer models) on the same set of inputs. By default, one prediction is generated per model when folding monomer models, and five predictions are generated per model when folding multimers. This step of the inference workflow is computationally very intensive and requires GPU or TPU acceleration.
(Optional) Structure relaxation. In order to resolve any structural violations and clashes that are in the structure returned by the inference models, you can perform a structure relaxation step. In the AlphaFold system, you use the OpenMM molecular mechanics simulation package to perform a restrained energy minimization procedure. Relaxation is also very computationally intensive, and although you can run the step on a CPU-only platform, you can also accelerate the process by using GPUs.
The AlphaFold batch inference with the Vertex AI solution lets you efficiently run AlphaFold inference at scale by focusing on the following optimizations:
Optimizing inference workflow by parallelizing independent steps.
Optimizing hardware utilization (and as a result, costs) by running each step on the optimal hardware platform. As part of this optimization, the solution automatically provisions and deprovisions the compute resources required for a step.
Describing a robust and flexible experiment tracking approach that simplifies the process of running and analyzing hundreds of concurrent inference workflows.
The following diagram shows the architecture of the solution.
The solution encompasses the following:
A strategy for managing genetic databases. The solution includes high-performance, fully managed file storage. In this solution, Cloud Filestore is used to manage multiple versions of the databases and to provide high throughput and low-latency access.
An orchestrator to parallelize, orchestrate, and efficiently run steps in the workflow. Predictions, relaxations, and some feature engineering can be parallelized. In this solution, Vertex AI Pipelines is used as the orchestrator and runtime execution engine for the workflow steps.
Optimized hardware platform selection for each step. The prediction and relaxation steps run on GPUs, and feature engineering runs on CPUs. The prediction and relaxation steps can use multi-GPU node configurations. This is especially important for the prediction step because the memory usage is approximately quadratic with the number of residues. Therefore, predicting a large protein structure can exceed the memory of a single GPU device.
Metadata and artifact management. The solution includes management for running and analyzing experiments at scale. In this solution, Vertex AI Metadata is used to manage metadata and artifacts.
The basis of the solution is a set of reusable Vertex AI Pipelines components that encapsulate core steps in the AlphaFold inference workflow: feature preprocessing, prediction, and relaxation. In addition to those components, there are auxiliary components that break down the feature engineering step into tools, and helper components that aid in the organization and orchestration of the workflow.
The solution includes two sample pipelines: the universal pipeline and a monomer pipeline. The universal pipeline mirrors the settings and functionality of the inference script in the AlphaFold Github repository. It tracks elapsed time and optimizes compute resources utilization. The monomer pipeline further optimizes the workflow by making feature engineering more efficient. You can customize the pipeline by plugging in your own databases.
To learn more and to try out this solution, check our GitHub repository, which contains the components and universal and monomer pipelines. The artifacts in the repository are designed so that you can customize them. In addition, you can integrate this solution into your upstream and downstream workflows for further analysis. To learn more about Vertex AI, visit our product page.
Acknowledgements
We would like to thank the following people for their collaboration: Shweta Maniar, Sampath Koppole, Mikhail Chrestkha, Jasper Wong, Alex Burdenko, Meera Lakhavani, Joan Kallogjeri, Dong Meng (NVIDIA), Mike Thomas (NVIDIA), and Jill Milton (NVIDIA).
Finally and most importantly, we would like to thank our Solution Manager Donna Schut for managing this solution from start to finish. This would not have been possible without Donna.
Read More for the details.
Amazon CloudWatch custom metrics now supports a 50x higher capacity allowing you to send up to 1,000 metrics per call at a 3x faster default call rate and specify 3x more dimensions (up to 30) per metric. Customers rely on CloudWatch custom metrics to capture application-specific data that complements the automatic metrics provided by CloudWatch based on the AWS services you are using. With these improvements, customers can send the same volume of data with fewer API requests, leading to reduced costs.
Read More for the details.
Vertex AI Training delivers a serverless approach to simplify the ML model training experience for customers. As such, training data does not persist on the compute clusters by design. In the past, customers had only Cloud Storage (GCS) or BigQuery (BQ) as storage options. Now, you can also use NFS shares, such as Filestore, for training jobs and access data in the NFS share as you would files in a local file system.
Built-in NFS support for custom training jobs provides the following benefits:
Delivers an easy way to store and access large datasets for Vertex AI Training with less of the cumbersome work involving moving training data around.
Training jobs execute faster by eliminating the data download steps.
Data streams over the network with higher throughput compared to using alternative storage solutions.
This article demonstrates how to create a Filestore instance and how to use the data that’s stored in the instance to train a model with your custom training code.
First let’s create a Filestore instance as our NFS file server.
In the Cloud Console, go to the Filestore Instances page and click Create instance.
Configure the instance based on your needs, noting the following:
For this tutorial, we used the “default” VPC network for simplicity. You may choose any network you want, but save the network name as we will need it later.
Ensure that you are using “private service access” as the connection mode.
For in depth instructions, see Creating instances.
Your new instance will show on the dashboard page. Click on the name of the instance to view the details of the instance.
Save the NFS mount point information, which is in the form of SERVER:PATH. We will use it later.
Copy data to your instance by following the instructions from the official guide.
Since we chose “private service access” mode for our Filestore instance as mentioned above, we already have VPC peering established between our network and Google services. If you’re using a third party NFS solution, you may need to set up the peering yourself as instructed in Set up VPC Network Peering.
Once you have the NFS share and VPC peering set up, you are ready to use it with your custom training jobs. In this section, we will use the gcloud CLI to create a custom training job that can access the files in your NFS share.
To be specific, the process can be simplified into following general steps:
Decide a mount point directory under the path /mnt/nfs/. Your NFS share will be mounted to this directory when you submit jobs.
In your custom code, you can access your NFS file share via the local path to your mount point directory.
Specify the “nfsMount” field and network fields in your training job request and submit it.
For example, we make my_mount the “Mount Point” folder. Then in our custom code, we can specify /mnt/nfs/my_mount to get the data stored in our Filestore instance:
We may also write to the Filestore instance via that local path:
Here, suppose that we built a custom container image gcr.io/PROJECT_ID/nfs-demo containing the above code for submitting our training job. We can run commands like the following:
The config.yaml file describes the CustomJobSpec and it should have the network and NFS mounts settings, like the following:
Then we can check the status of your training job and see how it successfully reads/writes the data from your NFS file shares.
In this article, we used Filestore to demonstrate how to access files in an NFS share by mounting it to Vertex AI. We created a Filestore instance and VPC peering connections, and then submitted a job that can directly read from Filestore as a local directory.
By leveraging the performance and throughput benefits of streaming data from NFS shares such as Filestore, it simplifies and accelerates the process to run training jobs on Vertex AI, which empowers users to train even better models with more data.
To learn more about using NFS file systems with Vertex AI, see NFS support on Vertex AI training.
To learn more about Vertex AI, check out this blog post from our developer advocates.
Read More for the details.
Connections to AWS Secrets Manager now support hybrid post-quantum key establishment using Kyber for transport layer security (TLS) from Round 3 of the NIST Post-Quantum Cryptography (PQC) selection process. This allows you to measure the potential performance impact of the post-quantum algorithm. You can also benefit from the longer-term confidentiality afforded by hybrid post-quantum TLS.
Read More for the details.