Cloud

2021 09 16

AWS – Amazon RDS now supports R5b instances for MySQL and PostgreSQL databases

Amazon Relational Database Service (Amazon RDS) now supports R5b database (DB) instances for MySQL and PostgreSQL databases. R5b DB instances support up to 3x the I/O operations per second (IOPS) and 3x the bandwidth on Amazon Elastic Block Store (Amazon EBS) compared to the latest x86-based memory-optimized DB instances (R5) available in Amazon RDS for MySQL and PostgreSQL databases. R5b DB instances are a great choice for IO-intensive DB workloads.

Read More for the details.

2021 09 16

AWS – Route 53 Resolver DNS Firewall Now Available in Asia Pacific (Osaka) Region

AWS, Cloud AWS

Today, we are pleased to announced that the Route 53 Resolver DNS Firewall is now generally available in the Asia Pacific (Osaka) Region. The Route 53 Resolver DNS Firewall is a managed firewall that allows customers to block DNS queries made for known malicious domains and to allow queries for trusted domains.

Read More for the details.

2021 09 16

AWS – Amazon RDS now supports T4g instances for MySQL, MariaDB, and PostgreSQL databases.

AWS, Cloud AWS

Amazon Relational Database Service (Amazon RDS) now supports AWS Graviton2-based T4g database (DB) instances for MySQL, MariaDB, and PostgreSQL databases. T4g DB instances offer up to 36% better price performance over comparable current generation x86-based T3 DB instances depending on the workload characteristics.

Read More for the details.

2021 09 16

GCP – Upgrade Postgres with pglogical and Database Migration Service

Cloud, Google Cloud gcp

As many of you are probably aware, Postgres is ending long term support for version 9.6 in November, 2021. However, if you’re still using version 9.6, there’s no need to panic! Cloud SQL will continue to support version 9.6 for one more year after in-place major version upgrades becomes available. But if you would still like to upgrade right now, Google Cloud’s Database Migration Service (DMS) makes major version upgrades for Cloud SQL simple with low downtime.

This method can be used to upgrade from any Postgres version, 9.6 or later. In addition, your source doesn’t have to be a Cloud SQL instance. You can set your source to be on-prem, self-managed Postgres, or an AWS source to migrate to Cloud SQL and upgrade to Postgres at the same time!

DMS also supports MySQL migrations and upgrades, but this blog post will focus on Postgres. If you’re looking to upgrade a MySQL instance, check out Gabe Weiss’s post on the topic.

Why are we here?

You’re probably here because Postgres 9.6 will soon reach end of life. Otherwise, you might want to take advantage of the latest Postgres 13 features, like incremental sorting and parallelized vacuuming for indexes. Finally, you might be looking to migrate to Google Cloud SQL, and thinking that you might as well upgrade to the latest major version at the same time.

Addressing version incompatibilities

First, before upgrading, we’ll want to look at the breaking changes between major versions. Especially if your goal is to bump up multiple versions at once (for example, upgrading from version 9.6 to version 13) you’ll need to account for all of the changes between those versions. You can find these changes by looking at the Release Notes for each version after your current version, up to your target version.

For example, before you begin upgrading a Postgres 9.6 instance, you’ll need to first address the incompatibilities in version 10, including renaming any SQL functions, tools, and options that reference “xlog” to “wal”, removing the ability to store unencrypted passwords on the server, and removing support for floating point timestamps and intervals.

Preparing the source for migration

There are a few steps we’ll need to take before our source database engine is ready for a DMS migration. A more detailed overview of these steps can be found in this guide.

First, you must create a database named “postgres” on the source instance. This database may already exist if your source is a Cloud SQL instance.

Next, install the pglogical package on your source instance. DMS relies on pglogical to transfer data between your source and target instances. If your source is a Cloud SQL instance, this step is as easy as setting the cloudsql.logical_decoding and cloudsql.enable_pglogical flags to on. Once you have set these flags, restart your instance for them to take effect.

This post will focus on using a Cloud SQL instance as the source, but you can find instructions for RDS instances here, and foron-prem/self-managed instances here. If your source is a self-managed instance (i.e. on Compute Engine), an on-premises instance, or an Amazon RDS/Aurora instance, this process is a little more involved.

Once you have enabled the pglogical flags on the instance, you will need to install the extension on each of your source databases that is not one of the following template databases: template0 and template1. If you are using a source other than Cloud SQL, you can check here to see what source databases need to be excluded.If you’re running Postgres 9.6 or later on your source instance, run CREATE EXTENSION IF NOT EXISTS pglogical; on each database in the source instance that will be migrated.

Next, you’ll need to grant privileges on the to-be-migrated databases to the user that you’ll be using to connect to the source instance during migration. Instructions on how to do this can be found here. When creating the migration job, you will enter the username and password for this user when creating a connection profile.

Creating the migration job in DMS

The first steps for creating a migration job in DMS are to define a source and destination. When defining a source, you’ll need to create a connection profile by providing the username and password of the migration user that you granted privileges to earlier, as well as the IP address for the source instance. The latter will be auto-populated if your source is a Cloud SQL instance:

Next, when creating the destination, you’ll want to make sure that you have selected your target version of Postgres:

After selecting your source and destination, you choose a connectivity method (see this very detailed post by Gabe Weiss for a deep-dive on connectivity methods) and then run a test to make sure your source can connect to your destination. Once your test is successful, you’re ready to upgrade! Once you start the migration job, data stored in the two instances will begin to sync. It might take some time until the two instances are completely synced. You can periodically check to see whether all of your data has synced by following the steps linked here. All the while, you can keep serving traffic to your source database until you’re ready to promote your upgraded destination instance.

Promoting your destination instance and finishing touches

Once you’ve run the migration, there are still a few things you need to do before your destination instance is production-ready. First, make sure any settings you have enabled on your source instance are also applied to your destination instance. For example, if your organization requires that production instances only accept SSL connections, you can turn on the enforce-SSL flag for your instance.

Some system configurations, such as high availability and read replicas, can only be set up after promoting your instance.

To reduce downtime, DMS migrations run continuously while applications still use your source database. However, before you promote your target to the primary instance, you must first shut down all client connections to the source to prevent further changes. Once all changes have been replicated to the destination instance, you can promote the destination, ending the migration job. More details on best practices when promoting can be found here.

Finally, because DMS depends on pglogical to migrate data, there are a few limitations of pglogical that DMS inherits:

The first is that pglogical only migrates tables that have a primary key. Any other tables will need to be migrated manually. To identify tables that are missing a primary key, you can run this query. There are a few strategies you can use for migrating tables without a primary key, which are describedhere.
Next, pglogical only migrates the schema for materialized views, but not the data. To migrate over the data, first run SELECT schemaname, matviewnameFROM pg_matviews; to list all of the materialized view names. Then, for each view, run REFRESH MATERIALIZED VIEW <view_name>

Third, pglogical cannot migratelarge objects. Tables with large objects need to be transferred manually. One way to transfer large objects is to use pg_dump to export the table or tables that contain the large objects and import them into Cloud SQL. The safest time to do this is when you know that the tables containing large objects won’t change. It’s recommended to import the large objects after your target instance has been promoted, but you can perform the dump and import steps at any time.
Finally, pglogical does not automatically migrate users. To list all users on your source instance, run du. Then follow the instructions linked here to create each of those users on your target instance.

After promoting your target and performing any manual steps required, you’ll want to update any applications, services, load balancers, etc to point to your new instance. If possible, test this out with a dev/staging version of your application to make sure everything works as expected.

If you’re migrating from a self-managed or on-prem instance, you may have to adjust your applications to account for the increased latency of working with a Cloud SQL database that isn’t right next to your application. You may also need to figure out how you can connect to your Cloud SQL instance. There are many paths to connecting to Cloud SQL, including the Cloud SQL Auth proxy, libraries for connecting with Python, Java, and Go, and using a private IP address with a VPC connector. You can find more info on all of these connection strategies in the Cloud SQL Connection Overview docs.

We did it! (cue fireworks)

If you made it this far, congratulations! Hopefully you now have a working, upgraded Cloud SQL Postgres instance. If you’re looking for more detailed information on using DMS with Postgres, take a look at our documentation.

Read More for the details.

2021 09 16

AWS – AWS IQ now supports AWS Certified experts and consulting firms located in the UK & France

AWS, Cloud AWS

AWS IQ now supports AWS Certified experts and consulting firms located in the UK & France. Quickly find, engage, & get help from experts and consulting firms in UK and France for on-demand work.

Read More for the details.

2021 09 16

GCP – Network security threat detection – Comparison of analytics methods

Cloud, Google Cloud gcp

Jaliesha is responsible for cybersecurity within the DevOps team at her cloud-native software service company – they call it DevSecOps. She has several requirements pressing down on her as their offering explodes in popularity and they take in their second round of VC funding:

Meet compliance requirements for Intrusion Detection System and Intrusion Prevention System (IDS / IPS) on the PCI DSS in-scope infrastructure, and produce artifacts for their upcoming SOC2 audit;

Continuously advance toward their goal to fulfill 90% of the Cloud Controls Matrix on their Cloud Security Alliance CAIQ;

Collaborate with the Network Operations Center (NOC) team to leverage their gathered telemetry for network performance monitoring (NPM) to achieve some of her security monitoring goals;

Retain twelve months of network metadata as may be required for threat hunting or legal evidence in case of a serious security incident;

Partner with the organization’s Chief Information Security Officer (CISO) team in improving threat detection and response.

It’s a tall order with a small staff, her and two others. Having recently migrated the majority of their workloads over from another public cloud provider, and having done this three years ago in a private environment with both physical and virtual workloads, she is familiar with the mechanisms she needs, but is still learning how to accomplish those capabilities in Google Cloud. Of course, she’s always looking for ways to get better visibility coverage and detections, simplify her team’s workflow, or both. Jaliesha’s story is a common one we hear around the Meet rooms at Google Cloud.

Google Cloud Network-based Threat Detection

We get it. The Google Cloud cybersecurity product management and engineering teams have been around the industry for decades. We observe the maturation of the industry from access control to the addition of intrusion prevention, and, more recently, analytics-based detection and automated response. As such, we have provided for several signal types that DevSecOps pros need in network-based threat detection efforts:

IPFIX (NetFlow) records

VPC Flow Logs

Packet Mirroring

Cloud IDS

Network Forensics and Telemetry blueprint

We’ll describe each one of these five offerings and their signal types emitted in brief form, provide links to full details on each, call out the differences, and offer guidance on when to use each one. A quick summary follows in Table 1.

Table 1. Comparison of methods for ingesting signal for Network and Security Analytics

Flow / Session

Let’s start by defining a “network flow,” sometimes called “session,” because the terms will show up repeatedly throughout the remainder of this blog:

Flow noun: a series of packets originating from one network endpoint, the source, to another network endpoint, the destination, bounded by the opening and closing of that single, discreet connection session between the two.

An example would be a simple HTTP connection using TCP, which is IP protocol number 6, between endpoint A, the originator of the connection, with source IP:port of 10.1.1.2:5832, and endpoint B, destination of the connection, with IP:port 10.200.200.12:80. The “5-tuple” that describes any IP flow would be:

IP Protocol #, source IP:port, destination IP:port

In this case:

6, 10.1.1.2:5832, 10.200.200.12:80.

IPFIX (was NetFlow)

What is it:

IPFIX is an international standard protocol from the IETF used to export IP flow information from network devices in a common format. It is based on a protocol originally designed, and published as Informational in IETF by Cisco called NetFlow.

IPFIX describes the network flow’s 5-tuple, as well as offers counters and other basic information about the connection that could be used for measurement, monitoring, accounting, billing, etc. This is the foundation of most all network monitoring currently. Most all modern mediating network devices support the export of IPFIX, including routers, switches, probes, next-gen firewalls (NGFWs), NAT gateways, etc. Many now perform the IPFIX data parsing, tracking, message creation and sending in hardware silicon to enable IPFIX export without impacting throughput.

By configuration, IPFIX / NetFlow can be exported from a number of different networking vendors’ image types that operate on Google Cloud, e.g. an NGFW, and are available in the Cloud Marketplace. Those records can be sent to Chronicle, or any number of 3rd party network performance monitoring (NPM), network traffic analytics (NTA), security information & event management (SIEM), network- or extended threat detection & response (NDR / XDR), or other 3rd party detection and response system. Our recently announced Autonomic Security Operations solution alongside integrations with Google Cloud’s business intelligence (BI) and analytics platform, Looker, and Chronicle (read the blog) makes querying this data set a task available to any SOC user via human language query, not just data analysts or those with SQL scripting expertise.

Graphic 1. Flow logs provide basic description, but not contents or intent

When to use it:

Jaliesha’s firm’s Network Operating Centers (NOCs) uses IPFIX to monitor network and endpoint connectivity, identify outages, measure utilization (for capacity planning, accounting, and billing), and to determine connectivity graphs. Using the latter, they established the baseline communication patterns for a network of endpoints. From that baseline a statistical characterization of the communication patterns between all endpoints was determined. This became the basis for many things. Capacity planning is one example.

From a security perspective, the baseline was also the foundation for migrating from a default-allow-explicit-block environment to the opposite, a zero-trust environment, i.e. default-deny-explicit-allow. This flow data allowed them to:

characterize the connectivity graph for a given endpoint, set of endpoints, each microservice, and an entire app;

determine if that pattern was currently in or out of policy;

draft an access control, authentication, and authorization policy for connectivity;

hypothetically evaluate that policy against historical and current traffic patterns, and

Adjust the enforced policy (via Cloud Armor, access control, IAM, Cloud Firewall, Cloud NAT, and VPC Service Controls).

As pointed out in Graphic 1 above, flow logs cannot determine anything about the contents or intent of a connection. For that Jaliesha needs metadata about the contents of the flow. We’ll address this in the VPC Flow Logs, Cloud IDS and Network Forensics & Telemetry Blueprint sections below.

VPC Flow Logs

What is it:

These logs record network flows sent from, and received by, workloads in a VPC, including both VM instances, and Google Kubernetes Engine nodes. They sample each workload’s TCP, UDP, ICMP, ESP, and GRE flows. They are a superset of IPFIX/NetFlow, adding attributes about each flow, including several attributes that are Google Cloud environment specific, like GCE & GKE context. They are based on sampled packets, interpolated to improve performance, and optionally filtered to reduce clutter and volume.

Not every packet is captured into a log record. About 1 out of every 10 packets is captured, but this sampling rate might be lower depending on the VM’s load. The method compensates for missed packets by interpolating from the captured packets. This happens for packets missed because of initial and user-configurable sampling settings. All packets collected for a given interval for a given connection are aggregated for a period of time (aggregation interval). If a flow is captured by sampling VPC Flow Logs it generates a log for the flow; inferred: due to sampling, not all flows will be recorded. Each flow record includes the information described in the Record format, as listed here in Table 2 (where they are also contrasted to Cloud IDS threat log formats).

By default, VPC Flow Log entries are annotated with metadata information, such as the names of the source and destination VMs or the geographic region of external sources and destinations. These are also represented in Table 2. Metadata annotations can be turned off, or you can specify only certain annotations, to save storage space.

VPC Flow logs are aggregated by connection and exported in real time. They can be filtered by user-specified criteria, and a second sampling of logs can be taken according to a configurable sample rate parameter, again, to save space.

The logs are then sent to Cloud Logging (by default), where the data can be viewed. From there logs may be exported to any destination that Cloud Logging export supports. Alternatively, the logs can be excluded from Cloud Logging and sent to Pub/Sub, to analyze them using real-time streaming APIs or 3rd party tools.

VPC Flow Logs introduce no delay or performance penalty to customer workloads when enabled.

When to use it:

Jaliesha’s firm uses VPC Flow Logs for NOC-type network monitoring, utilization (for expenses), and capacity planning, along with DevSecOps / SOC-type monitoring.

Their NOC teams’ data scientists run predictive models against historical data, and use a pub/sub function to feed the logs in a real-time stream to a 3rd party tool that does the same, for more advanced, notify-prior-to-outage operations. For utilization and expenses, the NOC characterizes traffic between regions and zones, to specific countries or carriers on the Internet, and top talkers.

For the DevSecOps team, they filter the VPC Flow Logs by VMs and by applications to understand connectivity graphs, and changes within the graph, when moving to a zero-trust environment, and when setting or changing access control and micro-segmentation rules, building on what was originally accomplished with IPFIX records. They also accomplish some measure of network forensics, like which endpoint talked with whom and when, and analyze all flows to/from a compromised workload, to aid in root cause analysis and mitigation. Last, they leverage the real-time streaming APIs (through Pub/Sub), and integrate with their commercially available SIEM.

VPC Flow Logs provide enough signal for some real-time security analysis, as described above. They can be used to detect some attacks or undesirable traffic, like DDoS, interaction with known malicious IPs, etc. Other tools we offer from Google Cloud are better suited to network-based advanced threat detection, like Cloud IDS and the Network Forensics & Telemetry Blueprint, as discussed below.

Packet Mirroring

What is it:

Packet Mirroring is a product available on Google Cloud that clones traffic of customer-specified workloads in your VPC network and forwards it for examination. Packet Mirroring captures the full packet data, including payloads and headers. The customer configured system(s) that receive the packet stream from Packet Mirroring, called “backend collectors,” can be a number of different 3rd party, or Google Cloud, solutions:

Parsing tool (to create and store structured metadata), e.g. Zeek open source tool

Data lake, for both structured and unstructured data, e.g. Elastic

NTA workload

Detection engine, like an IDS, NDR, or next-gen firewall (NGFW) workload

Once Packet Mirroring is enabled, Google Cloud sets up the overlay encapsulation and forwarding for all the packets from the mirrored workload(s) to the packet receivers (back-end collectors, network sensors, etc.); the customer simply configures an Internal TCP/UDP Load Balancer, ILB, the service, and the backend collectors that will receive those packets. We take care of shuttling the mirrored traffic, encrypted and cryptographically authenticated, thus maintaining privacy, packet integrity and satisfying compliance standards for sensitive information. Policies provide Jaliesha’s team with highly customizable filtering so that they pay to mirror only the traffic they need in each use case, allowing them to manage their cost to meet their budget. All this activity occurs with zero compute performance impact on workloads.

Packet Mirroring is available on all GCE and GKE clusters, in all Google Cloud regions.

When to use it:

Packet Mirroring is useful when you need full-packet data – not just flow data (IPFIX / NetFlow), nor sampled flow data (VPC Traffic Logs) – to monitor and analyze your network for performance issues (NPM, NTA), security incidents (IDS, SIEM, NDR, XDR), connection or application troubleshooting, application performance monitoring (APM), or lawful intercept. Often, the recipient of the mirrored packets will be 3rd party, commercially available products.

Cloud IDS

What is it:

Google Cloud IDS (currently in Preview) is a network-based threat detection technology that helps identify applications in use, and many classes of exploits, including app masquerading, malware, spyware, command-and-control attacks, to name a few. Cloud IDS is built with Palo Alto Networks’ industry-leading threat detection technologies, backed by Unit 42’s high-quality threat research for high security efficacy. Customers can enjoy ease of deployment and a Google Cloud integrated experience with high performance.

Cloud IDS deploys in just a few clicks and is easily managed with either the Google Cloud browser-based GUI, CLI, or APIs. It leverages Cloud’s Packet Mirroring capability under the hood to move desired flows (defined by policy) to the detection engines. With Palo Alto Networks technology embedded, Cloud IDS helps ensure that your network is free of malicious applications masquerading as legitimate ones through App-ID™ technology. What’s more, there is no need to spend time crafting detection signatures; it’s built-in already. For known attacks, threat detection mechanisms, libraries, and signatures are continually updated and applied automatically – as you would expect of a managed cloud offering; for zero-day attacks, anomaly detections that alert on malicious behavior are similarly updated and applied. Cloud IDS delivers all this without the need to architect or manage all the pieces, removing an enormous operational burden, and freeing your SecOps.

Cloud IDS emits threat detection alert logs. See Table 2, for a comparative listing of the fields in these alerts and VPC Flow Logs.

Table 2. Comparison of fields from Cloud IDS and VPC Flow Logs

When to use it:

Going beyond VPC Flow Logs, Cloud IDS provides near-real-time visibility into observed, network-based threats. Different than an NGFW sitting at a VPC edge, Cloud IDS provides full East-West visibility and detection, within VPCs, within GKE, and even between container pods.

Because Jaliesha works in a highly regulated industry, she uses Cloud IDS to help support their compliance requirements and goals for visibility and threat detection.

Since her DevSecOps team is stretched thin with time and money resources, Jaliesha appreciates the infusion of security research and expertise her team gains from automated detections being shown to them, rather than having to construct and script their own. Other customers are less into DIY detecting, and appreciate how our system covers all the basics automatically for them. For the DevSecOps team, Cloud IDS is a force multiplier, automating the bulk of the detection work, freeing them to do the more advanced detection and threat hunting specific to their environment.

Network Forensics & Telemetry Blueprint

What is it:

Jaliesha’s team has tools and methods they use to monitor and analyze network traffic in their own data centers, including detecting and hunting for threats both manually and with automation. Because these tools were designed for physical networks, and private clouds, they were not applicable when they migrated workloads to the cloud. The Network Forensics and Telemetry Blueprint addresses this need; it translates network packet data into logs so customers can use it in Google Chronicle, their existing SIEM, or other 3rd party analytics dashboards and tools, like NDR and XDR tools. The Network Forensics and Telemetry Blueprint is also natively offered through our Autonomic Security Operations solution.

Following this document, the DevSecOps team only needed to decide what workloads’ flows – which environments – they wanted to monitor. From there, Terraform scripts created the entire environment for them that accomplished:

Packet copy for all those flows, with zero performance impact to the workloads

Translated the copied packets’ raw data into metadata

Pushed structured and unstructured metadata into analysis tools and/or store them for post mortem investigation, of their choice, including a data lake and analysis, e.g. Chronicle, SIEM, Elastic, etc.

The recipe uses Packet Mirroring, ILB, Managed Instance Groups, Zeek, Pub/Sub, and (optionally) Google Chronicle:

Packet Mirroring – Sends copies of packets from all desired flows to/from any workloads desired

ILB & Managed Instance Groups – load balances the traffic to allow easy scaling of the Zeek instances

Zeek – Open source software for parsing raw packets into metadata

Pub/Sub – To export the metadata from Zeek to your data lake or analysis tool, if you already have one, e.g. an Elastic cluster. In Jaliesha’s case, they wanted something purpose built for threat detection and handling, so they chose Chronicle, which didn’t need Pub/Sub, because it is already integrated…

Chronicle – Allows security teams to cost effectively store, analyze, and write automated responses from all their security data to aid in the investigation and detection of threats. Customers author detections in Chronicle.

With Chronicle, customers build the required data pipeline for security analytics by sending log and metadata streams to it. Chronicle then provides a single pane of glass for analytics and detections, including support for aggregating data from a hybrid, multi-cloud architecture. Table 3 provides a few examples of rules that could be defined in Chronicle in order to detect web attacks, such as remote code execution and ransomware:

Table 3.

With these tools in place, the DevSecOps team scripts automation to take remediating actions in the environment when detections fire. An example would be the automated disconnection of network interfaces of a VM instance that is found to have command & control connections to an external, unfamiliar and unauthorized site, combined with observations of scanning internal networks.

Another example would be observing and mapping the healthy external services that internal workloads connect to in order to do their job, and then alerting on any connections made to unfamiliar domains by opening a ticket in their ServiceNow system for the team to address with the instance owner. Learn more about this exact use case from another customer in this Security Talks session.

When to use it:

The DevSecOps team uses this blueprint because they are going one step beyond Cloud IDS’s detections: they are threat hunting and authoring detections specific to their environment and volumes, and then automating remediation actions accordingly.

To do this, they want more than log data, they want packet and flow (PCAP) metadata. They also want to control the backend collector VMs that are deployed in their VPCs, and aggregate these collectors’ data into one location, combining them with other devices’ log data. They even add in endpoint logs and activity metadata.

They also have an outsourced managed security service provider (MSSP) that assists with tier 1 monitoring and validation of detections. Together with the MSSP, they write scripts and conduct manual threat hunting using flow metadata, device logs, and connection information placed into Chronicle from VPC Flow Logs, Cloud IDS threat logs, device logs, like their NGFW, DNS logs, and their Cloud Armor DDoS and WAF alerts.

Further, they store the network metadata and log data for 12 months, in order (a) to investigate security incidents, i.e. perform post mortem analysis to understand security impact and root cause once they’ve identified a breach or loss, and (b) to stockpile a data set on which to train their data science-based detections, some of which are ML-based (they use Google’s rich suite of AI/ML/Analytics tools to make this easy).

Table 4 that shows the differences specifically between Cloud IDS and Network Forensics & Telemetry Blueprint.

Table 4.

Summary

Jaliesha’s DevSecOps team has come a long way, and is using a variety of tools to keep their environment clean, meet compliance requirements, and minimize loss due to cyber attacks, as seen in Table 1 above. Some of these tools are cloud-native services from Google Cloud, like VPC Flow Logs, Packet Mirroring, Cloud IDS, and Chronicle. Others are integrations of our services, combined with both 3rd party, commercial tools and open source software. In this way, their threat detection environment is very similar to many of Google Cloud’s more security-focused customers.

Jaliesha is gratefully using these tools, but she’s hungry for more. We, the product management and engineering team for Network Security, meet with her regularly to hear her needs and collaborate on future offerings, because we aren’t done yet, not even close. We’ve got a lot more hybrid, multi-cloud, security services coming. If you’re like her, and have ideas on what we could do to be a force multiplier for your cybersecurity DevSecOps practice, please reach out to us through your account team; it would be great to add you and your team to our list of design partners.

If you are interested to learn more about Network Forensics & Telemetry Blueprint, please read our technical blog here.

Read More for the details.

2021 09 16

GCP – Google Cloud announces new Cloud Digital Leader training and certification

Cloud, Google Cloud gcp

You asked for it, we listened! Today we’re announcing the Cloud Digital Leader learning pathway, our first offering for business professionals that includes both training and certification.

The Cloud Digital Leader learning pathway is designed to skill-up individuals and teams that work with technical Google Cloud practitioners so they can contribute to strategic cloud-related business decisions.

A Cloud Digital Leader understands and can distinguish not only the various capabilities of Google Cloud core products and services, but also how they can be used to achieve desired business goals.

We asked one of our customers that participated in the beta why they are excited about this new offering and they said:

“ANZ is transforming its technology landscape by addressing the size and complexity of our current estate and fully embracing cloud. Our strategic advantage has always been our people; they are crucial to the transformation. One of the best ways to ensure they are set up for success is to provide relevant learning opportunities. The benefit of the Google Cloud Digital Leader certification is it provides general cloud knowledge and a shared language across the bank so no one is left behind. This means our technology teams as well as our business and enablement teams.”, Michelle Dobson, Head of Cloud COE & Enablement, Australia and New Zealand Banking Group Limited

Cloud Digital Leader Training

The Cloud Digital Leader training courses are designed to increase your team’s cloud confidence so they can collaborate with colleagues in technical cloud roles and contribute to informed cloud-related business decisions.

The courses provide customers with fundamental knowledge related to digital transformation with Google Cloud. The four courses are:

1: Introduction to Digital Transformation with Google Cloud

2: Innovating with Data and Google Cloud

3: Infrastructure and Application Modernization with Google Cloud

4: Understanding Google Cloud Security and Operations

Completion of these courses is recommended (not required) as one of the steps to prepare for the Google Cloud Digital Leader Certification exam.

Cloud Digital Leader Certification

Acquiring the Google Cloud Digital Leader Certification is an opportunity for your entire team to demonstrate its strong understanding of cloud capabilities, which can enhance organizational innovation with Google Cloud.

The Cloud Digital Leader exam is role-independent and does not require hands-on experience with Google Cloud. The Cloud Digital Leader exam assesses your knowledge in three areas:

General cloud knowledge

General Google Cloud knowledge

Google Cloud products and services

This certification is a new offering and additional resources will be available soon. Check back on the learning path page.

Start Innovating with Google Cloud

Get your team started on their Cloud Digital Leader learning journey!

Speak to your sales representative about skilling up your team.

Review the Cloud Digital Leader Certification exam using the exam guide.

Take the Cloud Digital Leader learning path

Read More for the details.

2021 09 16

AWS – Amazon RDS now supports X2g instances for MySQL, MariaDB, and PostgreSQL databases.

AWS, Cloud AWS

Amazon Relational Database Service (Amazon RDS) now supports AWS Graviton2-based X2g database (DB) instances for MySQL, MariaDB, and PostgreSQL databases. X2g DB instances offer double the memory per vCPU compared to R6g/R5 instances and the lowest cost per GiB of memory in Amazon RDS for MySQL, MariaDB, and PostgreSQL databases. The X2g.16xl DB instance has 33% more memory than previously available in Amazon RDS DB instances for MySQL, MariaDB, and PostgreSQL databases and is a great choice for memory-intensive DB workloads.

Read More for the details.

2021 09 16

AWS – Amazon CloudWatch Application Insights adds account application auto-discovery and new health dashboard

AWS, Cloud AWS

Setting up monitoring and managing the health of your business applications is now even easier with the ability to discover the applications and resources in your account even without a Resource Group, automatically set up monitoring for them and see their health at a glance in a summary health dashboard presented when you complete setup or open CloudWatch Application Insights. CloudWatch Application Insights is a service that helps customers easily setup monitoring and troubleshoot their enterprise applications running on AWS resources. The new feature makes setting up monitoring for all the resources in your account a truly one step process.

Read More for the details.

2021 09 16

AWS – Announcing Build on AWS for Startups

AWS, Cloud AWS

Amazon Web Services (AWS) announces the general availability of Build on AWS, a new offering from AWS Activate designed to help startups build their infrastructure on AWS in minutes. Build on AWS is a collection of infrastructure templates and reference architectures covering a wide variety of solutions curated specifically for startups. These solutions are built by experts at AWS and based on AWS best practices. This enables startups to focus on building their core product knowing they’re using AWS best practices for their underlying cloud infrastructure. With the launch of Build on AWS, we’ve simplified the first steps of launching scalable, reliable, secure, and optimized infrastructure tailored to startups’ industry or use case.

Read More for the details.

2021 09 16

GCP – To serve and protect: New storage features help ensure data is never lost

Cloud, Google Cloud gcp

Data is a company’s most important input and asset and one of the most critical tasks is protecting it. At the same time, customers tell us they wish it were easier to get the right level of protection for their data. That can be hard to do, whether your data is on-premises or in the cloud. For example, configuring synchronous replication on-prem for a tier-one application requires you to set up, manage and monitor both networking and storage. Additionally, many organizations struggle to extend the standard protection policies they have in place for VMs to container infrastructure.

Today, we are adding extensions to our popular Cloud Storage offering, and introducing two new services, Filestore Enterprise, and Backup for Google Kubernetes Engine (GKE). Together, these new capabilities will make it easier for you to protect your data out-of-the box, across a wide variety of applications and use cases.

“Enterprise customers want to spend less time managing storage,” said Matt Eastwood, Senior Vice President, Enterprise Infrastructure and Cloud, IDC. “Although public cloud is growing fast, it is still new in many organizations, and skill gaps are something we know slows cloud adoption. It is great to see Google Cloud focusing on ease of use as they bring new services to market.”

Extending Cloud Storage dual-region buckets

One of Cloud Storage’s most important features is dual-region and multi-region buckets. Relying on technologies such as Colossus and Spanner, Cloud Storage dual-region buckets enable what we like to call a “continent-sized storage system”—a true single namespace (a.k.a. bucket) that spans regions. A dual-region bucket is not a simple load balancer or access point in the network tier sitting on top of two independent buckets. It is a true single namespace bucket, active-active for read/write/delete, which offers some important strong consistency properties. As a result, developers can treat a continent like a single bucket, dramatically simplifying the application programming model. Google Cloud is unique in offering this capability among major public cloud vendors.

As our customers are aware, it’s super easy to create a dual-region bucket. Setting up a dual-region bucket is a simple matter of selecting the dual-region option when you first create it:

Today we are extending our dual-region buckets in two important ways:

Custom region selection for dual-region buckets – Previously, Google Cloud assigned dual region pairs for you to choose from. In an upcoming release, you’ll be able to select your own region pairs that meet your regulatory or compliance requirements, or optimize your app performance. Are you a financial company working with market data in Frankfurt and London? West coast US media firm that wants to use Los Angeles and Las Vegas? All this will be possible with this new capability.

Optional SLA-backed 15 min Recovery Point Objective (RPO) – Object storage is increasingly being used for important applications that can’t tolerate any data loss. Getting 99% of your data back after a regional outage just doesn’t cut it. We are excited to announce Turbo Replication for dual-region buckets, which replicates 100% of your data between regions in 15 minutes or less, backed by a Service Level Agreement, a first from a leading cloud provider.

To learn more about your data locality options, check out this guide to Cloud Storage bucket locations.

New Backup for GKE protects data in containers

Google Cloud users continue to adopt GKE in droves, benefitting from the greater application-development velocity it provides. And they’re no longer just running stateless applications in containers; they’re also running databases such as MySQL and PostgreSQL inside containers, as well as other stateful workloads. To help further accelerate this trend, we introduced Backup for GKE, a new native GKE service that makes it easier to protect your critical container-based data.

To learn more about Backup for GKE and how customers such as Broadcom and Atos are using it, check out the dedicated blog.

Filestore Enterprise offers protection from zonal outages

Filestore Enterprise is a brand new member of the fully managed Filestore family targeted at traditional tier-one enterprise applications (e.g., SAP) that need to share files. In addition to high-performance reads and writes, Filestore Enterprise offers high availability via synchronous replication across multiple zones in a region. When any zone within the region becomes unavailable, Filestore continues to transparently serve data to the application without any operational intervention.

To find out more about Filestore Enterprise and hear how Sabre is using it for their SAP deployment and their stateful GKE apps, check out this blog.

Data protection is in our DNA

Here at Google Cloud, we take data protection very seriously, and we’ll continue to focus on using Google’s global technology platform to give you powerful and easy-to-use capabilities to solve your most pressing storage challenges.

To learn more about Google Cloud’s family of storage and data protection products, register for our October 6th webinar: What’s New with Storage at Google Cloud, and attend Google Cloud Next ‘21.

Read More for the details.

2021 09 16

GCP – Announcing Filestore Enterprise, for your most demanding apps

Cloud, Google Cloud gcp

As more organizations move to Google Cloud, they need to consider their storage architecture—particularly for applications that they plan to lift and shift. The fact is that many legacy applications can’t use cloud-based object storage like Cloud Storage, and instead require a file- or block-based offering.

Today, we’re announcing Filestore Enterprise, a fully managed cloud-native NFS solution that lets you confidently deploy critical file-based applications in Google Cloud, backed by a Service Level Agreement that delivers 99.99% regional availability. Originally launched in 2018, the Filestore product family now includes:

Filestore Basicfor file sharing, software development, and GKE workloads

Filestore High Scale for high performance computing (HPC) application requirements such as genome sequencing, and financial-services trading analysis

Filestore Enterprise for critical applications (e.g., SAP) and GKE workloads

Figure 1 Filestore product family

With its 99.99% regional-availability SLA, Filestore Enterprise is designed for applications that demand high availability. With a few mouse clicks (or a few lines of gCloud CLI or API calls), you can simply provision NFS shares that are seamlessly synchronously replicated across three zones within a region. In the event that any zone within the region becomes unavailable, Filestore Enterprise continues to transparently serve data to the application with no operational intervention on your part.

To further protect critical data, Filestore also lets you take periodic snapshots of the file system and retain a desired number of recovery points. With Filestore, you can easily recover an individual file or an entire file system in less than 10 minutes from any of the prior snapshot recovery points.

Filestore Enterprise in action with SAP

SAP is a good example of a critical application that relies on both block (database tier) and file (application tier) storage. Recently Google and SAPannounced our joint partnership to help organizations migrate SAP to Google Cloud, and the availability of Filestore Enterprise is an important enabler of that partnership.

Last year, Google and Sabre announced a 10-year partnership to build the future of travel.

“As we continue our IT application modernization and migration projects, we intend to retire our data centers and run our applications and infrastructure in Google Cloud. To do so we need highly reliable storage for SAP, the stateful applications we’re deploying on GKE, and to support our travel and hospitality solution businesses,” said Patrick Uckermark, Sr. Director Enterprise Architecture at Sabre. “I’m looking forward to migrating off our legacy on-premises storage vendors that require complex storage configuration and management. I plan to utilize Filestore Enterprise for its ease of use, regional availability, and snapshot capabilities to support SAP, GKE, and our other critical applications.”

For critical applications like SAP, both the database and application tiers need to be highly available. To satisfy this requirement, you can deploy the SAP database tier to Regional Persistent Disk, which is synchronously replicated to multiple zones, while using Firestore Enterprise for the application tier. Similarly, the NetWeaver application tier, which requires shared executables across many VMs, can be deployed to Filestore Enterprise, which replicates the data across multiple zones within a region. The end result is a highly available three-tier mission-critical application architecture.

Figure 2 Filestore and Highly Available SAP

IT organizations are also increasingly deploying stateful applications in containers on Google Kubernetes Engine (GKE). This often causes them to rethink which storage infrastructure to use to support those applications. Based on application requirements, today you can use Persistent Disk (block), Filestore Basic or Enterprise (file), and/or Cloud Storage (object). Filestore Enterprise, with its managed Kubernetes Container Storage Interface (CSI) driver, allows organizations that require multiple GKE Pods to have shared file access, providing an increased level of availability for mission-critical workloads. (We also announced Backup for GKE in Preview today.)

SADA Systems is a three-time Google Cloud Reseller Partner of the Year and has migrated thousands of companies to Google Cloud. “We work with many organizations embarking on a cloud migration strategy and they require a combination of block, object, and file storage solutions to satisfy their diverse application requirements,” said Miles Ward, CTO, SADA. “Our team is looking forward to working with our customers to leverage Filestore Enterprise for their GKE workloads and experience the simplicity of a cloud-native NFS solution. Additionally, the 99.99% regional availability will be extremely valuable to our customers looking to run more mission-critical applications in Google Cloud.”

To get started with Filestore Enterprise, register for our October 6th, Explore What’s New with Google Cloud Storage webinar and Google Cloud Next ‘21 INF206 breakout session,where Sabre’s Patrick Uckermark will share more about how they intend to use Filestore to support their modernization journey.

Read More for the details.

2021 09 16

GCP – Announcing Backup for GKE: the easiest way to protect GKE workloads

Cloud, Google Cloud gcp

Organizations everywhere have been choosing to build on Google Kubernetes Engine (GKE), driven by benefits like higher developer productivity and lower infrastructure costs. And one of the fastest growing GKE architectures is the deployment of stateful workloads like relational databases, inside GKE containers. Stateful workloads have additional requirements over stateless workloads, including the need for data protection and storage management.

Today, we are announcing the Preview for Backup for GKE, a simple, cloud-native way for you to protect, manage, and restore your containerized applications and data. With Backup for GKE, you can more easily meet your service-level objectives, automate common backup and recovery tasks, and show reporting for compliance and audit purposes.

Best of all, this means more applications deployed in GKE, making it easier for our largest customers, like Broadcom, to expand their use of GKE and manage these new, more demanding workloads. Google Cloud is the first cloud provider to offer a simple, first-party backup for Kubernetes.

“Backup for GKE makes it easier for us to protect our stateful workloads in GKE, and it makes restoring those stateful workloads much simpler and faster,” said Jose Chavez, SaaS Platform and Delivery Engineer at Broadcom. “We see integrated backup as another sign of GKE’s maturity for stateful workloads, and we look forward to using it to serve our worldwide internal customers at Broadcom.”

Protecting containers: how Backup for GKE works

Prior to Backup for GKE, many GKE customers backed up their stateful application data separately from GKE cluster state data. Application data could be protected via a storage-based backup, while cluster state data might be captured occasionally using custom scripts and stored in a separate customer bucket. Customers with ongoing backup requirements relied on homegrown solutions to perform regular backups and to demonstrate compliance. In the event of a restore, customers had to perform more complex orchestration. Storage management tasks, like creating a clone for testing purposes, or migrating data from one cluster to another, meant additional operational overhead.

Backup for GKE orchestrates data protection and restores for you, so that you can manage data at the container level. With Backup for GKE, you can create a backup plan to schedule periodic backups of both application data and GKE cluster state data. You can also restore each backup to a cluster in the same region or, alternately, to a cluster in a different region. You can even customize your backups to ensure application consistency for the most demanding, tier-one database workloads. The result is a feature that drives down the operational cost for infrastructure teams at companies like Atos, while also making it easier for architects and developers to use GKE for their most critical applications.

“Over the past several months, we have been impressed by Backup for GKE and how it reduces our operational workload when protecting GKE clusters,” said Jaroslaw Gajewski, Digital Cloud Services Lead Architect and Distinguished Expert at Atos. “This feature supports our continued adoption of infrastructure-as-code as part of Digital Cloud Services landing zones delivery with our joint customers and, more importantly, ensures that we can deliver the demanding service levels our customers require to run mission-critical applications.”

Another sign of GKE maturity and momentum

Integrated, first-party backup functionality has long been a milestone for leading infrastructure software vendors on their way to mass adoption. Relational database vendors delivered their first-party backup tools over twenty years ago, and hypervisor vendors followed up with standardized backup APIs over ten years ago. Today, GKE’s first-party backup offering is ready for our customers.

We’re thrilled that more organizations are turning to GKE for more of their mission-critical workloads, including stateful applications. Our team has worked hard to deliver the best Kubernetes service for all workloads, and we’re energized by what our customers have created on our platform. We invite everyone interested in simplifying your backup and storage management tasks to sign up for the Preview of Backup for GKE.

To sign up for the Backup for GKE Preview, please reach out to your account team or contact our sales representatives. If you are interested in learning more about how customers are using GKE for Backup and other new Google Cloud storage capabilities, be sure to register for our webinar, Explore What’s New with Storage at Google Cloud, scheduled for October 6, and for Google Cloud Next ‘21, scheduled for October 12-14.

Read More for the details.

2021 09 16

Azure – Azure VMware Solution achieves FedRAMP High Authorization

Azure, Cloud Azure

With this certification, U.S. government and public sector customers can now use Azure VMware Solution as a compliant FedRAMP cloud computing environment, ensuring it meets the demanding standards for security and information protection.

Read More for the details.

2021 09 15

Azure – JetStream Disaster Recovery for Azure VMware Solution now in public preview

Azure, Cloud Azure

JetStream Disaster Recovery is now available on Azure VMware Solution in public preview, enabling DR protection needed for business and mission-critical applications. JetStream Disaster Recovery on Azure VMware Solution is also cost-effective, as it uses minimal resources at the DR site by leveraging cloud storage, such as Azure Blob Storage.

Read More for the details.

2021 09 15

AWS – AWS Lake Formation is now available in Asia Pacific (Osaka)

AWS, Cloud AWS

You can now use AWS Lake Formation in the Asia Pacific (Osaka) AWS region.

Read More for the details.

2021 09 15

AWS – Extract custom entities from documents in their native format with Amazon Comprehend

AWS, Cloud AWS

Amazon Comprehend, a natural-language processing (NLP) service that uses machine learning to uncover information in text, now allows you to extract custom entities from documents in a variety of formats (PDF, Word, plain text) and layouts (e.g., bullets, lists). This enables you to more easily extract insights and further automate your document processing workflows.

Read More for the details.

2021 09 15

GCP – PyTorch on Google Cloud: How to deploy PyTorch models on Vertex AI

Cloud, Google Cloud gcp

This article is the next step in the series of PyTorch on Google Cloud using Vertex AI. In the preceding article, we fine-tuned a Hugging Face Transformers model for a sentiment classification task using PyTorch on Vertex Training service. In this post, we show how to deploy a PyTorch model on the Vertex Prediction service for serving predictions from trained model artifacts.

Now let’s walk through the deployment of a Pytorch model using TorchServe as a custom container by deploying the model artifacts to a Vertex Endpoint. You can find the accompanying code for this blog post on the GitHub repository and the Jupyter Notebook.

Deploying a PyTorch Model on Vertex Prediction Service

Vertex Prediction service is Google Cloud’s managed model serving platform. As a managed service, the platform handles infrastructure setup, maintenance, and management. Vertex Prediction supports both CPU and GPU inferencing and offers a selection of n1-standard machine shapes in Compute Engine, letting you customize the scale unit to fit your requirements. Vertex Prediction service is the most effective way to deploy your models to serve predictions for the following reasons:

Simple: Vertex Prediction service simplifies model service with pre-built containers for prediction that requires you to only specify where you store your model artifacts.
Flexible: With custom containers, Vertex Prediction offers flexibility by lowering the abstraction level so that you can choose whichever ML framework, model server, preprocessing, and post-processing that you need.
Assistive: Built-in tooling to track performance of models and explain or understand predictions.

TorchServe is the recommended framework to deploy PyTorch models in production. TorchServe’s CLI makes it easy to deploy a PyTorch model locally or can be packaged as a container that can be scaled out by the Vertex Prediction service. The custom container capability of Vertex Prediction provides a flexible way to define the environment where the TorchServe model server is run.

In this blog post, we deploy a container running a TorchServe model server on the Vertex Prediction service to serve predictions from a fine-tuned transformer model from Hugging Face for the sentiment classification task. You can then send input requests with text to a Vertex Endpoint to classify sentiment as positive or negative.

Figure 1. Serving with custom containers on Vertex Prediction service

Following are the steps to deploy a PyTorch model on Vertex Prediction:

Download the trained model artifacts.
Package the trained model artifacts including default or custom handlers by creating an archive file using the Torch Model Archiver tool.
Build a custom container (Docker) compatible with the Vertex Prediction service to serve the model using TorchServe.
Upload the model with the custom container image as a Vertex Model resource.
Create a Vertex Endpoint and deploy the model resource to the endpoint to serve predictions.

1. Download the trained model artifacts

Model artifacts are created by the training application code that are required to serve predictions. TorchServe expects model artifacts to be in either a saved model binary (.bin) format or a traced model (.pth or .pt) format. In the previous post, we trained a Hugging Face Transformer model on the Vertex Training service and saved the model as a model binary (.bin) by calling the .save_model() method and then saved the model artifacts to a Cloud Storage bucket.

Based on the training job name, you can get the location of model artifacts from Vertex Training using the Cloud Console or gcloud ai custom-jobs describe command and then download the artifacts from the Cloud Storage bucket.

2. Create a custom model handler to handle prediction requests

TorchServe uses a base handler module to pre-process the input before being fed to the model or post-process the model output before sending the prediction response. TorchServe provides default handlers for common use cases such as image classification, object detection, segmentation and text classification. For the sentiment analysis task, we will create a custom handler because the input text needs to be tokenized using the same tokenizer used at the training time to avoid the training-serving skew.

The custom handler presented here does the following:

Pre-process the input text before sending it to the model for inference using the same Hugging Face Transformers Tokenizer class used during training
Invoke the model for inference
Post-process output from the model before sending back a response

3. Create custom container image with TorchServe to serve predictions

When deploying a PyTorch model on the Vertex Prediction service, you must use a custom container image that runs a HTTP server, such as TorchServe in this case. The custom container image must meet the requirements to be compatible with the Vertex Prediction service. We create a Dockerfile with TorchServe as the base image that meets custom container image requirements and performs the following steps:

Install dependencies required for the custom handler to process the model inference requests. For e.g. transformers package in the use case.
Copy trained model artifacts to /home/model-server/ directory of the container image. We assume model artifacts are available when the image is built. In the notebook, we download the trained model artifacts from the Cloud Storage bucket saved as part of hyperparameter tuning trials.
Add the custom handler script to /home/model-server/ directory of the container image.
Create /home/model-server/config.properties to define the serving configuration such as health check and prediction listener ports
Run the Torch Model Archiver tool to create a model archive file from the files copied into the image /home/model-server/. The model archive is saved in the /home/model-server/model-store/ with name same as <model-name>.mar

Launch Torchserve HTTP server to enable serving of the model referencing the configuration properties and the model archive file

Let’s understand the functionality of TorchServe and Torch Model Archiver tools in these steps.

Torch Model Archiver

Torchserve provides a model archive utility to package a PyTorch model for deployment and the resulting model archive file is used by torchserve at serving time. Following is the torch-model-archiver command added in Dockerfile to generate a model archive file for the text classification model:

Model Binary (–serialized-file parameter): Model binary is the serialized Pytorch model that can either be the saved model binary (.bin) file or a traced model (.pth) file generated using TorchScript – Torch Just In Time (JIT) compiler. In this example we will use the saved model binary generated in the previous post by fine-tuning a pre-trained Hugging Face Transformer model.

NOTE: JIT compiler trace may have some device-dependent operations in the output. So it is often a good practice to generate the trace in the same environment where the model will be deployed.

Model Handler (–handler parameter): Model handler can be TorchServe’s default handlers or path to a python file to handle custom TorchServe inference logic that can pre-process model inputs or post-process model outputs. We defined a custom handler script in the previous section of this post.

Extra files (–extra-files parameter): Extra files allow you to package additional files referenced by the model handler. For example, a few of the files referred in the command are:

index_to_name.json: In the custom handler defined earlier, the post-processing step uses an index-to-name JSON file to map prediction target indexes to human-readable labels

config.json: Required for AutoModelForSequenceClassification.from_pretrained method to load the model

vocab.txt: vocab files used by the tokenizer

TorchServe

TorchServe wraps PyTorch models into a set of REST APIs served by a HTTP web server. Adding the torchserve command to the CMD or ENTRYPOINT of the custom container launches this server. In this article we will only explore prediction and health check APIs. The Explainable AI API for PyTorch models on Vertex endpoints is currently supported only for tabular data.

TorchServe Config (–ts-configparameter): TorchServe config allows you to customize the inference address and management ports. We also configure service_envelop field to json to indicate the expected input format for TorchServe. Refer to TorchServe documentation to configure other parameters. We create a config.properties file and pass it as TorchServe config.

Model Store (–model-storeparameter): Model store location from where local or default models can be loaded

Model Archive (–modelsparameter): Models to be loaded by TorchServe using [model_name=]model_locationformat. Model location is the model archive file in the model store.

4. Build and push the custom container image

Run the following command to build the container image based on the Dockerfile and tag it with a name compatible with your Container Registry repository:

Before pushing the image to the Container Registry, you can test the docker image locally by sending input requests to a local TorchServe deployment running inside docker.

To run the container image as a container locally, run the following command:

To send the container’s server a health check, run the following command:

This request uses a test sentence. If successful, the server returns the prediction in the following format:

After the response is verified, it confirms that the custom handler, model packaging and torchserve config are working as expected. You can stop the TorchServe local server by stopping the container.

Now push the custom container image to the Container Registry, which will be deployed to the Vertex Endpoint in the next step.

NOTE: You can also build and push the custom container image to the Artifact Registry repository instead of the Container Registry repository.

5. Deploying the serving container to Vertex Endpoint

We have packaged the model and built the serving container image. The next step is to deploy it to a Vertex Endpoint. A model must be deployed to an endpoint before it can be used to serve online predictions. Deploying a model associates physical resources with the model so it can serve online predictions with low latency. We use Vertex SDK for Python to upload the model and deploy it to an endpoint. Following steps are applicable to any model trained either on Vertex Training service or elsewhere such as on-prem.

Upload model

We upload the model artifacts to Vertex AI and create a Model resource for the deployment. In this example the artifact is the serving container image URI. Notice that the predict and health routes (mandatory routes) and container port(s) are also specified at this step.

After the model is uploaded, you can view the model in the Models page on the Google Cloud Console under the Vertex AI section.

Figure 2. Models page on Google Cloud console under the Vertex AI section

Create endpoint

Create a service endpoint to deploy one or more models. An endpoint provides a service URL where the prediction requests are sent. You can skip this step if you are deploying the model to an existing endpoint.

After the endpoint is created, you can view the endpoint in the Endpoints page on the Google Cloud Console under the Vertex AI section.

Figure 3. Endpoints page on Google Cloud console under the Vertex AI section

Deploy the model to endpoint

The final step is deploying the model to an endpoint. The deploy method provides the interface to specify the endpoint where the model is deployed and compute parameters including machine type, scaling minimum and maximum replica counts, and traffic split.

After deploying the model to the endpoint, you can manage and monitor the deployed models from the Endpoints page on the Google Cloud Console under the Vertex AI section.

Figure 4. Manage and monitor models deployed on Endpoint from Google Cloud console under the Vertex AI section

Test the deployment

Now that the model is deployed, we can use the endpoint.predict() method to send base64 encoded text to the prediction request and get the predicted sentiment in response.

Alternatively, you can also call the Vertex Endpoint to make predictions using the gcloud beta ai endpoints predict command. Refer to the Jupyter Notebook for complete code.

Cleaning up the environment

After you are done experimenting, you can either stop or delete the Notebooks instance. Delete the Notebook instance to prevent any further charges. If you want to save your work, you can choose to stop the instance instead

To clean up all Google Cloud resources created in this post and the previous post, you can delete the individual resources created:

Training Jobs

Model

Endpoint

Cloud Storage Bucket

Container Images

Follow the Cleaning Up section in the Jupyter Notebook to delete the individual resources.

What’s next?

Continuing from the training and hyperparameter tuning of the PyTorch based text classification model on Vertex AI, we showed deployment of the PyTorch model on Vertex Prediction service. We deployed a custom container running a TorchServe model server on the Vertex Prediction service to serve predictions from the trained model artifacts. As the next steps, you can work through this example on Vertex AI or perhaps deploy one of your own PyTorch models.

References

Deploying models on Vertex Prediction service

Custom container requirements for prediction | Vertex AI

GitHub repository with code and accompanying notebook

In the next article of this series, we will show how you can orchestrate a machine learning workflow using Vertex Pipelines to tie together the individual steps which we have seen so far, i.e. training, hyperparameter tuning and deployment of a PyTorch model. This will lay the foundation for CI/CD (Continuous Integration / Continuous Delivery) for machine learning models on the Google Cloud platform.

Stay tuned. Thank you for reading! Have a question or want to chat? Find authors here – Rajesh [Twitter | LinkedIn] and Vaibhav [LinkedIn].

Thanks to Karl Weinmeister and Jordan Totten for helping and reviewing the post.

Read More for the details.

2021 09 15

AWS – Amazon Timestream is now in scope for AWS SOC Reports

AWS, Cloud AWS

You can now use Amazon Timestream in applications that are subject to System and Organization Control (SOC) compliance. Amazon Timestream is a fast, scalable, secure, and purpose-built time series database for application monitoring, IoT, and real-time analytics workloads that can scale to process trillions of time series events per day.

Read More for the details.

2021 09 15

Azure – Azure Database for PostgreSQL – Hyperscale (Citus) support for Citus 10.1 is generally available

Azure, Cloud Azure

The support of Citus 10.1, with columnar storage and more, is now included in Azure Database for PostgreSQL – Hyperscale (Citus), a managed service running the open source Postgres database on Azure.

Read More for the details.

Cloud

Why are we here?

Addressing version incompatibilities

Preparing the source for migration

Creating the migration job in DMS

Promoting your destination instance and finishing touches

We did it! (cue fireworks)

MySQL major version upgrade using Database Migration Service

Google Cloud Network-based Threat Detection

Flow / Session

IPFIX (was NetFlow)

What is it:

When to use it:

VPC Flow Logs

What is it:

When to use it:

Packet Mirroring

What is it:

When to use it:

Cloud IDS

What is it:

When to use it:

Network Forensics & Telemetry Blueprint

What is it:

When to use it:

Summary

Leveraging Network Telemetry for Forensics in Google Cloud

Cloud Digital Leader Training

Cloud Digital Leader Certification

Start Innovating with Google Cloud

New to Google Cloud? Here are a few free trainings to help you get started

Extending Cloud Storage dual-region buckets

New Backup for GKE protects data in containers

Filestore Enterprise offers protection from zonal outages

Data protection is in our DNA

Filestore Enterprise in action with SAP

Speeding up, scaling out: Filestore now supports high performance

Protecting containers: how Backup for GKE works

Deploying a PyTorch Model on Vertex Prediction Service

1. Download the trained model artifacts

2. Create a custom model handler to handle prediction requests

3. Create custom container image with TorchServe to serve predictions

Torch Model Archiver

TorchServe

4. Build and push the custom container image

5. Deploying the serving container to Vertex Endpoint

Upload model

Create endpoint

Deploy the model to endpoint

Test the deployment

Cleaning up the environment

What’s next?