Azure – Generally available: vRealize Log Insight Cloud with Azure VMware Solution integration
vRealize Log Insights Cloud with Azure VMware Solution support is now generally available.
Read More for the details.
vRealize Log Insights Cloud with Azure VMware Solution support is now generally available.
Read More for the details.
Public IP capability on Azure VMware Solution provides internet connectivity for Azure VMware Solution workloads. You can now use NSX-T manager as your security terminating point.
Read More for the details.
Explore four new features in the no code editor in Azure Event Hubs. This editor allows you to easily develop a Stream Analytics job without writing a single line of code.
Read More for the details.
Google Cloud Storage is very happy to launch powerful new object lifecycle rules for GCS to help our customers control their storage costs in new ways. Two new features are included in this launch.
Before this launch, customers already had several lifecycle conditions to choose from, such as the age of an object, its version history, a custom timestamp, and more. Now, customers can add conditions on the names of objects; specifically, matching a prefix or a suffix.
Prefix and suffix conditions are helpful for a number of cases. Here are two:
Managing common object prefixes separately. It’s quite common to group objects using a common prefix, such as in a dataset. Now, lifecycle conditions can act on those groups using a MatchesPrefix rule.
Managing categories of objects separately. It’s also common to use “extensions” on object keys to denote the format of the data; .mp4, .zip, .csv, and so on. Customers often have a mix of these within a bucket, and would like to use the extension to manage them separately. A great example is when customers have large binary objects that are read seldomly, but metadata about them with a different extension that is read often (think, .mov and .xml). Now, it is easy to keep the metadata “hot” while saving costs with the bulk of your data “cold” using a MatchesSuffix rule.
In 2021, we launched the Multipart Upload protocol in our XML API. This provided our customers with an even smoother migration path by greatly improving our S3 compatibility. One common pain point with multipart uploads is that they sometimes get abandoned. Part storage isn’t free, so pruning abandoned uploads is critical housekeeping. Now, customers can “set it and forget it” with GCS Lifecycle support for incomplete multipart uploads.
These new lifecycle features are now generally available to all of our customers. They are supported through the GCS API, GCS GUI, gsutil, gcloud storage, and the client libraries.
Read More for the details.
JetStream DR enables cost-effective disaster recovery by consuming minimal resources at the DR site and using cost-effective cloud storage. JetStream DR now also automates recovery to Azure NetApp Files datastores.
Read More for the details.
Online games have grown in popularity lately, and with the influx of players in multiplayer games, meeting the increased demand is a known issue within the gaming industry. We developed Open Match, an open source multiplayer matchmaking framework, to solve this issue by providing core services commonly found in most matchmaking services and hosting them on Kubernetes. Open Match enables developers to focus more on match logic and less on running at scale.
Open Match uses a Bitnami open source Redis image but allows for using any key-value store. In the architecture diagram below you can see how Open Match interacts with a state store:
Open Match Architecture
While Open Match is designed to scale with player activity, by default, all reads and writes are performed on the primary Redis node, which can potentially approach capacity limits. Moving all read operations to replicas can reduce the stress on primary, but also introduces concerns with how well this approach scales and the potential for Open Match to return outdated matchmaking results.
Earlier this year, Cloud Memorystore for Redis read replicas launched, and Open Match was the perfect candidate to test running at scale. Cloud Memorystore for Redis read replicas expose a read endpoint for load balancing read queries across up to five replicas over multiple zones. This allows for high availability and minimal application downtime. Memorystore also supports Redis 6, which introduced multithreaded I/O on M3 or higher configurations. At scale, this performance could support up to a million players looking for (and finding!) matches.
The example configuration provided in the Open Match repository uses Redis Sentinel with three replicas for high availability with all reads and writes performed on the primary node. With a quick swap in configuration, we will confirm that Memorystore for Redis standard instance with read replicas could be used as a drop-in replacement with no code changes. Although Memorystore supports an even high number of read replicas, to get a close comparison to the Sentinel example configuration, three replicas were used with Memorystore.
The Open Match repo contains scale tests for benchmarking. To test the compatibility of read replicas and really give the system a workout, we ran the benchmark with 15000 simulated players requesting a match per second.These numbers represent a surge of players entering matchmaking as a game becomes a viral hit. Here are some metrics confirming that the application was stable and performed well, gathered over several days.
Open Match tracks the player’s process through the matchmaking system as a status field in Redis, and out-of-sync Memorystore replicas could lead to a variety of problems. Some examples of players who should no longer be considered for matchmaking include:
Players who have already been matchedPlayers who have chosen to leave the queueTo prevent these issues, the replicas must have the latest state. To track this, we confirmed sub-second data replication latency from the primary to the replicas:
The chart shows a measured latency under one second behind the primary node. With this little latency, replicas are up-to-date, matchmaking requests are completed, and players are off to enjoy their games.
As players enter the queue and are waiting to be matched, Open Match stores them in Redis. If matchmaking slows down due to a performance bottleneck, the number of players waiting can start to increase. An important consideration is making sure the number of players waiting doesn’t outstrip the available memory for Redis.
Open Match Memorystore Keys
As the chart shows, the number of keys slowly increased to 2k during our tests. There it remained steady and the application operation became stable, with players being matched at approximately the rate they enter the matchmaking queue. From this metric, we can see that the replacement of a Redis Sentinel configuration with our Cloud Memorystore for Redis configuration using read replicas is operating normally and doesn’t seem to have any issues with getting all players into matches.
In this test, we demonstrated how three Cloud Memorystore for Redis read replicas can serve as a drop-in replacement for Redis Sentinel while giving simulated players a similar matchmaking experience. Cloud Memorystore for Redis supports additional read replicas, which could improve Open Match performance even further. This capability will help an online game smoothly and rapidly scale to handle matchmaking requests in the event of a player surge.
You can learn more about Cloud Memorystore for Redis read replicas in the documentation.
To learn more about Open Match, check out the GitHub repo, explore the documentation for tutorials and samples, and join the Slack channel to connect with developers and contributors.
Follow Jon on Twitter for more Open Match and gaming content.
Read More for the details.
Amazon Chime SDK now supports connecting an open source signaling client implementation in C++. The Amazon Chime SDK lets developers add intelligent real-time audio, video, and screen share to their web and mobile applications. WebRTC is a common library for video conferencing applications, and the Amazon Chime SDK provides WebRTC based media clients in JavaScript, iOS and Android. However, some customers choose to bring their own custom WebRTC library and still want to connect server side to the Amazon Chime SDK.
Read More for the details.
Starting today, Amazon EC2 C6gd instances are available in South America (Sao Paulo) region. C6gd instances are ideal for compute-intensive workloads such as high performance computing (HPC), batch processing, ad serving, video encoding, gaming, scientific modelling, distributed analytics, and CPU-based machine learning inference. The local SSD storage provided on these instances will benefit applications that need access to high-speed, low latency storage, as well as for temporary storage of data such as batch and log processing, and for high-speed caches and scratch files.
Read More for the details.
If you’re a college student like me and are gearing up to enter the “big kids” job market, as I like to call it, then you’ve probably been wondering (or worrying) about how to get ahead of the curve and stand out amongst your peers.
When I think about which high-value fields to target for improving my technical skills and qualifications, the one area I keep coming back to is data analysis.
Data in all forms is becoming increasingly valuable in our technology-driven society, and that is because of the insights that it brings! The amount of data being generated around us is growing exponentially in all fields. This is great news for students, because now you can benefit from learning data analysis to complement your existing skills, whether you’re majoring in computer science, marketing, or even music! Having the skills necessary to manipulate, process, analyze, and display data in a meaningful way will push you ahead regardless of your background.
Learning new tech skills and tools on top of coursework, jobs, and internships can seem daunting. Trust me, I know your pain. That’s why it’s so important for us as students to be strategic and efficient in how we discern the best resources to use for learning.
Whenever I want to learn a new software or skill, there are a few factors I take into account that I’m sure are important to you too:
How much is this going to cost me?How much time is this going to take?How applicable is this to my job prospects?
Cost
I’m not even going to pretend this isn’t one of the first things I think about. Knowing how to allocate your financial resources is a vital skill – especially when it comes to career-oriented self-improvement.
Time
And time? Well that is a cost as well. Time is valuable for us students, sometimes even more valuable than money itself. We’re juggling coursework and studying, commute time, extracurriculars, career development, and sometimes even a job to help pay for it all. We’re looking for skills that are relatively straightforward to learn and that we can learn on our own time, on a flexible, self-paced schedule.
Applicability
Lastly, I want to be able to learn a skill or tool that is applicable to my job search, something that I can list directly on my resume that will be attractive to the type of companies that I will be applying to. After all, furthering your career is one of the main drivers for this kind of self-study. That is why I always look for opportunities to learn directly using industry-standard software and services.
During my internship here at Google, I’ve been given ample opportunities to build my data analysis skills using Google Cloud services. In this blog post, I focus specifically on two of those services: BigQuery and Data Studio.
What is BigQuery?
BigQuery is a cloud data warehouse that companies use for running analytics on large datasets. It also happens to be a great place for learning and practicing SQL (the language for analyzing data). The “getting-started” experience with BigQuery is smooth and saves students tons of time. Instead of downloading and installing database software, sourcing data, and loading it into tables, you can login to the BigQuery sandbox and immediately start writing SQL queries (or copying sample ones) to analyze data provided as part of the Google Cloud public datasets program (which you will see for yourself soon!).
What is Data Studio?
Data Studio is an online business intelligence tool (integrated with BigQuery) for visualizing data in customizable and informative tables, dashboards, and reports! You can use it to visualize the results of your SQL queries; it’s also great for analyzing data without SQL, and for sharing insights with non-technical users.
Because Data Studio is already part of Google Cloud, there’s no need to export queried, processed data to an external tool. Data visualization can be completed through direct connections to the BigQuery environment, which saves you lots of time and headaches from having to worry about things like data file compatibility, size, and so on.
You can go from the BigQuery Console to visualizing your query results in Data Studio in one click.
Both BigQuery and Data Studio can be used for little to no cost, within the free tier of Google Cloud. This tier allocates users a starting amount of data storage (if you want to upload your own) and allows a certain amount of bytes processed for your queries each month. You can even create a BigQuery “sandbox” environment that stays within this free tier and doesn’t require any credit card to set up (I’ll give you instructions on how to set one up later 👍).
So, you can get started quickly with BigQuery and Data Studio for free; let’s talk about applicability. Both BigQuery and Data Studio are used across many industries in production workloads today. Just search BigQuery or Data Studio on LinkedIn, and you’ll see what I mean!
Now let’s get to the action. I want to show you just how simple it is to get started with both of these tools, so here’s a quick tutorial on using BigQuery and Data Studio with a real public dataset!
Let’s dive into an example scenario that BigQuery can help solve:
Congrats! You’re a new intern who recently got hired by Pistach.io. Pistach.io is adamant that for the first couple of weeks, new hires come into the office for training programs. So, you must make sure that you show up on time. Pistach.io is in New York City, and the office does not have accessible parking nearby. You know that New York City has reimplemented its public bike program so you’ve decided to use bike sharing to get to work.
Because you must be at work on time, you need answers to a few key questions:
Which nearby stations have bikes you can use in the morning? Where is the drop-off location that is closest to the office? What are the busiest stations that you should avoid ?
It would be great to answer these questions using a public dataset! Luckily for you, BigQuery has tons of datasets available for you to use for no cost. The data that you’ll be analyzing for this example is in the New York Citi Bike public dataset.
First, create a BigQuery sandbox, which is essentially an environment for you to work in. Follow these steps to set one up: https://cloud.google.com/bigquery/docs/sandbox.
In the Google Cloud console, go to the BigQuery page (documentation).
In the Explorer pane, click +Add Data > Pin a project > Enter project name.
Type “bigquery-public-data” and click Pin. This project contains all the datasets available in the public datasets program.
To see underlying datasets, expand the bigquery-public-data project in the Explorer pane and scroll to find “new_york_citibike”.
Click to highlight the dataset or expand to see the citibike_stations and citibike_trips tables. You can then highlight the tables themselves to see more details like the schema and a preview of the data.
Ok, on to the analysis! Let’s figure out which stations are closest to home. For this tutorial, you will be using the Port Authority Bus Terminal in NYC as your “home”.
This query calculates the distance between each Citi Bike station and your “home”, and then returns the result with the closest station listed first. The ST_DISTANCE function calculates the shortest distance between two points. So more like a bird flying than taking a bike to work, but it’ll work for this use case!
Next, let’s find the stations closest to the office. Let’s use the coordinates for the Google NYC Chelsea Market office since that is where I worked this summer. You can use essentially the same query as the last:
Finally, let’s identify the most popular Citi Bike stations around the office so that we can avoid them!
One of the great things about BigQuery is that you can visualize your results easily with Data Studio (just press the Explore Data button in the query results page!). This will give you a better idea of what exactly you queried.
If you want to try out Data Studio for yourself, I recommend following this tutorial. (It’s also about bike share trips, but this time in Austin, Texas!)
It’s really that simple! Google Cloud is easy to learn and use, so you spend less time “getting started” and more time analyzing data and designing visualizations. You can see the potential in using something like this in your personal and professional tech development, and there are so many ways to boost your skills and early career in data science with Google Cloud tools such as BigQuery.
You can also supplement what you’ve learned in this post by completing the From Data to Insights with Google Cloud specialization on Coursera.
That’s all I have to share for now. If you found this blog post helpful, be sure to share! You can find more helpful content on the Google Cloud Platform blog.
Read More for the details.
If you’re a college student like me and are gearing up to enter the “big kids” job market, as I like to call it, then you’ve probably been wondering (or worrying) about how to get ahead of the curve and stand out amongst your peers.
When I think about which high-value fields to target for improving my technical skills and qualifications, the one area I keep coming back to is data analysis.
Data in all forms is becoming increasingly valuable in our technology-driven society, and that is because of the insights that it brings! The amount of data being generated around us is growing exponentially in all fields. This is great news for students, because now you can benefit from learning data analysis to complement your existing skills, whether you’re majoring in computer science, marketing, or even music! Having the skills necessary to manipulate, process, analyze, and display data in a meaningful way will push you ahead regardless of your background.
Learning new tech skills and tools on top of coursework, jobs, and internships can seem daunting. Trust me, I know your pain. That’s why it’s so important for us as students to be strategic and efficient in how we discern the best resources to use for learning.
Whenever I want to learn a new software or skill, there are a few factors I take into account that I’m sure are important to you too:
How much is this going to cost me?How much time is this going to take?How applicable is this to my job prospects?
Cost
I’m not even going to pretend this isn’t one of the first things I think about. Knowing how to allocate your financial resources is a vital skill – especially when it comes to career-oriented self-improvement.
Time
And time? Well that is a cost as well. Time is valuable for us students, sometimes even more valuable than money itself. We’re juggling coursework and studying, commute time, extracurriculars, career development, and sometimes even a job to help pay for it all. We’re looking for skills that are relatively straightforward to learn and that we can learn on our own time, on a flexible, self-paced schedule.
Applicability
Lastly, I want to be able to learn a skill or tool that is applicable to my job search, something that I can list directly on my resume that will be attractive to the type of companies that I will be applying to. After all, furthering your career is one of the main drivers for this kind of self-study. That is why I always look for opportunities to learn directly using industry-standard software and services.
During my internship here at Google, I’ve been given ample opportunities to build my data analysis skills using Google Cloud services. In this blog post, I focus specifically on two of those services: BigQuery and Data Studio.
What is BigQuery?
BigQuery is a cloud data warehouse that companies use for running analytics on large datasets. It also happens to be a great place for learning and practicing SQL (the language for analyzing data). The “getting-started” experience with BigQuery is smooth and saves students tons of time. Instead of downloading and installing database software, sourcing data, and loading it into tables, you can login to the BigQuery sandbox and immediately start writing SQL queries (or copying sample ones) to analyze data provided as part of the Google Cloud public datasets program (which you will see for yourself soon!).
What is Data Studio?
Data Studio is an online business intelligence tool (integrated with BigQuery) for visualizing data in customizable and informative tables, dashboards, and reports! You can use it to visualize the results of your SQL queries; it’s also great for analyzing data without SQL, and for sharing insights with non-technical users.
Because Data Studio is already part of Google Cloud, there’s no need to export queried, processed data to an external tool. Data visualization can be completed through direct connections to the BigQuery environment, which saves you lots of time and headaches from having to worry about things like data file compatibility, size, and so on.
You can go from the BigQuery Console to visualizing your query results in Data Studio in one click.
Both BigQuery and Data Studio can be used for little to no cost, within the free tier of Google Cloud. This tier allocates users a starting amount of data storage (if you want to upload your own) and allows a certain amount of bytes processed for your queries each month. You can even create a BigQuery “sandbox” environment that stays within this free tier and doesn’t require any credit card to set up (I’ll give you instructions on how to set one up later 👍).
So, you can get started quickly with BigQuery and Data Studio for free; let’s talk about applicability. Both BigQuery and Data Studio are used across many industries in production workloads today. Just search BigQuery or Data Studio on LinkedIn, and you’ll see what I mean!
Now let’s get to the action. I want to show you just how simple it is to get started with both of these tools, so here’s a quick tutorial on using BigQuery and Data Studio with a real public dataset!
Let’s dive into an example scenario that BigQuery can help solve:
Congrats! You’re a new intern who recently got hired by Pistach.io. Pistach.io is adamant that for the first couple of weeks, new hires come into the office for training programs. So, you must make sure that you show up on time. Pistach.io is in New York City, and the office does not have accessible parking nearby. You know that New York City has reimplemented its public bike program so you’ve decided to use bike sharing to get to work.
Because you must be at work on time, you need answers to a few key questions:
Which nearby stations have bikes you can use in the morning? Where is the drop-off location that is closest to the office? What are the busiest stations that you should avoid?
It would be great to answer these questions using a public dataset! Luckily for you, BigQuery has tons of datasets available for you to use for no cost. The data that you’ll be analyzing for this example is in the New York Citi Bike public dataset.
First, create a BigQuery sandbox, which is essentially an environment for you to work in. Follow these steps to set one up: https://cloud.google.com/bigquery/docs/sandbox.
In the Google Cloud console, go to the BigQuery page (documentation).
In the Explorer pane, click +Add Data > Pin a project > Enter project name.
Type “bigquery-public-data” and click Pin. This project contains all the datasets available in the public datasets program.
To see underlying datasets, expand the bigquery-public-data project in the Explorer pane and scroll to find “new_york_citibike”.
Click to highlight the dataset or expand to see the citibike_stations and citibike_trips tables. You can then highlight the tables themselves to see more details like the schema and a preview of the data.
Ok, on to the analysis! Let’s figure out which stations are closest to home. For this tutorial, you will be using the Port Authority Bus Terminal in NYC as your “home”.
This query calculates the distance between each Citi Bike station and your “home”, and then returns the result with the closest station listed first. The ST_DISTANCE function calculates the shortest distance between two points. So more like a bird flying than taking a bike to work, but it’ll work for this use case!
Next, let’s find the stations closest to the office. Let’s use the coordinates for the Google NYC Chelsea Market office since that is where I worked this summer. You can use essentially the same query as the last:
Finally, let’s identify the most popular Citi Bike stations around the office so that we can avoid them!
One of the great things about BigQuery is that you can visualize your results easily with Data Studio (just press the Explore Data button in the query results page!). This will give you a better idea of what exactly you queried.
If you want to try out Data Studio for yourself, I recommend following this tutorial. (It’s also about bike share trips, but this time in Austin, Texas!)
It’s really that simple! Google Cloud is easy to learn and use, so you spend less time “getting started” and more time analyzing data and designing visualizations. You can see the potential in using something like this in your personal and professional tech development, and there are so many ways to boost your skills and early career in data science with Google Cloud tools such as BigQuery.
You can also supplement what you’ve learned in this post by completing the From Data to Insights with Google Cloud specialization on Coursera.
That’s all I have to share for now. If you found this blog post helpful, be sure to share! You can find more helpful content on the Google Cloud Platform blog.
Read More for the details.
Amazon MemoryDB for Redis is now a HIPAA (Health Insurance Portability and Accountability Act) eligible service. You can now use MemoryDB to store, process, and access protected health information (PHI) and power secure healthcare and life sciences applications. MemoryDB is a fully managed, Redis-compatible, in-memory database that provides low latency, high throughput, and durability at any scale.
Read More for the details.
Today, AWS Site-to-Site VPN enables publishing VPN connection logs to CloudWatch, providing you with deeper visibility into your VPN setup to help you quickly troubleshoot and resolve VPN connectivity issues.
Read More for the details.
Smithy Interface Definition Language (IDL) 2.0 is now generally available. Smithy is Amazon’s next-generation API modeling language that is based on our experience building tens of thousands of APIs and generating SDKs. Using IDL 2.0, developers can now author Smithy models and generate code from Smithy models in a simpler, more intuitive way.
Read More for the details.
AWS Control Tower now provides the ability to customize the retention policy for Amazon Simple Storage Service (Amazon S3) buckets that store your AWS Control Tower CloudTrail logs. AWS Control Tower uses AWS CloudTrail to record logs of actions made by users, roles or AWS services. These logs are stored in Amazon S3 buckets, which are created in the core logging account. The Amazon S3 CloudTrail buckets have a retention policy that specifies the number of days after which the logs will get deleted. The default settings of 1-year for standard account logging and 10-years for access logging are applied if a user chooses not to customized their log retention policy.
Read More for the details.
Many security leaders head into the cloud armed mostly with tools, practices, skills and ultimately the mental models for how security works that were developed on premise. This leads to cost and efficiency problems that can be solved by mapping their existing mental models to those of the cloud.
When it comes to understanding the differences between on-premises cybersecurity mental models and their cloud cybersecurity counterparts, a helpful place to start is by looking at the kinds of threats each one is attempting to block, detect, or investigate.
Traditional on-premise threats focused on stealing data from databases, file storage, and other corporate resources. The most common defenses of these resources rely on layers of network, endpoint, and sometimes application security controls. The proverbial “crown jewels” of corporate data were not made accessible with an API to the outside world or stored in publicly accessible storage buckets. Other threats aimed to disrupt operations or deploy malware for various purposes, ranging from outright data theft to holding data for ransom.
There are some threats that are specifically aimed at the cloud. Bad actors are always trying to take advantage of the ubiquitous nature of the cloud.One common cloud-centered attack vectorthat they pursue is constantly scanning IP address space for open storage buckets or internet-exposed compute resources.
As Gartner points out, securing the cloud requires significant changes in strategy from the approach we take to protect on-prem data centers. Processes, tools, and architectures need to be designed using cloud-native approaches to protect critical cloud deployments. And when you are in the early stages of cloud adoption, it’s critical for you to be aware of the division of security responsibilities between your cloud service provider and your organization to make sure you are less vulnerable to attacks targeting cloud resources.
Successful cloud security transformations can help better prepare CISOs for threats today, tomorrow, and beyond, but they require more than just a blueprint and a set of projects. CISOs and cybersecurity team leaders need to envision a new set of mental models for thinking about security, one that will require you to map your current security knowledge to cloud realities.
As a way to set the groundwork for this discussion, the cloud security transformation can start with a meaningful definition of what “cloud native” means. Cloud native is really an architecture that takes full advantage of the distributed, scalable, and flexible nature of the public cloud. (To be fair, the term implies that you need to be born in the cloud to be a native, but we’re not trying to be elitist about it. Perhaps a better term would be “cloud-focused” or doing the security “the cloudy way.”)
However we define it, adopting cloud is a way to maximize your focus on writing code, creating business value, and keeping your customers happy while taking advantage of cloud-native inherent properties—including security. One sure way to import legacy mistakes, some predating cloud by decades, into the future would be to merely lift-and-shift your current security tools and practices into the public cloud environment.
Going cloud-native means abstracting away many layers of infrastructure, whether it’s network servers, security appliances, or operating systems. It’s about using modern tools built for the cloud and built in the cloud. Another way to think about it: You’ll worry less about all these things because you’re going to build code on top of that to help you move more quickly. Abandoning legacy security hardware maintenance requirements is part of the win here. To put another way, security will follow in the steps of IT that has been transformed by the SRE and DevOps revolution.
You can extend this thinking to cloud native security, where some of your familiar tools combine with solutions provided by cloud service providers to take advantage of cloud native architecture to secure what’s built and launched in the cloud. While we talked about the differences between on-prem targeted threats compared to threats targeting cloud infrastructure, here are other vital areas to re-evaluate in terms of a cloud security mental model.
Some organizations practice network security in the cloud as if it were a rented data center. While many traditional practices that worked reasonably well on-premise for decades, along with many traditional network architectures, are either not applicable in the cloud or not optimal for cloud computing.
However, concepts like a demilitarized zone (DMZ) can be adapted to today’s cloud environments. For example, a more modern approach to DMZ would use microsegmentation and govern access by identity in context. Making sure that the right identity, in the right context, has access to the correct resource gives you strong control. Even if you get it wrong, microsegmentation can limit a breach blast radius.
Becoming cloud native also drives the adoption of new approaches to enterprise network security, such as BeyondProd. It also benefits organizations because it gets them away from traditional network perimeter security to focus on who and what can access your services—rather than where requests for access originated.
Although network security changes driven by cloud adoption can be enormous and transformational, not all areas shift in the same way.
In the cloud, the concept of a security endpoint changes. Think of it this way: A virtual server is a server. But what about a container? What about microservices and SaaS? With software as a service cloud model, there’s no real endpoint there. All along your cloud security path, users only need to know what happens where.
Here is a helpful mental model translation: An API can be seen as sort of an endpoint. Some of the security thinking developed for endpoints applies to cloud APIs as well. Securing access, permissions, privileged access thinking can be carried over, but the concept of endpoint operating system maintenance does not.
Even with automation of service agents on virtual machines in the cloud, insecure agents may increase risks because they are operating at scale in the cloud. Case in point: This major Microsoft Azure cross-tenant vulnerability highlighted a new type of risk that wasn’t even on the radar of many of its customers.
In light of this, across the spectrum of endpoint security approaches, some disappear (such as patching operating systems for SaaS and PaaS), some survive (such as the need to secure privileged access,) and yet others are transformed.
With a move to the cloud comes changes to the threats you’ll face, and changes to how you detect and respond to them. This means that using on-prem detection technology and approaches as a foundation for future development may not work well. Copying all your on-premises detection tools and their threat detection content won’t reduce risks in the way that most cloud-first organizations will need..
Moving to the cloud provides the opportunity to rethink how you can achieve your security goals of confidentiality, integrity, availability, and reliability with the new opportunities created by cloud process and technology.
Cloud is distributed, often immutable, API-driven, automatically scalable, and centered on the identity layer and often contains ephemeral workloads created for a particular task. All these things combine to affect how you handle threat detection for the cloud environment and necessitate new detection methods and mechanisms.
There are six key domains where threats in the cloud can be best detected: identify, API, managed services, network, compute, and Kubernetes. These provide the coverage needed related to network, identity, compute, and container infrastructure. They also provide specific detection mechanisms for API access logs and network traffic captures.
As with endpoint security, some approaches become less important (such as network IDS on encrypted links), others can grow in importance (such as detecting access anomalies,) while others transform (such as detecting threats from the provider backplane).
The cloud is changing data security in significant ways, and that includes new ways of looking at data loss prevention, data encryption, data governance, and data access.
Cloud adoption sets you on a path to what we at Google call“autonomic data security.”Autonomic data security means security has been integrated throughout the data lifecycle and is improving over time. At the same time, it makes things easier on users, freeing them from having to define and redefine myriad rules about who can do what, when, and with which data. It lets you keep pace with constantly evolving cyberthreats and business changes, so you can keep your IT assets more secure and make your business decisions faster.
Similar to other categories, some data security approaches wane in importance or disappear (such as manual data classification at cloud scale), some retain their importance from on-prem to cloud unchanged, while others transform (such as pervasive encryption with effective and secure key management).
The context for identity and access management (IAM) in the cloud is obviously different from your on-premise data center. In the cloud, every person and service has its own identity and you want to be able to control access.
Within the cloud, IAM gives you fine-grained access control and visibility for centrally managing cloud resources. Your administrators authorize who can act on specific resources, giving you full control and visibility to manage cloud resources centrally. What’s more, if you have complex organizational structures, hundreds of workgroups, and a multitude of projects, IAM gives you a unified view into security policy across your entire organization.
With identity and access management tools, you’re able to grant access to cloud resources at fine-grained levels, well beyond project-level access. You can create more granular access control policies to resources based on attributes like device security status, IP address, resource type, and date and time. These policies help ensure that the appropriate security controls are in place when granting access to cloud resources.
The concept of Zero Trust is strongly in play here. It’s the idea that implicit trust in any single component of a complex, interconnected system can create significant security risks. Instead, trust needs to be established via multiple mechanisms and continuously verified. To protect a cloud-native environment, a zero trust security framework requires all users to be authenticated, authorized, and validated for security configuration and posture before being granted or keeping access to cloud-based applications and data.
This means that IAM mental models from on premise security mostly survive, but a lot of underlying technology changes dramatically, and the importance of IAM in security grows significantly as well.
Clearly, cloud is much more than “someone else’s computer.” That’s why trust is such a critical component of your relationship with your chosen cloud service providers. Many cloud service providers acknowledge shared responsibility, meaning that they supply the underlying infrastructure but leave you responsible for many seemingly inscrutable security tasks.
With Google Cloud, we operate in a shared fate model for risk management in conjunction with our customers. We believe that it’s our responsibility to be active partners as our customers deploy securely on our platform, not delineators of where our responsibility ends. We stand with you from day one, helping you implement best practices for safely migrating to and operating in a trusted cloud.
We offer you several great resources to help you prepare for cloud migration, and guide you as you review your current security approaches for signs of on-prem thinking.
Listen to our podcast series where Phil Venables, Vice President, CISO at Google Cloud, and
Nick Godfrey, Director, Financial Services Security & Compliance and member of Office of the CISO at Google Cloud, join me in a discussion on preparing for cloud migration (Podcast 1, Podcast 2). You can deepen your cloud native skills by earning a Professional Cloud Security Engineer certification from Google.
Read More for the details.
Amazon DynamoDB now makes it easier for you to migrate and load data into new DynamoDB tables by supporting bulk data imports from Amazon S3. Now, you can import data directly into new tables to help you migrate data from other systems, load test data to help you build new applications, facilitate data sharing between tables and accounts, and simplify your disaster recovery and business continuity plans.
Read More for the details.
Why are so many companies moving to the cloud? One reason we hear quite often is cost reduction. The elasticity of cloud services, or its ability to scale up or down as needed, means paying only for what you use. Right up there on the list of reasons is security. This is because developing in the cloud enables greater visibility and governance over your deployment’s resources and data. Customers also migrate to the cloud to increase reliability and performance (through cloud vendor-provided backups, disaster recovery, and SLAs). And finally, reduced maintenance and better manageability in the cloud eases the burden on IT operations teams, freeing them up for more strategic projects.
However, on this journey to realizing the cost savings, scalability, increased security, performance, reliability, and manageability of the cloud, it can sure feel like you’re sitting in the cockpit of a 747.
That’s because the cloud is complicated! The vastness of the cloud (hundreds of products within Google Cloud and counting!) can make it difficult to take full advantage of the wide range of opportunities and optimizations it brings. Constantly tuning your deployment can quickly become tedious work due to the sheer magnitude of options.
That’s why we want to make sure everyone using Google Cloud knows about Active Assist.
Active Assist brings together information from your workloads’ usage, logs, and resource configuration, and then uses machine learning and business logic to help proactively optimize deployments in exactly those areas that draw us to the cloud: cost, security, performance, reliability, manageability, and even sustainability.
Let’s start by taking a look at a few recommender tools for cost optimization that are part of the broad portfolio of Active Assist solutions.
The cloud makes it easy to spin up virtual machines and pay only for the time that resources are running. However, there can be cases when a quick prototype or experiment leaves machines running that are not actively in use or that may require less virtual CPUs and memory than allocated. Active Assist can help with both situations, proactively bringing visibility to cost optimization opportunities and minimizing the need for manual audits, a tedious task, especially when dealing with a multitude of projects.
Idle VM Recommender
Idle VM recommender identifies virtual machines that haven’t been used in the last 14 days and notifies you so you can shut them down or remove them from your project. Active Assist uses system metrics to classify the virtual machines as idle when they meet the following criteria:
the VM has had CPU utilization less than 0.03 for 97% of the time during the observation window
the VM has received less than 2600 bytes per second for 95% of the VM runtime, and
the VM has sent less than 1000 bytes per second 95% of the time.
Active Assist can also help you identify other idle resources, includingidle Cloud SQL instances, and idle resources associated with virtual machines, such as IPs, persistent disks, and custom images.
VM Machine Type Recommender
VM Machine Type recommender can help you optimize the resource utilization of your virtual machine instances by suggesting a machine type configuration that is more efficient for the workloads running on it. For example, if it identifies an application running on the virtual machine that has had a prolonged period of low memory usage, it will recommend switching to a machine type with less memory than currently allocated. If you decide to apply the recommendation, you can lower the costs associated with your virtual machine. The recommendations are generated using CPU and memory utilization metrics over the last eight days.
Predictive Autoscaler
Another solution, the predictive autoscaler, goes beyond recommendations and takes an active role in your deployment!
Predictive autoscaling uses machine learning capabilities to not just respond to capacity needs, but to forecast them. It creates VMs ahead of growing demand, allowing enough time for your application to initialize. Its forecasting model continuously learns and adapts to weekly and daily patterns for your deployment using your instance group’s CPU history. For example, if your app usually needs less capacity on the weekend, the forecast will capture that.
Unattended Project Recommender
We can’t wrap up this section without highlighting Unattended project recommender, which provides recommendations that help you discover, reclaim, and remove unattended projects. This helps optimize cost, security, and sustainability in one go! And what’s even more interesting is that it provides the carbon emission reduction impact for each “unattended project”, showing the emissions you will save if you delete the project and release all its resources.
Be sure to check other cost optimization recommenders in the portfolio:
BigQuery Slot recommender – helps you optimize BigQuery spend with slot reservations
Cloud SQL overprovisioned instance recommender – helps you resize overprovisioned SQL instances
Managed instance group machine type recommender– helps you rightsize the machine types of machine instance groups
Committed Use Discounts – helps you reduce costs through commitments
These cost-focused highlights are just a few of the solutions that make up the Active Assist portfolio. For the full list of solutions that address not just cost but also security, performance, reliability, manageability, and sustainability, see the list of all recommenders and the list of all insight types.
Active Assist surfaces its insights and recommendations in several ways:
Cloud Console UI through the Recommendations Hub and in-context (within specific services pages)
Exporting recommendations to BigQuery to get all your recommendations as a BigQuery dataset for trend analysis or building dashboards
API and CLI with the Recommender API (check out these best practices for more information)
Built-in solutions (such as predictive autoscaling or Quick Access)
Active Assist generates most insights and recommendations for free; however, check the pricing page to see if your support plan offers the BigQuery Export capability and to review API quotas.
To dip your toes in and try out some of the Active Assist solutions in your own deployment, we suggest checking out these recommenders first, since they are among the easiest to understand and apply to your deployments:
Idle persistent disk recommender
Cloud SQL idle instance recommender
And again, take a few minutes to explore the full list of all recommenders and of all insight types that make up Active Assist to find solutions that will help you on your unique cloud journey.
Read More for the details.
AWS App Mesh is now available in Asia Pacific (Osaka) and Asia Pacific (Jakarta) AWS Regions. AWS App Mesh is a service mesh that provides application-level networking to make it easier for your services to communicate with each other across multiple types of compute infrastructure. AWS App Mesh standardizes how your services communicate, giving you end-to-end visibility and options to tune for high-availability of your applications.
Read More for the details.
Amazon Chime SDK announces the release of concatenation that helps makes it easy to create a single file by concatenating media chunks including audio, video, content, and transcriptions created by Amazon Chime SDK media capture. Amazon Chime SDK enables multi-party video sessions by letting developers add real-time voice and video to their web and mobile apps. With concatenation, customers can create a single file from a multi-party video session that includes audio, individual or composited video, content, and transcriptions.
Read More for the details.
Today, Amazon Personalize is excited to announce the Trending-Now recommender for Video-on-Demand domain to highlight catalogue items that are gaining popularity at the fastest pace. Amazon Personalize is a fully managed machine learning service that makes it easy for customers to deliver personalized experiences to their users. Recommenders help reduce the time needed for you to deliver and manage these personalized experiences, and help ensure that recommendations are relevant to your users. User interests can change based on a variety of factors, such as external events or the interests of other users. It is critical to tailor recommendations to these changing interests to improve user engagement. With Trending-Now, you can surface items from your catalogue that are rising in popularity faster than other items, such as a newly released movie or show. Amazon Personalize looks for items that are rising in popularity at a faster rate than other catalogue items and highlights them to users to provide an engaging experience. Amazon Personalize automatically identifies trending items every 2 hours based on the most recent interactions data from your users.
Read More for the details.