Azure – Azure Support Plan Offer extended to December 31, 2023
The existing Azure Support offer is being extended. The promotion will run from July 1, 2023 to December 31, 2023.
Read More for the details.
The existing Azure Support offer is being extended. The promotion will run from July 1, 2023 to December 31, 2023.
Read More for the details.
Increase resiliency, higher availability, and flexibility by utilizing more than 2 Standard Load Balancers per VM.
Read More for the details.
After decades of managing and securing identities in data centers, security and IT operations teams face new challenges when detecting identity compromise in their public cloud environments. Protecting cloud service accounts against leaked keys, privilege escalation in complex authorization systems, and insider threats are vital tasks when considering the threat landscape.
Security Command Center Premium, our built-in security and risk management solution for Google Cloud, has released new capabilities to help detect compromised identities and protect against risks from external attackers and malicious insiders.
In Google Cloud, there are three types of principals used to manage identities:
Google accounts: Human end users of Google Cloud
Service accounts: Used by applications and workloads to access resources
Groups: Collections of Google accounts or service accounts
Once a principal has been defined, the IT team needs to assign it the correct permissions to access Google Cloud resources. Permissions are assigned based on roles, which govern what resources may be accessed. In some cloud environments, it makes sense for one principal to legitimately act with the permissions of another principal. This is referred to as service account impersonation.
Using Groups can make managing identities even more complex. For example, Groups can consist of users that are either inside or outside the organization. Further, Groups can contain other Groups, including those from outside the organization.
While this framework offers organizations flexibility and efficiency in managing cloud identities and resources, the potential complexity, especially at scale, can be a source of risk.
Identity Access Management (IAM) policies govern how principals can access data, create new compute instances, and modify security settings in Google Cloud projects, folders, and organizations. Security Command Center Premium can detect risky IAM policy changes and behavior by principals that may indicate possible account takeover. Detection happens over the full attack chain, from initial credential access and discovery, thorough privilege escalation, and finally attacker persistence.
Security Command Center Premium is able to provide these differentiated detection capabilities because it is engineered into the Google Cloud infrastructure, and has first-party access to core platform services such as Google Groups. It operates within carefully reviewed security and privacy controls to keep Google Cloud customer data private.
Security Command Center Premium includes new detections for:
Excessive failed attempts: This detector analyzes the logs created when a principal attempts to access a resource and is denied per the policy. While some number of denied attempts is normal, Security Command Center looks for cases that are anomalously high. These anomalies potentially can indicate an adversary attempting to enumerate their privileges or explore a privileged environment.
Anomalous service account impersonation: Service account impersonation allows one principal to act with the permissions of another. While this is a normal approach to permission management for some organizations, this new Security Command Center detector is designed to identify anomalously long impersonation chains, which are often a sign of an adversary engaging in privilege escalation.
Dormant service account activity: Managing the sprawl of service accounts is difficult in any cloud environment, but Google Cloud helps with our policy intelligence service. In addition to making proactive recommendations about which service accounts are no longer in regular use, Security Command Center now alerts users about activity taken by a service account that has been dormant for a meaningful period of time.
These new detectors augment Security Command Center’s existing defenses against common identity threats:
UEBA (User and Entity Behavior Analytics) new geography, user agent, and client library: Security Command Center can identify when a principal makes a change in Google Cloud configurations from a new geographical location, or with a new client library. While these are not conclusive proof of account compromise, they are signals that help an analyst understand the state of an account in the context of other findings pertaining to the same principal.
UEBA new API method: This Security Command Center feature is designed to detect over-permissioned accounts that deviate from their established pattern of behavior. It fires when a service account uses an API, or method on an API, that it had not used in its first seven days after creation.
Discovery: Service account self investigation: It is normal for a human user to look up their own IAM policy before embarking on a project so they can ensure they are equipped with necessary permissions. It is far less common for a service account to resolve its own IAM policy. When a service account resolves its own IAM policy, it is often a sign that an adversary has compromised a service account key and is engaging in discovery tactics to see their permissions.
Persistence: Grants and Groups: Adversaries can attempt to establish backup or hidden administrative accounts in Groups to gain persistence in a cloud environment. Security Command Center automatically inspects group membership and alerts if there are external members inside of groups that have received sensitive privileges.
Defense evasion: Access from anonymizing proxy: Security Command Center detects when principals make changes to cloud environments from anonymizing proxies. While it is not unusual to browse the web from anonymizing proxies, such as Tor exit nodes, there are few legitimate purposes to using these proxies when managing cloud environments. Security Command Center uses a list of anonymizing proxies that is regularly updated by our Google Cloud Threat Intelligence team.
Securing a cloud environment requires the ability to detect identity-based threats. Security Command Center continues to augment detection and remediation capabilities for Google Cloud customers. Go to your Google Cloud console to get started today with these new capabilities in Security Command Center.
Read More for the details.
Welcome to FinOps from the Field — a new blog series from experienced FinOps practitioners at Google Cloud designed to help answer the most common questions we get here about how you can apply FinOps to your Google Cloud spend. This content will be a combination of stories and experiences designed specifically to help customers on Google Cloud, but for more information about FinOps, check out the overview here or, for a cloud agnostic view, see www.finops.org.
In a world where cloud services have become increasingly complex, how do you take advantage of the features, but without the nasty bill shock at the end?
The State of FinOps 2023 survey cites “Organizational Adoption of FinOps” as the second highest challenge for businesses when looking at FinOps in their business. When you look at the projected billions of dollars in cloud wastage, it’s not a surprise Flexera’s 2023 report shows 82% of respondents’ primary challenge was managing cloud spend.
Here at Google Cloud, Google Professional Services works with both large and small organizations to build FinOps capability through the means of a FinOps Assessment Workshop. Having worked with over 100 customers in these workshops since 2020, the workshop has been iterated on based on customer feedback. By establishing a concrete plan (referred to as the FinOps Roadmap) key stakeholders within a customer’s organization have the ability to answer the difficult question of “Where do we start?”
The FinOps Roadmap is a definitive document of where various business units are assessed to be right now in seven key capabilities, a prioritized list of what the future state should be, and next steps inside each specific capability.
Today’s blog post will go through the steps that Google PSO takes customers through, focusing on our initial discovery and recommendation workshop. By the end of it, you’ll be able to use the information and tools below to build your own FinOps roadmap.
Before the workshop happens, the process starts with asking the customer to identify specific stakeholders who have or will have cloud spend. These folks can come from any business area. We recommend two to three individuals from the following areas as a good starting point:
FinOps/CCoE (if it exists!)
Engineering
Platform
Business
Finance
Usually 10-15 stakeholders is the right amount — this ensures that there’s adequate representation from each of the key business units, helps break down silos and keeps the flow of the workshop on track.
The goal of the workshop is to establish how mature the current FinOps capability is right now, and where it needs to be, to support effective scaling of cloud usage with the FinOps roadmap as the agreed output.
What is FinOps, really? Hint: It’s not just cost optimization!
The workshop begins by ensuring all workshop participants have a shared understanding of what FinOps actually is. There are many misconceptions and definitions out there, but having one definition helps ensure everyone present knows why this is important for their organization.
FinOps is an operational framework and cultural shift that brings technology, finance, and business together to drive financial accountability and accelerate business value realization through cloud transformation.
It’s a model that allows you to understand your Cloud spend, maximize the sustainable parts that generate value, and gives you a robust structure to reduce/eliminate wasted spend.
Running on the cloud is an investment for an organization’s business, and shifting the mindset from how much it costs to how much value it provides allows them to create a roadmap with more than cost-cutting in mind. Reducing waste is important, but not the only purpose of a successful FinOps strategy.
Following this, the group undergoes an assessment which is designed to establish their current maturity state in seven capabilities:
Cost allocation
Reporting
Pricing efficiency
Architectural efficiency
Training and enablement
Incentivization
Accountability
We consider these seven capabilities as cornerstones for a minimal viable FinOps practice, and the Crawl, Walk, Run maturity model outlines the next stage an organization can aim for.
These can be adapted and personalized for your own organization, and we define each of these (always starting from a shared understanding) and provide a clear example of what would be considered “Crawl, Walk, Run.”
Let’s review each capability:
1) Cost allocation: The process by which cloud costs are assigned an owner, via the use of labels or tags to be identifiable either for show (“showback ”) or charging (“chargeback”).
Cost allocation is often seen as one of the hardest problems by organizations to solve, and many organizations in the workshops rate themselves as Crawl due to the complexity and pressure of getting it right. We often recommend that customers start small, build their confidence in their tooling and education and iterate their allocation strategy often.
2) Reporting: The process by which allocated cloud spend data is made visible and available for users to see theirs/other departments spend and provide answers to common questions (e.g. “Which application spent the most last month?”).
Our findings across multiple customers with Reporting is that many customers are unaware of the Google Cloud tools available in the console, or do not provide access to their teams to access them. More often this is accidental and education/review of the IAM permission model helps them securely provide access to their teams. The built-in tooling can be a great starting point for teams to understand what data is useful and where they need to build customized views.
3) Pricing efficiency : A group of levers to pull to change the end cost of the cloud based on market conditions or special offers (discounts, committed usage etc).
Pricing efficiency is an interesting topic for most organizations and the most likely place for “shadow FinOps” as many teams have started purchasing discounts on their own. Many organizations find this a place for “low-hanging fruit” and add a high priority item on their roadmap to find and consolidate discounts across the organization for the best deal.
4) Architectural efficiency: The extent to which applications are architected for the cloud and make use of the benefits of Cloud pricing.
An interesting finding in workshops is that architectural efficiency is usually forgotten as a FinOps capability because customers report that the move from on-prem to cloud has a larger perceived ROI than any individual application modernization. However, once the initial migration is complete, organizations start to prioritize architectural efficiency and most consider their capability as Crawl.
5) Training and enablement: The extent to which individuals in the organization are upskilled on FinOps.
Training and enablement and incentivization (see below) tend to correlate strongly in maturity with customers. A key roadmap item usually includes a step to consider investment in both capabilities at once to encourage rapid up-skilling.
6) Incentivization: The extent to which employees are motivated to make cost-conscious choices.
Incentivization is a challenge for many organizations as reported both in PSO workshops and in the wider FinOps community. Here, an outcome to review any budgets for success is often captured as a key next step to ensure continuing funding for incentives as teams up-skill.
7) Accountability: The extent of which a strong culture of ownership regarding cloud spend exists.
There are many other capabilities not covered above (e.g., creating unit metrics, accurate forecasting, internal and external benchmarking, etc.), but these can be established later once foundations are present.
Once all the capabilities have been defined, and any questions clarified about the maturity states, participants are then asked to rate in each of these areas:
Where is the organization right now? Crawl, Walk or Run?
What’s the 12 month target? What’s realistic? Crawl, Walk or Run?
What are the critical success factors to be able to get there?
(A brainstorm of ideas that are then inputs to the roadmap.)
After posing these questions, a lively discussion likely takes place, and the teams walk away with a better understanding of how each capability is seen within different areas of the business, where it is thriving, and where it may need more guidance or investment.
Customers often use this part of the discussion to prioritize concrete next steps culminating in the Roadmap output: Which capabilities are the most urgent, what do individual stakeholders need, and how can the group work in-sync to the defined maturity level of that capability?
If you have a small cloud footprint, and a small set of cloud users, you may find the answers to the above questions don’t require much discovery, discussion or effort to answer.
However, the larger your organization is, the more valuable this exercise is, allowing you to identify different pockets of maturity across your organization. For example, your data and analytics team may have a strong understanding of their costs and how to optimize them, whilst your product teams might have no way of viewing their spend at all!
The key benefit of assessing your capabilities as a group is that you’ll end up with:
A common understanding of FinOps across your organization of where you stand against key FinOps capabilities
An aligned view across many teams of what should be the next steps
For example, an organization may be currently in the ‘Crawl’ stage of Allocation. Cross-functional stakeholders identify this as a priority for the next 12 months, with an aim to reach ‘Walk’ and to build a robust tagging strategy to cover 50% of their cloud estate.
In this scenario, their FinOps roadmap includes the following steps usually undertaken by a FinOps team:
Discuss with finance what the current chargeback model looks like and where there are gaps
Discuss what tags would be meaningful for each business unit and how this relates to the model identified
Discuss with platform teams where tags should be applied (project/resource)
Review the technology that allows the platform team to automate this
Provide education and training to engineers to bring tags into their workflow
It’s always difficult to predict the exact output of this workshop, but visibility of spend (via cost allocation and reporting) and defining who is formally responsible for cloud FinOps, are the most common areas of initial focus. A large telco company undertook the FinOps assessment workshop and found some great insights. You can read more about their journey here.
A FinOps roadmap allows the organization to work in an aligned manner, step-by-step to the next stage of maturity.
Now you’ve taken off the pressure of getting everything right the first time, and built your roadmap. What’s next?
Take a deep breath!
Review the steps that each team needs to take in that capability to reach the intended maturity state.
Use the individual stakeholders from the meeting as ‘FinOps champions’ in specific teams that need to implement the agreed actions.
Provide feedback on the new features of the capability internally — celebrate the success!
Review, and repeat!
One of the best things about FinOps is its iterative nature — no organization could or should be at the “Run” maturity state when they start out, or even have all capabilities as ‘Run’ as an end goal.
Think of it as a circle: once you’re done with your first cycle, you’re right back at your starting point. This time you hold more knowledge, you’ll develop and further expand your FinOps strategy and have the ability to innovate and thrive.
Don’t panic if this all sounds like a lot, as picking a single area to focus on will bring you further insights into your cloud spend.
In six months, try reviewing your capabilities again. Are you on track to hit those 12 months target(s) you set as a group during the discussion?
Over time, you’ll build up confidence in your FinOps strategy, but we hope this article has given you some pointers on where to start. Join us next time to further your FinOps journey on Google Cloud.
Attributions:
State of FinOps by FinOps Foundation
Adopting FinOps by FinOps Foundation
Flexera 2023 State of the Cloud Report
Read More for the details.
There are countless use cases for capturing, storing, and classifying unstructured image data — think social media analytics to find missing people, image analytics for tracking road traffic, or media analytics for e-commerce recommendations, to name a few. Most organizations are not able to be fully data driven because the majority of the data generated these days is highly unstructured and when it comes to large-scale analytics across data types and formats, there are some limiting factors for enterprise applications: 1) data storage and management, 2) infrastructure management and 3) availability of data science resources. With BigQuery’s new unstructured data analysis feature, you can now store, process, analyze, model, predict, with unstructured data, and combine it with structured data in queries. Best of all, you can do all of this in no-code SQL-only steps.
In this blog, we will discuss the use case of storing and analyzing images of yoga poses in BigQuery, and then implement a classification model with BigQuery ML to label the poses using only SQL constructs.
BigQuery is a serverless, multi-cloud data warehouse that can scale from bytes to petabytes with zero operational overhead. This makes it a great choice for storing ML training data. Besides, the built-in BigQuery Machine Learning (BQML) and analytics capabilities allow you to create no-code predictions using just SQL queries. And you can access data from external sources with federated queries, eliminating the need for complicated ETL pipelines. You can read more about everything BigQuery has to offer on the BigQuery page.
So far, we know BigQuery as this fully managed cloud data warehouse that helps users analyze structured and semi-structured data. But,
BigQuery has expanded to perform all analytics and ML on unstructured data as well
We can use SQL queries to perform insightful analysis, analytics and ML on images, videos, audio etc. at scale without having to write additional code
We have the ability to combine structured and unstructured data as if they all existed together in a table
We will discuss these in our Yoga Pose Classification use case covered in the next section.
With first of its kind access to image data related “structured” queries, we can now predict results using machine learning classification models using BigQuery ML. I have narrowed down the stages involved into 5 steps for easy understanding.
For our use case of image detection of five Yoga poses, I have used a publicly available dataset and you can access the dataset from this repo. The Yoga poses we are identifying are limited to Downdog, Goddess, Plank, Tree and Warrior2.
Before you begin with the BigQuery Dataset creation, make sure to select or create a Google Cloud Project and check if billing is enabled on the project. Enable BigQuery API and BigQuery Connection API.
a. Using below steps, create the dataset “yoga_set”
b. BigLake Connection allows us to connect the external data source while retaining fine-grained BigQuery access control and security, which in our case is the Cloud Storage for the image data. We will use this connection to read objects from Cloud Storage. Follow steps below to create the BigLake Connection
Click ADD DATA on the Explorer pane of the BigQuery page:
Click Connections to external data sources and select BigLake and Remote functions option:
Provide Connection Name and create the connection. Remember to take a note of the Service Account id that is created in this process.
We are going to use a Cloud Storage bucket to contain the image files of Yoga poses that we want to create the model on.
a. Go to Cloud Storage Buckets page and click CREATE
b. On the Create a bucket page, enter your bucket information and continue, making sure it is in the same region as the dataset and the connection discussed in above steps and create
c. Once the bucket is created, store your images (through console or Cloud Shell commands or programmatically) and grant the necessary permissions for the connection’s service account (that we saved earlier) to access the images
> export sa=”yourServiceAccountId@email.address”
> gsutil iam ch serviceAccount:$sa:objectViewer “gs://<<bucket>>”
Create an external object table from BigQuery to access the unstructured data in the bucket using the connection we created. Run the below CREATE SQL from BigQuery editor:
External Table is created as shown below:
Let’s quickly query a pose from the newly created table:
As you can see in the screenshot below, you can create and operate on unstructured images as if they are structured data:
Now let’s export the query result from above into a small Python snippet to visualize the result:
Now that we have created the external table and accessed images from Cloud Storage only using SQL queries, let us move on to the next section that is to create the Classification Model.
For this implementation, we are going to use the pre-trained ResNet 50 Model to run inference on the object table we just created. The ResNet 50 model analyzes image files and outputs a batch of vectors representing the likelihood that an image belongs to the corresponding class (logits).
Before moving on to this step, make sure you have all the necessary permissions in place. Then follow the below steps:
a. Download the model from this location and save it in your local
b. It should unpackage into saved_model.pb and a variables folder
c. Upload these two (the file and the folder) into the bucket we created in previous section
Once this step is completed, your model related files should be present in the same bucket as your images as seen in the image above.
In this step, we are going to load the model into the same BigQuery Dataset as the external table we created earlier and apply it for the images we have stored in the Cloud Storage.
a. From BigQuery Editor, run the following SQL statement
Once the execution is completed, you would see the model listed in your Dataset section in BigQuery.
b. Inspect the model to see its input and output fields. Expand the dataset and click on the model we just created “yoga_poses_resnet”. Click the Schema tab:
In the Labels section, you see the “activation_49” field that represents the output field. In the Features section, you can see “input_1” that represents the field that is expected to be input to the model. You will reference “input_1” in your inference query (or prediction query) as the field you are passing in for your “test” data.
c. Infer your Yoga Pose!
Let’s use the model we just created to classify our test image data. Make sure you have some test images (Yoga poses) identified from your Cloud Storage bucket that made it into the External Table when we created it. We are going to selectively query for those test images in BigQuery to perform the inference using the BQML model we just created. Use the below query to trigger the test.
In the above query, we select one test image that is identified to contain a specific URI value (00000097.jpg) in the external table. Also, the SELECT part uses the ML.DECODE_IMAGE construct as field “input_1” in order for the ML.PREDICT function to work.
Once execution is completed, you will see the result as shown below:
Now for those who know the ResNet model in depth, this should help understand the classification. But for those like me, let’s code a small snippet to understand the classification visually.
d. Flattening the result
One way of visualizing the above output is to flatten the activation_49 field values using BigQuery SQL’s UNNEST construct. Please refer to the query below for flattening the result from the earlier step. If you want to further textually label the resulting class, you can introduce the logic in place of the placeholder <<LABEL_LOGIC>> in the query (uncomment when using).
Without the class labeling logic, below is the output to the query:
You can read further about the model and apply the logic that works best with your data and the model output.
e. Visualizing the inference
Finally, a quick Python snippet to visualize the result from the classification! Export the above query result to a CSV file and reference it in the Python code.
The above image output refers to the Yoga Pose “Downward Dog” which is exactly the same test input we passed into the ML.PREDICT query for classification using BQML!
Lastly, my favorite part of this implementation is to unify the fields from my structured relational table with this unstructured image data. I created a structured BigQuery table in the same dataset as the external table to hold the pose and its health related data.
The image above represents the schema of the structured data table named “yoga_health” and the fields are pose, focus, health_benefit and breath. The query below joins both Structured and Unstructured data:
Below is the result:
Note: All of the queries we have covered in this blog can be run directly from your Python Notebook using the BigQuery Magic commands.
That’s it! We have successfully stored and queried unstructured data in BigQuery, created a Classification Model using BQML and predicted test yoga poses with the model. If you would like to implement this, get started with your Google Cloud project and follow the codelab. Also, if you would like to learn more about databases or other end to end application implementations in Google Cloud, please head to my blogs. For feedback and questions, you can stay in touch with me here.
Read More for the details.
Unstructured data such as images, speech and textual data can be notoriously difficult to manage, and even harder to analyze. The analysis of unstructured data includes use cases such as extracting text from images using OCR, sentiment analysis on customer reviews and simplifying translation for analytics. All of this data needs to be stored, managed and made available for machine learning.
The new BigQuery ML inference engine empowers practitioners to run inferences on unstructured data using pre-trained AI models. The results of these inferences can be analyzed to extract insights and improve decision making. This can all be done in BigQuery, using just a few lines of SQL.
In this blog, we’ll explore how the new BigQuery ML inference engine can be used to run inferences against unstructured data in BigQuery. We’ll demonstrate how to detect and translate text from movie poster images, and run sentiment analysis against movie reviews.
Google Cloud is home to a suite of pre-trained AI models and APIs. The BigQuery ML inference engine can call these APIs and manage the responses on your behalf. All you have to do is define the model you want to use and run inferences against your data. All of this is done in BigQuery using SQL. The inference results are returned in JSON format and stored in BigQuery for analysis.
Traditionally, working with AI models to run inferences required expertise in programming languages like Python. The ability to run inferences in BigQuery using just SQL can make generating insights from your data using AI simple and accessible. BigQuery is also serverless, so you can focus on analyzing your data without worrying about scalability and infrastructure.
The inference results are stored in BigQuery, which allows you to analyze your unstructured data immediately, without the need to move or copy your data. A key advantage here is that this analysis can also be joined with structured data stored in BigQuery, giving you the opportunity to deepen your insights. This can simplify data management and minimize the amount of data movement and duplication required.
For now, the BigQuery ML inference engine can be used with these pre-trained Vertex AI models:
Vision AI API: This model can be used to extract features from images managed by BigQuery Object Tables and stored on Cloud Storage. For example, Vision AI can detect and classify objects, or read handwritten text.
Translation AI API: This model can be used to translate text in BigQuery tables into over one hundred languages.
Natural Language Processing API: This model can be used to derive meaning from textual data stored in BigQuery tables. For example, features like sentiment analysis can be used to determine whether the emotional tone of text is positive or negative.
So, how does this work in practice? Let’s look at an example using images of movie posters
We will define our pre-trained models for Vision AI, Translation AI and NLP AI in BigQuery ML.
We’ll then use Vision AI to detect the text from some classic movie posters images.
Next, we’ll use Translation AI to detect any foreign posters and translate them to a language of our choosing – English in this case.
Finally, we’ll combine our unstructured data with structured data in BigQuery.
We’ll use the extracted movie titles from our movie posters to look up the viewer reviews from the BigQuery IMDB public dataset. We can then run sentiment analysis against these reviews using NLP AI.
Note: The BigQuery ML inference engine is currently in Preview. You will need to complete this enrollment form to have your project allowlisted for use with the BQML Inference Engine.
We’ll give examples of the BigQuery SQL needed to define your models and run your inferences. You’ll want to check out our notebook for a detailed guide on how to get this up and running in your Google Cloud project.
You will need to enable the APIs listed below, and also create a Cloud resource connection to enable BigQuery to interact with these services.
API
Model Name
Cloud_ai_vision_v1
Cloud_ai_translate_v3
Cloud_ai_natural_language_v1
You can then run the CREATE MODEL query for each AI service to create your pretrained models, replacing the model_name as required.
You will need to create an object table for your images in Cloud Storage. This read-only object table provides metadata for images stored in Cloud Storage:
To detect the text from our posters, you can then use ML.ANNOTATE_IMAGE and specify the text_detection feature.
A JSON response will be returned to BigQuery that includes the text content and language code of the text. You can parse the JSON to a scalar result using the dot annotation highlighted above.
ML.TRANSLATE can now be used to translate the foreign titles we’ve extracted from our images into English. You just need to specify the target language and the table of the movie posters for translation:
Note: The table column with the text you want to translate must be named text_content:
The table of results will include json that can be parsed to extract both the original language and the translated text. In this case, the model has detected that title text is in French and has translated it to English:
You can easily join inference results from your unstructured data with other BigQuery datasets to bolster your analysis. For example, we can now join the movie titles we extracted from our posters with thousands of movie reviews stored in BigQuery’s IMDB public dataset `bigquery-public-data.imdb.reviews`.
You can use ML.UNDERSTAND_TEXT with the analyze_sentiment feature to run sentiment analysis against some of these reviews to determine whether they are positive or negative:
Note: The table column with the text you want to analyze must be named text_content:
The JSON response will include a score and magnitude. The score indicates the overall emotion of the text while the magnitude indicates how much emotional content is present:
To wrap up, we’ll compare the average review score of the 1925 Lost World movie to other movies released that year to see which was more popular. This can be done using familiar SQL analysis:
It looks like The Lost World narrowly missed out on the top spot to Sally of the Sawdust!
Check out our notebook for a step by step guide on using the BQML inference engine for unstructured data in Google Cloud. You can also check out our Cloud AI service table-valued functions overview page for more details. Curious about pricing? The BQML Pricing page gives a breakdown of how costs are applied across these services.
Read More for the details.
AWS Application Migration Service (AWS MGN) is now available in AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions. You can now use Application Migration Service to migrate and modernize your applications in these AWS Regions.
Read More for the details.
Starting today, AWS Pricing Calculator will enable you to re-access your previously saved estimate with pricing as on the date you created the estimate, while giving you the option to update the estimate to reflect latest pricing. The new experience will now make it possible for you to track changes to your cost estimates over time. When updating your previously saved estimate to reflect latest pricing, you may have to enter inputs for specific services to account for any new changes made to service estimation logic or the underlying pricing model since the creation of the estimate. To help you stay informed on the factors leading to these changes, we are also launching a Service updates page that records any significant updates made to computation logic of services within the calculator.
Read More for the details.
AWS Service Catalog now supports granting portfolio access to IAM principal (user, group or role) names with wildcards, such as ‘*’ or ‘?’. This enables flexible and efficient sharing of infrastructure-as-code templates for customers using wildcard patterns to cover multiple IAM principal names at a time. Previously, customers had to use the exact IAM principal names to share a portfolio. Customers using AWS IAM Identity Center (successor to AWS Single Sign-On) can now quickly grant their workforce users access to Service Catalog portfolio products.
Read More for the details.
Today, AWS announced the launch of a new customization feature for the Amazon VPC IP Address Manager (VPC IPAM) dashboard, that allows you to adapt the VPC IPAM dashboard to your specific needs. With this new feature, you can rearrange and resize widgets to your preferences. For example, if you want to see the overlapping CIDRs widget as the first item on the dashboard, you can move it to the top and resize it as needed. The dashboard also includes new insights into your top VPCs and subnets by allocated IP count, so you can take action before any resource runs out of IP addresses. You can also easily identify CIDRs that are overlapping, or CIDRs that are noncompliant with your allocation rules, using new intuitive graphs, to help you not overlook them.
Read More for the details.
Starting today, you can enable a new Managed Domain List on Amazon Route 53 Resolver DNS Firewall, to block domains identified as low-reputation or that are known or suspected to be malicious by Amazon GuardDuty’s threat intelligence. This means that customers using GuardDuty can now block domains using the same GuardDuty threat intelligence used to monitor and alert you on potential DNS threats for your AWS accounts today.
Read More for the details.
AWS Elemental MediaTailor now recognizes overlay ad-break signals as valid ad break types in over the top (OTT) streams. Non-linear overlay ads can be rendered in-line with the program content on to the client device using metadata returned by MediaTailor’s existing interface to an ad decision server (ADS) and the VAST standard.
Read More for the details.
AWS announces the general availability of a new AWS Snowball Edge Storage Optimized device with higher storage capacity. The new Snowball Edge Storage Optimized device increases storage capacity on Snow devices from 80TB to 210TB with high performance NVMe storage, enabling customers to simplify multi-petabyte data migrations from on-premises locations to AWS.
Read More for the details.
Azure Files has increased the root directory handle limit per share from 2,000 to 10,000 for standard and premium file shares.
Read More for the details.
Zone Redundant Storage (ZRS) for Azure Disks is now available on Azure Premium SSD and Standard SSD in Brazil South, UK South, East US, East US 2, and South-Central US.
Read More for the details.
Resource-centric log query support is available in the latest stable release of the Azure Monitor Query client libraries. Query logs by Azure resource ID using .NET, Go, Java, JavaScript, or Python.
Read More for the details.
Today, AWS announces the Amazon EventBridge open-source connector for Apache Kafka Connect. This connector allows you to integrate EventBridge into Kafka Connect environments to deliver events from Kafka topics to EventBridge Event Buses. It also includes useful features, such as schema registry support for Avro, Protobuf, and JSON, consuming from multiple Kafka topics, and IAM role-based authentication.
Read More for the details.
Welcome to the second Cloud CISO Perspectives for May 2023. I hope you all enjoyed our previous newsletter from my Office of the CISO colleague MK Palmore, on Google’s new cybersecurity certification and how it can help prepare aspiring cybersecurity experts for their next career steps.
Before I jump into my column today, I’d like to encourage everyone to sign up for our annual Security Summit, coming in just a few weeks on June 13-14. This year, we’ll explore the latest technologies and strategies from Google Cloud, Mandiant, and our partners to help protect your business, your customers, and your cloud transformation. You can register for the broadcast in your choice of two time zones here. We hope to see you there.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
Today, I’d like to talk about one of the more complex and important topics in our current cloud discourse: digital sovereignty. Simply put, digital sovereignty is an organization’s intention to retain control over their data and how that data is stored, processed and managed when using third-party services — including cloud providers.
Organizations should feel that they have control over their data. When those controls have been designed well, they should encourage even more organizations to use the cloud and benefit from all that the cloud offers.
Digital sovereignty is a subject that we feel strongly about, and over the past few years, Google Cloud has worked extensively with customers, partners, policy makers, and governments to understand their evolving sovereignty requirements.
We take an expansive view of sovereignty requirements encompassing data, operations, and software. We also see the control of encryption as vital to addressing these requirements and have engineered leading encryption solutions. Along with having these solutions in our cloud, we also have led the industry on establishing partnerships with trusted local partners to address concerns of working with foreign providers.
Google Cloud has been leading dialogue and developing digital sovereignty solutions since 2019. Our ongoing discussions in the market have taught us that designing a digital sovereignty strategy that balances control and innovation is challenging because of four main reasons:
Foundational concepts are not always well-understood, including regulatory requirements, legal safeguards, and risk management.
Many organizations struggle to articulate their specific requirements, particularly when it comes to how sovereign strategies enable digital transformation.
Choosing the best technologies and solutions to meet those requirements can be difficult.
A shortage of advisory capacity and expertise in the market can make these challenges even harder to overcome.
While digital sovereignty challenges cross boundaries and oceans, we’ve focused many of our initial efforts in Europe. This has resulted in our “Cloud. On Europe’s Terms” initiative and a broad portfolio of Sovereign Solutions we have already brought to market to help support customers’ current and emerging needs as they bring more workloads to the cloud.
We’ve also developed the Digital Sovereignty Explorer, a tool designed to help you make progress on your understanding of key concepts and potential solutions, which we introduced in March. Initially focused on the needs of European organizations, the Explorer is an online, interactive tool that takes individuals through a guided series of questions about their organizations’ digital sovereignty requirements.
One benefit of our early digital sovereignty investments has been that it has helped strengthen other areas we’re focused on. Confidential Computing has also proven to be a helpful additional control for organizations implementing digital sovereignty strategies, providing an encryption capability, and protection for data-in-use where encryption keys are not accessible by the cloud provider.
Innovating to address digital sovereignty requirements is important to advance digital transformation and technological creativity, and to join in the benefits of the cloud. We’re going to continue to engage with customers, our partners, governments, and regulators to deliver novel solutions that meet local requirements.
Here are the latest updates, products, services, and resources from our security teams so far this month:
Get ready for Google Cloud Next: Discounted early-bird registration for Google Cloud Next ‘23 is open now. This year’s Next comes at an exciting time, with the emergence of generative AI, breakthroughs in cybersecurity, and more. It’s clear that there has never been a better time to work in the cloud industry. Register now.
Partnering with Health-ISAC to strengthen the European healthcare system: We’re growing our relationship with Health-ISAC to include CISOs and security leaders in Europe, the Middle East, and Africa (EMEA), starting with a joint 17-city tour across the region, as part of its European Healthcare Threat Landscape Tour. Read more.
4 ways to improve cybersecurity from the boardroom: Here are four ways that boards and cybersecurity teams can keep their organizations more secure and reduce risk. Read more.
How does Google protect the physical-to-logical space in a data center? Each Google data center is a large and diverse environment of machines, networking devices, and control systems. In these complex environments, the security of your data is our top priority. Learn how we keep it secure. Read more.
Introducing reCAPTCHA Enterprise Fraud Prevention: We are pleased to announce the general availability of reCAPTCHA Enterprise Fraud Prevention, a new product that uses Google’s own fraud models, machine learning, and intelligence from protecting more than 6 million websites to help stop payment fraud. Read more.
How Apigee can help government agencies adopt Zero Trust: Securely sharing data is critical to building an effective government application ecosystem. Rather than building new applications, APIs can enable government leaders to gather data-driven insights within their existing technical environments, which Google Cloud’s Apigee can help achieve. Here’s how.
New OT malware possibly related to Russian emergency response exercises: Mandiant identified COSMICENERGY, a novel operational technology (OT) and industrial control system (ICS)-oriented malware possibly related to Russian emergency response exercises, which has demonstrated a cyber impact to physical systems. Read more.
Don’t @ me: URL obfuscation through schema abuse: Mandiant has found attackers distributing multiple malware families by obfuscating the end destination of a URL by abusing the URL schema. This technique can increase the likelihood of a successful phishing attack. Read more.
A requirements-driven approach to cyber threat intelligence: Mandiant’s latest report on applying threat intelligence outlines what it means to be requirements-driven in practice, offering actionable advice on how intelligence functions can implement and optimize such an approach within their organizations. Read more.
Cloudy with a chance of bad logs: As organizations increasingly move to cloud and security teams struggle to keep up, Mandiant provides a hypothetical scenario of a cloud platform compromise with multiple components that would require investigation. Read more.
We launched a weekly podcast focusing on Cloud Security in February 2021. Hosts Anton Chuvakin and Timothy Peacock chat with cybersecurity experts about the most important and challenging topics facing the industry today. Earlier this month, they discussed:
The good, the bad, and the epic possibilities of threat detection at scale: Good detection is hard to build, whether defined for a rule or a piece of detection content, or for a program at a company. Reliably producing good detection content at scale is even trickier, so we chatted with Jack Naglieri, founder and CEO, Panther Labs. Listen here.
Firewalls in the cloud: Nevermind the difference between firewalls and firewalling (although we discuss that, too) — does the cloud even need firewalls? Our own senior cloud security advocate, Michele Chubirka, gets us grounded on all things cloud firewall. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back at the end of the month with more security-related updates.
Read More for the details.
Today many companies manage their infrastructure and configure environments using multiple tools that are either stand-alone or a part of a larger CI/CD pipeline solution. Tools such as Cloud Build in Google Cloud, HashiCorp’s Terraform, or AWS CloudFormation, allow developers to use purpose-built languages such as HashiCorp’s HCL or Cloud Build Configuration language to define their environment’s infrastructure and/or automate its provisioning.
One of the popular tools, Terraform, is a widely used Infrastructure as Code (IaC) software tool to provision infrastructure on Google Cloud and other cloud platforms. Google is actively supporting Terraform by contributing to the Google Cloud Provider for Terraform and developing the Cloud Foundation Toolkit which includes many useful Terraform modules.
We would like to evaluate another solution called Config Connector (a.k.a., KCC), available on Google Cloud, and to show how cloud users can improve their operational processes using this solution compared with other available tools. Google announced it first in 2020. Config Connector is a Kubernetes operator that allows you to manage Google Cloud resources. Config Connector utilizes the Kubernetes Resource Model to enforce a contract between the configuration a developer has defined and infrastructure. This is often referred to as Configuration as Data. You can read more about Configuration as Data in this blog post. Compared to Terraform, Config Connector applies a reconciliation strategy to keep cloud infrastructure as close to the declared configuration as possible in real time.
Config Connector can provide a developer with a number of advantages:
Native integration with GKE and Anthos Configuration Management simplifies provisioning of both Google Cloud resources and application workloads across multiple environments.
Automated reconciliation observes the infrastructure state and repairs any discrepancies between the desired and observed states without need for additional monitoring or manual intervention.
Centralized configuration management lets you manage workload and infrastructure configurations for all environments in one place and in one format.
As a managed solution, Config Connector reduces operational and maintenance overload on DevOps teams, saving time and helping to speed up onboarding new team members.
You can reference the following decision tree when deciding which tool to use when provisioning Google Cloud infrastructure:
Using Config Connector also lets developers benefit from extensive observability capabilities. Leveraging integration with GKE and Cloud Operations suite, you can audit Config Connector operations and the reconciliation state of the configuration. Additionally, you can automate incident handling by defining alert policies to be triggered when there are problems with configuration, provisioning or reconciliation. For example, the following set of log filters can be used to query problems with configuration references (e.g., a resource references a Kubernetes Secret that cannot be found):
See the Config Connector documentation about monitoring and troubleshooting for more information.
Getting started with Config Connector is simple. All you need is a GKE cluster. Then, you can enable the Config Connector add-on to have Config Connector automatically installed on the cluster. There are several options to install Config Connector. The following paragraphs summarizes pros and cons of each option.
Config Controller is a great choice if you are looking to minimize maintenance cost and add support for GitOps components. To use it, you would have to enable Anthos in your projects which may introduce management and cluster fees. If you already use Anthos Config Management (ACM), Config Controller is already available for you. ACM hosts Config Connector and automatically upgrades it to the latest stable version.
Manual installation is useful when you need a high level of customization and control over Config Connector. Using this method you install a Kubernetes operator and additional CRDs on your GKE cluster. This also enables you to install Config Connector on other Kubernetes distributions. It comes at higher operational costs since you will own the hosting and configuration of Config Connector.
GKE Config Connector add-on is a good choice as a jump start solution. It can be installed on any new or existing GKE Standard cluster (starting version 1.15) using a single configuration setting. However, we would like to discourage you using it in production because of the significant lag behind the latest Config Connector version. It also comes with operational costs of provisioning and maintaining the hosting GKE cluster.
Once Config Connector is installed, you can provision Google Cloud resources like you do your Kubernetes workloads. For example, the following code snippet will create a BigQuery dataset:
(This example uses the user-specified resource ID to identify the BigQuery dataset)
In many scenarios Config Connector can replace multiple other tools while minimizing the time it takes to reconcile configured and actual states. The managed nature of the Connector together with a large coverage of Google Cloud resources and services, as well as integration with Anthos configuration, makes it a universal Swiss Army Knife of DevOps pipelines for Google Cloud users. You can familiarize yourself with Config Connector by reading the documentation. Give it a try!
Read More for the details.
Enterprise Strategy Group (ESG) recently published a 15-page report on the Economic Advantages of Google Cloud’s Advanced Networking Services, detailing how these advantages can help customers realize up to a 28% total cost savings for their cloud networking. In this blog, we explore the six key areas customers should consider when evaluating public cloud providers, and some of the advantages gained by building on Google’s globally scaled cloud network.
Google has an advanced, globally scaled, fiber-optic software-defined network. This global network supports billions of users accessing various Google services like YouTube, Search, and Maps, as well as Google Cloud. You can capitalize on this global network for your Google Cloud workloads with full confidence in its scalability and robustness. You can find the current count of regions, zones and edge location on the Cloud Locations page.
The report goes into quantitative and qualitative details as to areas that enterprises should look at when choosing between cloud provider networks. Let’s explore these areas and how Google Cloud Networking services support each of them.
# 1 – There are differences between cloud network architectures.
Robust cloud networks should be software-defined, scalable, simple, and automatable. A well-documented architecture, built upon consistent innovation and that is transparent about network performance, can greatly reduce administrative overhead. Google Cloud’s network meets all of these requirements and more. Google has documented its Andromeda stack for software-defined networking, contributes significantly to Internet Engineering Task Force (IETF) RFC standards, and provides a public global view of inter-region network latency and throughput. Unlike other cloud providers, its network capabilities are globally scoped, which helps customers reduce architectural complexity and simplify network management.
# 2 – A cloud network should provide simple and flexible hybrid cloud connectivity.
Google Cloud hybrid connectivity options offer flexibility for customers’ business requirements. Cloud VPN provides easy-to-set-up high-availability connectivity. For organizations that need more stable and higher bandwidth connections, Cloud Interconnect options like Dedicatedand Partner Interconnect provide connectivity. Private Service Connect, meanwhile, allows secure and private access to Google Cloud, third-party, or customer-managed cloud services.
In Google Cloud, a Virtual Private Cloud (VPC) is a global construct that is different from other cloud providers’. You can create a single VPC and provision different isolated subnets in any region within the same VPC. This allows customers to create and manage fewer VPCs, thus lowering operational overhead.
# 3 – A cloud network should make it easy to scale and accelerate workloads.
Google Cloud Managed Instance Groups and cluster autoscaling help you scale your resources automatically. In Google Cloud, there exist a robust set of Load Balancers that support various traffic options to meet both global and regional requirements. Load balancers distribute traffic across your backend targets, which can exist within and outside of Google Cloud.
With Cloud CDN for static content and Media CDN for streaming content, customers benefit from the same global footprint and network performance as other Google services such as YouTube. Customers can verify CDN performance as measured by Cedexis and made publicly available here.
# 4 – A cloud network should ensure secure hybrid cloud operations.
Google Cloud services like Cloud NAT, Cloud Firewall and Cloud Armor can be utilized at different points in the environment to provide a layered defense-in-depth approach. Cloud Armor was in the news last year when it helped a customer mitigate a Layer 7 DDoS attack at 46 million requests per second. In addition to native capabilities, Google Cloud also supports third-party appliances including many partner solutions that are directly available in the Google Cloud Marketplace.
# 5 – A cloud network should provide visibility and control.
Google Cloud Network Intelligence Center provides real-time observability for your network. It does this through individual modules (Network Topology, Connectivity test, Performance Dashboard, Firewall Insights, Network Analyzer) which provide targeted visibility into aspects of your network infrastructure including latency, throughput, connectivity between specific resources, and resource configuration. This resource analysis highlights configuration issues that can cause network failures, lead to resource exhaustion, or otherwise result in sub-optimal performance — often proactively before an issue can be observed by the end user.
# 6 – Initiatives must support modernization efforts.
Google Cloud supports application modernization with many services to help customers both migrate and build applications on the platform. Google Kubernetes Engines (GKE) provides high-scale Kubernetes deployments with up to 15,000 nodes per cluster while introducing next-generation features such as the GKE Gateway Controller, a production implementation of the new Kubernetes Gateway API. Google also continues to develop and/or contribute to many popular open standards and open-source projects such as HTTP3, QUIC, gRPC, eBPF, Envoy, Istio, and, of course, Kubernetes itself.
The full report goes into much greater detail with comparisons, percentages, charts, and diagrams to evaluate the economic benefits that customers can, and should, expect from their cloud provider. To read more, download your copy of the ESG Report: The Economic Advantage of Google Cloud’s Advanced Networking Services today.
For more on cloud networking visit https://cloud.google.com/products/networking.
Read More for the details.