GCP – How Looker helps startups uncover data-driven insights
Post Content
Read More for the details.
Post Content
Read More for the details.
The ability to customize Google markers ranks among our most requested features. In May, we made Advanced Markers generally available for the Maps JavaScript API and today we are announcing support for Android and iOS, enabling developers to build experiences that are consistent across platforms.
Halloween themed markers to highlight and navigate between Points of Interest on mobile
With Advanced Markers, you can create highly customized, faster performant markers to drive a richer user experience. You can customize the Google Maps pin using SVGs, PNGs, or native views (for Android) and UIViews (for iOS) to create custom markers and control collision behavior—all with improved load times over traditional markers. You can also save developer time and resources with the ability to customize the Google Maps red pin directly in code without a PNG for each variation.
Google pins customized into Halloween-themed markers
Rendering of the code snippet above showing custom pin colors and a custom image
To learn more about Advanced Markers, please check out our website and documentation for Android and iOS support. For Android developers, marker clustering of Advanced Markers is also supported in the Maps SDK for Android Utility Library, and the Maps Compose Library provides composables for Advanced Markers as well as Jetpack Compose-compatible marker clustering of Advanced Markers. If you’re ready to get started, head to the Cloud console.
We have also released an Advanced Markers Utility Library for JavaScript developers. This library simplifies common patterns for using Advanced Markers by combining all features from the google.maps.marker.AdvancedMarkerElement and google.maps.marker.PinElement classes into a single interface and supporting dynamic properties. It also provides some useful features like automatic color selection, handling for icon-fonts, and automatic handling of small to medium datasets.
For more information on Google Maps Platform, visit our website.
Read More for the details.
Welcome to the second Cloud CISO Perspectives for October 2023. This month, David Stone and Anton Chuvakin, colleagues from our Office of the CISO, are talking about what security and business leaders need to know about securing our multicloud present and future.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
By David Stone, solutions consultant, and Anton Chuvakin, security advisor, Office of the CISO
One of the inevitabilities of the modern cloud ecosystem is that multicloud — when you use services from more than one public cloud provider — is happening across your infrastructure.
David Stone, solutions consultant, Office of the CISO
Organizations should have an executable strategy in place to better manage the security risks associated with multicloud systems based on this foundation: Securing multicloud doesn’t mean you need to multiply the number of engineers you have, their skills, or even the size of your security team by the number of clouds you have. Instead, it’s about securing the clouds you use and the connections between them.
Anton Chuvakin, security advisor, Office of the CISO
An overwhelming majority of organizations are using or plan to use at least two cloud infrastructure providers, and nearly one-third are using four or more, according to an Oracle and 451 Research report published earlier this year. Yet securing clouds is not a one-size-fits-all proposition, especially in highly-regulated environments such as those that financial services organizations operate in.
A multicloud environment allows your cloud environments to be private, public, or a combination of both. The primary goal of a multicloud strategy is to give you flexibility to operate with the best computing environment for each workload.
We believe that it’s best to approach securing multiple clouds with a beginner’s mindset, by relearning cloud capabilities and security control mechanisms. An effective approach to securing those workloads is to run multiple cloud services as a unified multicloud infrastructure-as-a-service. Based on our conversations with financial services organizations, here are five steps and three bonus tips on how to better secure multicloud from an organizational and operational perspective.
1) How to make the most of multicloud
Cloud service providers including Google talk a lot about and implement security by default — but there is no “multicloud secure by default.” Each cloud has their own set of defaults and guiding principles that may not be harmonized internally or with each other, so you need to baseline each, see where the gaps are to your ideal, and augment each cloud’s defaults as needed.
One way to improve secure-by-default outcomes in multicloud environments is to develop inside the cloud so you lay a more secure foundation. Building individual cloud tools from the ground up that are secure by default and secure by design leads to less add-on work after the fact.
When it comes to integration, most cloud providers have APIs and all the other things you need to “glue it all together” from a security perspective.
Be aware of complexities that can evolve around identity and access management, and data governance. Maintaining data governance and secure access across multiple cloud environments can be challenging and play out in different ways. For instance, some organizations rely on their on-premise Active Directory for all their identity management, including that in all their clouds. That makes their modern cloud environments critically reliant on a 1990s piece of technology.
Secure each cloud using the best available tools and approaches, and prepare to build additional safeguards to reduce the risks your multicloud system might face.
2) Integrate CI/CD pipelines with common security controls
Security as code for multicloud can create one pipeline that integrates common security checks. This is applicable even when you have adopted an agnostic configuration language or technology, such as open policy agent (OPA). The agnostic control statements need to be mapped to your particular cloud realities.
Google launched a risk and compliance as code (RCaC) solution to allow organizations to enable security and continuous compliance through code. The key building blocks of the solution are tools and best practices that allow you to strengthen your capabilities for preventative controls, detections, and drift remediation.
3) Leverage cloud-born tools
Adopting an Information Security Management System for multicloud (or any cloud) infrastructure is a must for maintaining and operating a secure environment. For example, a bank operating in a traditional legacy environment that is moving now to the cloud might conclude that there’s no need to take a new approach to their security information and event management (SIEM). But the truth is, you really do need a cloud-born SIEM. Otherwise, your teams often will keep using existing on-premises tools, which creates problems and ultimately increases your technology debt.
While it’s true that on-premise tools are usually cloud-agnostic, and so can be used with any cloud, they often lack any awareness of modern cloud technologies and practices and therefore are rarely the best option for cloud security.
This all-encompassing cloud security thinking applies to every tool within your multicloud environment, even your browser choice. Google Chrome Enterprise is a prime example of a cloud-born tool that combines the business capabilities of a modern web browser (such as Chrome) and secure OS (like ChromeOS) to power your cloud workforce and enable them to work safely and securely in any and all cloud environments.
4) Choosing the right security tools and technology
When it comes to the tools and technology for monitoring cloud security, you need to choose multicloud solutions designed to monitor multiple environments. It’s ideal to align with the common frameworks from NIST and the Cloud Security Alliance (CSA) to provide a gauge for different standards. Start with your cloud service provider’s tools, and then use third-party cross-cloud tools only if you need them.
Many cloud service management tools used by cloud service providers focus on keeping an eye on the production environment. But more tools are starting to focus on how those workloads get deployed to production based on the pipelines — and the protections that are needed there. Essentially, a shift left for these controls. This means that tools are increasingly also being aimed at the preventive side of controls, rather than just the detective side, to help pinpoint when something that went into production isn’t right.
5) Finding and training the right multicloud team
It’s no surprise that the search for talent with the right cloud skill sets can be frustrating. What you’re really dealing with is a division of teams and talent — with a need for, say, Google Cloud experts, AWS experts, and Microsoft Azure experts. It’s equally vital for your management team to have the right skills to assess the operational model and make corrections where necessary. Finding leaders and experts who understand multiple cloud platforms is a daunting task.
Of course, training is an important piece of this. But questions arise: What’s the optimum strategy? Do you get your current staff trained in the new technology? Do you bring in experts to jump-start that process? Do you rely on vendors? All are valid strategies for upgrading cloud security tools and technology training.
As your need to support multicloud infrastructure-as-a-service increases, you need to think through how to train individuals on multiple clouds. Training experts in all cloud services is, perhaps, unrealistic. Few organizations can afford security experts who also know more than one cloud really well.
A more realistic approach is to train teams on relevant operational components, and then rely on those experts for key functions. For example, to detect threats across Google and another cloud, you need to hire detection experts who can then consult with cloud-specific experts for support. Within your own organization, offering specialized training and certifications is a good incentive.
Three multicloud bonus tips
Be prepared: Plan for how your security team can start on Day 1 with a multicloud strategy.Be deliberate: Set a clear strategy around when to use which cloud providers for the best capabilities and the best business outcomes.Be ready for change: Expect changes to cloud platform security (and best practices) for each cloud, and prepare to adapt rapidly to cover emerging threats.
Next steps
Your multicloud path may differ from your competitors or your partners. Some start off as multicloud enterprises. Others come to it from mergers and acquisitions, distributed decision making, or even independent (and possibly uncoordinated) internal purchasing.
Now is the right time to consider the operational processes, technology tools, and people that best meet the multicloud security needs of your infrastructure. We’re here to offer you our insights and recommendations to help you address your multicloud security requirements – and ways to maximize security across your multicloud environment.
Here are the latest updates, products, services, and resources from our security teams so far this month:
Google Cloud and E-ISAC team up to advance security in the electricity industry: Google Cloud partners with E-ISAC as the first major cloud provider in the Vendor Affiliate Program. Read more.Cloud and consequences: Internet censorship data enters the transformation age: Censored Planet Observatory is transforming the way we analyze censorship data to be more informative. Here’s why. Read more.Shining a light in the dark: Measuring global internet shutdowns: Data ingestion from multiple sources helps the researchers see where governments are blocking access to websites. Read more.From turnkey to custom: Tailor your AI risk governance to help build confidence: Business and security leaders have questions about how generative AI models affect their risk-management strategies. Here’s a primer. Read more.Empowering all to be safer with AI: As part of Cybersecurity Awareness Month, we’re sharing more on how AI has the potential to vastly improve how we identify, address, and reduce cybersecurity risks. Read more.Building core strength: New technical papers on infrastructure security: Based on principles laid out in Building Secure and Reliable Systems, we are excited to announce a new series of technical whitepapers on infrastructure security. Read more.New learning lab can help address security talent gap: To help address the chronic shortage of security talent, Google Cloud has introduced a new virtual, lab-based training for Security Command Center, that can be completed in just six hours. Read more.What’s new with Cloud Firewall Standard: We are excited to announce the general availability of the fully qualified domain name (FQDN) feature for Cloud Firewall. Read more.Introducing Actions and Alerts in Advanced API Security: Shift your security approach to proactively identify and act on security threats with security actions and alerts. Read more.How we’ll build sustainable, scalable, secure infrastructure for an AI-driven future: Google products have always had a strong AI component, and we’ve spent the past year supercharging our core products with the power of generative AI — including security. Read more.Improve Kubernetes cost and reliability with the new Policy Controller policy bundle: Our new GKE Policy Controller Cost and Reliability policy bundle automatically identifies potential workload improvements so you can achieve greater reliability and cost efficiency. Read more.
Remediation for Citrix NetScaler ADC and Gateway vulnerability: Mandiant is providing additional steps for remediating and reducing risk related to a Citrix NetScaler ADC and Gateway vulnerability, which we have observed being exploited at professional services, technology, and government organizations. Read more.Mandiant Threat Intelligence product updates for October 2023: Mandiant Threat Intelligence has added a number of new and updated features and capabilities that can help you save time and gain more insight into the threats targeting you. Read more.
Weighing the benefits and risks of LLM for security: Securing foundation models is a complex process that requires more holistic thinking. Hosts Anton Chuvakin and Tim Peacock talk about the challenges and nuances of foundation model security with Kathryn Shih, group product manager and LLM lead, Google Cloud Security. Listen here.How to cure one of cloud security’s biggest headaches: Why is cloud security remediation such a headache for so many organizations? Whether the remediation problem stems from process failures, internal team friction, or technology snafus, Anton and Tim talk with Tomer Schwartz, CTO, Dazz, about how security pros can evaluate solutions for prioritizing, triaging, and fixing issues. Listen here.Threat Trends: DHS Secretary Alejandro Mayorkas in conversation with Kevin Mandia: DHS Secretary Alejandro Mayorkas and Mandiant CEO Kevin Mandia discuss collaboration between the private sector and government, improving the talent gap in cyber, and ongoing DHS initiatives to foster greater cybersecurity. Listen here.Threat Trends: Addressing risk in the cloud with Wiz: Host Luke McNamara is joined by Amitai Cohen, Wiz’s attack vector intel lead, to discuss trends in cloud security, managing risk, and more. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in two weeks with more security-related updates from Google Cloud.
Read More for the details.
Editor’s note: The post is part of a series showcasing partner solutions that are Built with BigQuery.
Since the launch of Built with BigQuery was first announced at the Data Cloud Summit in April 2022, we’ve gained some incredible momentum. Now, over 1000 ISVs and data providers are using BigQuery and Google Data and AI Cloud to power their applications.
At Google Cloud Next this year, we were lucky enough to be joined by some of these partners in person, where we talked about driving growth through data monetization.
The four companies we spoke with, Exabeam, Dun & Bradstreet, Optimizely and LiveRamp are all leaders in their fields, so we wanted to share some of the key ways they are using BigQuery and Google Data and AI Cloud to drive growth through data monetization.
Organizations across industries have seen an explosion in the data they generate over the last decade and it’s often said to be a company’s most valuable resource. However, the reality is that the raw data generated is often little more than a byproduct of other processes, and can be costly to store, protect and govern. To have value, and to be monetizable, raw data must be transformed into a data product. So what do we mean by a data product? At a minimum, data products must:
Have a development lifecycle, roadmap and distributionProvide value to customers by satisfying a need or wantProvide value to data providers through direct (i.e. net new revenue via upsell & cross sell) and/or indirect (e.g. Increased differentiation, usage & stickiness) monetization
When working with ISVs and data providers across the ecosystem, we see four key recurring patterns that use the BigQuery suite to power data products.
Foundational to all of these is the need to unify datasets into a common data and analytics platform.
In the stories that follow, you’ll see examples of how companies are leveraging these patterns to monetize their data assets and deliver value for their customers.
Exabeam, Google Cloud Tech Partner of the year for Security Analytics and a global cybersecurity leader delivering AI-driven security operations, provides security operations teams with end-to-end Threat Detection, Investigation, and Response (TDIR).
Andy Skrei, VP of Product Management, explained that when Exabeam adopted BigQuery two years ago, their key goal was to deliver a platform-level data store that could collect and normalize all data once, and then use it in different ways through different applications and data products from search through detection and response. One of the primary drivers was security analytics, and the company needed a data store that could scale to drive the detection capabilities that its customers expect.
Exabeam also wanted to give customers access to the data store to query it during security investigations. For this, they needed customers to have the ability to ingest all of the data they care about; to be able to retain that data for up to 10 years based on compliance and investigation needs; and be able to access that data at scale quickly. They also needed to make sure that customers had the ability to craft complex queries, access the right insights, and aggregate that data.
The key to all of these need is making sure that the solution can scale based on demand. When you’re in an incident or a breach, you may have 10x the number of queries, and people using that service need it to be highly available.
Their platform approach has also enabled them to quickly layer on additional capabilities like advanced AI-based behavior analytics that normalizes and learns the behavior of every user and device to identify anomalies and threats. They have been using AI for years to detect and defend against threats, and they’re now looking at generative AI to derive new insights and accelerate the ability to understand problems. This includes using generative AI to tell multiple sides of a security event. For example, the information and level of detail that a CISO will need is different than that of an analyst. Generative AI can be used to tell and enhance each of these stories, even though they’re reporting on the same incident.
“We’ve been using AI from the very beginning to help detect threats and behaviors but generative AI is a really interesting space for us now. Google is one of the only companies that offers a commercially available LLM focused on cybersecurity and having that security expertise built-in is important so we can start to leverage and productize generative AI for our customers.” – Andy Skrei, VP Product Management, Exabeam
Dun & Bradstreet is a leader in data and business information serving over 525 million companies across the globe.
Leigh Luxenburg, Senior Director of Product Management explained that Dun & Bradstreet needed a single location where all of its data could work together, efficiently — and this was a big driver in the company’s decision to move not just its data, but all of its platforms, to Google Cloud.
In doing so, Dun & Bradstreet is able to fuel its products, data, SaaS offerings and models for its customers that use the data much more efficiently and consistently, and without latency issues. After choosing BigQuery for its efficiency and consistency, Dun & Bradstreet’s customers are now noting how fast and smooth its processes are.
Data sharing at scale is at the heart of what Dun & Bradstreet does and BigQuery Analytics Hub has enabled them to open new sales and marketing avenues by creating a catalog of data offerings that can be easily subscribed to by BigQuery customers.
We were thrilled that Leigh announced Google Cloud’s new generative AI partnership, with Vertex AI already being a foundational technology in their AI lab. The new partnership will allow customers to develop applications and models using Dun & Bradstreet data, so they can understand things like who they should be targeting from a sales and marketing perspective, as well as who they should do business with from a risk perspective.
They also plan to use AI to deliver new data products that make information more understandable and readable for customers, for example, automated chat agents that can help provide information across a number of data sets easily and conversationally, without having to know teams of people behind the scenes providing that information.
“With 180 years of data and business information for over 525 million companies across the globe, having one place where all that information can work together efficiently was a big driver in our decision to move not just our data but all of our platforms to Google Cloud.” — Leigh Luxenburg, Senior Director of Product Management
Optimizely is a leader in delivering personalized and optimized digital experience delivering 2 million experiments across 10,000 customers.
According to Spencer Pingry, VP Software Architecture, a range of products at Optimizely are powered by Google Data and AI Cloud, but their partnership also extends into the wider Google products suite. For example, using BigQuery, they have integrated with Google Analytics to establish common reporting so customers can view Optimizely experiment results within Google Analytics. Soon, they will be able to take audiences defined in Google Analytics and activate them directly through Optimizely Experimentation – so the audiences that they define can be delivered all the way through to the experience while measuring outputs.
Building on the data sharing theme, Spencer explained how their customers take Optimzely’s rich data and integrate it into their own BigQuery datasets to make it available across their teams.
Google’s partnership has provided Optimizely with access to the tools and training needed to start using generative AI. This enables product teams and engineers to reduce friction and move quickly — starting with enhancing their content generation solutions, but also more broadly through building a range of data products that can leverage the customer context across Optimizely’s entire suite of solutions.
“We’re really interested in how to think about data products internally and then map them across the suite of our products so that we can apply these AI models, not just in a specific point use case, but really with the context of our clients across all of our products. Google’s been a great partner in helping us enable those things.” – Spencer Pingry, Optimizely
LiveRamp, Google Cloud Global Partner of the year for Industry Solution Technology, enables marketers and data teams to accurately and safely enrich, match and activate audiences for marketing.
Erin Boelkens, VP Product Management, explained that LiveRamp runs on Google Data and AI Cloud and the company has deep integrations with Google Marketing platform, making customer transitioning and onboarding easier and faster. With full visibility across the data organization suite, the company is able to optimize budgets and resources. Being hosted on Google Cloud and integrated with Google products provides the enormous benefit of everything being in the same place. The more consolidated the data, the easier it is to transact and have full visibility across a customer’s entire data product suite. This unification of data breaks down silos and offers a true 360 view across customer data so businesses can understand how their customers are interacting with them, no matter what channel it’s through.
Data clean rooms are a powerful way of enabling identity resolution for customers in a safe and effective manner. LiveRamp’s solution, the LiveRamp Data Collaboration Platform, is a great example of a DWaaS that enables them to provide a trusted environment to customers with a ready-to-go first-party data strategy.
More recently, LiveRamp has been an early design partner for Google’s Data Clean Room solution, based on Analytics Hub, which was announced at Google Cloud Next ‘23, in private preview.
At the heart of LiveRamp is its AI-powered identity graph. They have used several AI methodologies over the years but one that continues to be leveraged is the transformer architecture, which was invented by Google, along with their own algorithms to continually drive accuracy and match rates in every LiveRamp product.
“The better we can connect the data in the ecosystem, the better we can help companies understand every touch point in the customer journey, and the more successful they’ll be for their business outcomes. Google’s AI, and our own algorithms, are helping to pave the way.” – Erin Boelkens, LiveRamp
The Built with BigQuery advantage for ISVs and data providers
To learn more about these customer stories you can watch the full session on demand here. Don’t forget to register for our upcoming webcast on Oct 24th to learn how ISVs can accelerate gen AI adoption with BigQuery.
Built with BigQuery helps ISVs and Data Providers build innovative applications with Google Data and AI Cloud. Participating companies can:
Accelerate product design and architecture through access to designated experts who can provide insight into key use cases, architectural patterns, and best practicesAmplify success with joint marketing programs to drive awareness, generate demand, and increase adoption
BigQuery gives ISVs the advantage of a powerful, highly scalable unified AI lakehouse that’s integrated with Google Cloud’s open, secure, sustainable platform. Click here to learn more about Built with BigQuery.
Read More for the details.
Want to make ultra-fast ML predictions without writing and maintaining siloed applications, or having to move your data? How about doing it all with just SQL queries, and without worrying about scaling, security, or performance? Google Cloud Spanner, a globally distributed and strongly consistent database, has revolutionized the way organizations store and manage data at any scale transparently and securely. Vertex AI has transformed how we use data and gain meaningful insights and make informed decisions with machine learning, artificial intelligence and generative AI. Spanner with its Vertex AI integration helps you perform predictions using your transactional data with models deployed in Vertex AI easier and faster than before.
This means you can eliminate the need to access Cloud Spanner data and the Vertex AI endpoint separately. Traditionally, we would retrieve data from the database and pass it to the model via different modules of the application, or external service/function, then serve it to the end-user or write-back to the database. The win here is that you don’t have to use the application layer to combine these results anymore, it’s all rolled into one step in the Spanner database, right where your data lives using familiar SQL.
The benefits of Spanner with Vertex AI integration include:
Improved performance and better latency: Spanner talking directly to the Vertex AI service eliminates additional round-trips between the Spanner client and the Vertex AI service.
Better throughput / parallelism: Spanner Vertex AI integration runs on top of Cloud Spanner’s distributed query processing infrastructure, which supports highly parallelizable query execution.
Simple user experience: Ability to use a single, simple, coherent, and familiar SQL interface to facilitate both data transformation and ML serving scenarios on Cloud Spanner scale lowers the ML entry barrier and allows for a much smoother user experience.
Reduced costs: Spanner Vertex AI integration uses Cloud Spanner compute capacity to merge the results of ML computations and SQL query execution, which eliminates the need to provision an additional compute.
In this blog, we will explore the steps to perform ML predictions from Spanner using a model deployed in Vertex AI. This is possible by registering the ML model in Spanner that has already been deployed to a Vertex AI endpoint.
For implementing this feature, we will take my good-old movie viewer score prediction use case for which we already have the model created in both Vertex AI and BQML methods. The movie score model predicts the success score of a movie (in other words, an IMDB rating) on a scale of 1 to 10 depending on various factors including runtime, genre, production company, director, cast, cost, etc. The steps to create this model are available as codelabs and the links are included in the next section of this blog.
Google Cloud project: Before you begin, make sure you have a Google Cloud project created, billing enabled, set up Spanner access control to access Vertex AI endpoint provisioned, and necessary APIs (BigQuery API, Vertex AI API, BigQuery Connection API) enabled.ML model: Familiarize yourself with the movie prediction model creation process using Vertex AI Auto ML or BigQuery ML by referring to the codelabs linked here or here.Deployed ML model: Have the movie score prediction model created and deployed in Vertex AI endpoint. You can refer to this blog for guidance on creating and deploying the model.There will be costs associated with implementing this use case. If you want to do this for free, use the BQML model deployed in Vertex AI and choose Spanner free-tier instance for the data.
We will predict the movie success score for data in Cloud Spanner using the model created with Vertex AI Auto ML and deployed in Vertex AI.
Below are the steps:
Step 1: Create a Spanner Instance
Go to Google Cloud console, search for Spanner and choose the Spanner product.You can create an instance or choose the free instance.Provide instance name, ID, region and standard configuration details. Let’s select the region to be the same as your Vertex / BigQuery datasets. In my case, this is us-central1. Let’s call this instance spanner-vertex.
Step 2: Create a database
From the instance page, you should be able to create a database by providing the database name and database dialect (select Google Standard SQL). You could also optionally create a table there, but let’s reserve that for the next step. Go ahead and click CREATE.
Step 3: Create a table
After the database is created, navigate to the database overview page.Click CREATE TABLE button, paste the below statement in the DDL TEMPLATES section and execute:
CREATE TABLE movies (
id INT64,
name STRING(100),
rating STRING(50),
genre STRING(50),
year FLOAT64,
released STRING(50),
score FLOAT64,
director STRING(100),
writer STRING(100),
star STRING(100),
country STRING(50),
budget FLOAT64,
company STRING(100),
runtime FLOAT64,
data_cat STRING(10),
) PRIMARY KEY (id);
Step 4: Insert test data
Now that we have created the table, let’s insert a few records to use as test data in the table we just created.Go to the database overview page, and from the list of tables available, click movies.In the TABLE page, click Spanner Studio in the left pane and this should open the studio on the right side.Open editor tab and RUN the below statement to insert record:
INSERT INTO movies (id, name, rating, genre, year, released, score, director, writer, star, country, budget, company, runtime, data_cat)
VALUES (
7637, ‘India’s Most Wanted’, ‘Not Rated’, ‘Action’, 2019, ‘5/24/2019’, null, ‘Raj Kumar Gupta’, ‘Raj Kumar Gupta’, ‘Arjun Kapoor’, ‘India’, 57531050, ‘Fox STAR Studios’, 123, ‘TEST’
);
Step 5: Register the Vertex AI AutoML model
You should have already created the classification / regression model and deployed it in the Vertex AI endpoint (as mentioned in the prerequisites section).If you haven’t already created the classification model for predicting the movie success rating, refer to the codelab in the “prerequisites” section to create the modelNow that the model is created, let’s register it in Spanner so you can make it available for your applications. Run the following statement from the editor tab:
CREATE OR REPLACE MODEL movies_score_model
REMOTE OPTIONS (endpoint = ‘https://us-central1-aiplatform.googleapis.com/v1/projects/<your_project_id>/locations/us-central1/endpoints/<your_model_endpoint_id>’);
Replace <your_project_id> and <your_model_endpoint_id> with values from your deployed Vertex AI model endpoint. You should see your model creation step completed in a few seconds.
Note that the schema structure of your table and the model should match. In this case I have already taken care of this while creating the table by making sure the fields and types in the DDL match the schema of the model dataset in Vertex AI. Take a look at the “Step 3: Create a table” section in this blog for the DDL. You can verify this against the schema structure of the model dataset in Vertex AI by navigating to the “Datasets” section in Vertex AI console and clicking on the ANALYZE tab of the model you are using as seen in the image below:
Step 6: Make Predictions
All that is left to do now is to use the model we just registered in Spanner to predict the movie user score for the test movie data we inserted into the movies table.Run the below query:
SELECT * FROM ML.PREDICT (MODEL movies_score_model, (SELECT * FROM movies WHERE id=7637));ML.PREDICT is the method Spanner uses to predict the target value for the input data. This method takes 2 arguments — model name and input data for prediction.The subquery “select * from movies where data_cat = ‘TEST'” in our case fetches the test data we have inserted into the movies table for prediction.
That’s it. You should see the prediction result in the field “value” for your test data as shown in the image:
Step 7: Update prediction results to Spanner table
You can choose to write the result of prediction directly to the table, this will be useful particularly when your application requires real time target update. Run the below query for updating the result while predicting:
UPDATE movies SET score = (SELECT value FROM ML.PREDICT (MODEL movies_score_model, (SELECT * FROM movies WHERE id = 7637))) WHERE id = 7637;
Once this is executed, you should see the predicted score updated in your table.
Making ML predictions from Google Cloud Spanner is a powerful way to integrate predictive analytics with your database. By leveraging the capabilities of Vertex AI and the flexibility of Spanner, you can enhance decision-making in real-time applications. This example with movie user score prediction illustrates the potential of combining cloud services to drive data-driven insights. Learn more about the machine learning capabilities of Spanner here.
Read More for the details.
Today’s post is the first in a series that will provide the reasons for why administrators and architects should build batch processing platforms on Google Kubernetes Engine (GKE).
Kubernetes has emerged as a leading container orchestration platform for deploying and managing containerized applications, speeding up the pace of innovation. This platform is not just limited to running microservices, but also provides a powerful framework for orchestrating batch workloads such as data processing jobs, training machine learning models, running scientific simulations, and other compute-intensive tasks. GKE is a managed Kubernetes solution that abstracts underlying infrastructure and accelerates time-to-value for users in a cost effective way.
A batch platform processes batch workloads, defined as Kubernetes Jobs, in the order they are received. The batch platform might incorporate a queue to apply your business case logic.
There are two key Kubernetes resources in a batch platform: Jobs and Pods. You can manage Jobs through the Kubernetes Job API. A Job creates one or more Pods and continues to retry execution of the Pods until a specified number of them successfully terminate. As Pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the Job, or the task, is considered complete.
GKE is a managed Kubernetes solution that simplifies the complexity of infrastructure and workload orchestration. Let’s take a deeper look at GKE capabilities that make it a compelling place to operate batch platforms.
A GKE environment consists of nodes, which are Compute Engine virtual machines (VMs), that are grouped together to form a cluster. GKE in Autopilot mode automatically manages your cluster configuration, including your nodes, scaling, security, and other preconfigured settings so you can focus on your workload. Autopilot clusters are highly-available by default.
Google Cloud is continuously introducing new workload optimized VM series and shapes; please visit the machine families resource and comparison guide to review the options that are currently available for your workload.
GKE is capable of hosting the largest Kubernetes clusters a managed provider currently can. 15,000 nodes per cluster (compared to 5,000 nodes for open-source Kubernetes) is a scale that is key to many batch use cases.
Customers like Bayer Crop Science take advantage of GKE scale to process around 15 billion genotypes per hour, and GKE was used by PGS to build an equivalent of one of the worlds largest supercomputers. GKE can create multiple node pools, each with distinct types/shapes and series of VMs to run workloads of various needs.
GKE cluster multi-tenancy is an alternative to managing many single-tenant clusters and to manage access control. In this model, a multi-tenant cluster is shared by multiple users and/or workloads which are referred to as “tenants”. GKE allows tenant isolation on a namespace basis; you can separate each tenant and their Kubernetes resources into their own namespaces. You can then use policies to enforce tenant isolation, restrict API access, set quotas to constrain resource usage, and to restrict what containers are allowed to do.
As an administrator, you can implement policies and configuration to share the underlying cluster resources in a fair way between tenants submitting workloads. You can decide what is fair in this context; you might want to assign resource quota limits per tenant proportional to their minimum workload requirements, to ensure there is room for all tenants’ workloads on the platform. Or you could queue incoming jobs when there are no available resources and process them in the order they were received.
Kueue is a Kubernetes-native job queueing system for batch, high performance computing, machine learning and similar applications in a Kubernetes cluster. To help with fair sharing of cluster resources between its tenants, Kueue manages quotas and how jobs consume them. Kueue decides when a job should wait, when a job should be admitted to start (as in pods can be created) and when a job should be preempted (as in active pods should be deleted). The diagram below shows the flow from a user submitting a Job, to Kueue admitting it with the appropriate nodeAffinity and the other parts of Kubernetes activating to process the workload.
To help administrators configure batch processing behavior Kueue introduces concepts like ClusterQueue (a cluster-scoped object that governs a pool of resources such as CPU, memory, and hardware accelerators). ClusterQueues can be grouped in cohorts; ClusterQueues that belong to the same cohort can borrow unused quota from each other and this borrowing can be controlled using the ClusterQueue’s BorrowingLimit.
ResourceFlavor is an object in Kueue that represents resource variations and allows you to associate them with cluster nodes through labels and taints. For example, in your cluster you might have nodes with different CPU architectures (e.g. x86, Arm) or different brands and models of accelerators (e.g. Nvidia A100, Nvidia T4, etc). ResourceFlavors can be used to represent these resources and can be referenced in ClusterQueues with quotas to control resource limits. See Kueue concepts for more details.
Want to learn how to implement a Job queueing system, configure workload resource and quota sharing between different namespaces with GKE? See this tutorial.
The efficient use of compute and storage platform resources is important. To assist you with reducing costs and right-sizing your compute instances to align with your batch processing needs while not sacrificing performance, your GKE batch workloads can take advantage of Compute Engine Persistent Disks through Persistent volumes, providing durable storage for your workloads. To learn more about ways to optimize your workloads, from storage to networking configurations, see https://cloud.google.com/kubernetes-engine/docs/best-practices/batch-platform-on-gke#storage_performance_and_cost_efficiency
GKE is integrated with Google Cloud’s operations suite; you can control which logs and which metrics, if any, are sent from your GKE cluster to Cloud Logging and Cloud Monitoring. For batch users, this means being able to access detailed and up to date logs from their workload Pods and the flexibility to write their own workload specific metrics using monitoring systems like Prometheus. GKE also supports Managed Service for Prometheus for zero-hassle metrics collections backed by Google’s planet-scale Monarch time-series database. We recommend that you use managed collection; using it eliminates the complexity of setting up and maintaining Prometheus servers. To learn more, see batch platform monitoring.
GKE provides a scalable, resilient, secure and cost-effective platform for running containerized batch workloads with both a hands-off Autopilot experience and highly customizable Standard mode that can be specified to fit the needs of your organization. Having deep integration with Google Cloud services, access to all Compute Engine VM families, CPU and accelerator architectures and the Kubernetes-native Job queueing capabilities of Kueue, GKE is ready to be the new home for your batch workloads. It gives you the ability to operate large computational platforms, simplifying multitenancy, scale, and performance with cost optimization.
For more detailed information about how to design and implement the batch platform, see the Best practices for running batch workloads on GKE | Google Kubernetes Engine (GKE)
Want to learn how to implement a Job queueing system, configure workload resource and quota sharing between different namespaces with GKE? See this tutorial.
Read More for the details.
As cloud development has evolved, teams have benefitted from several innovations that significantly increased productivity, such as advanced debuggers, modern IDEs and Notebooks, online communities, and cloud computing services. Despite this, organizations continue to struggle with a chronic shortage of developers with desired skills. Moreover, developers often face numerous challenges, some of which are particularly relevant to cloud development, including:
Disruptive context switching and friction when evaluating, learning, or integrating new tools or frameworksExcessive time spent on repetitive tasksDifficulty understanding new code bases or complex APIsAchieving a high level of automated test coverage without sacrificing velocity
However, the recent growth of generative AI has introduced new opportunities for businesses and developers alike with large language models (LLMs) — AI models trained on massive sets of textual data to produce relevant responses to requests in natural language. This novel technology opens up fresh use cases for developers by enhancing their software development process and, more significantly, driving increased productivity.
With the explosion of tools available in the market, a common question we hear from customers is: How can I evaluate which solution best fits my organization’s needs?
Due to their ability to synthesize and generalize patterns from massive training data, LLMs can enhance every part of the software development lifecycle, improving productivity with capabilities that can complete, transform, explain code, generate tests in IDEs and even do agentic workflows.
Here are some of the benefits of incorporating generative AI into software development:
Speed: Generative AI tools can reduce development time and effort. For example, developers can reduce time spent on repetitive tasks by leveraging in-line code suggestions.Quality: LLMs can detect more sophisticated and nuanced patterns than traditional code analysis. As a result, developers can fine-tune code according to an organization’s best practices, adhere to naming conventions and deliver more consistent code quality.Scale: While there are over 1 billion knowledge workers in the world, there are only about 25 million programmers. Teams can bridge this gap with AI-enhanced tools that help developers perform above their skill level, enabling them to scale their impact and productivity when used under the right circumstances.Onboarding and upskilling: LLMs can reduce the time and effort spent understanding code and documentation by explaining code and making it easier for developers to become familiar with a new codebase, language, or framework.
As enterprises adopt these transformational technologies, it’s important to put measures in place to assess and quantify their impact. Measuring productivity gains is a nuanced process, and it’s critical to resist the urge to only run narrow, task-specific experiments. They tend to give overly optimistic results that do not generalize when looking at productivity gains for typical day-to-day tasks.
While there are some obvious leading impact indicators — including adoption, suggestion acceptance rate, and tool retention — lagging indicators provide a more accurate picture. Measuring indicators, such as the reduction in coding iteration time and the amount of new code generated from AI tools, may offer a better understanding of their productivity impact.
While generative AI has strong potential, it’s still in the early stages of widespread adoption, with many organizations just starting to evaluate use cases and launch pilots. Here are some key considerations and questions that can help guide you:
Understand your business needs and use cases. Before implementing generative AI, you should know why you need it and how it can help you achieve your development and business goals. Some questions to ask include:
Which roles will benefit most from the benefits described above?Which of these use cases will yield high business value and near term impact?
Protect your IP and customer privacy. Protecting your intellectual property and sensitive data when working with generative AI is vital to maintain your unique competitive advantage and prevent any misuse of your work. Equally important is customer privacy, as safeguarding sensitive data fosters customer trust and complies with regulatory requirements. Questions your organization needs to think about include:
What centralized admin controls are needed?How can you use generated code in line with your licensing policies?What is the cost of hallucinations (i.e., incorrect code generated) to your business? How can this be mitigated?What security validation steps can be built on top of the code output?How can I ensure my code and data remain private? So they aren’t used for training shared models.
Find the right solution. To make sure the solution you choose can work with your existing technology stack and meet the specific needs of your organization, there are many factors you should consider, such as:
Are there specific programming languages and frameworks to care about?Will it be beneficial to customize the models on a proprietary codebase? Do you and your teams have the right data?How can you leverage the organizational knowledge graph to maximize impact?How can you keep the model up-to-date as your internal knowledge evolves?How does the solution interact with the existing tech stack and platform?Does this fit into your broader AI adoption plans and vendor choices?How good is the tool being used? Across dimensions such as accuracy, domain relevance, style and safety?Can this solution be deployed at scale?
Consider organizational culture and processes. AI-powered developer assistance is one more tool in your productivity toolset, and having the right culture and processes can maximize its impact. Be sure to ask questions like:
Do my teams have the right skills and processes to use these tools effectively?What training is needed in areas like prompt engineering to ensure effective use of these tools?What can you do to find the right balance between automation and humans in the loop?How can your organization make sure the tools accelerate the growth of developers as opposed to making them overly reliant?
As you go through the journey with generative AI for coding, remember that this is not a one-time endeavor. The models and the products built on these technologies are evolving rapidly, and an increasingly large choice of tools means that a more holistic approach is needed to ensure organizations identify and adopt the best tools available for their development teams.
By harnessing the power of AI-driven developer assistance, such as Duet AI assisted development, businesses can unlock unprecedented levels of productivity and efficiency in software development, paving the way for a new era of innovation and growth. Sign up here to join Duet AI’s Private Preview.
Read More for the details.
Google’s infrastructure security teams continue to advance the state of the art in securing distributed systems. As the scale, capabilities, and geographical locations of our data centers and compute platforms grow, we continue to evolve the systems, controls, and technology used to secure them against external threats and insider risk.
Building on the principles laid out in Building Secure and Reliable Systems, we are excited to announce a new series of technical whitepapers on infrastructure security. The series begins with papers:
Protecting the physical-to-logical space in a data centerEnforcing boot integrity on production machines
These papers are technical, but we designed them to be readable and accessible to non-experts. We hope they give you insight into the exciting work our teams are doing to keep our customers safe, and that the papers can be a valuable resource as you work to protect your own infrastructure from attacks.
Thomas Koh is the author of “Protecting the physical-to-logical space in a data center,” which explores Google’s security controls that help protect the vital physical-to-logical space.
We define the physical-to-logical space in a data center as “arms-length from a machine in a rack to the machine’s runtime environment.” This space sits between physical controls (such as building access controls) and logical controls (such as secure service deployment). Physical-to-logical controls are designed to defend against attackers that have legitimate access to the data center floor.
To protect the physical-to-logical space, Google implements a number of security controls, including:
Hardware hardening: Reduce each machine’s physical access paths, known as the attack surface.Task-based access control: Provide access to secure rack enclosures only to personnel who have a valid, time-bound business justification.Anomalous event detection: Generate alerts when physical-to-logical controls detect anomalous events.System self-defense: Recognize an unexpected change in the physical environment and respond to threats with defensive actions.
You can read the full paper now: How Google protects the physical-to-logical space in a data center.
Jeff Andersen goes deep into boot integrity security on production machines in the “Enforcing boot integrity on production machines” whitepaper. The security posture of a data center machine is established at boot time, which means that the machine’s hardware must be configured, and the operating system initialized, all while keeping the machine safe to run in Google’s production environment.
In this paper, we step through our boot process and demonstrate how our controls ensure attested machine boot integrity at each step in the boot flow.
The paper dives into the following:
Hardware roots of trust and cryptographic sealing using Google’s custom Titan chipCredential sealing in the boot processMaintaining the integrity of the kernel, boot firmware, and root of trust firmwareEnsuring root of trust authenticity
You can read the full paper now: How Google enforces boot integrity on production machines.
We plan to publish more papers like these. As more become available, we’ll publish them on our infrastructure security whitepapers page. We hope they’re exciting, illuminating, and most of all, useful.
Read More for the details.
Internal research by Google, and by Google’s DevOps Research and Assessment (DORA) organization, shows that teams that encourage a culture of trust — one that allows for questioning, risk-taking and mistakes — perform better. The way an organization responds to opportunity is a big part of its culture. And for software delivery and overall team effectiveness, equally important is how an organization responds to failure.
By adopting specific behaviors and ways of working that encourage resilience, we can increase our teams’ effectiveness and achieve better organizational performance.
At Google, we not only make a lot of technology, we also study how technology gets made.
DORA is an academically and statistically rigorous research program that seeks to answer the questions: “How does technology help organizations succeed, and how do we get better at software delivery and operations?”
Internal research projects across hundreds of Google teams, such as Project Aristotle, have also allowed us to study the drivers of highly effective teams.
In this blog series, we’ve taken years of this Google research and are distilling down the findings into five dimensions that you can apply to drive success within your own organization:
Resilience (the focus for this blog)CommunicationCollaborationInnovationEmpowerment
Let’s jump in, and consider what resilience is, how it improves performance, and how your team can get more of it.
We define resilience as the ability of teams to take smart risks, share failures openly and learn from mistakes, and teams that exhibit resilience are demonstratively more successful than teams who don’t. This idea that a culture with resilient characteristics can drive desirable organizational outcomes isn’t new. Sociologist Dr. Ron Westrum’s study of how culture influences team behavior when things go wrong typified three distinct organizational cultures, and cultures in which failure led to inquiry, rather than justice or scapegoating, were found to be more performance-oriented. Westrum referred to these as “generative” cultures.
This research has been reinforced by our DORA findings since the first State of DevOps Report was published in 2014. Our 2023 Accelerate State of DevOps Report demonstrates that the presence of a generative culture continues to predict higher software delivery and organizational performance. We believe this is because, at its core, DevOps is fundamentally about people and the ways those people work. And people drive culture.
Source: DORA 2023 Accelerate State of DevOps Report
Take, for example, security development practices. Our research found organizations with high-trust, resilient cultures are 1.6x more likely to have above-average adoption of emerging security practices than those who did not. We believe these generative traits, including aspects of resilience, may lead to a more desirable security posture due to their influence on teams’ ways of working. For example, generative organizations may be more likely to actively minimize the inconvenience or risk associated with reporting security issues by fostering an atmosphere of “blamelessness,” among other things. The bottom line is, if you want to improve your organization’s security posture (and beyond), consider evaluating your team’s culture first.
We can further break resilience down into two additional mindsets:
Launching and iterating: getting started, gathering feedback and continuously improvingPsychological safety: a shared belief that a team is safe for interpersonal risk-taking
Would you be comfortable sharing an idea with your leadership if it were only 20% formulated?
Part of resilience is gathering input and continuously improving. Our research shows that teams who adopt a mindset of continuous improvement perform better. This includes starting quickly, adapting to changing circumstances, and experimenting.
For example, in the context of software delivery, DORA research supports the philosophy of continuous delivery so that software is always in a releasable state. Maintaining this “golden” state requires creating mechanisms for fast feedback and rapidly recovering from failures. We’ve found that teams that prioritize these feedback mechanisms have better software delivery performance. Our research has also found that working in small batches improves the way teams receive and use such feedback, as well as the ability to recover from failure, among other things.
Launching and iterating is not only about improving the software that you ship. It’s also about a teams’ more general ability to self-assess, pivot, and adopt new ways of working when it makes sense based on the data. Inevitably, this experimentation will include both successes and failures. In each case, teams stand to learn valuable lessons.
Would you be comfortable openly failing on your team?
Extensive research inside Google found that psychological safety provides a critical foundation for highly effective teams. In general, our research demonstrates that who is on a team matters less than how team members interact when it comes to predicting team effectiveness.
In order of importance, Google researchers found these five variables were what mattered most when it came to team effectiveness. Source: Google re:Work Guide: Understand team effectiveness
Project Aristotle examined hundreds of Google teams to answer the question “what makes a team effective?” Statistical analysis of the resulting data revealed the most important team dynamic is psychological safety, or creating an environment where taking smart risks is encouraged. An environment where members trust they will not embarrass or punish each other for ideas, questions or mistakes. Further DORA analysis found that these practices also benefit teams outside of Google, uncovering that a culture of psychological safety is broadly predictive of better software delivery performance, organizational performance and productivity.
It’s important to remember that culture flows downstream from leadership. DORA research shows that effective leadership has a measurable, significant impact on software delivery outcomes. If we want to foster a blameless, psychologically safe environment, leaders must provide their teams with the necessary trust, voice, and opportunities to experiment and fail.
Adopting a mindset of continuous improvement can help you achieve better organizational performance. Likewise, embracing psychological safety within your organization may help your teams work more effectively. This is what we mean when we say using resilience to drive success through culture.
So, what does resilience look like when it is applied practically in our behaviors and reinforced through our daily work?
We can continuously improve by launching early, defining success metrics, gathering input (including through crowdsourcing), and taking what we learn to heart, both to improve our products and the way we work. This ability can be underpinned by technical practices such as continuous integration, automated testing, continuous delivery and monitoring, to name a few. These practices provide the foundation and guardrails that allow for safe, rapid iteration and reliability.
We can also normalize failure by conducting both “premortems” (anticipating the myriad ways an idea may fail), and “blameless postmortems” — candid conversations about times when things haven’t gone according to plan and what could be done to improve, without assigning blame. For example, we’ve found that teams who leverage reliability practices, including blameless postmortems, report higher productivity and job satisfaction, and lower levels of burnout, than their counterparts who use more traditional operations approaches. We suspect this is because, among other things, a sustained fear of making mistakes can lead to poor well-being.
Blameless postmortems help prevent issues from recurring, help avoid multiplying complexity, and allow you to learn from mistakes and those of others.
These ways of working are exemplified by our latest Google Cloud DevOps Award winners. These organizations have demonstrated how they are implementing these and other practices to drive organizational success and elite performance. For example, consider how one company leveraged cross-functional teams to remove bottlenecks, address blockers, and improve communication — the focus of our next blog in this series.
In the meantime, be prepared for failure as you experiment with new ways of working, including new approaches to software delivery, operations and beyond. And ask yourself, how will you react next time something goes wrong? To learn more, take the DevOps Quick Check and read the latest State of DevOps Report, both at dora.dev.
Read More for the details.
In today’s rapidly evolving business landscape, secure and reliable disaster recovery solutions are paramount. For CELSA Group, a leader in steel and steel product manufacturing, ensuring uninterrupted access to critical systems is not just a priority — it’s a necessity. That’s why CELSA Group made a strategic decision to migrate SAP S/4HANA for disaster recovery to Google Cloud, a move that has yielded peace of mind and a robust layer of redundancy.
CELSA Group’s previous architecture could not keep pace with their security and reliability standards. Recognizing the potential risks associated with any downtime, Manuel Parra López, CTO of CELSA Group, shares, “This pivotal decision based on Google Cloud’s leadership in security and reliability marked the beginning of a trusted partnership between CELSA Group and Google Cloud.” Google Cloud allows CELSA Group to work without disruption and strengthen its overall security and resilience.
Whether planned or unplanned, the ability to seamlessly switch between the production and disaster recovery environments with virtually no data loss allows CELSA Group to continue its operations uninterrupted. Having production and disaster recovery environments hosted by two separate cloud providers adds a crucial layer of redundancy to CELSA Group’s infrastructure, ensuring near-24×7 access to critical systems. Any downtime from issues like system patching challenges or cyber threats can result in substantial financial losses and operational delays. Thanks to CELSA Group’s resilience strategy to protect against this, they can fully transition business applications in under an hour, allowing CELSA Group to spend their time where it matters.
CELSA Group’s technical strategy not only prioritizes redundancy but also delivers improved performance and security. Regularly stress tested for real-life scenarios, Celsa maintains the same level of service while improving security by going from a single cloud provider to multi-cloud model for disaster recovery and can fully restore to the secondary system in under an hour.
“Security and system performance must be optimal at all times, and this migration clearly helped us achieve higher standards in both of these areas,” says López.
The value realized from better reliability and security has been immeasurable, particularly considering cost controls and planning ensured higher costs were not incurred. The peace of mind that comes from knowing their critical systems are safeguarded against any disruptions is priceless to CELSA Group. This successful migration to Google Cloud has further deepened their commitment to the partnership.
CELSA Group’s strategic migration of its SAP disaster recovery environment to Google Cloud has proven to be a difference maker. It has not only strengthened business continuity but has also enhanced performance, security, and overall resilience. In today’s digital landscape, where downtime can paralyze operations, CELSA Group has confidently positioned themselves with a reliable and secure infrastructure with Google Cloud.
Read More for the details.
For retailers, having a powerful search function for their website and mobile applications is crucial. However, most current search experiences are somewhat basic and limited, primarily returning a list of results ranked based on keyword relevance. And typically, customers only have a vague idea of what they want, rather than a specific item in mind. For instance, they might be searching for a nice dining room table but aren’t sure about the style, size, shape, or other details. Not surprisingly, many customers leave a retail website after unsuccessful search attempts. This not only deprives them of a potentially satisfying shopping experience but also represents a lost opportunity for retailers.
Today, we are excited to present a new search experience for retailers using generative AI with Vertex AI and ElasticSearch. This enhanced interface offers users an interactive, conversational experience that summarizes pertinent data and tailors responses based on each customer’s unique needs, all while drawing upon the retailer’s public and internal knowledge. That can include, for example, public knowledge from the retailer’s domain expertise:
Guidelines and common scenarios for solving your specific needSummary of the reviews for similar products matching your requestPublic references on how to build, combine or use the productsetc.
Or, the knowledge can be private to the retailer:
A user’s general location and local regulations that may apply to themReal-time updated product catalog, stock nearby and pricesPrivate documentation and knowledgeetc.
Instead of merely generating a simple list of relevant items, this advanced search experience functions much like a concierge sales representative with in-depth domain knowledge. It assists and guides customers throughout their entire purchase journey, creating a more engaging and efficient shopping experience.
The diagram below illustrates the integration of Google Cloud’s generative AI services and Elastic search capabilities, which forms the foundation for this new user experience.
In this architecture, a search query happens in two broad phases.
1.Initial query processing and enrichment
A user initiates the process by submitting a query or question on the search page.
This query, along with other relevant metadata like general geolocation, session information, or any other data deemed significant by the retailer, is then forwarded to Elastic Cloud.
Elastic Cloud uses this query and metadata to perform search on the domain-specific customer data, gathering rich context information in the process:
Relevant real-time data from multiple internal company data-sources (ERP, CRM, WMS, BigQuery, GCS, etc) are continuously indexed into Elastic through its set of integrations, and related embeddings are created via your transformer model of choice or the built-in ELSER. Inference can be automatically applied to every data stream through ingest pipelines.Embeddings are then stored as dense vectors in the Elastic’s Vector Database.Using the search API, elasticsearch starts executing search among its indices with both text-search and vector search within a single call. Meanwhile it generates vectors on the user’s query text that it is receives.Generated embeddings are compared via vector search to the dense vectors from previously ingested data with kNN (K-Nearest Neighbors Algorithm).
An output is produced combining both semantic and text-search results, with the Reciprocal Rank Fusion (RRF) hybrid ranking.
2. Produce results using gen AI
The original query and the newly obtained rich context information is forwarded to Vertex AI Conversation.Conversational AI is a collection of conversational AI tools, solutions and APIs, both for designers and developers. In this design, we will be using Dialogflow CX in Vertex AI Conversation for the conversational AI, and integrate with its API.
When using the Dialogflow CX API, your system needs to:
Build an agent.Provide a user interface for end users.Call the Dialogflow API for each conversational turn to send end-user input to the API.Unless your agent responses are purely static (uncommon), host a webhook service to handle webhook-enabled fulfillment.
For more details on using the API, please refer to Dialogflow CX API Quickstart.
The Conversational AI module reaches the endpoints of the LLM model deployed in your Vertex AI tenant to generate the complete response in natural language, merging model knowledge with Elastic-provided private data. This is achieved by:
Selecting your favorite model from Model GardenIf needed, fine-tuning it on your domain tasksDeploying the model to an endpoint in your Google Cloud projectConsuming the endpoint from the Dialogflow workflows
To manage data access control and ensure privacy, Vertex AI employs IAM for resource access management. You have the flexibility to regulate access at either the project or resource level. For more details, please refer to the Google documentation. Please refer to the section below for more details on this step.
Dialogflow make the chatbot experience actionable with conversational responses, providing relevant actions to the user depending on the context (for instance placing an order or navigate to content)
The response is relayed back to the user.
When using generative AI, context windows help to pass additional, user-prompted, real-time, private data to the model at query-time, in parallel with the question you’re submitting. This enables users to receive better answers as output, based on the public knowledge that the LLM is trained on, but also in the space of the specific domain you provided. Gen AI’s effectiveness highly depends on input engineering, and context really improves quality of results.
Once the user submits their question via your website search box, Elasticsearch digs into your internal knowledge base, searches for related content and returns it for further processing on the awaiting generative model. Searching information inside your business, from multiple diverse data sources, is what Elasticsearch is designed for.
But when the context window size is limited, common models only allow processing of just a few thousands tokens. It’s therefore impossible to pass the whole enterprise dataset to each query. Moreover, the bigger the query the user submits, the more performance and cost are impacted. You need to filter and find relevant context even before sending the query to generative AI.
The Elasticsearch Relevance Engine (ESRE) is a package built of years of R&D and expertise from Elastic in the search and AI space to help customers leverage the power of Elasticsearch with high-end machine learning features. ESRE enables relevant results retrieval from huge, heterogeneous datasets quickly using advanced vector search features on text, images, audio and video, all while combining with classical text-search on Elasticsearch indices. With ESRE, not only is every relevant, but also secured thanks to document- and field-level security applicable to all content that Elastic handles.
With Elastic, you have the freedom of choosing and running your own Machine Learning (ML) models to create embeddings or for feature extraction, thus allowing full control and customization depending on your specific domain, language and data type. Vertex AI can then help you create your transformer models to import into the Elastic platform. No worries if you want to get started quickly instead, with no internal ML skills: the out-of-the-box ELSER retrieval model lets you be up and running with just a few clicks.
Prompt design strategies and context windows might not be sufficient for tailoring model behavior.
Model Garden: Vertex AI Model Garden is a rich repository of pre-constructed ML models that cater to a diverse set of use cases. These enterprise-ready models have been meticulously tested and optimized for enterprise applications’ performance and precision needs. They offer user-friendly access through APIs, notebooks, and web services. Designed with scalability in mind, these models can handle large data volumes and serve numerous users. Furthermore, they are freely accessible and regularly updated with fresh models.
Model customization: Retailers may prefer to interact with foundational models to adapt them to their unique needs. Fine-tuning is an efficient and economical approach to enhance a model’s task-specific performance or comply with distinct output requirements when instructions aren’t adequate. But developing and training new foundational models from the ground up can be costly. Techniques such as efficient parameters tuning reduce overhead while implementing adapter layers atop existing LLMs. These layers supplement the models with your specific missing knowledge, maintaining their already established skills.
Security and privacy: While consumer AI assistants are typically robust, controlling their features and security isn’t always in your hands. Having complete insight into your data’s usage and how it is stored is vital to compliance, privacy, and maintaining a competitive edge. Google´s Secure AI Framework provides clear industry standards for building and deploying AI technology in a responsible manner. This is why adopting generative AI for enterprises via Google’s Vertex AI capabilities is so smart.
Enterprise-grade solution: Vertex AI Generative AI Studio provides access to cutting-edge models and allows for navigation through various releases and modifications. Once customized, you can test, deploy, and integrate them with other applications and platforms hosted on Google Cloud. This can be done via the Vertex AI SDK or directly through the Google Cloud console, making it user friendly for almost anyone, even without specialized data-science expertise.
Responsible AI: With great power comes great responsibility. LLMs have the capacity to translate languages, condense text, and generate creative writing, code, and images, among many other tasks. However, as an emerging technology, its evolving abilities and applications may lead to potential misuse, misapplication, and unexpected consequences. These models can sometimes generate outputs that are unpredictable, including text that might be offensive, insensitive, or factually incorrect. Google is dedicated to promoting responsible use of this technology, providing safety features and filters to protect users and ensure compliance with our AI principles and commitment to responsible AI.
Getting started with building your own gen AI-powered customer facing retail app has never been so easy: take a look at Elastic’s official GitHub repo that helps you create and run, with guided steps, an e-commerce search bar with ESRE and Vertex AI.
If you’re interested in exploring further through code examples, Google Lab provides a comprehensive guide entitled Dialogflow CX: Build a retail virtual agent. For additional examples of Google gen AI designs, we encourage you to visit our ‘Generative AI examples‘ page.
Concerned about the usage of your Vertex AI models and services? Explore how Elastic provides an end-to-end observability platform that leverages all your logs, metrics and traces directly from the Google Cloud operations suite thanks to native integrations.
Don’t miss out on this groundbreaking innovation. We encourage you to explore and adopt this solution, and start transforming your customers’ shopping experiences. The best way to start is to create a free 14-day trial cluster on Elastic Cloud using your Google Cloud account, or easily subscribe to Elastic Cloud through Google Cloud Marketplace.
Read More for the details.
Climate change and its effects have significantly increased the number of allergy sufferers. According to theWorld Allergy Organization, over 400 million people now suffer from a pollen allergy globally, creating a very large market for allergy remedies. People who suffer from allergies are looking for relevant information they can trust to manage their symptoms. In particular, they’re looking for personal insights that adjust to their individual needs, sensitivities, and locations. The Pollen API, now generally available, can be helpful to businesses in many industries, including the pharmaceutical industry.
Providing personalized experiences is very important for individuals looking to proactively manage their symptoms. Pharmaceutical companies can help by providing helpful info and offering a more holistic, data-based approach to self-managed allergy treatment. By leveraging our now generally available Pollen API, pharmaceutical companies can cultivate personalized experiences that are incredibly valuable for their allergy medication customers. By delivering hyperlocal pollen levels, detailed plant information, and heatmap visualizations of pollen levels, they can tailor treatment options and services to meet individual exposure and pollen sensitivity.
One way allergy medication brands create meaningful relationships with their customers is via companion apps – digital tools strategically designed to complement pharmaceutical products or services.
These apps can help users identify the pollen plant/type and index level which triggers their allergy symptoms over time. This allows for more informed allergy management and increases the usefulness of the product or service being offered, while nurturing engagement with the brand. Additionally, brand trust is built by allowing people to distinguish between grass, weed, and tree pollen to identify their specific triggers, showing a commitment to their well-being and a preventative approach to reduce symptoms. This type of personalization leads to earlier intervention, precise diagnoses, and more efficient treatment options. By using pollen data and insights to create valuable user experiences, pharmaceutical companies enhance their reputation as leaders in the allergy treatment industry.
Personalized pollen alerts
To be most effective, many allergy medications need to be taken before symptoms start. Pharmaceutical companion apps can ensure individuals take their medication at the right time by providing alerts triggered by forecasted pollen counts. By providing live and forecast pollen reporting at a spatial resolution of 1km, available through the Pollen API, users can take a proactive approach that significantly enhances allergy symptom management.
In addition, personalized pollen index level alerts can help allergy sufferers know when to take precautions, like staying indoors or turning on their air purifier. The reporting method utilized by our Pollen API leverages multiple data sources including land cover maps, satellite imagery, weather information, and monitoring station data, so users can come to rely on these alerts wherever they are planning their day or week. For example, they might see a message that provides a short summary of location-based pollen information and recommendations when they open the app, “Good news! Very low pollen levels. Today is a good day for outdoor activities.”
Using our Pollen API, companies can create a variety of experiences to help their customers and build loyalty to their allergy medication brand, including the development of a comprehensive educational center.
With the detailed pollen allergen information the Pollen API provides, pharmaceutical companies can build out education centers allowing allergy sufferers to learn more about pollen allergens including their shape, color, and cross reactions (other allergens they could also be allergic to). They can use color-coded map overlays to avoid triggers during outdoor activities and get tips on how to cope with different allergens. They can even explore pollen season calendars by allergen type that show what a typical allergy season might look like at the user’s location. These experiences promote active engagement and a greater likelihood of long-term commitment.
Hiking route recommendations based on pollen levels
Pollen data can be a valuable asset for both individuals and pharmaceutical companies. The API provides access to actionable, real-time information that empowers allergy sufferers to manage their symptoms and take control of their well-being. Using our Pollen API data, heatmap, and insights in companion apps can help allergy treatment brands build strong, lasting relationships with the people who use them.
The now generally available Pollen API is part of Google Maps Platform’s new suite of environment products which aims to support the development of sustainability experiences that help people adapt to and mitigate the impacts of climate change.
To learn more, check out our Pollen API webpage, to get started visit the documentation.
Pollen API offers real time local pollen data and detailed forecasts broken down by specific pollen types and species to the enterprise customers. It doesn’t process health data or personalize for the customers’ end users.
Read More for the details.
It’s hard to imagine (or for some of us, remember) life without the internet. From work, to family, to leisure, the internet has become interwoven in the fabric of our routines. But what if all of that got cut off, suddenly and without warning?
For many people around the world, that’s a daily reality. In 2022, 35 countries cut off internet access, across at least 187 instances, with each outage lasting hours, days, or weeks.
Censored Planet Observatory, a team of researchers at the University of Michigan, has been working since 2010 to shine a spotlight on this problem. They measure and track how governments block content on the internet, and then make that data publicly accessible to analyze and explore from a dashboard developed in collaboration with Google’s Jigsaw. To help restore unfiltered access to the internet in the face of censorship, Jigsaw also builds open source circumvention tools like Outline.
Fighting internet blackouts around the world requires a variety of scalable, distributed tools to better understand the problem. Jigsaw and Censored Planet turned to the Google Cloud team to help create a data pipeline and dashboards to highlight the global impact of censorship campaigns.
When the Google teams started working with the Michigan team in 2020, the main data outputs of their daily censorship measurements were large, flat files, some around 5 GB each. Loading all this data (around 10 TB total) required over 100 on-premises high-memory computers to achieve real-time querying capability. Just getting to this stage took heroic efforts: The project gathers censorship measurement data from over 120 countries every few days, and the records go back to 2018, so we’re talking about many files, from many sources, across many formats.
It was no small feat to build this consolidated dataset, and even harder to develop it so that researchers could query and analyze its contents. Vast troves of data in hand, the teams at Censored Planet and Google focused on how to make this tool more helpful to the researchers tracking internet censorship.
While open and freely shared, you needed specific technical expertise to manipulate or query the Censored Planet data: It wasn’t all in one place, and wasn’t set up for SQL-like analysis. The team and its partners needed a better way.
Sarah Laplante, lead engineer for censorship measurement at Jigsaw, wondered if there was a quick and easy way to load this big dataset into BigQuery, where it could be made easily accessible and queryable.
“Building the dashboard would not have been possible without the cloud tech,” said Laplante. “The pipeline needs to reprocess the entire dataset in 24 hours. Otherwise, there’s suspect data scattered throughout.”
She figured out a sample workflow that led to the first minimum viable product:
Load the data into Dataprep, a cloud data service to visually explore, clean, and prepare data for analysis and machine learningUse Datarep to remove duplicates, fix errors, and fill in missing valuesExport the results to BigQuery
This workflow made analysis much easier, but there was a catch. Every day, the sources tracking censorship created new files, but those JSON files required domain knowledge, and parsing with code, in order to be used in BigQuery. This “minimum viable product” could not be scaled. Different kinds of filtering, restrictions, and network controls led to different outputs.
It was a problem in desperate need of a solution that included automation and standardization. The teams needed more and specific tools.
With new data files being created every day, the team needed to develop a process to consolidate, process, and export these from JSON to BigQuery in under 24 hours. That way, researchers would be able to query and report on the day’s censorship data along with all historical data.
This is where Apache Beam came in.
Designed to handle a mix of batch and stream data, Apache Beam gave the Censored Planet folks a way to process the dataset each night, making sure the latest data is ready in the morning. On Google Cloud, the team used Dataflow to make managing the Beam pipelines easier.
Censored Planet dashboard shows commonly-blocked websites.
There were some snags at first. For example, some data files were so large they slowed down the processing pipeline. Others included metadata that wasn’t needed for the visualizations. Sarah and team put in place queries to shrink them down for faster processing and lower overhead, only parsing the data that would be useful for the tables and graphs they were generating. Today, the job can go out to thousands of workers at once, and finish quickly. One day’s worth of data can be processed in just a few hours overnight.
They solved the problem of how to process the dataset, but to make that dataset useful required good reports and dashboards. To get started quickly the team began with rapid prototyping, testing out options and configurations with Looker Studio and iterating quickly. Using an easy-to-use tool let them answer practical, immediate questions.
Those early versions helped inform what the eventual final dashboard would look like. Reaching a final dashboard design involved some UX studies with researchers, where the Censored Planet team watched them use the dashboard to attempt to answer their questions, and adjust to improve usability, functionality or ease of use.
Researchers using the Censored Planet data wanted to see which governments were censoring the internet and what tools they were using in as close to real-time as possible. To make the dashboards load and render quickly, the team began clustering and partitioning data tables. By cutting out data that they didn’t need to display, they also cut down on Looker Studio costs.
The data pipeline, from original measurements to dashboards.
Within BigQuery, the team partitioned the data by date, so it was easy to exclude historical data that was not needed for most reports. Then they partitioned by data source, country, and network. Since tracking and response often focused on one country at a time, this made queries and loading dashboards smaller, which made them much faster.
The goal was for all these queries to end up in a Looker Studio dashboard, with filters that let viewers select the data source they want to track. To make this work, the team merged the data sources into one table, then split that table out so that it was easier to filter and view.
There was more to this exercise than indulging internet censorship researchers’ need for speed.
Adding the ability to quickly reprocess the data, and then explore it through a speedy dashboard, meant the team could much more rapidly find and close gaps in their understanding of how the censors operated. They were able to notice where the analysis methodology missed out on certain measurements or data points, and then deploy, test, and validate fixes quickly. On top of creating a new dashboard and new data pipeline, Censored Planet also created a better analysis process. You can dive much deeper into that methodology in their paper published in Free and Open Communications on the Internet.
Building the dashboards in Looker Studio brought operational benefits, too. Because Looker Studio is a Google Cloud hosted offering, the team minimized creation and maintenance overhead and were able to quickly spin up new dashboards. That gave them more time to focus on gathering data, and delivering valuable reports for key use cases.
Looker Studio also lets them iterate quickly on the user experience for researchers, engineers, non-technical stakeholders, and partners. It was also easy to edit, so they could update or modify the dashboard quickly, and even give end users the opportunity to export it, or remix it to make the visualizations more helpful.
Shifting to a cloud-based analysis pipeline has made sharing and filtering all this data much more efficient for the more than 60 global organizations that rely on Censored Planet to monitor internet freedom and advocate for increased internet access. The team used Google Cloud tools to quickly experiment, iterate, and shift their data pipeline, prototyping new tools and services.
Google’s data analysis toolkit also helped to keep costs down for the University of Michigan sponsors. To isolate inefficient queries, they exported all the billing logs into BigQuery and figured out which Looker Studio reports were pulling too much data, so they could filter and streamline.
Censored Planet is working on more data sources to add to the dashboard, including DNS censorship data. The organization encourages everybody interested in researching internet censorship to use their data, share screenshots, and publish their analyses. To build a data pipeline with Google Cloud similar to Censored Planet’s, you can start with these three guides:
Run pipelines with Dataflow and PythonRun pipelines with Dataflow and JavaAnalyze billing data with BigQuery
Read More for the details.
As we enter the home stretch of 2023, it’s almost impossible to imagine a world before the generative AI explosion we’ve experienced over the last year. Today, it’s no longer just the thing everyone is talking about, it’s also the thing everyone is experimenting with hands-on, or at least trying to figure out the best way to get started.
If that sounds familiar, you certainly aren’t alone, which is why we recently published “The executive’s guide to generative AI” report from Google Cloud, designed to help you kickstart your own internal efforts with Gen AI including a step-by-step guide.
Today, we’re taking things one step further on your Gen AI journey to help you create and measure real value as you try to chart the best course and drive adoption across your organization. I’d like to introduce you to the new personalized Gen AI Navigator from Google Cloud, a practical tool designed to help businesses of any size or scale realize the true potential of Gen AI in the year ahead.
This interactive assessment dives into your AI maturity, industry challenges and current business operations to provide actionable steps that can help maximize the impact of your Gen AI strategy and investment. After you complete the quick assessment, you’ll receive a detailed report based on your organization’s specific environment, needs and opportunities.
Sounds interesting? Let’s take a closer look at what the new tool can do.
The Gen AI Navigator is designed to help you identify your organization’s fastest path to adoption & impact, including three sections each covering key aspects of an effective plan:
Output: Once you complete these three sections, you’ll receive a full, personalized report with tips to get started unique to your organization. The report is highly detailed with step-by-step guides tailored to your industry, goals, and current-state AI maturity of your overall business.
Are you ready to put Gen AI to work? Get started now with Google Cloud Gen AI Navigator.
Read More for the details.
2023 so far has been a year unlike any in the recent past. Machine learning (ML), specifically generative AI and the possibilities that it enables, has taken many industries by storm and developers and data practitioners are exploring ways to bring the benefits of gen AI to their users. At the same time, most businesses are going through a period of consolidation in the post-COVID era and are looking to do more with less. Given that backdrop, it is not surprising that more and more data leaders are seizing the opportunity to transform their business by leveraging the power for streaming data to drive productivity and save costs.
To enable and support you in that journey, the Dataflow team here at Google has been heads down working on a number of product capabilities. Considering that many of you are likely drawing your plans for the rest of the year and beyond, we want to take a moment to give you an overview of Dataflow’s key new capabilities.
Autotuning is at the foundation of Dataflow. It is one the key reasons why users choose it. Operating large-scale distributed systems is challenging, particularly if that system is handling data in motion. That is why autotuning has been a key focus for us. We’ve implemented the following autotuning features:
Asymmetric autoscaling to improve performance by independently scaling user workers and backend resources. For example, a shuffle-heavy job with asymmetric autoscaling benefits from fewer user workers, less frequent latency spikes and more stable autoscaling decisions. This feature is GA.In-flight job option updates to allow customers to update autoscaling parameters (min/max number of workers) for long-running streaming jobs without downtime. This is helpful for customers who may want to save costs during when there are spikes in latency or adjust downscaling limits in anticipation of a traffic spike to ensure low latency, and can’t tolerate the downtime and latency associated with a running pipeline update. This feature is GA.Fast worker handover: Typically, during autoscaling, new workers have to load pipeline state from the persistent store. This is a relatively slow operation and results in increased backlogs, increasing latency. To address this, Dataflow now transfers state directly from workers, reducing latency.Intelligent autoscaling that takes into account key parallelism (aka key-based throttling) to improve utilization and a number of other advanced auto tuning techniques.Intelligent downscale dampening: One of the common challenges with streaming pipelines is that aggressive downscaling causes subsequent upscaling (yo-yoing). To address this problem, Dataflow now tracks scaling frequencies and intelligently slows downscaling when yo-yoing is detected.Autosharding for BigQuery Storage Write API: Autosharding dynamically adjusts the number of shards for BigQuery writes so that the throughput keeps up with the input rate. Previously, autosharding was only available for BigQuery Streaming Inserts. With this launch, the BigQuery Storage API also has an autosharding option. It uses throughput and backlog to determine the optimal number of shards per table, reducing resource waste.
Every user that we talk to wants to do more with less. The following efficiency- and performance-focused features help you maximize the value you get from underlying resources.
In Dataflow environments, CPU resources are often not fully utilized. Tuning the number of threads can increase utilization. But hand-tuning is error-prone and leads to a lot of operational toil. Staying true to our serverless and no-ops vision, we built a system that automatically and dynamically adjusts the number of threads to deliver better efficiency. We recently launched this for batch jobs; support for streaming will follow.
Vertical autoscalingfor batch jobs
Earlier this year we launched vertical autoscaling for batch jobs. Vertical autoscaling reduces operational toil for developers rightsizing their infrastructure. Vertical autoscaling works hand in hand with horizontal autoscaling to automatically adjust the amount of resources (specifically memory) to prevent job failures because of out of memory errors.
ARM processors in Dataflow
We’ve added support for T2A VMs in Dataflow. Powered by Ampere® Altra® Arm-based processors, T2A VMs deliver exceptional single-threaded performance. We plan to expand ARM support in Dataflow further in the future.
SDK (Apache Beam) performance improvements
Over the course of this year we made a number of enhancements to Apache Beam that has resulted in substantial performance improvements. These include:
More efficient grouping keys in the pre-combiner tableImproved performance for TextSource by reducing how many byte[]s are copiedGrouping table now includes sample window sizeNo more wrapping lightweight functions with Contextful.An optimized PGBK tableOptimization to use a cached output receiver
Machine learning relies on have good data and the ability to process that data. Many of you use Dataflow today to process data, extract features, validate models and make predictions. For example, Spotify uses Dataflow for large-scale generation of ML podcast previews. In particular, generating real-time insights with ML is one of the most promising ways of driving new user and business value, and we continue to focus on making ML easier and simpler out of the box with Dataflow. To that end, earlier this year we launched RunInference, a Beam transform for doing ML predictions. It replaces complex, error-prone, and often repetitive code with a simple built-in transform. RunInference lets you focus on your pipeline code and abstracts away the complexity of using a model, allowing the service to optimize the backend for your data applications. We are now adding a number of new features and enhancements to RunInference.
Dataflow ML RunInference now has the ability to update ML models (Automatic Model Refresh) without stopping your Dataflow streaming jobs. This now means updating your production pipelines with the latest and greatest models your data science teams are producing is easier. We have also added support for:
Dead-letter queue – Ensures bad data does not cause operational issues, by allowing you to easily move them to a safe location to be dealt with out of the production path.Pre/post–processing operations – Lets you encapsulate the typical mapping functions that are done on data before a call to the model within one RunInference transform. This removes the potential for training or serving skew.Support for remote inference with Vertex AI endpoints in VertexAIModelHandler.
Many ML cases benefit from GPU both in terms of cost effectiveness as well as performance benefits for tasks such as ML predictions. We have launched a number of enhancements to GPU support including multi-process service (MPS), a feature that improves efficiency and utilization of GPU resources.
Choice is of course always a key factor and this true for GPUs, especially when you want to match the right GPU for your specific performance/efficiency needs. To this end we are also adding support for NVIDIA A100 80 Gig and the NVIDIA L4 GPU, which is rolling out this quarter.
Building and operating streaming pipelines is different from batch pipelines. Users need the right tooling to build and operate pipelines that can run 24*7 and offer high SLAs to downstream customers. The following new capabilities are designed with developers and data engineers in mind:
Data sampling – One common challenge that users have is the following knowing that there is a problem, but having no way of associating the problem with a particular set of data that was processed when the problem occurred. Data sampling is designed to address this specific problem. Data sampling lets you observe actual data at each step of a pipeline, so you can debug problems with your pipeline.Straggler detection – Stragglers are work items that take significantly longer to complete than other work items in the same stage. They reduce parallelism and block new work from starting. Dataflow can now detect stragglers and can also try to determine the cause of the straggler. You can view detected stragglers right in the UI, viewing them by stage or worker, for more flexibilityCost monitoring – Dataflow users have asked for the ability to easily look up estimated costs for their jobs. Billing data, including billing exports, is a reliable and authoritative source for billing data. Now, to support easy lookup of billing estimates for individual users who may not have billing data access, we have built cost estimation right into the Dataflow UI.Autoscaling observability – Understanding how autoscaling impacts a job is one frequently requested feature. You can now view autoscaling monitoring charts for streaming jobs within the Dataflow monitoring interface. These charts display metrics over the duration of a pipeline job and include information such as the number of worker instances used by your job at any point in time, autoscaling logs, estimated backlog over time, and average CPU utilization over time.
None of these new capabilities are useful if you can’t get to the data that you need. That’s why we continue to expand the products and services Dataflow works. For instance:
You can now implement streaming analytics on analytics data with the new Google Ads to BigQuery Dataflow template. It gives users the power to incorporate minutes-old, fine-grained data into your marketing strategy.New MySQL to BigQuery, PostgreSQL to BigQuery, SQL Server to BigQuery, and BigQuery to Bigtable templates connect your operational and analytical systems and Templates for Splunk, Datadog, Redis and AstraDB integrate data from and into other commonly used SaaS products.
To reduce the time it takes for you to architect and deploy solutions for your business problems, we partnered with Google’s Industry Solutions team to launch Manufacturing Data Engine, an end-to-end solution that uses Dataflow and other Google Cloud services to enable “connected factory” use cases from data acquisition to ingestion, transformation, contextualization, storage and use case integration, thereby accelerating time to value and enabling faster ROI. We plan to launch more solutions like Manufacturing Data Engine.
It’s easy to get started with Dataflow. Templates provide a turn-key mechanism to deploy pipelines, and Apache Beam is intuitive enough to write production-grade pipelines in a matter of hours. At the same time, you get more benefit out of the platform if you have a deep understanding of the SDK and the benefit of best practices. Here are some new developer learning and development assets:
Dataflow Cookbook: This comprehensive collection of code samples will help you learn, develop and solve sophisticated problems.We partnered with the Apache Beam community to launch Tour of Beam (an online, always-on, no-setup notebook environment that allow developers to interactively learn Beam code), Beam Playground (a free online tool to interactively try out Apache Beam examples and transforms), and Beam Starter Projects (easy-to-clone starter projects for all major Apache Beam languages to make getting started easier). Beam Quest certification complements these learning and development efforts, providing users recognition that will help them build accreditation,We hosted the Beam Summit in our New York Pier 53 campus on June 13-15. All of the content (30 sessions spread out across 2 days and 6 workshops) from the Beam Summit is now available online.Lastly, we are working with the Apache Beam community to host the annual Beam College. This virtual event will help developers with various levels of Beam expertise to learn and master new concepts with lectures, hands-on workshops, and expert Q&A sessions. Register here for the October 23-27 event.
Thanks for reading this far. We are excited to get these capabilities to you and looking forward to seeing all the ways in which you use the product to solve your hardest challenges.
Read More for the details.
Google Cloud offers a variety of load balancing solutions which simplify the management of networking infrastructure and support different types of backend services. Now available, the cross-region internal Application Load Balancer expands this coverage by providing integrations to load balance, geo-route, and automatically failover to backends in multiple regions. In this walkthrough we’ll focus on workloads in Google Kubernetes Engine (GKE). This enables the following capabilities:
Global load balancing: Support for backends in multiple regionsImproved performance, reliability and high availability: Distributing traffic across multiple regions, and automatic failover to services in other regionsGeo routing: Cloud DNS policy manager to route traffic to nearest healthy backendsManaged certificates support: Google-managed/Self-managed certificates
Initial project setup and enable required APIs
Components: VPC network with 2 subnets in us-central1 and us-east1, for two GKE clusters
Deploy sample GKE backends:
Create Proxy-only subnets
Querying the load balancer from a VM in each region shows that Cloud DNS routes the requests to the closest backends.
To test, scale down the deployment in one region. Running the same command as above, we can see that the requests failover to us-central1 now, since there are no healthy endpoints in us-east1 anymore.
To learn more on this topic please checkout the links below.
Documentation: Set up a cross-region internal Application Load Balancer with hybrid connectivityDocumentation: Internal Application Load Balancer overviewDocumentation: DNS policies overview | Google Cloud
Read More for the details.
Scaling down your workloads by 4x certainly sounds appealing, doesn’t it? The State of Kubernetes Cost Optimization report discovered that elite performers can scale down four times more than their low-performing counterparts. This advantage stems from elite performers tapping into existing autoscaling capabilities more than any other group.
The elite performer group understands that demand-based downscaling relies heavily on workload autoscaling, which requires proper configuration of resources, monitoring, and decision making. According to the report, elite performers leverage Cluster Autoscaler (CA) 1.4x, Horizontal Pod Autoscaler (HPA) 2.3x, and Vertical Pod Autoscaler (VPA) 18x more than low performers.
But enabling one of these is not enough. CA alone cannot make a cluster scale down during off-peak hours. To autoscale a cluster, you need to configure workload autoscaling properly, complete with HPA and VPA. This blog outlines the significance of workload autoscaling and outlines steps that you as a developer or platform admin can follow to harness its benefits.
Establishing resource requests and limits is critical to ensuring reliability and optimizing workloads for scaling. This was also emphasized in the blog post, Setting resource requests: the key to Kubernetes cost optimization.
To allocate resources appropriately, reconsider how you determine resource requests and limits. Requests should reflect the typical needs of your workload, not just the bare minimum your application can function with. Conversely, view the limit as the necessary resources to sustain your workload during scaling events.
To identify workloads that lack resource requests, consult the GKE Workloads at Risk dashboard. It is essential to set resource requests as they are required for the HPA and CA to provide elasticity in your cluster.
Having determined your resource requests and limits, the next step involves setting up monitoring dashboards to observe metrics for your workloads. The Google Kubernetes Engine (GKE) UI offers observability charts for both cluster and individual workloads. Additionally, Cloud Monitoring lets you view usage metrics and in-built VPA recommendations without deploying VPA objects, a point we’ll cover in Step 3. Whether you’re using predefined dashboards or customizing your own, Cloud Monitoring allows for comprehensive dashboard and alert creation.
To downscale efficiently, you need to rightsize your clusters properly. By utilizing the monitoring dashboards that were set up earlier, you can accurately determine the optimal resource request values and ensure maximum efficiency for your workloads. For a more in-depth exploration of workload rightsizing, please refer to our previous article, Maximizing reliability, minimizing costs: Right-sizing Kubernetes workloads.
Before choosing a workload scaling strategy, it’s essential to understand their purposes. HPA is best when optimizing the number of replicas for the pods’ performance, while VPA works great in optimizing resource utilization. HPA ensures enough resources are available to handle the demand even during peak times, while VPA ensures you’re not overprovisioning valuable resources to run your application.
Though it’s feasible to combine both HPA and VPA, the general guideline is to avoid it. Such a mix can induce erratic scaling behaviors, potentially leading to application downtime and performance and reliability complications.
The more workloads you manage to scale down during non-peak periods, the more adeptly CA can eliminate nodes.
Autoscaling profiles determine when to eliminate a node, balancing between optimizing utilization and ensuring resource availability. While removing underused nodes can enhance cluster efficiency, upcoming workloads may need to pause until resources are reallocated.
You can choose an autoscaling profile that best fits this decision-making process. The available profiles are:
Balanced: This is the default profile and is a better option when you want your key nodes spread between zones.Optimize-utilization: This profile leans towards maximizing utilization at the cost of having fewer spare resources. If you choose this, the cluster autoscaler acts more assertively, removing nodes at a quicker pace and in greater numbers.
For more insights into using CA for cost efficiency, check out Best practices for running cost-optimized Kubernetes applications on GKE.
If your aim is to streamline the management of demand-based downscaling, consider transitioning your workloads to GKE Autopilot. Autopilot eases operations by handling the management of the cluster infrastructure, control plane, and nodes. Whether you choose GKE Standard or GKE Autopilot, you’ll still be responsible for completing Steps 1 – 4, beginning with setting resource requests. However, with Autopilot managing the node pools on your behalf, you can skip Step 5, thereby simplifying the process.
This article highlights the value of workload autoscaling to facilitate efficient demand-based downscaling. You can undertake various measures, from resource request settings to activating Cluster Autoscaler, to guarantee optimal resource distribution. The steps laid out in this piece serve as a foundation for application developers, budget managers, and platform administrators to maximize workload scaling, diminish expenses, and bolster performance.
Remember, before attempting to scale down your cluster, it’s essential to set appropriate resource requests to avoid compromising your user experience.
Download the State of Kubernetes Cost Optimization report, review the key findings, and stay tuned for our next blog post!
Also, be sure to check out our other blogs based on the State of of Kubernetes Cost Optimization key findings, as well as other resources mentioned in this blog:
Setting resource requests: the key to Kubernetes cost optimizationMaximizing reliability, minimizing costs: Right-sizing Kubernetes workloadsBest practices for running cost-optimized Kubernetes applications on GKEThe Right-sizing workloads at scale solution guideThe simple kube-requests-checker toolAn interactive tutorial to get set up in GKE with a set of sample workloads
Read More for the details.
At Reckitt, we exist to protect, heal and nurture in the relentless pursuit of a cleaner and healthier world. We work tirelessly to get our products into the hands of those who need them across the world because we believe that access to high-quality health and hygiene is a right and not a privilege. The markets in which we operate can be very distinct with the need to understand our consumers in respective regions, run adapted campaigns leading to a need of capturing and using relevant consumer data. To ensure that everyone, everywhere, has access to the health and hygiene products they need, our regional marketing and ecommerce teams require relevant data insights for each market in which Reckitt operates.
Before we created our consumer data ecosystem with Google Cloud, regional insights and achieving this level of data reporting was challenging. We had good insights into certain markets, brands, or aspects of our business, but because our data was fragmented across consumer databases it was impossible to connect data points across the business for comprehensive views of customers, campaigns, and markets. Our activation data was also stored separately from our sales data, making it difficult for us to understand the efficacy of our marketing campaigns.
We needed a more unified approach to leverage consumer data effectively. Working with Google Cloud partner Artefact, we built what we call Audience Engine, which is designed to help us with audience activation. The Audience Engine uses Ads Data Hub to consolidate the consumer data from various sources such as websites in BigQuery, allowing us to analyze the path of consumers through our sites in far more detail than before. With the help of Vertex AI, our audience engine then builds models to show which users are in the market for which product, enabling us to build lookalike audiences and provide the right message on the relevant channels to more consumers. The more data that goes into the engine, the more accurate the modeling becomes, allowing us to channel our marketing resources more effectively. As a result, we have seen an average incremental increase in ROI of between 20% and 40%, depending on the campaign.
Once we realized just how powerful a tool our audience engine was, it was obvious that we should migrate all our consumer data to this newly created consumer data ecosystem with Google Cloud, that would form the backbone of our consumer marketing at Reckitt, while staying true to our Responsible Consumer Data Principles.
The migration began at the end of 2022, so our data transformation is still very much in its infancy, but we have already started to build some highly effective tools that are helping us to make our marketing operations more efficient.
A good example is our marketing ROI modeling tool, which allows us to predict how effective a marketing campaign will be before it goes live. With our historical marketing data unified in BigQuery, we are able to model potential results of specific planned campaigns to give our marketing teams the insights they need to adjust their campaign before it goes live. This helps us deliver a better ROI when scaling up those campaigns. Again, with our data previously fragmented across databases, such insights would have been impossible, making it harder to target our media spends effectively.
Having all our data in BigQuery also enables far more effective reporting on our marketing performance. We can now deliver an analytics tool that enables our media departments to analyze the performance of everything from cost-per-click to ROI.
With these insights, our teams are not only able to optimize media spends or enforce compliance, they can also gain insights into a campaign’s performance in almost real time. This allows them to respond quickly, adjusting a campaign within hours, instead of days.
As we migrate more and more consumer data into Google Cloud, we have noticed a change in the way we work. We are now able to try out new things more quickly: we have been able to test new approaches with our audience engine in less than a month. We build innovative tools and products in a more agile manner, and are more creative in how we use Google Cloud solutions to achieve our aims. As a result, we feel more empowered to experiment and learn at speed.
For example, we are currently building advanced solutions using Google Cloud for predictive and measurement marketing solutions, helping us to master such capabilities internally.
Unifying all our consumer data in Google Cloud has given Reckitt the foundations we wanted. While we still have some way to go towards full data democratization, this backbone will eventually empower all our departments, and particularly our regional marketing and ecommerce teams, to access, understand, and make use of relevant data whenever they need it, to make timely, informed business decisions. That means empowering local teams to make decisions based on the data specific to local markets, helping us to get more health and hygiene products into the hands of those who need them, wherever they are in the world.
Read More for the details.
Post Content
Read More for the details.
A new Places API is now available to give developers the opportunity to share new types of information from Google Maps with their end users. New features in Text Search, Place Details and Photos, and Nearby Search–which now includes EV charging–make it easier for developers to surface helpful information about the world in the products they build.
Find updated accessibility features in Place Details, Nearby Search, and Text Search
Real-time electric vehicle (EV) charging station data
As EVs grow in popularity, so has the demand for finding charging stations. To optimize travel plans, your users need information regarding stations, compatibility with their car, and availability. Real-time data for EV charging stations is now available through the new Places API. This means your users can optimize their driving based on real-time charging station information, like which stations are available and max charging speed at each station.
Since charging connectors differ, your users will also be able to view the type of connector a charging station accommodates. This means businesses like automakers, real estate companies, and more can create even richer and more dynamic user experiences. Charging stations will be returned by the API wherever they’re shown in Google Maps.
Real-time data for EV charging stations is now available through the new Places API
New place types
The new Places API provides even more details and place types, which means your users will be able to find detailed information about places of interest by type such as best sushi or nearby hiking areas. You can now include or exclude nearby places from place searches by place type. With the new Places API, we’ve nearly doubled the number of supported place types since the previous version. Now there are nearly 200 places including coffee shops, playgrounds, and more.
Wheelchair-accessible places
With the new Places API, developers will also be able to provide updated accessibility information for places, including wheelchair-accessible seating, restrooms, and parking. This builds on our wheelchair-accessible entrance field, which was introduced to the Places API in 2022. These accessibility features will be available for places wherever they’re available in Google Maps. You can now find these accessibility features available in Place Details, Nearby Search, and Text Search, exclusively on the new Places API.
It’s easy to surface the best places nearby with the help of new ranking capabilities, which show places that are popular in nearby search. For improved performance, search now also supports field masking and can return the place information you need without a follow up call to Place Details.
Previously, your users had to do a text search for a keyword such as “coffee shop” which wouldn’t always guarantee the right results. Sometimes the query might even surface something similar but not quite right, like a coffee roaster, in the results. Now, you can filter Text Search by the more granular set of place types, such as adding “coffee shop” as a filter when your users are searching for a cafe.
With the new Places API, you will also be able to display a “primary type” for each place, which tells you which type is the most important. For example, a primary type returned for a restaurant could be “Japanese restaurant.” That restaurant might also have a bar as well, but “bar” wouldn’t be considered the “primary type.”
The new Text Search also supports improved location restrictions and filters for minimum ratings and varying price levels, which can deliver better matches for your users’ queries and support new use cases.
Based on developer feedback across numerous industries, we’ve made improvements to simplify pricing for the new Places API. With field masking now available for Place Details, Text Search, and Nearby Search, developers can control costs by requesting only the information their apps need. For example, if you only request fields included in the Basic data tier in your Nearby Search, only the Nearby Search (Basic) SKU will be billed. If you request reviews or other detailed place data like whether a restaurant serves dessert, the Nearby Search (Preferred) SKU will be billed.
The new Places API also offers modern security settings with OAuth-based authentication and is built on Google Cloud’s service infrastructure to give you even more peace of mind as you build with us. This provides an alternative to API key-based authentication of requests to the new Places API, which are still supported.
Together, with our new fields and search features, you have a new API, modernized to help you focus on building and launching great products. To learn more about the new Places API, check out our website and documentation.
Read More for the details.