GCP – Formula E chooses generative AI to inform drivers and engage fans
In today’s fast-paced business landscape, data-driven decision-making has become paramount for organizations aiming to stay ahead of the competition. With the emergence of generative AI (gen AI) technology, it’s now possible to provide a business user with a conversational interface to data that enables new engagement styles. With this approach, enterprises can build on traditional BI capabilities with technology that can inherently understand the data model below it, and respond generally to requests for insights held within it. This simplifies data exploration and empowers users to access real-time insights with ease. A conversational approach also allows for a more general exploration of all the data compared to dashboards, which can only provide insights on certain pre-selected areas.
Google Cloud, Formula E, and McKinsey QuantumBlack worked together to create such an experience. By combining race data and car telemetry data with gen AI from Google Cloud, Formula E is able to provide a conversational interface for their drivers to answer specific questions ranging across a very large possible question-space. For example, questions such as “What was the exit speed from turn 1?” can be directly asked and answered through a simple text-based interface, rather than requiring manual analysis of the data corpus to identify important features, followed by manually creating and maintaining a BI dashboard to deliver this information.
Modern race cars create huge amounts of telemetry data from their many sensors. Understanding exactly how a vehicle behaves at certain points on a race track is an important early step for optimally tuning the vehicle and achieving faster lap times. The closing week of the Formula E 2023 championship included a successful attempt at setting the indoor land speed Guiness World Record with the next generation GENBETA race car, as well as the final two championship races of the season.
During this week of events, Formula E wanted to provide information for two very different personas:
Drivers who wanted to understand how their vehicles had behaved in the land speed record attempt to help them set a faster timeFans who wanted to learn more about Formula E, current and previous championship results and race data, and the next-generation GENBETA race car
Powered by text- and chat-optimized PaLM large language models (LLMs) and other AI services in Google Cloud’s Vertex AI portfolio, we built a gen AI chat interface that catered to both of these use cases and personas and that was available to both drivers and fans during the events.
Design overview
At a high level, the system consists of a custom backend service that acts as an agent for handling the questions, and a frontend UI built in collaboration with McKinsey QuantumBlack. These two services are containerized and deployed as Kubernetes Deployments in Google Kubernetes Engine via a CI/CD pipeline built with Terraform, Cloud Build and Anthos Configuration Management (Config Sync). The frontend is secured with authorization via Identity-Aware Proxy and also uses a CDN built using Cloud CDN and Cloud Storage.
The backend service is where the gen AI logic resides. It uses Google’s PaLM 2 API for chat available in Vertex AI to provide a general-purpose conversational framework as a base-layer. Above this, additional context is provided to enable more tailored and context-rich responses:
Structured data such as car telemetry data and historical championship resultsUnstructured text corpus including general information on Formula E and information to differentiate Formula E from other motorsports
Adding context from structured data using a combination of BigQuery, Vertex AI and Langchain
Formula E has a lot of structured data that is very relevant to both drivers and fans, so wanted this additional data to be part of responses. This includes data from previous and current championships such as race results and qualifying times, as well as very granular telemetry data coming from sensors all over the race cars. Batch uploading and streaming this data to BigQuery was the first step in allowing the backend service to access this information and augment its responses with insights found in the data.
The next step was for the backend to dynamically query these datasets that sit outside the foundational model’s training set to retrieve the additional information it requires to more precisely answer the user’s questions. Langchain is an open-source framework for language models and includes a way to dynamically generate SQL based on an input question, and only requires coaching through prompt tuning rather than any fine-tuning on the data itself. By using Langchain with Google’s PaLM 2 API for text available in Vertex AI , the backend service can first identify the sorts of questions that would require additional data from the structured datasets. Then a SQL query is dynamically generated that queries the relevant BigQuery tables to return the relevant data to answer the question and return this back to the frontend.
For example, a question such as “Who won the race in Rome in the 2018 season?” would be identified as a question sitting outside the foundational model’s training set and would be better answered by querying the structured data that exists for race results. A query similar to the below example would be dynamically constructed and ran against the relevant BigQuery table, returning the requested name of the driver, which can then be sent back to the frontend:
Adding context from unstructured data with Vertex AI Search
Vertex AI Search is a gen AI offering from Google Cloud that helps enable a Google Search-style experience across a specified corpus of structured, unstructured or website-based data sources.
Formula E added a wide range of unstructured information to Vertex AI Search, covering what Formula E is, its commitment to sustainability, information on previous, current and future generation race cars, and many other topics. When a question is received by the backend service, it also determines whether it should first pass this question to Vertex AI Search, asking it to return the most relevant information from the unstructured data corpus.
After retrieving additional context-rich information from Vertex AI Search, this is then sent to the foundational model in addition to the original question. This provides the general-purpose PaLM 2 model with additional Formula E context to build a more specific and more relevant response before returning the answer back to the frontend.
This following diagram represents the complete backend behavior visually:
Results and discussion
The system was built by a small team in just under two weeks. Over the final race weekend of the 2023 Championship season, over 700 questions were answered, covering a wide range of topics on Formula E and beyond. The system catered to newcomers to the sport asking entry-level questions, to veterans looking for specific and niche information such as decisions on build changes between race car generations, and to drivers themselves wanting to understand how the race cars behaved when in operation.
Typically a user began with simpler questions and asked more complex questions over time. This required answers drawn from the whole range of available information: across the model’s foundational training set, as well as the private corpus of structured and unstructured data provided by Formula E.
“A generative AI [natural language processing] interface will revolutionize the way we consume and interpret data”, says Eric Ernst, CTO at Formula E. “A general-purpose NLP interface that can take any question, examine that question with all available data from root sources and bring back the most relevant result, will give drivers access to insights in a flexible way that they did not have access to before, and provide fans with an engagement portal into the world of Formula E, regardless of their familiarity with the sport.”
As LLMs and gen AI technology are unlocking new ways to access information. The system described here provides a text-based output, but as multi-modal models are brought into mainstream use, these responses will have the potential to become much more rich, using additional types of dynamically generated media such as images and graphs and possibly even video.
Read More for the details.