GCP – How Virgin Media O2 simplified internal data sharing with BigQuery’s Analytics Hub
Easily sharing data has become a critical asset for driving informed decisions across any business. Still, many organizations struggle with the complexities of sharing data in a way that’s effective and compliant. Data teams often navigate challenges like unclear governance, version control issues, data silos, access restrictions, and a lack of data management skills across the broader business.
Virgin Media O2 is a media and telecommunications company that relies on data sharing across the organization to optimize operations, drive strategy, and empower decision makers. From finance to marketing, all departments are supported by the data team for accurate, timely information.
As director of engineering, I was overseeing data platform teams who were running into a number of hurdles in sharing and documenting data. Communication overhead was high, with 20 to 30 hours a week spent on training, support, pipeline issues with deployment, and responding to data-related tickets — with even more time and resources spent on meetings to find solutions to access issues. Internal teams were also sharing and editing different versions of non-live data, which led to data duplication, latency, and slow time-to-value.
These obstacles and inefficient processes often slowed the flow of information and hindered decision-making across the business. Virgin Media O2 needed a solution that would make it easier to provide data access, as well as governance, across business divisions. Without it, we’d be stuck with no centralized data-sharing process, and unable to obtain org-level visibility and efficiency.
Navigating internal data-sharing challenges
As teams were already working on projects based in Google Cloud, it was important to ensure strong version control so the data was always accurate, consistent, and up-to-date, but this often delayed the time it took to produce insights further. Virgin Media O2 already had their enterprise data in BigQuery in order to support their enterprise and AI needs, so Analytics Hub — BigQuery’s data exchange capability — was one potential solution to build on their existing infrastructure.
The data platform team decided to pilot Analytics Hub after learning about its scalability, self-service capabilities, and simple governance model for data tags and quality; this last feature in particular aligned with improving the implementation of privacy by design.
After a successful pilot, Virgin Media O2 had established a clear onboarding and training process, appointing two dedicated owners for each new data exchange and two owners on the subscriber side to make it easier to track any actions in BigQuery. Over the next nine months, this process was scaled to 25 teams and more than 50 exchanges, 100 listings, and 500 tables, with around 300 daily users.
One of the biggest benefits the team found was that Analytics Hub doesn’t duplicate data, which avoids the network and storage costs associated with copying data. It does this by creating a real-time pointer, known as a Linked Dataset, to the source dataset that can be shared. So if someone edits the original data source, it is immediately available to all subscribers, and can easily be audited, tracked, and restored. This approach also means a disaster recovery safety net is also built in.
Analytics Hub also overcomes the complexity challenges with creating views, specifically accessing data through authorized views often results in lost metadata from the original table. However, with direct access to the source dataset, all table descriptions and columns are retained for the subscriber to see.
Now that Virgin Media O2 was linking data directly from the data publisher to a data subscriber, they were able to save time, reduce latency, and improve management and usability for both publishers and subscribers—all while the platform’s enhanced governance provides a centralized location for managing data access and quality.
Analytics Hub streamlines the process of sharing data between teams and business departments, reducing manual effort and errors. The platform has been particularly valuable to analytics engineers, data analysts, data engineers, data scientists, and software engineers within Virgin Media O2 — ensuring that everyone has real-time access to the data they need for their various roles.
Saving time with data that’s more accessible than ever
After rolling out Analytics Hub to around 25 squads, the solution helped save up to 30 hours a week in time spent on training, support, pipelines issues with deployment, and communication overhead from squads using the old solution. Now, the time spent each week across all teams is as low as 30 minutes, because there are almost zero issues. The team estimates this is around a 95% effort-saving result. Data is no longer kept in silos, making it widely available to the different departments that need it.
By building a dashboard, the team was able to democratize access to data for subscribers and their broader teams without needing to access Analytics Hub itself. Enabling users to subscribe to datasets enables self-service while still maintaining a strong governance model. Eliminating the middleman streamlines and speeds up this repetitive process while maintaining a robust governance model.
Key benefits realized from secure data sharing
1. Data Integrity and Security
Secure, zero-copy sharing ensures consistent data integrity across departments, preventing metadata loss and unauthorized access.
2. Cost Efficiency and Simplified Management
By eliminating data movement, the platform reduces long-term costs and operational overhead, while a small team can manage data oversight effectively.
3. Centralized Monitoring and Governance
A unified dashboard provides real-time control over data sharing activities, allowing quick identification of issues and enforcing strict access and permission policies.
Looking ahead, the team is focusing on streamlining four key areas: data ownership, data catalog, data quality metrics, and more effective tagging of sensitive data. The goal is to automate the entire data operation, and once data is certified, these four areas must be clearly defined and enforced as a policy through Analytics Hub.
The process above, known as “data certification,” offers two key benefits:
Fast identification of uncertified data assets and the tracing of data quality issues within minutes by leveraging data quality metrics (at the column level) and data lineage.
Real-time identification of sensitive data and the monitoring of where it’s being consumed, enabling proactive management of data privacy risks through live audit logs.
How you can get started
For customers new to Google Cloud, BigQuery is where to start your journey. For those already using BigQuery, Analytics Hub is the next step.
Read More for the details.