GCP – Uber’s modern edge: A new approach to network performance and efficiency
Picture this: You’re ordering an Uber in Lisbon, but your request takes a scenic tour through Madrid, London, and Virginia before confirming your ride. That was the case for millions of users until Uber and Google Cloud set out on an even bigger journey: redesigning how global edge networks should work.
Operating across six continents, Uber connects millions of riders and drivers while handling more than 100,000 concurrent trips and more than a million HTTP requests per second.
At this scale, every millisecond matters. When Uber’s existing edge architecture had sub-optimal routing paths, the company partnered with Google Cloud to redesign their global network approach. The results: substantial latency improvements and millions of dollars in cost savings.
The challenge: Sub-optimal routing, inefficient architecture with operational overhead
Uber’s previous Google Cloud edge used the open source Envoy proxy instances running on virtual machines across 16 regions. While designed to reduce latency by bringing services closer to users, this architecture often created sub-optimal routing paths with multiple unnecessary hops through different regions before reaching Uber’s data centers. The additional network transit increased latency and degraded the user experience that Uber’s customers expect.
Legacy Uber Edge GCP Traffic Flow
This setup presented several challenges:
- Operational complexity: Managing and orchestrating a large fleet of virtual machines (VM) was cumbersome and deviated from Uber’s internal standards.
- Diminishing returns on latency: Contrary to initial assumptions, running Envoy in numerous global regions did not consistently improve latency for all users. In fact, for some, it introduced unnecessary network hops.
- High operational costs: Maintaining a large, globally distributed infrastructure incurred significant costs.
The solution: Direct routing with Hybrid NEGs
The goal was straightforward: create the most direct path from the user to Uber’s backend services across on-premises and various cloud environments. The approach involved moving away from the distributed Envoy VMs and using Google Cloud’s Hybrid Network Endpoint Groups (NEGs) instead.
Simplified/Modern Uber Edge GCP Traffic Flow
This new architecture, developed through a 10-month collaboration between Uber and Google engineers, directs traffic from Google’s Global External Application Load Balancer — fronted by Google Cloud Armor for DDoS protection and Cloud CDN for caching — directly to Uber’s on-premises infrastructure via Cloud Interconnect.
The results of migrating to Hybrid NEG-based load balancers were immediate. By removing all edge VMs, the traffic path became significantly more efficient, allowing Google’s global network to handle the long-haul transit over optimized channels. This shift delivered a 2.6% latency improvement at the 50th percentile and 10% at the 99th percentile, directly improving service responsiveness.
The results: impactful improvement
The migration delivered substantial improvements across three key areas. After validating the design and shifting 99% of edge traffic, the project achieved:
- Significant cost reduction: Removing the entire fleet of edge Envoy VMs resulted in a significant cost savings.
- Improved performance and user experience: The streamlined traffic flow improved latency for Uber’s mobile app users by 2.6% at p50 and 10% at p99.
- Simplified operations: Decommissioning the edge VMs reduced operational overhead and improved reliability through more standardized tooling.
“At Uber, every millisecond defines the user experience for millions of people. By re-architecting our global edge with Google Cloud and Hybrid NEGs, we’ve created a more direct, lower-latency path for our services. This not only enhances today’s user experience but also provides the high-performance foundation necessary for our next generation of AI applications, all while significantly reducing operational overhead for our engineering teams.” – Harry Liu, Director of Networking, Uber.
Key takeaways for enterprise teams
Uber’s edge architecture transformation demonstrates what focused technical collaboration can achieve. By replacing a distributed Envoy VM fleet with a streamlined architecture using Google’s global network and Hybrid NEGs, Uber achieved significant improvements in performance, cost, and reliability.
This migration succeeded in under a year through close collaboration between Uber and Google engineers. Key success factors included:
- Architectural validation: Google’s insights into its load balancer architecture helped validate that fewer proxy locations would improve performance, reduce operational overhead.
- Performance modeling: Google engineers modeled production-scale results from Uber’s initial tests, saving benchmarking time and providing the confidence to proceed.
- Simplified design: Hybrid NEGs eliminated the need for Envoy proxy VMs in Google edge.
Read More for the details.