Skip to main content
April 30, 2026

How Ansible® Automation Powers the Uber Corporate Network at a Global Scale

Claudia Baz Alvarez

Network Engineer

Almaz Rakhimbekov

Staff Infrastructure Engineer

Share this article

Introduction

Enterprise networks must ensure seamless uptime while handling daily operational updates. At Uber, we maintain connectivity for all our offices spread across six continents (Asia, Africa, North America, South America, Europe, and Australia/Oceania), with thousands of engineers accessing it daily. Managing a network of this scale requires orchestrating cloud and on-premises infrastructure, continuous deployments, and multi-vendor hardware to ensure redundancy, scalability, and minimal downtime.

In 2026, infrastructure automation is no longer optional. Any big enough distributed system requires automation to manage repetitive tasks such as patching, creating and deleting VMs, or configuring VLANs across switches throughout the campus. Nowadays, several IaC (Infrastructure as Code) tools are popular and stable enough to orchestrate production environments. Puppet®, Terraform® and Ansible® tools are common options that provide similar functionalities.

Provisioning infrastructure is different from configuration management. At Uber’s network offices, we orchestrate our cloud instances with Terraform, the primary tool for cloud provisioning. For configuration management, like installing version updates, configuring users, and managing files on an already running machine, we use Puppet. Ansible is similar to Puppet in this manner. It provides a framework to push configuration into already running instances or devices. 

This blog focuses on configuration management for on-premises network hardware. It explains how and why Uber Corpnet (Corporate Network) uses Ansible to enforce a stable network configuration across its global extent. This includes everything from setting regional DHCP server preferences to IP routes and trunk interface configurations. The explanation assumes familiarity with the Ansible inventory and playbooks, but doesn’t require deep programmability knowledge.

Ansible: Development of an Automation Standard

Driven by the frustration of manually managing multiple web servers simultaneously, Michael DeHaan created a GitHub® project in 2012 called Ansible. In three years’ time, it became a GitHub top-five by number of contributors and was led by Red Hat®. Later releases broke the monolithic project into Ansible Collections. Ansible then spread across multiple repositories where contributors could develop their features and ship them through a package manager. Nowadays, Ansible is still an open-source project, stable, modular, and continually enhanced by the community.

By the time Ansible was released, Puppet and Chef were already established tools. In comparison, it was easier to set up: all it required was a node running the Ansible controller with SSH access to the clients, while the others called for extra configuration. In other words, Ansible is agentless: it doesn’t need a local daemon in the client to manage them. Also, it didn’t require knowledge of Ruby or a DSL (Domain-Specific Language), compared to Chef and Puppet. It was simpler for newcomers already familiar with Python and YAML.

This simplicity soon caught the eye of network and DevOps engineers. What started as configuration management for Linux servers evolved into orchestrating network gear. Over time, Ansible’s modularity incorporated roles and collections from vendors such as Juniper®, Cisco®, and Arista® certified by Red Hat and built by the community. 

Overall, Ansible is a popular tool, modular, easy to set up, Python-based, open-source backed by a strong community. It also includes network modules to manage everything from Cisco switches to Junos® routers. The combination of community and project growth, with Ansible collections that manage the network gear, made it the tool Uber chose to manage its configuration in their offices.

Scaling Up: Configuring Uber’s Global Office Network

Once a network reaches a certain size, daily failures and downtime become common across links and even devices. In large networks, daily operations and maintenance certainly include manual device changes to mitigate live problems. Tracking both physical and configuration updates in a database is vital, not just to gather network statistics and avoid downtime, but also to restore the devices back to the desired configuration and clean temporary fixes. Ansible helps us ‌track the device inventory, categorize them, and return them to the desired configuration using playbooks.

Operating across five global regions, our infrastructure encompasses around 5,000 devices. We used Ansible automation to manage critical segments and standardize configuration management for everything from generic switches to external gateways, and firewalls from different vendors, among others. This scale introduces significant hardware diversity: an office in India might use a type of switch not available in Mexico, and bigger offices such as Amsterdam or San Francisco might need certain routers that a smaller one doesn’t.

It’s also worth mentioning data center hardware, which completely differs from office environments. While office gear is optimized for user access and local connectivity, data center infrastructure is built for higher throughputs and specialized fabric topologies like leaf-spine, requiring a different approach, hardware, and management.

The Corpnet hosts are then registered in the Ansible inventory, grouping them into sets by region, functionality, and operating system. Each group is defined by a YAML file, also with its configuration variables. These mirror the desired state of the network, describing the ideal configuration scenario where no manual changes are introduced. By registering the device in different Ansible groups, it inherits their variables, which are then translated to the operating system’s syntax. This is how Ansible simplifies our network management by decoupling configuration logic from vendor-specific syntax.

Diagram showing a network inventory database structure with categories such as regions (EMEA, APAC, AMER WEST, AMER EAST, LATAM), device types (Firewall, VPN Concentrator, Core Gateway, Datacenter Switch), network components (Access Point, WAN Gateway, Console Server, Audio Visual Switch, Wireless Controller, Access Switch, Distribution Gateway), gateways (Provider Gateway, Edge Gateway, Hiub Gateway, Cross-connect Gateway), and additional attributes (Vendor/Operating System, Uber Office Location, Network Function). Device configuration files are depicted as being associated with these inventory items.

Figure 1: The Ansible inventory describes the desired Corpnet configuration state with both specific to vendor and OS syntax-agnostic variables.


Any daily manual change performed on the network devices, not mirrored in the Ansible inventory, are overwritten overnight. These untracked manual adjustments are treated as technical debt. By cleaning them, we guarantee stability, provide a versioned record of the network state based on inventory variables, and prepare the network to be operational and ready for when the offices open their doors in the morning. 

Backup, Generate, and Push

We call this process Daily Nightly Enforcement. It’s a sequence of Ansible playbooks, which form a workflow to render the Corpnet configurations, push them to the devices, and trigger alerts in the case of failures. We divide it into three steps:

  1. Backup: Before performing any changes, we run a full backup of the region’s configuration. This process is managed by a playbook which accesses all the devices, saves the running configuration in a backup file, and pushes it to a private repository in GitHub. GitHub allows us to track deltas in the configuration and to have a solid history of changes in each network host.
  2. Config generation: Another Ansible playbook gathers facts from each device, and renders the desired vendor configuration based on the host data. The inventory groups and variables are injected in Jinja templates, which render a final file based on the device’s operating system. Each vendor has its own Jinja template to fit the syntax for the type of device. These generated files are called golden configurations, and they are generated every 24 hours. Finally, they are processed by a custom Config Ansible Role that masks their sensitive data, applies formatting, and prepares them to be stored  in GitHub.
  3. Nightly push: When the golden configurations are ready, the last playbook pushes them to the hosts. Any error in the backup or the golden config generation stops the enforcement for that device and alerts an external system of the failure. The same Config Role now prepares the golden configuration, with an inverse process, for it to be enforced into the devices.

Flowchart illustrating an Ansible Controller workflow for network configuration management. Corpnet connects to Ansible Inventory, which interacts with three playbooks: config_backup, config_gen, and config_push. Config_backup playbook outputs to a Config Role, storing Backup Config. Config_gen playbook uses Jinja Templates, then outputs to a Config Role, storing Golden Config. Config_push playbook also uses the Config Role. Manual Trigger, Scheduler, and Alert System can initiate or interact with the process.

Figure 2: The Daily Nightly Enforcement is a sequence of Ansible playbooks which gathers network host information, performs backups, and enforces dynamic network configuration. 


These alerts feed into our broader visibility strategy. While Ansible ensures our network reaches its desired state, we rely on a cloud-native observability platform to monitor its health in real-time. You can read more about how we evolved this system in the blog From Monitoring to Observability: Our Ultra-Marathon to a Cloud-Native Platform.

Finally, as explained, we store all our configuration files  in GitHub. The Ansible playbooks, the inventory, and the network configuration files reside in different repositories. This allows us to track updates of backups and golden configurations when multiple engineers are working at the same time. GitHub also allows us to use workflows to check configuration standards and formatting in each commit of both generated files and our own Ansible code.

From DHCP to BGP, and VLANs

The Daily Nightly Enforcement includes several core configurations. The regional groups allow us to establish preference variables based on proximity to decrease ‌latency in some operations. Some examples are the BGP communities, ASNs, TACACS, RADIUS, and DHCP server lists. Every 24 hours, these are resolved in all our devices, unless triggered ‌manually, if needed. In addition, Ansible’s flexibility also allows us to enforce specific use cases, such as a local DHCP server for redundancy in certain offices or a local path preference in all the routers of a location.

We also use the Ansible inventory to standardize configurations like VLAN management by device type. This setup allows for seamless updates: a single change to a YAML file can be deployed to all Corpnet switches globally, or targeted to a specific region or office.

In addition, we leverage automation to build certain helper workflows which don’t fall under the Daily Nightly Enforcement. A hard manual task such as password rotation now can be fully automated by Ansible playbooks. Here the monitoring system also comes into place allowing the correct alerts in case of failure stopping the enforcement if needed. We also use automation to add new devices to the network. A configured DHCP server joined with the config generation ensures a smooth ZTP (Zero Touch Provisioning). All these cases are also covered by our Ansible automation to help us, network engineers, to deal with not only daily but frequent tasks which can take a lot of time.

The strength of network programmability goes beyond the automation of simple tasks. It allows us to perform network snapshots, and centralize configurations. In the ideal scenario, the network configuration becomes vendor-agnostic. As an example, an access list defined in a single YAML file, now can be enforced across different vendor’s equipment. 

Managing 2,000-Line Configuration Files

The risk of automation is that any error might have an unpredictable blast radius. Deploying new features requires extra care so we don’t lose access to devices in other parts of the world. Processes such as password rotation, a change in access lists, or even a simple VLAN configuration affect hundreds or thousands of physical devices across the globe.

At the same time, dealing with multiple vendors is a hard task. While the Ansible inventory is network agnostic, it requires caution not to end up with a sea of duplicated variables which‌ serve only a single vendor syntax. A planned software architecture is vital to structuring the Ansible code so it leverages a single source of variables. As an example, the same access list can be rendered for Juniper or Cisco devices if the config generation process is prepared to adapt to multiple Jinja templates.

On top of that, dealing with stored network gear configuration files can be a pain. They can easily reach 2,000 lines, and any wrong character breaks the parser. Having a formatter to remove blank spaces or undesired tabs working next to the Jinja templates helps us to avoid empty commits and spending too much time verifying changes. Searching for the reasons for a misbehavior in a repository of so many devices, with thousands of lines of backups and generated configurations, is too time-consuming. Having daily jobs to remove legacy hosts and set in these big repositories is vital to keeping them manageable.

Finally, sometimes public Ansible collections don’t fit our needs. We might need to implement a very specific procedure, or the public module is too complex or nested for what we want. Jinja templates also need extra logic to render a specific vendor syntax based on our inventory. In these cases, we develop our own Ansible and Python modules. This requires resources such as engineering time, control devices to test the features on, and the Ansible code to be maintained in the long run. Furthermore, not all vendors can be managed by Ansible. This requires orchestrating both Ansible code and other network portals to keep their configurations synchronized.

Conclusion

Despite these challenges, Ansible automation makes our global scale possible. This network programmability is vital because it allows us to handle many different network vendors across different regions with total confidence. By automating the entire workflow, from daily backups and golden config generation to nightly enforcement, we’ve turned a massive, diverse system into a manageable environment. While tools like Terraform handle our cloud provisioning and other portals manage specific hardware, Ansible remains the heart of our on-premises configuration. Moving away from manual work ensures that  the Uber corporate network remains stable, updated, and ready for our engineers every single day. With this powerful automation toolkit as our foundation, we’re excited to keep pushing the boundaries of what a modern, reliable network can do for our teams worldwide.

Acknowledgments

Cover Photo Attribution: “photo of outer space” by NASA is covered by the Unsplash License.

Ansible® is a registered trademark of Red Hat, Inc. in the United States and other countries

Arista are trademarks of Arista Networks, Inc. and its subsidiaries in the U.S. and/or other countries

Terraform is a trademark or registered trademark of HashiCorp, Inc.

Chef is a trademark or registered trademark of Progress Software Corporation in the U.S. and other countries.

This material is not sponsored by, endorsed by, or affiliated with Cisco Systems, Inc. Cisco, Cisco Systems, and the Cisco Systems logo are registered trademarks or trademarks of Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries.

GitHub is a trademark of GitHub, Inc.

Juniper® and Junos® are registered trademarks of Juniper Networks, Inc. in the United States and other countries. 

Puppet ®is a trademark or registered trademark of Puppet, Inc.

Python® and the Python logos are trademarks or registered trademarks of the Python Software Foundation.

Red Hat® is a trademark of Red Hat, LLC,  registered in the United States and other countries.. Uber has no affiliation with Red Hat, LLC.

Terraform® is a trademark of HashiCorp, Inc.

Written by

Claudia Baz Alvarez

Network Engineer

Claudia Baz is a Network Engineer based in Amsterdam. As a telecommunications engineer, she specializes in network automation and software-defined systems.

Almaz Rakhimbekov

Staff Infrastructure Engineer

Almaz Rakhimbekov is a Staff Infrastructure Engineer based in Amsterdam. He specializes in network automation, scalable infrastructure, and system resilience.