Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Engineering, Data / ML

Containerizing the Beast – Hadoop NameNodes in Uber’s Infrastructure

26 January 2023 / Global
Featured image for Containerizing the Beast – Hadoop NameNodes in Uber’s Infrastructure
Image
Figure 1: Container Deployment on HDFS NameNode Host
Image
Figure 2: Developer Friendly CLI
Image
Figure 3: DNS based Discovery performed by different HDFS components
Image
Figure 4: DNS records used for discovery among HDFS components
Image
Figure 5: Load Testing Setup
Image
Figure 6: Monitoring RpcQueueTimeNumops during load tests
Image
Figure 7: Monitoring RpcQueueTimeAvgTime during load tests
Image
Figure 8: Migration Process
Image
Figure 9: DataNodes being decommissioned by automation during the 2022 year-end holidays
Image
Figure 10: RpcQueueTimeAvgTime prior to migration
Image
Figure 11: RpcQueueTimeAvgTime post migration
Mithun (Matt) Mathew

Mithun (Matt) Mathew

Mithun (Matt) Mathew is a Sr. Staff Engineer on the Data team at Uber. He currently works on various projects in the security domain. Previously, he led the initiative to containerize and automate Data infrastructure at Uber.

Prabhat Jha

Prabhat Jha

Prabhat Jha is a Software Engineer II on the Data (Hadoop) team at Uber. He currently works on deployment and automation of Data infrastructure at Uber, and worked on containerization of NameNode.

Jing Zhao

Jing Zhao

Jing Zhao is a Principal Engineer on the Data team at Uber. He is a committer and PMC member of Apache Hadoop and Apache Ratis.

Yuru Liu

Yuru Liu

Yuru Liu is a Senior Software Engineer on the Data (Hadoop) team at Uber. He currently works on the containerization of HDFS Namenode, HDFS client, and observability of Data infrastructure at Uber.

Nishith Shetty

Nishith Shetty

Nishith Shetty is a Software Engineer II on the Data Infrastructure team at Uber. He currently works on the containerization and automation of HDFS Namenodes.

Fengnan Li

Fengnan Li

Fengnan Li is an Engineer Manager with the Data Infrastructure team at Uber. He is an Apache Hadoop Committer.

Posted by Mithun (Matt) Mathew, Prabhat Jha, Jing Zhao, Yuru Liu, Nishith Shetty, Fengnan Li