Skip to main content
Engineering, Data / ML

Optimizing HDFS with DataNode Local Cache for High-Density HDD Adoption

May 24, 2023 / Global
Featured image for Optimizing HDFS with DataNode Local Cache for High-Density HDD Adoption
Image
Figure 1
HostTotal readsTotal writesNumber of Blocks stored in the hostNumber of blocks being readAverage Block SizeCapacity UsageRead traffic on top 10k blocks (1 hour time window)
Host113.5M3.3 K1,074,62284769380 MB77%89%
Host212.8M4.7 K633,92359376330 MB79.89%94%
Host311.3M275 K479,54449317300 MB79.86%91%
Host48.5M4.6 K247,20631048300 MB82.85%99%
Host514.3M45 K463,81979958160 MB81.62%99%
Image
Figure 2
-rw-r–r–  1 275M Nov 17 09:49 blk_1234567890-rw-r–r–  1 2.2M Nov 17 09:49 blk_1234567890_1122334455.meta
Image
Figure 3
Image
Figure 4
Image
Figure 5
Image
Figure 6
Image
Figure 7
Image
Figure 8
Image
Figure 9
Image
Figure 10
Image
Figure 11
Chen Liang

Chen Liang

Chen Liang is a Senior Software Engineer on Uber's Interactive Analytics team working with Presto and Alluxio integration. Before joining Uber, Chen was a Staff Software Engineer on LinkedIn's Big Data Platform team. Chen is also a committer and PMC member of Apache Hadoop. Chen holds master degrees from Duke University and Brown University.

Jing Zhao

Jing Zhao

Jing Zhao is a Principal Engineer on the Data team at Uber. He is a committer and PMC member of Apache Hadoop and Apache Ratis.

Yangjun Zhang

Yangjun Zhang

Yangjun is a Staff Software Engineer with the Data storage team at Uber. He has been working on the reliability, efficiency, and modernization improvement for the HDFS dataplane.

Junyan Guo

Junyan Guo

Junyan is a Senior Software Engineer with the Data Security team at Uber.

Fengnan Li

Fengnan Li

Fengnan Li is an Engineer Manager with the Data Infrastructure team at Uber. He is an Apache Hadoop Committer.

Posted by Chen Liang, Jing Zhao, Yangjun Zhang, Junyan Guo, Fengnan Li