Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery

July 1, 2017 / Global

Share
Facebook
X social
Linkedin
Envelope
Abstract
Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. In this paper, we introduce a large-scale OCR dataset Uber-Text, which contains (1) streetside images with their text region polygons and the corresponding transcriptions, (2) 9 categories indicating the business name text, street name text and street number text, etc, (3) a set containing over 110k images, (4) 4.84 text instances per image on average. We show the challenge of the task and the dataset via evaluating the prevalent methods, which proves the significance of the dataset and motivates the future work in this field of study.
Authors
Ying Zhang, Lionel Gueguen, Ilya Zharkov, Peter Zhang, Keith Seifert, Ben Kadlec
Conference
CVPR 2017
Full Paper
‘Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery’ (PDF)
Uber ATG