Uber’s Real-Time Document Check

9 June 2022 / Global

Share
Facebook
X social
Linkedin
Envelope
Introduction
Justification for Identity Verification
Latin America is a rich cultural region, known for its world-renowned gastronomy, its abundant biodiversity, and its welcoming population. However, socio-economic inequality has been a challenge for the region, and is generally considered a major contributing factor to high levels of violence.
The platform is not immune to the environment in which it operates. Therefore, to be one step ahead of opportunistic users, Uber has created the Rider Identity Team with 3 main goals in mind: reduce rider anonymity by verifying riders, reduce the rate of conflicts caused by riders, and improve driver safety sentiment regarding new riders.
Uber Rider Identity Product Evolution
Rider verification on our platform has been live since 2016. By mid-2017, we had implemented social verification through Facebook for all new cash users in Spanish-speaking LatAm, and CPF (ID) number verification in Brazil. Our portfolio now includes rider selfie verification as well. But drivers have been asking for more stringent rider screening; in a 2018 survey conducted by Uber, we found that a government-backed verification method was considered the most reliable and effective by drivers.
Thus, ID verification for anonymous users on our platform was born. An ID creates a unique identifier that helps us hold users accountable for their actions on the platform, and is a form of identification that can be provided to law enforcement, subject to proper legal process.
Real-Time Document Check Criteria
From the onset, we knew that the Real-Time Document Check product needed to meet 4 non-negotiable criteria:
Data privacy: Adherence to best practices for handling personal data, taking into account local laws, regulations, and norms in all countries where the product is available. This involves implementing the recommended privacy mitigation steps related to retention, purpose limitation, access controls, deletion, transparency, and documentation.
ID fraud detection is the key to accountability – high confidence in the validity of the ID enables us to block repeat offenders from coming back to our platform. At the same time, the provision of a real, valid ID serves as a deterrent to bad actions, and a sign of good faith from our riders.
Real-time verification: First-time users may be standing at the corner of the street when they request a ride and are asked to photograph their ID. As such, the latency of the end-to-end document check is critical to rider safety.
Global coverage: Uber operates in more than 60 countries, and non-digital payments can be used everywhere. To add complexity, every country has more than one version of the national ID circulating at a given time, to which we must also add support for driver’s licenses and passports.
With the above 4 criteria in mind, we designed and built a technical solution that can be deployed around the globe.
Vendor Strategy
In order to support various different documents in multiple countries by the end of 2020, we evaluated the leading document verification vendors in the industry and conducted multiple rounds of tests and performance benchmarking before choosing our vendor.
Technical Overview
Figure 1: Diagram on Uber’s Real Time Document Check functionality.

In the first row of the above diagram, the Document Image Collection function module is also the interface interacting with users in different countries. The UI walks users step by step through the document capture photo before the app uploads the photo.
In the Document Image Processing module, a list of operations (including document classification, transcription, and fraud detection) are applied to the uploaded document images via different technologies (e.g., 3rd-party vendor, Uber in-house technology, and human review).
In the Result Evaluation and Output modules, a list of configurable verification checks would be conducted based on the information obtained from the Document Image Processing module, and the final verification results would be stored in Uber’s system and pushed to end users, via Uber’s server-side push system.
The Verification Results database storage saves all data related to the verification conducted to the document as input data, output of each verification, as well as the final identity verification result. PII data, when allowed to be stored, is always saved encrypted. Access to this database is restricted by a permission group, which only has access in case an investigation needs to be conducted. Finally, data retention is region-based so that applicable verification data can be deleted from storage through a specific routine.
From Prototype to Global Scale
The pilot phase was crucial to pressure-test the technology, and it involved rapid prototyping and iteration to address the following issues:
Low-quality images
Challenging user experience
ID variations
The sections below dive deeper into each of the above issues, and outline our solutions.
Low-Quality Images
Image analysis from Chile, Mexico, and Argentina showed us that almost half of the verification failures were caused by poor photo capture: blur, glare, overexposure, incomplete image, etc. To solve for low-quality images, we developed a mobile client-side machine-learning model that checks image quality prior to photo capture.
The UI was designed to give users real-time feedback so that they could address these issues, and once a successful frame is observed, we auto capture the photo.
Figure 2: Example of ML model checking image quality.
*PII has been obscured for privacy purposes. 
The client-side ML model flags images with the following issues:
The image does not contain an ID
The ID is truncated
The ID is blurry
The ID has glare
We use a single deep-learning model, with a shared feature extractor, to address the above issues with multi-task learning. Finally, we quantized the model to run it more effectively on mobile devices.
Our tests showed that our model was accurate in the vast majority of cases, and results in production showed a decrease in low-quality images.
Figure 3: Diagram on client-side image evaluation functionality. 
Capture Image
On Android, to capture the image frames on which we perform image analysis, we used the ImageAnalysis use case provided by CameraX 
On iOS, we use the pixel buffers delivered by the AVFoundation camera 
Perform Inference Using TFLite Model
We perform inferences using the TFLite model on image frames delivered by the camera, and provide the output to document scanning.
If the model is not available, or we have a problem loading it or configuring TensorFlow, we switch the document scan mode to manual.
Figure 4: Example of document scanning in manual mode. 
Analysis of Model’s Output
Auto Capture

For each frame, the model provides scores for glare, blur, ID, and ID location. If we can get a frame with scores that are within the acceptable ranges, we use that as the document image that will be sent to our servers. No user interaction is required with this way of capturing the document.
Manual Capture

If we are unable to auto-capture the document image due to some hardware failure or timeout of Auto Capture, we will present the user with an option to take the picture manually (normal camera interface).
Funnel Improvement
Through mobile data analysis of users’ behavior in addition to surveys, we recognized that we needed to refine the rider experience to reduce the number of steps and make the process more intuitive.
We began by showing specific failure reasons. Data inspection revealed that users who failed multiple times did so for the same reason (e.g., unsupported ID). To diminish confusion and present localized guidance, the backend provides customized feedback, which results in a personalized error message (e.g., tells the user to try with a different ID type).
Additionally, analytic events showed that there was drop-off at every screen, which we took to mean that the fewer the screens, the more users would get through. We simplified our flow by cutting non-critical screens and including animations instead of instruction pages. By doing so, we not only minimized complexity on the mobile side, but also iterated quickly to fit regional requirements.
So far, we’ve been through 4 UI iterations, and we expect to keep improving by responding to our customers’ needs.
Figure 5: Example user experience progression from the first and the fourth UI implementation. 
ID Variations
Not only does every country have a different ID, but even within countries there are different versions of the same ID. In some countries, IDs are not issued at the national level, but rather at the regional or local level. Mexico is an example of a country with locally-issued drivers’ licenses. Supporting licenses in Mexico entails supporting more than 30 different licenses.
Also, some IDs are harder to tackle than others – for example, in Brazil the RG card and driver’s license are printed on paper, the fonts differ per state, and some IDs are hand-written. To add more complexity, users usually carry their IDs in a plastic cover, which they do not remove for photo-taking purposes and therefore introduces glare.
To tackle challenges like the one mentioned above, we came up with the following solutions.
In-House Document Processing
We built an in-house automatic document transcription solution for relevant Brazilian documents, focusing on extracting key information. The system uses OCR to extract character-level text from the document and its corresponding location, and uses a state-of-the-art object detection model to determine the location of the key fields in the document. Based on the OCR results and object detection model outputs, the system fuses the results together and applies some post-processing (removing punctuations, etc.) before outputting the final values of the key fields.
Human-in-the-Loop
There are a few scenarios where riders can’t produce good quality images, even with the help of the on-device ML model image quality check. For example:
The device camera may not meet the minimum requirements for a good photo
A non-tech-savvy user may be unable to take a good photo 
The environment may not be conducive to adequate lighting
As the verification success rate highly depends on the image quality, we introduced the optional human review step into the Document Image Processing module for the low confidence images or suspicious results generated from the automatic solutions. For users, the experience of interacting with the UI is transparent, and these human reviews are completed quickly, generally in under 90 seconds.
Looking into the Future
Real World Impact 
As of May 2022, Real-Time ID Document Check is live in Brazil, Mexico, Chile, Costa Rica, Colombia, Guatemala, El Salvador, Dominican Republic, Argentina, US, and Canada. Since the launch of the initial experiment at the end of 2019, we have verified more than a million IDs. Our experiment has shown a significant safety benefit, which is why we plan to expand this product more widely.
We’ve found this technology to be useful for other use cases as well. For example, we have been able to offer alcohol delivery in the US, Canada, Australia, and New Zealand by verifying Eaters’ eligibility through their ID. Similarly, we launched ID verification technology for mopeds in different parts of Europe to ensure that users have a valid driver’s license. Document verification has also expanded our KYC capabilities, and enabled our car rental business in the US.
Challenges
While our experiment has been successful, we continuously iterate and refine our product to ensure the best user experience. Some of the challenges we are addressing include:
Latency: Although document scanning pass rate is high for the rider scenario, the alcohol delivery experience is less tolerant of latency, as courier-partners are kept waiting for the scan to successfully complete to deliver the restricted item. Adding new alternatives to scanning, like barcode scanning, may reduce latency on some use cases and improve user experience.
Scalability: Given the unique challenges that every country faces, scaling our product is not as easy as flipping a switch. Across the globe there are cultural differences around privacy, and while some countries like Colombia have a national driver’s license, others like Mexico have state-issued licenses. Adding new countries to the mix requires operational and engineering work.
Conclusion
Rider Identity Verification was developed by Uber to protect drivers from bad actors on the platform. Since 2017, the verification service has evolved from a simple solution to a scalable and configurable one that is available in different regions. Today, Real-Time Document Check is the flagship product of the platform given its trustworthiness as an effect of being based on certified vendors. Even though many improvements have already been made, the Uber Rider Identity Team is constantly working to enable more countries and new use cases on the platform, meeting its data privacy and latency requirements as well as providing a best-in-class user experience.
Acknowledgements
Our product would not have been built without a cross-functional team that includes designers, data scientists, and engineers. The following people were instrumental in building Uber’s Real-Time Document Check: Brian Zhang, Sudheer Agrawal, Felipe Figueiredo, Matheus Candido, Mateus Batista, Joao Enomoto, Diogo Costa, Martin Norris, Luiz Vieira, Gustavo Daud, Flavia Rangel, Nengjun Zhao, Jingchen Liu, Xuewen Zhang, Eunice Yuh-Jie Chen, Himaanshu Gupta, Xiaoyu Ji, Trinh Bui, Daniel Kolta, Shimul Sachdeva, Aarti Daryanani, and Xuewen Zhang.

Brian Zhang

Brian Zhang is a Staff Software Engineer at Uber working on Safety & Identity related projects since 2016.

Flavia Rangel

Flavia Rangel is a Senior Software Engineer at Uber working on the Identity Verification project since its nascency. She is the first woman to represent and help build the São Paulo Tech Center engineering team.

Martin Norris

Martin Norris is a Staff Software Engineer working on Identity Verification providing active verification flows for platform users at the São Paulo Tech Center in Brazil.

Himaanshu Gupta

Himaanshu Gupta is a Senior Engineering Manager at Uber AI, leading a group working on delivering innovative AI solutions for Uber’s core business problems across computer vision, geospatial AI, and marketplace optimization areas.

Aarti Daryanani

Aarti Daryanani is a Product Manager for Safe Identity, working on products like Social Verification, Rider Selfies, and Real-Time time Document Check since 2018.

Felipe Figueiredo

Felipe Figueiredo is a Senior Software Engineer at Uber who has been working on the Identity Verification Mobile Platform and Uber's Camera library since 2019.