In the right hands, data can be a powerful tool, a key to understanding some of the most complex problems we face. Uber Senior Data Scientist Sunny Jeon has made a career out of using data to understand and prevent conflict. As applied at Uber, he studies the data around safety risks and builds models that help identify risk factors that could predict why and where conflicts or incidents might occur, so that we can develop policies and products designed to enhance the safety of our driver-partners and riders.
Born in South Korea, Sunny grew up in a country in a perpetual state of war, giving him a strong motivation to study the nature of conflict. Sunny learned how he could apply statistics and modeling techniques to conflict studies while earning his Ph.D. in Political Science from Stanford, and eventually moved into machine learning, further advancing the use of data.
As Uber CEO Dara Khosrowshahi wrote that we are “putting safety at the core of everything we do”, Sunny’s own work has gained prominence. We sat down with him to discuss his background, how data science can be used to promote safety, and how he applies his knowledge at Uber.
Tell us about your background.
I studied social science, which I think is a little different than the average data scientist, who mostly come from a statistics or a computer science background. My undergraduate and graduate degrees were in political science, and most of my work in academia focused on issues related to security and human violence. I studied why people fight, and why people cooperate with each other, and what strategies we can then implement to promote cooperation based on what we know about human behavior. I did my dissertation on violence in Africa, but towards the end of my Ph.D. I realized that academia was not the right place for me.
I liked doing research and trying to answer tough questions with real world applications, but it moved too slowly for me. Working with another graduate student who also focused on issues related to security, we created a company that provided analytics for political risks. We focused on helping security agencies understand regional risks where they worked and insurance companies understand what kind of financial risks they were taking on.
What interested you about studying conflict?
I come from South Korea, where we’ve been at war since I was born. Conflict has just been something that I’ve always been interested in. And in grad school, there were a few professors who specialized in violence and were some of the top names in the field. It was a great opportunity to not just learn more about something I cared about, but also actually do something about it.
Given that social science relies heavily on statistics, was that your path into programming and technology?
Yes, as an undergrad I was taking game theory courses from a political science professor, and he got me really interested in research. He said, “If you want to go to grad school in this, you’re going to have to pick up math, you’re going to have to pick up statistics.” I started from square one and took all my calculus, algebra, and statistics courses, and, through our research project, picked up R, the statistical programming language that I still use today. At that time, around 2003, R was very new and not as widely used as other languages. Now, it’s one of the hottest languages in data science.
Were you getting more interested in technology when you formed your own company?
Yes, that’s when I began looking more closely into machine learning and artificial intelligence. We touched upon those areas in some of the more quantitative classes in my grad program, but we were mostly interested in hypothesis testing as opposed to prediction. It was really when I went into the realm of prediction that I started looking into data science methodologies and realized that “Hey, for prediction machine learning might be a better tool.”
How did you keep up with the latest advances in data science when you entered the private sector?
One of the nice things about a Ph.D. program is that it teaches you how to pick up new methodologies, because it is as much about the scientific process, discovery and testing, as it is about the subject matter. Once you get that training, it’s fairly easy to pick up new tools and understand new fields. For example, with machine learning, once I had the math and the statistics background, it was quite easy to read through the algorithms and understand what they were trying to do.
Why did you choose to work at Uber?
The company my friend and I tried to create failed miserably. We were academics and not businessmen, so we found it very difficult to sell.
When I realized it wasn’t going to work out, I began looking for jobs in tech because I wanted to build products that had an impact. Fortunately, I found an opening on the Uber Safety Team that was very similar to what was I doing at my company. The job was attractive because I would be working with a lot of really interesting, rich datasets, and the description emphasized real world impact, how it would help improve safety. That really got me. I was the first data scientist hired to evaluate safety. At the time, we had the Algorithmic Insights team, with data scientists embedded into a lot of different programs, such as growth and fraud.
What surprised you most when you started working at Uber?
There were definitely a lot of surprises because this was really my first job out of academia. I was in the startup world for a few years, and I was doing a lot of design work during my graduate studies, so was familiar with and liked the whole process of product development, going from ideation, to testing, to building and scaling. But when I came here, I realized it was a lot more complicated than just product development. At Uber, it feels like a hundred different startups within a startup, so you’re not only promoting your own products but you also have to think about how they work with those from other teams.
And sometimes objectives may push in different directions in the absence of careful coordination. As an example, on safety we work hard to remove potentially unsafe drivers and riders from our platform. Meanwhile, other teams work on products and initiatives to retain riders and drivers and attract new ones. We need to ensure that the methodologies we use do not counteract each other and prevent these initiatives from succeeding. Navigating all these different interests, even within the same company, was a surprise to me. In academia, each area of study had a singular focus, so in social studies we didn’t have to take into account what the students in the business school were doing.
Tell us about your current team at Uber.
I work as a Senior Data Scientist on the Safety and Insurance team. We are cross-functional, consisting of product managers, engineers, data scientists, and designers. We also have a huge operations team embedded around the world that helps us figure out the risks in different markets, as Chicago will have different challenges than Buenos Aires. On the insurance side, we also have actuaries, lawyers, and claims advocates that handle complaints. It’s quite a big team.
How much do the local operations teams contribute to what you know about safety?
For new markets that I don’t understand very well, the safety operations teams inform me of their issues. For example, in South Africa we have a great partner on the operations teams who contacted me to say, “Hey, taxi violence is an issue here.” He provided data and asked if I could help figure out how to model it and identify possible patterns and risks. We have a very close relationship with the operations teams and their insight is absolutely critical for understanding regional risk. When we launch anything into a new market, it’s in collaboration with the local teams.
Safety is a key priority for Uber. Given your team’s important role in this initiative, what kind of support are you seeing for your team from around the company?
With Dara providing so much leadership on safety and making it the top company priority, we’ve had incredible opportunities to build and launch our products, and it’s been a lot easier to get alignment with other teams. We’ve been very productive. We really owe it to Dara and the executive team for making it number one.
Does the kind of predictive modeling you developed in social studies as an academic directly apply to what you are doing now?
Absolutely. The types of models that I was building for the conflict forecasting business are the same models that I built here at Uber for predicting conflicts between our users. For instance, in Latin America, we have seen some riders try to use the Uber platform to commit crimes like theft against drivers. We built sophisticated models to better anticipate these types of malicious requests so that we could take specific actions to block them. Both the models and the hypotheses that we have about human behavior are all drawn from my academic research.
Can you describe the models you use?
Our models leverage a variety of trip, user, and environmental characteristics to assess safety risks. Some of the most actionable risk signals include feedback that drivers and riders may have received from other users. Risk can also be associated with how drivers and riders use the platform.
We look at geographical factors as well, such as how many incidents happened in a specific area. Bar areas can be associated with interpersonal conflict, so we have data on the number of bars and nightclubs in neighborhoods.
How is your team working to raise the bar on safety?
Safety is incredibly complex, especially at Uber scale. We’re not immune from the world’s safety challenges, whether it’s a physical altercation or a theft. Connecting people together in cars is a big responsibility and it’s on us to constantly work to raise the bar. For example, we have a suite of products that use phone sensor data to infer vehicle movements and give drivers tools that help them drive more safely. We also developed features that enable riders to share with loved ones their trip location in real-time, and an emergency button to facilitate rapid response by the authorities when needed. And as I mentioned before, we have machine learning-based safety interventions that identify and block high-risk trip requests proactively, similar to how fraud systems reject high-risk credit card transactions.
We continue to prioritize safety by leveraging our technology and data. We can’t take something off the shelf and apply it; we have to innovate and get creative. It’s complex and it takes time, but with good processes, creative product development, and a dedicated team, we’ll continue our work to raise the bar on safety.
Interested in applying your data science skills to improving safety and other challenges in the realm of transportation? Apply for a role on our team!
Subscribe to our newsletter to keep up with the latest innovations from Uber Engineering.