In this paper we propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor. By jointly reasoning about these tasks, our holistic approach is more robust to occlusion as well as sparse data at range. Our approach performs 3D convolutions across space and time over a bird’s eye view representation of the 3D world, which is very efficient in terms of both memory and computation. Our experiments on a new very large scale dataset captured in several North American cities, show that we can outperform the state-of-the-art by a large margin. Importantly, by sharing computation we can perform all tasks in as little as 30 ms.
Wenjie Luo, Bin Yang, Raquel Urtasun
Risk Entity Watch – Using Anomaly Detection to Fight Fraud
September 28 / Global
The Transformative Power of Generative AI in Software Development: Lessons from Uber’s Tech-Wide Hackathon
August 3 / Global
Innovative Recommendation Applications Using Two Tower Embeddings at Uber
July 26 / Global