Start ordering with Uber Eats

Order now
AI, Engineering

Characterizing how Visual Question Answering models scale with the world

December 1, 2017 / Global

Abstract

Detecting differences in generalization ability between models for visual question answering tasks has proven to be surprisingly difficult. We propose a new statistic, asymptotic sample complexity, for model comparison, and construct a synthetic data distribution to compare a strong baseline CNN-LSTM model to a structured neural network with powerful inductive biases. Our metric identifies a clear improvement in the structured model’s generalization ability relative to the baseline despite their similarity under existing metrics.

Authors

Eli Bingham, Piero Molino, Paul Szerlip, Fritz Obermeyer, Noah D. Goodman

Conference

ViGIL @ NeurIPS 2017

Full Paper

‘Characterizing how Visual Question Answering models scale with the world (PDF)

Uber AI