Photo-based question answering

TitlePhoto-based question answering
Publication TypeConference Papers
Year of Publication2008
AuthorsTom Yeh, Lee JJ, Darrell T
Conference NameProceedings of the 16th ACM international conference on Multimedia
Date Published2008///
Conference LocationNew York, NY, USA
ISBN Number978-1-60558-303-7
KeywordsComputer vision, Information retrieval, Question answering

Photo-based question answering is a useful way of finding information about physical objects. Current question answering (QA) systems are text-based and can be difficult to use when a question involves an object with distinct visual features. A photo-based QA system allows direct use of a photo to refer to the object. We develop a three-layer system architecture for photo-based QA that brings together recent technical achievements in question answering and image matching. The first, template-based QA layer matches a query photo to online images and extracts structured data from multimedia databases to answer questions about the photo. To simplify image matching, it exploits the question text to filter images based on categories and keywords. The second, information retrieval QA layer searches an internal repository of resolved photo-based questions to retrieve relevant answers. The third, human-computation QA layer leverages community experts to handle the most difficult cases. A series of experiments performed on a pilot dataset of 30,000 images of books, movie DVD covers, grocery items, and landmarks demonstrate the technical feasibility of this architecture. We present three prototypes to show how photo-based QA can be built into an online album, a text-based QA, and a mobile application.