Learning to Recognize Objects in Egocentric Activities
   Alireza Fathi, Xiaofeng Ren, James Rehg, at CVPR 2011



Abstract

This paper addresses the problem of learning object models from egocentric video of household activities, us- ing extremely weak supervision. For each activity sequence, we know only the names of the objects which are present within it, and have no other knowledge regarding the ap- pearance or location of objects. The key to our approach is a robust, unsupervised bottom up segmentation method, which exploits the structure of the egocentric domain to par- tition each frame into hand, object, and background cat- egories. By using Multiple Instance Learning to match object instances across sequences, we discover and lo- calize object occurrences. Object representations are re- fined through transduction and object-level classifiers are trained. We demonstrate encouraging results in detecting novel object instances using models produced by weakly- supervised learning.