Fine-Grained Kitchen Activity Recognition using RGB-D
   Jinna Lei, Xiaofeng Ren, and Dieter Fox, at Ubicomp 2012



Abstract

We present a first study of using RGB-D (Kinect-style) cameras for fine-grained recognition of kitchen activities. Our prototype system combines depth (shape) and color (appearance) to solve a number of perception problems crucial for smart space applications: locating hands, identifying objects and their functionalities, recognizing actions and tracking object state changes through actions. Our proof-of-concept results demonstrate great potentials of RGB-D perception: without need for instrumentation, our system can robustly track and accurately recognize detailed steps through cooking activities, for instance how many spoons of sugar are in a cake mix, or how long it has been mixing. A robust RGB-D based solution to fine-grained activity recognition in real-world conditions will bring the intelligence of pervasive and interactive systems to the next level.