Robotics: Science and Systems 2017

Workshop on Articulated Model Tracking

July 16, 2017

Room 36-156

Program | Invited Talks | Organizers


In recent years, the robotics and vision communities have provided many techniques for estimating and tracking the pose of articulated objects such as robot manipulators, doors, tools, human hands, and human bodies. There are model-based techniques, learning-based techniques, and hybrid techniques, all showing exciting progress towards solving this challenging problem and each having their own limitations and advantages. There are still many open challenges, however, especially when the scene of interest contains multiple interacting articulated objects. In these scenarios, occlusions, partial observability, and high-dimensional state spaces make it very difficult to maintain a real-time state estimate that is sufficiently accurate for safe planning and manipulation.

The goal of this workshop is to bring the robotics and vision communities together to discuss both recent successes in articulated object tracking, its use in robotic manipulation, and also the remaining limitations. The ultimate goal is to identify directions for further improvement of articulated object tracking systems that can be exploited in robotics to allow robots and humans to work together in a shared space safely, robustly, and efficiently.

Call for Contributions

We are soliciting extended abstracts in the RSS format (4 pages plus references). Accepted papers will be presented as a short spotlight talk and in a poster session. Suitable subjects include any new and exciting or preliminary results on articulated model tracking, or evaluations of the abilities and limitations of existing approaches.

We are also soliciting live demos to be given during the poster session. Live demos may accompany a submitted paper but they are not required to.

Please e-mail by May 30 to submit a paper and/or demo.


9:30 - 9:50 Invited Talk:
Antonis Argyros
Tracking Human Hands and Hand-object Interactions
9:50 - 9:57 Aaron Walsman, Tanner Schmidt, and Dieter Fox Articulated Tracking with a Dynamic High-Resolution Surface Model
9:57 - 10:04 Gregory Izatt and Russ Tedrake Globally Optimal Object Pose Estimation in Point Clouds with Mixed-Integer Programming
10:04 - 10:11 Alexander Lambert, Amirreza Shaban, Zhen Liu and Byron Boots Deep Forward and Inverse Perceptual Models for Tracking and Prediction
10:11 - 10:18 Mo Shan and Nikolay Atanasov A Spatiotemporal Model with Visual Attention for Video Classification
10:18 - 11:00 Poster Session and Demos
11:00 - 11:20 Invited Talk:
Jonathan Tompson
Human Person Detection and Pose Estimation
11:20 - 11:40 Invited Talk:
Javier Romero
From Expensive to Cheap Digital Selves
11:40 - 12:00 Discussion

Invited Talks

Antonis Argyros

University of Crete

Tracking human hands and hand-object interactions

In this talk, we provide a brief overview of our work on computational methods for tracking the activities of human hands based on unobtrusive computer vision techniques that rely on the processing and analysis of markerless visual data. We focus on a computational framework for tracking the 3D position, orientation and full articulation of human hand(s) and we show how this is employed to solve problems of varying complexity, ranging from 3D tracking of a single hand to 3D tracking of two hands interacting with several objects. We also show how this framework can be used to deal with problems other, related problems such as human body tracking. Finally, we show how our work can support intuitive HCI and HRI as well as the development of interactive exhibits in the context of smart environments.

Jonathan Tompson


Human Person Detection and Pose Estimation

Recent advances in Convolutional Network architectures, training methodologies and dataset scale have greatly improved the real-world performance of human pose estimation. In this talk I will discuss some of the recent work at Google (and elsewhere!) tackling this difficult articulated tracking problem, whose generalization performance is now "good enough" for deployment in Google's production systems. I will also touch on some of the open challenges in the domain and discuss some of Google's future work.

Javier Romero

Body Labs Inc., NYC

From expensive to cheap digital selves

While we spend a large part of our time in the digital world, we currently lack an embodied presence in it. This results in ineffective online shopping experiences, impersonal entertainment and disconnected physical and digital environments. While there is a long history of human tracking and body scanning in the film industry, the creation of compelling animated selves remains inaccessible for the average user. In this talk, our latest developments in the use of both professional and personal devices for capturing human shape and motion will be reviewed.


Max Planck Institute


Jeannette Bohg

Jeannette Bohg is a Senior Research Scientist at the Autonomous Motion Department. Her research focuses on perception for autonomous robotic manipulation and grasping. She is specifically interesting in developing methods that are goal-directed, real-time and multi-modal such that they can provide meaningful feedback for execution and learning.

Before joining the Autonomous Motion lab in January 2012, she was a PhD student at the Computer Vision and Active Perception lab (CVAP) at KTH in Stockholm. Her thesis on Multi-modal scene understanding for Robotic Grasping was performed under the supervision of Prof. Danica Kragic. She studied at Chalmers in Gothenburg and at the Technical University in Dresden where she received her Master in Art and Technology and her Diploma in Computer Science, respectively.

Dieter Fox

Dieter Fox is a Professor in the Department of Computer Science & Engineering at the University of Washington. He grew up in Bonn, Germany, and received his Ph.D. in 1998 from the Computer Science Department at the University of Bonn. He joined the UW faculty in the fall of 2000.

His research interests are in robotics, artificial intelligence, and state estimation. He is the head of the UW Robotics and State Estimation Lab and he currently serves as the academic PI of the Intel Science and Technology Center for Pervasive Computing. He's a Fellow of the AAAI and IEEE, and he currently serves as an editor of the IEEE Transactions on Robotics.

Roberto Martín-Martín

Roberto Martín-Martín is a research assistant and PhD candidate at the Technische Universität Berlin, in the Robotics and Biology department (RBO) under the supervision of Oliver Brock. He is interested in the intersection between robotics and artificial perception, focusing on leveraging robot interactions to enable robust and online estimation and exploitation of the relevant information for manipulation. He received his first degree from the UPM (Madrid, Spain) in 2010 and his Masters degree in Electrical and Computing Engineering at the Technische Universität Berlin in 2011.

Tanner Schmidt

Tanner Schmidt is a graduate student in Computer Science and Engineering at the University of Washington, working with Dieter Fox in the Robotics and State Estimation Lab. His primary interests are robotics, computer vision, and artificial intelligence. He received his bachelor's degree in Electrical and Computer Engineering and Computer Science from Duke University in 2012, and began at UW in the fall of 2012.