Maya Cakmak's Webpage
Maya Cakmak
Paul G. Allen School of Computer Science & Engineering, University of Washington

R:SS Early Career Spotlight talk that overviews my research.

The long term goal of my research is to make personal robot assistants in the home a reality. Such robots can bring independence to persons with motor impairments, enable older adults to age in place, and improve the quality of people's lives. Within the last decade, developments in robotics, such as common hardware platforms and open-source software, have fueled a great deal of progress towards this vision. We already see research videos of general-purpose robots carrying out useful tasks such as folding laundry, preparing a meal, or emptying the dishwasher. However, these capabilities are far from ready for the real world, for two important reasons: (1) they are tailored to the particular environment in which the robot operates and cannot be easily adapted to different settings, (2) they require long development cycles by skilled programmers capable of using specialized robotics software.

Mainstream robotics research targets the first problem. This involves developing universal robotic capabilities that will work robustly in every possible deployment scenario. The success of this approach has been limited by the difficulty of anticipating all possible use-cases and operation environments of robots. Instead, my research targets the second problem. I exploit the fact that no individual robot will encounter all possible scenarios. Rather than aiming for universal capabilities, I embrace tailoring to the robot's particular environment. Hence, the main goal of my research is to develop general-purpose robots that can be programmed by their end-users after they are deployed.

This has the potential to greatly expand possible use-cases of general-purpose robots by empowering users to decide what their robots will do for them. It also allows users to customize their robot's behavior to meet their particular needs and preferences. These benefits come with one key challenge: enabling end-users to do what is currently done by skilled programmers. My research tackles this challenge.

The key research question at the core of my work is: what are the right representations, abstractions, and interfaces for programming robots? The answer requires balancing the tradeoff between expressivity and intuitiveness given the level at which the robot needs to be programmed and the backgrounds of the end users, which can vary from professional software developers to older adults with no technical background. Tackling this question requires:

I deeply care about the relevance and usefulness of my research. To that end, I evaluate systems I develop with realistic and diverse set of tasks; I put these systems in front of real potential users with diverse backgrounds and abilities; and I take every opportunity to demonstrate and deploy them in the real world.

New ways of programming robots

Most of my work thus far has focused on programming of mobile manipulators to perform useful everyday tasks. Currently end-to-end programming of an application for such robots requires advanced knowledge of frameworks and tools like the Robot Operating System (ROS) and web programming. Making the process accessible to end-users requires a suite of tools for programming the robot at multiple levels. My research has contributed tools at all levels spanning (1) sensing & world modeling, (2) motion specification, and (3) flow of control as categorized according to Lozano-Perez's requirements of programming systems. Combinations of these tools allow creation of new applications within fractions of what ROS would require even for expert programmers. For example some of the manipulation strategies we implemented with ROS for the Amazon Picking Challenge task in about a month took less than 3 hours with a suite of our tools we called Code3, presented at HRI 2017. A few of these systems are described in the following.

Programming by demonstration

A large body of work in robotics developed techniques to learn new behaviors on a robot from demonstrations of that behavior. However, most of these techniques were only evaluated with demonstrations collected from the developers of the method. My earlier work revealed that novice users did not satisfy many of the assumptions made by these methods. My thesis work proposed new methods and interactions to address this issue. Two methods from my thesis work that have become popular for learning from novices are (1) the use of keyframes instead of dense trajectory recordings in kinesthetic teaching and (2) making robots ask questions. In more recent work, published at RSS 2014, we extended keyframe based programming with interactive visualizations to enable one-shot programming of bimanual object manipulation tasks. We also developed a simple dialog system and demonstrated that novice users could learn to program the robot (to manipulate a pill bottle and fold a towel) without any instructions from the experimenter in a study published at HRI 2014.

Programming with situated and embodied natural language

Another channel my work explored for programming robots is through natural language. Our ICRA 2015 paper established a taxonomy of movement for grounding prepositions (relative/absolute, target/directional) and presented a system with language processing and scene understanding to teach the robot new manipulation skills through situated spatial language commands. Later work, published at RSS 2016 combined language processing with scene understanding and deictic gesture recognition to reference surfaces and objects in a room.

Situated tangible programming

One of the more challenging tasks in programming robots is to unambiguously reference the robot's environment. This often requires expertise with the robot's perception system and understanding of coordinate systems and transformations. Situated natural language interactions, even with referential gestures, suffer from uncertainty and has so far been limited to single object references. To address this challenge we proposed situated tangible programming, which involves programming the robot by placing tangible markers in the robot's workspace. The markers are designed to be unambiguously identifiable by the robot and they enable the user to reference objects, locations, and regions on the robot's workspace. Other markers allow attaching actions to those entities and specifying order or instructions. The robot autonomously perceives the workspace with the markers and compiles it into an executable robot program. This work defined a new category of interactions for robot programming and was nominated for a best design paper award at HRI 2017. We later extended this work with situated feedback from the robot through a projector mounted on its head.

Visual programming

My group also developed a number of visual and textual robot programming languages that made programming useful tasks on robots simpler and faster than with current software engineering practices. RoboFlow which was presented at ICRA 2015, is a flow-based visual programming language which enables initialization of a program from a single demonstration and direct editing of that program through a graphical interface to add looping, branching, and error handling. CustomPrograms (HRI 2016) is a block-based robot programming tool that allowed the Savioke Relay robot to be reprogrammed for new applications by non-technical users within fractions of the time it takes their professional engineers. CodeIt! (HRI 2017) enabled integration of actions created through programming by demonstration and perceptual detectors created through another of our tools called CustomLandmarks (IROS 2017).

Evaluating tools and understanding programmers

In my research I adopted evaluation protocols, some of which are new in robotics, from software engineering and end-user programming research. We evaluate each new tool or interface in terms of two key properties:

Through these evaluations, my work has contributed an empirical understanding of the process of programming robots and people's preferences. Some of these observations recur across different systems, such as people's preference for interpretable and predictable behavior over higher autonomy or intelligence, differences between expert and novice programmers, differences across people with different backgrounds, and differences between our tools and traditional robot programming tools. Our evaluations also demonstrate that some of our methods break new ground in terms of how long it takes to learn to program a robot. For instance, situated tangible programming (HRI 2017) allowed participants to correctly interpret and program common manufacturing handling tasks without any or with minimal (5 minute) instruction; whereas user manuals for today's manufacturing robots are anywhere between 150 to 500 pages. Some of our user studies go deeper into understanding flaws in people's mental models of a programming language that lead to errors or inefficiencies, as in our work on trigger-action programming which was presented at UbiComp 2015.