I am currently working on a project on pixel-level segmentation of human hands from RGBD images and its applications to object tracking and human state estimation. The videos shown below are two examples of the hand segmentation model in action.
My research in this area focuses on single-shot techniques for non-expert users to solve perceptual problems common to robot manipulation.
Custom Landmarks is a flexible system that allows users to create detectors for a wide variety of object parts and scene geometry. In this system, users annotate object parts or parts of a scene such as the edge of a table, an empty gripper, object handles, etc. by drawing a 3D box around a point cloud view of the scene. The robot can then detect these objects in new scenes. Landmarks detected by this system can be used as reference frames for programming manipulation actions in other systems that I have developed. A description of the system and a user evaluation of it was described in an IROS 2017 paper.
Custom Landmarks 2D is an extension of Custom Landmarks that I am working on with a mentee. Together, we are applying the same paradigm of one-shot creation of arbitrary landmarks to 2D detection. This enables the robot to detect flat objects that cannot be found by Custom Landmarks, such as elevator buttons, light switches, and stickers or fiducials. A video demo of this work is linked below.
Multi-surface detection: I am working with another mentee on detecting surfaces in shelf scenes, which are commonly encountered in mobile manipulation scenarios. We are developing a new technique and comparing it to an existing system. Additionally, we are looking into making it easy for users to optimize algorithm parameters for specific tasks.
Based on my interactions with non-expert users who tried our lab's programming by demonstration system, I wrote a new programming by demonstration system called Rapid PbD. Rapid PbD implements a variety of useful features:
Shortly after its development, Rapid PbD was used to teach a week-long programming workshop for a group of high school students with disabilities. The students were able to use their own laptops to access Rapid PbD and CodeIt to program a Fetch robot to do tasks in a mock grocery store setting.
During my summer 2015 internship at Savioke, I developed a web-based system for programming the Savioke Relay robot (screenshot and video below). It allows designers, technicians, and other non-software engineer users to rapidly experiment with new use cases for the robot, using a drag and drop interface (screenshots below). We wrote about the system and our findings from a user evaluation in an HRI 2016 paper. The system continues to be used for sales demos, trade shows, and internal testing. It was also used by outside researchers who are conducting a study related to elder care.
An open-source version of the system is available under the name CodeIt. CodeIt has been implemented for the PR2, Fetch, and Turtlebot robots, and was used for robot programming workshops with high school students. We showed that we could integrate CodeIt, Custom Landmarks, and programming by demonstration into a single robot programming system, which we described and evaluated in the HRI 2017 paper, Code3: A System for End-to-End Programming of Mobile Manipulator Robots for Novices and Experts.
I presented these web components and <ros-rviz> at ROSCon 2017 in Vancouver, BC. My slides and a video recording of the talk are on the conference website.
Robot web server is a web-based application launcher that I built to envision the end-user experience with the robot. RWS runs 24/7 on our lab's PR2 and Fetch robots and provides an interface for developers to add web-based applications to the robot. RWS allows us to launch applications such as Rapid PbD or CodeIt from any web browser, including from smartphones.
My second project during my Savioke internship was to build a system for visual global localization of the Relay robot. When Monte-Carlo localization (MCL) with a laser scanner fails, a typical strategy for global localization is to distribute particles randomly, but this often fails. In our system, the robot saved camera images from previous deliveries to a database. The images were represented using features from a pre-trained convolutional neural network. At runtime, the robot searched for similar-looking images and seeded MCL based on the results. The database managed itself to keep data fresh and to limit its size of disk by the area of the map, rather than the amount of data collected. We showed that out of 10 scenarios where MCL global localization failed, our system could correctly globally localize itself 7 out of 10 times, or 9 out of 10 times with a larger dataset. The system was implemented in C++ using OpenCV, Caffe, and ROS. With GPU support, the system ran in real-time with approximately 5 ms per image lookup.
During Spring 2015, I worked on the Amazon Picking Challenge along with other members of the lab. We used a PR2 robot.
I was the primary developer on the team and worked on several components of the challenge:
In Winter 2015, I worked on a project investigating trigger-action programming interfaces. We worked to characterize the possible confusions people might have in creating and understanding trigger-action rules. I built a trigger-action programming interface and conducted two studies on Mechanical Turk. Our paper, Supporting Mental Model Accuracy in Trigger-Action Programming, was accepted to Ubicomp 2015. A video of the interface is linked below.
In Autumn 2014, I was a co-author on The Privacy-Utility Tradeoff for Remotely Teleoperated Robots. I helped design privacy-preserving interfaces for teleoperation, run a user study, and analyze data from the study.
Prior to this (Spring 2014), I worked on an study for a human-robot interaction class in which I recorded users' interactions with a teleoperation interface, and analyzed the difficulties they had. I wrote a paper about my results.