Chief Scientist, Amap (AutoNavi) @ Alibaba
Bellevue, WA, USA
x dot ren at alibaba-inc dot com
I am currently chief scientist of Amap (AutoNavi), a subsidiary of Alibaba, China's leading mapping, navigation, and location-based service provider. I joined Alibaba in 2017, initially as chief scientist and associate dean of the Institute of Data Science and Technology (iDST), Alibaba's AI R&D Division.
Alibaba has a fast growing site in Bellevue (yes I am still in sunny Seattle, love it) and I am hiring in computer vision (Seattle, Bay Area as well as Beijing and Hangzhou)!
(CV,Google
Scholar).
Prior to Alibaba, I was a senior principal scientist at Amazon. For 2013-17, I was
the lead scientist at Amazon Go,
using computer vision and machine learning to re-invent retail.
We have launched our first Just-Walk-Out store that automatically figures out purchases without customer effort, completely eliminating check-out (that "unnecessary" and annoying wait).
I was a research scientist at Intel Labs during 2008-2013, working closely with faculty and students at University of Washington. For 2006-2008, I was a research
assistant professor at the Toyota Technological Institute at Chicago (TTI-C). I received my Ph.D.
from U.C. Berkeley in 2006, under the supervision of Jitendra Malik.
Code available: I have put together C++ implementations of both our
kernel descriptor features (NIPS10) and the more recent hierarchical sparse coding
features (NIPS11) in a live webcame demo, here released under BSD license. What's more, the demo is now running on an Android phone!!
Progress on contour detection: move beyond Pb and use sparse codes to compute local oriented gradients. F=0.74 (up from 0.71 of gPb) on BSDS500, a large step forward (human=0.80). Great RGB-D results: F=0.62 (vs gPb 0.53) on NYU Depth (v2).
Check out the demo video for our Ubicomp paper on fine-grained kitchen activity recognition and tracking. News article at New Scientist.
A C++ implementation of our kernel descriptor features (NIPS10, IROS11),
~5 times faster than the matlab version, and with a live demo using webcams. Please try
this out and let me know your comments and suggestions.
Upcoming papers at ISER, Ubicomp and the Robotics and Automation Magazine (RAM).
CVPR paper accepted on scene labeling on both RGB-D (indoor) and image-only (outdoor) scenes. Preprint available.
Liefeng's NIPS paper on hierarchical orthogonal matching pursuit for learning image features.
The (Matlab) code for kernel descriptors is now available. Please try it out!!
Check out our ubicomp final video on interactive mapping on youtube.
Co-organized the 2nd
RGB-D workshop at RSS on advanced perception using depth cameras; it's a great success!! 18 presentations, 7 demos and over 70 attendees. All the papers, videos, slides will be available online.
Two BMVC papers to appear on material recognition and video segmentation.
Two IROS papers to appear on depth kernel descriptors and object discovery based on scene changes.
Our interactive mapping work (online user interaction in real-time mapping) to appear at Ubicomp 2011.
Depth Enhancement via Low-rank Matrix Completion.
[pdf][project]
Si Lu, Xiaofeng Ren, Feng Liu, in CVPR 2014.
Ever unhappy with poor-quality depth data? Si and Feng has a solution!
Histograms of Sparse Codes for Object Detection.
[abstract][pdf][code]
Xiaofeng Ren and
Deva Ramanan, at CVPR 2013.
Move beyond HOG! Use learned sparse code dictionaries to significantly improve object detection accuracy
Multipath Sparse Coding Using Hierarchical Matching Pursuit.
[abstract][pdf][code] Liefeng Bo, Xiaofeng Ren and
Dieter Fox, at CVPR 2013.
Extend hierarchical matching pursuit (NIPS11) to a reconfigurable architecture that captures structures of varying scale and deformation
RGB-D Flow: Dense 3-D Motion Estimation Using Color and Depth.
[abstract][pdf][code] Evan Herbst, Xiaofeng Ren and Dieter Fox, at ICRA 2013.
General motion estimation using Kinect; extend variational optical flow to RGB-D data
Discriminatively Trained Sparse Code Gradients for Contour Detection.
[abstract][pdf][test code][training code]
Xiaofeng Ren and Liefeng Bo, at NIPS 2012.
Pushing the limit of local features for contour detection; a large step forward (0.71=>0.74 on BSDS500, human=0.8)
Unsupervised Feature Learning for RGB-D Based Object Recognition.
[abstract][pdf][C++ code] Liefeng Bo, Xiaofeng Ren, and Dieter Fox, at ISER 2012.
Extending our fast feature learning using hierarchical matching pursuit (NIPS '11) to RGB-D data
RGB-(D) Scene Labeling: Features and Algorithms.
[abstract][pdf][code]
Xiaofeng Ren, Liefeng Bo, and Dieter Fox, at CVPR 2012.
RGB (and D) scene labeling, 76% on NYU Depth (up from 56%) and 83% on Stanford Background (from 79%)
Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms.
[abstract][pdf][C++ code][Android code] Liefeng Bo, Xiaofeng Ren and
Dieter Fox, at NIPS 2011.
Learning patch features using matching pursuit (sparse coding), 2 orders of magnitude faster than prior work
Toward Robust Material Recognition for Everyday Objects.
[abstract][pdf] Diane Hu, Liefeng Bo, Xiaofeng Ren, at BMVC 2011.
Real-world material recognition, 54% on MIT Flickr Dataset (up from 45%)
Combining Self Training and Active Learning for Video Segmentation.
[pdf] Alireza Fathi, Maria Florina Balcan, Xiaofeng Ren and Jim Rehg, at BMVC 2011.
Simple but effective semi-supervised learning for segmenting out objects in video
RGB-D Object Discovery via Multi-Scene Analysis.
[pdf] Evan Herbst, Xiaofeng Ren and Dieter Fox, at IROS 2011.
Enable a robot to automatically discover and cluster objects via multiple visits to a scene; works on all objects
Interactive 3D Modeling of Indoor Environments with a Consumer Depth Camera.
[abstract][pdf][video] Hao Du, Peter Henry, Xiaofeng Ren, Marvin Cheng, Dan Goldman, Steve Seitz, Dieter Fox, at Ubicomp 2011.
First interactive system for RGB-D mapping, runs near real-time and allows user feedback and control on-the-spot
A Scalable Tree-based Approach for Joint Object and Pose Recognition.
[abstract][pdf] Kevin Lai,
Liefeng Bo, Xiaofeng Ren and
Dieter
Fox, at AAAI 2011.
Joint tree model for category, instance and pose recognition; sequential decision.
Object Recognition with Hierarchical Kernel Descriptors.
[abstract][pdf] Liefeng Bo, Kevin Lai, Xiaofeng Ren and
Dieter Fox, at CVPR 2011.
Use the kernel descriptor framework (NIPS10) to both pixel->patch and patch->image
Learning to Recognize Objects in Egocentric Activities.
[abstract][pdf] Alireza Fathi, Xiaofeng Ren and Jim Rehg, at CVPR 2011.
Focus on object-in-hand in egocentric video; clustering and discovery of objects in activities
Sparse Distance Learning for Object Recognition Combining RGB and Depth Information.
(best vision paper)
[abstract][pdf] Kevin Lai, Liefeng Bo, Xiaofeng Ren and Dieter Fox, at ICRA 2011.
Local distance learning and feature selection using instance-to-class distance
Toward Object Discovery and Modeling via 3-D Scene Comparison.
[abstract][pdf] Evan Herbst, Xiaofeng Ren and Dieter Fox, at ICRA 2011.
How can a robot discover objects robustly? By visiting a scene and finding out the changes
Discriminative Mixture-of-Templates for Viewpoint Classification.
[abstract][pdf] Chunhui Gu and Xiaofeng Ren, at ECCV 2010, Crete, Greece, 2010.
First paper on discriminative models for viewpoint/pose recognition, large improvement: 57%=>74% on 3DObject
Manipulator and Object Tracking for In Hand Model Acquisition.
[pdf] Michael Krainin, Peter Henry, Xiaofeng Ren and Dieter Fox, at the Mobile Manipulation and Best Practices in Robotics Workshops at ICRA 2010.
Figure-Ground Segmentation Improves Handled Object Recognition in Egocentric Video.
[abstract][pdf][video][dataset]
Xiaofeng Ren and Chunhui Gu, at CVPR 2010, San Francisco, 2010.
Egocentric recognition can work!! ~90% accuracy on a very challenging dataset for objects-in-hand. Check out the videos.
Egocentric Recognition of Handled Objects: Benchmark and Analysis.
[abstract][pdf][dataset]
Xiaofeng Ren and Matthai Philipose, in Egovision Workshop '09, Miami, 2009.
Can we recognize objects in a user's hand? A large benchmark (43 objects, 2 hours of video) using a wearable camera
Multi-Scale Improves Boundary Detection in Natural Images.
[abstract][pdf]
Xiaofeng Ren, in ECCV '08, Marseille, 2008.
Multi-scale does help contour detection on natural images - extensive benchmarking and analysis
A Probabilistic Multi-scale Model for Contour Completion Based on Image Statistics.
[abstract][pdf][ps][bibtex]
Xiaofeng Ren and Jitendra Malik, in ECCV '02, volume 1, pages 312-327, Copenhagen 2002.