Develop an object recognition system that uses "parts" from interest operators to recognize classes of objects. You may use any of the interest operators from the literature and the SIFT Descriptor, which can be used to describe regions identified by another interest operator as well as the SIFT detector [1][2].
Your system should learn its object classes from training images. For each object, there should be a set of images that contain the object and another set that does not. From this it should learn which of the detected regions are likely parts of the object. We suggest you follow the learning methodology in Yi's paper [4-6] on the generative / discriminative approach. The EM algorithm [8] is used for clustering the parts in phase 1, and then a discriminative classifier is trained to recognize the object in phase 2. In Yi's work, the discriminative classifier was trained on vectors that gave, for each image, an aggregated response to each EM component. Another related approach is a vector that is a histogram of the components found in the image. The WEKA package [7] contains many useful classifiers. Yi used neural nets.
You are welcome to choose any image sets online or generate your own image set for the object recognition project. The Caltech image sets [3] is a good place to find online image sets. For instance, the system could be tested on the Caltech256 image set (motor bikes, faces, airplanes, and watches) and any other objects you wish to include. For each object class in the 4 chosen categories, you should have a set of at least 200 images that include the object and an equal number that do not. In both sets, randomly select 2/3 of the images for training and leave 1/3 for testing. Your test will then show, for each object, the percentage of the test set that were correctly classified. You will report the numbers plus showing examples of correctly classified and incorrectly classified images of each class.
Sample Timeline
Extra Credit
Develop a simple model for the spatial relationships among the parts and train your system to recognize objects according to both parts and relationships.
Reference URLs
What your report should contain