Part-Based Models for Computer Vision

About:

This page is for the demonstration I had to give for my Computer Vision class in Spring 2009.  It is a modified version R. Fergus's Simple Parts and Structure Object Detector.

It will learn to detect objects given a couple of manually trained examples, by using the alignment of individual parts.  It will also perform Expectation Maximization to improve the model.

 

Download:

 

Usage:

To run, extract all the files into a directory, then edit the value BASE_PATH in "common/config.m" to point to the new directory.

Then execute "run_test" from within the "common" dir.  This will prompt you to train the initial model on several training examples, and then run the "parts_overlap" detection method to find faces.  The program will run for a few iterations, then plot the final results.

Most everything can be controlled from the config.m file.  To test the various detection models, uncomment the appropriate EXPERIMENT_TYPE in config.m

 

Additions/changes:

  • Iterative EM - run models iteratively, using best scores from previous iteration to train next iteration.
  • HOG - get_hog.m computes a Histogram Oriented Gradient descriptor for any image.  HOG similarity computes a similarity metric between two HOG descriptors.
  • Use of HOG similarity to prevent part filters from degenerating over iterations.
  • Overlap test - variant on a Hough transform, that shifts the response images and finds the best overlap of response peaks for the different filters.  This method works best with EM.
  • Support for Caltech datasets -- just make a new directory under images, and it will convert the groundtruth for you when it resizes the images.
  • A few bug fixes, lots of abstraction and removal of redundant code.
  • Better more/plots.

 

Screenshots:

 

Example of training face:

 

Heatmap responses, final plot shows face correctly detected by shifting all responses and taking max of mins across all responses.

 

 

Results ranked by score and colored by category:

 

Model part positions over iterations.  Variance generally decreases with iterations:

Relative Part Locations

 

Filters over iterations.  Noisy filters converge into smoothed composites over iterations, with the largest change happening between iterations 1 and 2.

 

3D plot of RPC curves over iterations.  Because only 1 guess is taken per image, recall doesn't necessarily reach one (since incorrect match may come first).

RPC Curves Over Iterations