Conventional Image Processing Methods

Posted: 6 January 2016

I was first introduced to Computer Vision in October 2015. I found it really fascinating. The first project that I did with respect to Computer Vision was: Where’s Wally?

Where's Wally?

I was new to this field and had to do quite a bit of reading to figure out how to find Wally. I read about Computer Vision and found some promising methods that I could use to find Wally:

  1. SIFT
  2. SURF
  3. Template Matching
  4. Image Gradients
  5. Canny Edge Detection
  6. Histogram of Oriented Gradients
  7. Hough Line Transform
  8. Hough Circle Transform
  9. Gabor Filters

I experimented with all the methods using Python OpenCV and I found that the best results were using Gabor Filters, followed by Template Matching. I now discuss how I found Wally.

Gabor Filter

I thought the best way to approach this problem would be to try and convert human intuition into code. What would I do when I am trying to find Wally manually? Well, I would first look out for his red and white t-shirt. This worked well for simpler images, but in certain images, the only thing visible is his face and maybe a little of his hat. In those pictures, I looked out for his spectacles From this, I conclude that the most defining feature of Wally would be his spectacles. With that, I set out to explore how to construct a spectacles filter.

The simplest solution for a spectacle filter would be to just write a matrix and then surround it with perhaps 1’s and 0’s manually. That would work, but I would prefer something more theoretically proven. I chanced upon some work on Gabor Annulus’ and thought that would be a good idea.

Gabor Spectacles

The spectacles alone is insufficient, as we all know that Wally has a certain skin colour. The best way to get an average of his skin colour is to manually sample a pixel in each of the 15 images. However, I didn’t do that as it was tedious. I decided to use KMeans-3 to find 3 defining colours, and I just took the middle one to be his defining colour. With that, our face filter is complete!

15 Samples

KMeans 3

Face Filters!

It’s now time to slide the face filter over the entire image and find areas where the match is the highest. I used three different metrics (SQDIFF_NORMED, CCOEFF_NORMED, Convolution) and linearly combined the results.


Surprisingly, it worked very well! I’m quite impressed by it. However, it’s quite manual as it requires hand tuned filters. When the scale of the image changes, the filters have to be changed too, and it’s not a guaranteed hit. I’ve read about how Neural Networks are amazing. I should try it on Wally.