Posted: 4 June 2016
Part 1: 4th June 2016
For almost the longest time ever, I had this hypothesis:
Take a video from all perspectives of an object, do some segmentation to extract only the object, train the neural network with these images, and you will get a neural network that can recognize these objects from all orientations.
I seem to have disproved this hypothesis today. Perhaps Spatial Transformer Networks by Jaderberg would really help in this problem. I should try it soon.
Here are the images I tried: