The basic features that we used for correlation were corners. For detection of the corners we tried several techniques. The first one was the method proposed by Rangarajan, Shah and Van Brackle. The basic idea in that method was classifying the types of corner that occurs in the images by considering the quadrants occupied by the cone portion of the corner. For that purpose they defined 12 types of kernels for possible types of corners. We found that designing kernels for the different types of corners and then doing convolution with them was costly for our purpose. Also for the matching detection of every kind of corner is not necessary. So next we tried simple types of kernels to find "straight angle" (right angle) corners. The kernels that we used were basically 9x9 windows which has 1's at the sides and zeros at everywhere, for different rotation of the right angle. We basically did template matching with that kernels. This method was computationally more efficient than the previous one, since there was no multiplication. But the edge pixels were containing many parallel lines that were so close to cause misclassified corner detection. So lastly we tried a simple heuristic approach for the detection of corners. Our first observation for our decision on this method was single-pixel-thick edges. This types of edges makes edge following easier. The method that we used basically following edge pixels in the 13x13 window. Basically we followed the edge pixel in this window and try to find connected component that enters the window from the east or west side of the window and leaves it from north or south side. We discard the components that have length less than the size of window length. In this way we avoid to detect duplicate corners and the false corners. Also the running time was impressive.
One of the improvements that we introduce in the corner detection and also in the matching stages was the assumption of the motion direction. Since the major motion direction was in the x direction we put limit in the area that we detect features. Because some of the features due to the movement is outside of the next image, so detection this features are not necessary, since we can not match them.