HOG (Histograms of Oriented Gradients)主要的思想就是采用关键点周边像素的梯度来对关键点进行描述。具有代表性的算法就是SIFT (Scale-Invariant Feature Transform),其计算过程大概如下:
First, keypoints are detected in the image using an approach called “Laplacian-Of-Gaussian (LoG)“, which is based on second-degree intensity derivatives. The LoG is applied to various scale levels of the image and tends to detect blobs instead of corners. In addition to a unique scale level, keypoints are also assigned an orientation based on the intensity gradients in a local neighborhood around the keypoint.Second, for every keypoint, its surrounding area is transformed by removing the orientation and thus ensuring a canonical orientation. Also, the size of the area is resized to 16 x 16 pixels, providing a normalized patch. Third, the orientation and magnitude of each pixel within the normalized patch are computed based on the intensity gradients Ix and Iy.Fourth, the normalized patch is divided into a grid of 4 x 4 cells. Within each cell, the orientations of pixels which exceed a threshold on magnitude are collected in a histogram consisting of 8 bins.5. Last, the 8-bin histograms of all 16 cells are concatenated into a 128-dimensional vector (the descriptor) which is used to uniquely represent the keypoint.
上面基于梯度方向直方图的描述算子,能够很好的应对缩放、旋转、亮度改变、对比度改变,但是速度欠佳,如果商业运用还会有专利费。 二进制描述算子速度更快,能够达到实时性,准确率只是稍微降低了一点点,所以更有运用前景。 与HOG的主要区别是采用周边像素的强度值,而不是梯度,所以计算、匹配起来更快。关于具体算法研究,建议分别去查相关文献,这里只是提一下有个概念。在GitHub:SFND_2D_Feature_Tracking 中讲解了如何应用,以及它们它们的性能对比。 Currently, the most popular binary descriptors are BRIEF, BRISK, ORB, FREAK and KAZE (all available in the OpenCV library).
匹配的原则是什么了,那就是计算关键点描述子之间的差别,两个关键点描述子之间差别越小,匹配越佳。前面讲的描述子结果为一定长度的vector,这里的差别计算就是对描述子vector逐一元素计算差值,然后累和。主要有以下三种方法,SAD、SSD、 HD(主要用于二进制描述算子)。
Udacity 传感器融合课程笔记