Abstract:
A first video frame and a second video frame in a flow of digital video frames are encoded by extracting for the frames in question respective sets of keypoints and descriptors, with each descriptor including a plurality of orientation histograms regarding a patch of pixels centred on the respective keypoint. Once a pair of linked descriptors has been identified, one for the first frame and one for the second frame, which have a minimum distance from among the distances between any one of the descriptors of the first frame and any one of the descriptors of the second frame, the differences of the histograms of the descriptors linked in said pair are calculated, and the descriptors linked in said pair are encoded as the set including one of the linked descriptors and the aforesaid histogram differences by subjecting the histogram differences to a thesholding setting at zero all the differences below a certain threshold, to quantization, and to an encoding of a run-length type. The run-length encoding is followed by a further encoding chosen from among a Huffman encoding, an arithmetical encoding, and a type encoding.
Abstract:
One embodiment is a method for selecting and grouping key points extracted by applying a feature detector on a scene being analyzed. The method includes grouping the extracted key points into clusters that enforce a geometric relation between members of a cluster, scoring and sorting the clusters, identifying and discarding clusters that are comprised of points which represent the background noise of the image, and sub-sampling the remaining clusters to provide a smaller number of key points for the scene.