Abstract:
Disclosed embodiments are directed to methods, systems, and circuits of generating compact descriptors for transmission over a communications network. A method according to one embodiment includes receiving an uncompressed descriptor, performing zero-thresholding on the uncompressed descriptor to generate a zero-threshold-delimited descriptor, quantizing the zero-threshold-delimited descriptor to generate a quantized descriptor, and coding the quantized descriptor to generate a compact descriptor for transmission over a communications network. The uncompressed and compact descriptors may be 3D descriptors, such as where the uncompressed descriptor is a SHOT descriptor. The operation of coding can be ZeroFlag coding, ExpGolomb coding, or Arithmetic coding, for example.
Abstract:
Searches performed in a data base using image descriptors of query images are managed via a mobile communication device, such as a smartphone, a tablet, etc., by: extracting at the mobile device grayscale and color descriptors of query images, sending the grayscale descriptors as compressed grayscale descriptors of query images from the mobile device to a server for searching in the data base, receiving at the mobile device results of the search, using color descriptors of query images in disambiguating the results by: i) sending the color descriptors as compressed color descriptors of query images from the mobile device to the server and receiving at the mobile device disambiguated search results from the server, or ii) receiving at the mobile device non-disambiguated search results from the server and disambiguating the search results by means of the color descriptors extracted at the mobile device to produce disambiguated search results.
Abstract:
A neural network classifies an input signal. For example, an accelerometer signal may be classified to detect human activity. In a first convolutional layer, two-valued weights are applied to the input signal. In a first two-valued function layer coupled at input to an output of the first convolutional layer, a two-valued function is applied. In a second convolutional layer coupled at input to an output of the first two-valued functional layer, weights of the second convolutional layer are applied. In a fully-connected layer coupled at input to an output of the second convolutional layer, two-valued weights of the fully connected layer are applied. In a second two-valued function layer coupled at input to an output of the fully connected layer, a two-valued function of the second two-valued function layer is applied. A classifier classifies the input signal based on an output signal of second two-valued function layer.
Abstract:
First and second video frames in a flow of digital video frames are encoded by extracting respective sets of keypoints and descriptors, each descriptor including a plurality of orientation histograms regarding a patch of pixels centered on the respective keypoint. Once a pair of linked descriptors has been identified, one for each frame, which have a minimum distance from among the distances between any one of the descriptors of the first frame and any one of the descriptors of the second frame, the differences of the histograms of the descriptors linked in the pair are calculated, and the descriptors linked in the pair are encoded as the set including one of the linked descriptors and the histogram differences by subjecting the histogram differences to a thesholding setting at zero all the differences below a certain threshold, to quantization, and to run-length encoding.
Abstract:
Searches performed in a data base using image descriptors of query images are managed via a mobile communication device, such as a smartphone, a tablet, etc., by: extracting at the mobile device grayscale and color descriptors of query images, sending the grayscale descriptors as compressed grayscale descriptors of query images from the mobile device to a server for searching in the data base, receiving at the mobile device results of the search, using color descriptors of query images in disambiguating the results by: i) sending the color descriptors as compressed color descriptors of query images from the mobile device to the server and receiving at the mobile device disambiguated search results from the server, or ii) receiving at the mobile device non-disambiguated search results from the server and disambiguating the search results by means of the color descriptors extracted at the mobile device to produce disambiguated search results.
Abstract:
An image processing system includes a first processor that acquires frames of image data. For each frame of data, the first processor generates a Gaussian pyramid for the frame of data, extract histogram of oriented gradient (HOG) descriptors for each level of the Gaussian pyramid, compresses the HOG descriptors, and sends the compressed HOG descriptors. A second processor is coupled to the first processor and is configured to receive the compressed HOG descriptors, aggregate the compressed HOG descriptors into windows, compare data of each window to at least one stored model, and generate output based upon the comparison.
Abstract:
Image-processing apparatus and methods to adaptively vary an interest point threshold value and control a number of interest points identified in an image frame are described. Sub-regions of an image frame may be processed in a sequence, and an interest point threshold value calculated for each sub-region. The calculated value of the interest point threshold may depend upon pre-selected values and values determined from the processing of one or more prior sub-regions. By using adaptive thresholding, a number of interest points detected for each frame in a sequence of image frames may remain substantially constant, even though objects within the frames may vary appreciably.
Abstract:
Image-processing apparatus and methods to adaptively control a size and/or location of a visual search window used for feature matching in a machine-vision system are described. A search window controller may receive motion vector data and image recognition rate data, and compute a search window size and/or search window location based on the received data. The computed search window size may be a portion of an image frame. The motion vector data and image recognition rate data may be computed from one or more images in a video image sequence. By adaptively controlling search window size and location, an appreciable reduction in data processing burden for feature matching may be achieved.
Abstract:
Compact descriptors of digital images are produced by detecting interest points representative of the digital images and selecting out of the interest points key points for producing e.g. local and global compact descriptors of the images. The digital images are decomposed into blocks by computing an energy (variance) for each said block and then subjecting the blocks to culling by rejecting those blocks having an energy failing to pass an energy threshold. The interest points are detected only in the blocks resulting from culling, and the key points for producing the compact descriptors are selected out of the interest points thus detected, possibly by using different selection thresholds for local and global compact descriptors, respectively. The number of key points for producing the compact descriptors may be varied e.g. by adaptively varying the number of the interest points detected in the blocks resulting from culling.
Abstract:
An embodiment of a method for computing pyramids of input images (I) in a transformed domain, e.g., for search and retrieval purposes, includes:—arranging input images in blocks to produce input image blocks,—subjecting the input image blocks to block processing including: transform into a transformed domain, subjecting the image blocks transformed into a transformed domain to filtering, subjecting the image blocks transformed into a transformed domain and filtered to inverse transform implementing an inverse transform with respect to the previous transform into a transformed domain, thus producing a set of processed blocks. The set of processed blocks, which is recomposeable to an image pyramid, may be used, e.g., in detecting extrema points in images in the pyramid, extracting a patch of given size around the extrema points detected, and processing the patch to obtain local descriptors such as SIFT descriptors of a feature.