Abstract:
A method includes adapting the universal generative model of local descriptors to a first camera to obtain a first camera-dependent generative model. The same universal generative model is also adapted to a second camera to obtain a second camera-dependent generative model. From a first image captured by the first camera, a first image-level descriptor is extracted, using the first camera-dependent generative model. From a second image captured by the second camera, a second image-level descriptor is extracted using the second camera-dependent generative model. A similarity is computed between the first image-level descriptor and the second image-level descriptor. Information is output, based on the computed similarity. The adaptation allows differences between the image-level descriptors to be shifted towards deviations in image content, rather than the imaging conditions.
Abstract:
A method for recognition of an identifier such as a license plate includes storing first visual signatures, each extracted from a first image of a respective object, such as a vehicle, captured at a first location, and first information associated with the first captured image, such as a time stamp. A second visual signature is extracted from a second image of a second object captured at a second location and second information associated with the second captured image is acquired. A measure of similarity is computed between the second visual signature and at least some of the first visual signatures to identify a matching one. A test is performed, which is a function of the first and the second information associated with the matching signatures. Only when it is confirmed that the test has been met, identifier recognition is performed to identify the identifier of the second object.
Abstract:
A system and method for comparing a text image and a character string are provided. The method includes embedding a character string into a vectorial space by extracting a set of features from the character string and generating a character string representation based on the extracted features, such as a spatial pyramid bag of characters (SPBOC) representation. A text image is embedded into a vectorial space by extracting a set of features from the text image and generating a text image representation based on the text image extracted features. A compatibility between the text image representation and the character string representation is computed, which includes computing a function of the text image representation and character string representation.
Abstract:
In image classification, each class of a set of classes is embedded in an attribute space where each dimension of the attribute space corresponds to a class attribute. The embedding generates a class attribute vector for each class of the set of classes. A set of parameters of a prediction function operating in the attribute space respective to a set of training images annotated with classes of the set of classes is optimized such that the prediction function with the optimized set of parameters optimally predicts the annotated classes for the set of training images. The prediction function with the optimized set of parameters is applied to an input image to generate at least one class label for the input image. The image classification does not include applying a class attribute classifier to the input image.
Abstract:
A system and method for computing confidence in an output of a text recognition system includes performing character recognition on an input text image with a text recognition system to generate a candidate string of characters. A first representation is generated, based on the candidate string of characters, and a second representation is generated based on the input text image. A confidence in the candidate string of characters is computed based on a computed similarity between the first and second representations in a common embedding space.
Abstract:
Authentication methods are disclosed for determining whether a person or object to be authenticated is a member of a set of authorized persons or objects. A query signature is acquired comprising a vector whose elements store values of an ordered set of features for the person or object to be authenticated. The query signature is compared with an aggregate signature comprising a vector whose elements store values of the ordered set of features for the set of authorized persons or objects. The individual signatures for the authorized persons or objects are not stored; only the aggregate signature. It is determined whether the person or object to be authenticated is a member of the set of authorized persons or objects based on the comparison. The comparing may comprise computing an inner product of the query signature and the aggregate signature, with the determining being based on the inner product.
Abstract:
A method for diagnosis assistance exploits similarity between a new medical case and existing medical cases and experts when embedded in a common embedding space. Different types of queries are provided for, including a query-by-cases and a query-by-experts. These may be associated with different cost structures that encourage the requester to use the query-by-cases first and seek expert assistance if this proves unsuccessful. Depending on whether the query-by-cases or query-by-experts is requested, a subset of the existing cases or experts is identified based on the similarity of their representations, in the embedding space, with a representation of the new case in the embedding space. There may then be provision for communicating the new case to a selected one or more of the subset of experts for the expert to attempt to provide a diagnosis.
Abstract:
A method for generating an image representation includes generating a set of embedded descriptors, comprising, for each of a set of patches of an image, extracting a patch descriptor which is representative of the pixels in the patch and embedding the patch descriptor in a multidimensional space to form an embedded descriptor. An image representation is generated by aggregating the set of embedded descriptors. In the aggregation, each descriptor is weighted with a respective weight in a set of weights, the set of weights being computed based on the patch descriptors for the image. Information based on the image representation is output. At least one of the extracting of the patch descriptors, embedding the patch descriptors, and generating the image representation is performed with a computer processor.
Abstract:
A system and method for computing confidence in an output of a text recognition system includes performing character recognition on an input text image with a text recognition system to generate a candidate string of characters. A first representation is generated, based on the candidate string of characters, and a second representation is generated based on the input text image. A confidence in the candidate string of characters is computed based on a computed similarity between the first and second representations in a common embedding space.
Abstract:
A system and method for comparing a text image and a character string are provided. The method includes embedding a character string into a vectorial space by extracting a set of features from the character string and generating a character string representation based on the extracted features, such as a spatial pyramid bag of characters (SPBOC) representation. A text image is embedded into a vectorial space by extracting a set of features from the text image and generating a text image representation based on the text image extracted features. A compatibility between the text image representation and the character string representation is computed, which includes computing a function of the text image representation and character string representation.