Abstract:
Audio sounds are captured from a subject's body, e.g., using a smartphone or a worn array of microphones. Plural features are derived from the captured audio, and serve as fingerprint information. One such feature may be a time interval over which a threshold part of spectral energy in the audio is expressed. Another may be a frequency bandwidth within which a second threshold part of the spectral energy is expressed. Such fingerprint information is provided to a knowledge base that contains reference fingerprint data and associated metadata. The knowledge base matches the fingerprint with reference fingerprint data, and provides associated metadata in return—which can comprise diagnostic information related to the captured sounds. In some arrangements, an audio signal or pressure waveform stimulates the body at one location, and is sensed at another, to discern information about the intervening transmission medium. A great variety of other features and arrangements are also detailed.
Abstract:
In some arrangements, product packaging is digitally watermarked over most of its extent to facilitate high-throughput item identification at retail checkouts. Imagery captured by conventional or plenoptic cameras can be processed (e.g., by GPUs) to derive several different perspective-transformed views—further minimizing the need to manually reposition items for identification. Crinkles and other deformations in product packaging can be optically sensed, allowing such surfaces to be virtually flattened to aid identification. Piles of items can be 3D-modeled and virtually segmented into geometric primitives to aid identification, and to discover locations of obscured items. Other data (e.g., including data from sensors in aisles, shelves and carts, and gaze tracking for clues about visual saliency) can be used in assessing identification hypotheses about an item. Logos may be identified and used—or ignored—in product identification. A great variety of other features and arrangements are also detailed.
Abstract:
Portions of the disclosure relate generally to shadow analysis, e.g., on mobile platforms. One claim recites a mobile phone comprising: a camera for capturing images and video; memory for buffering captured images and video; means for identifying a shadow cast by the mobile phone on a subject being imaged by said camera by analyzing buffered captured images and video; and means for determining proximity to the subject based on an analysis of the shadow. Of course, other claims and combinations are provided too.
Abstract:
Portable devices are equipped with a variety of technologies by which existing functionality can be improved, and new functionality can be provided. One claim recites a method comprising: receiving data representing imagery that depicts an object, the imagery captured by a portable device, the portable device comprising a camera and a microphone; determining one or more descriptors relating to the object in the imagery, the determining including collecting descriptors associated with other imagery or audio; processing the descriptors in discerning whether the imagery depicts an object that is likely of a first class or a second class or a third class, the processing being performed by one or more electronic processors configured to perform such act; and taking an action dependent on whether the imagery depicts an object that is likely of a first class or a second class or a third class. Of course, other claims and combinations are provided as well.
Abstract:
The disclosed technology generally relates to methods for identifying audio and video entertainment content. One claim recites a method comprising: receiving data representing audio uploaded to a network server, the audio having been transformed prior to receipt; analyzing the data representing audio with a fingerprint generator to yield fingerprint data; determining whether the fingerprint data incurs a potential match in a fingerprint data repository, the potential match indicating an unreliability in the match below a predetermined threshold; upon a condition of unreliability in the match, and via an application program interface, issuing a call requesting at least a first reviewer and a second reviewer to review the data representing audio; receiving results from the first reviewer and results from the second reviewer through the application program interface; weighting results from the first reviewer differently than results from the second reviewer in determining whether to allow public access to the data representing audio. Of course other combinations and claims are provided.
Abstract:
Reference imagery of dermatological conditions is compiled in a crowd-sourced database (contributed by clinicians and/or the lay public), together with associated diagnosis information. A user later submits a query image to the system (e.g., captured with a smartphone). Image-based derivatives for the query image are determined (e.g., color histograms, FFT-based metrics, etc.), and are compared against similar derivatives computed from the reference imagery. This comparison identifies diseases that are not consistent with the query image, and such information is reported to the user. Depending on the size of the database, and the specificity of the data, 90% or more of candidate conditions may be effectively ruled-out, possibly sparing the user from expensive and painful biopsy procedures, and granting some peace of mind (e.g., knowledge that an emerging pattern of small lesions on a forearm is probably not caused by shingles, bedbugs, malaria or AIDS). A great number of other features and arrangements are also detailed.
Abstract:
Cell phones and other portable devices are equipped with a variety of technologies by which existing functionality can be improved, and new functionality can be provided. Some relate to visual search capabilities, and determining appropriate actions responsive to different image inputs. Others relate to processing of image data. Still others concern metadata generation, processing, and representation. Yet others relate to coping with fixed focus limitations of cell phone cameras, e.g., in reading digital watermark data. Still others concern user interface improvements. A great number of other features and arrangements are also detailed.
Abstract:
The present disclosure relates generally to cell phones and cameras, and to shadow analysis in images captured by such cell phones and cameras. One claim recites a smart phone comprising: a camera; and one or more processors programmed for: i) identifying a shadow cast by the smart phone or camera on a subject being imaged by a camera; and ii) determining a proximity of the camera to the subject based on an analysis of the shadow. Of course, other claims and combinations are provided too.
Abstract:
The present disclosure relates generally to cell phones and cameras, and to shadow analysis in imagery captured by such cell phones and cameras. One claim recites a method comprising: identifying a shadow cast by a cell phone on a subject being imaged by a camera included in the cell phone; and using a programmed electronic processor, determining proximity to the subject based on an analysis of the shadow. Another claim recites a mobile phone comprising: a camera for capturing images and video; memory; and one or more processors programmed for: identifying a shadow cast by a cell phone on a subject being imaged by said imager; and determining proximity to the subject based on an analysis of the shadow. Of course, other claims and combinations are provided too.
Abstract:
A smart phone senses audio, imagery, and/or other stimulus from a user's environment, and acts autonomously to fulfill inferred or anticipated user desires. In one aspect, the detailed technology concerns phone-based cognition of a scene viewed by the phone's camera. The image processing tasks applied to the scene can be selected from among various alternatives by reference to resource costs, resource constraints, other stimulus information (e.g., audio), task substitutability, etc. The phone can apply more or less resources to an image processing task depending on how successfully the task is proceeding, or based on the user's apparent interest in the task. In some arrangements, data may be referred to the cloud for analysis, or for gleaning. Cognition, and identification of appropriate device response(s), can be aided by collateral information, such as context. A great number of other features and arrangements are also detailed.