摘要:
Video sequence processing is described with various filtering rules applied to extract dominant features for content based video sequence identification. Active regions are determined in video frames of a video sequence. Video frames are selected in response to temporal statistical characteristics of the determined active regions. A two pass analysis is used to detect a set of initial interest points and interest regions in the selected video frames to reduce the effective area of images that are refined by complex filters that provide accurate region characterizations resistant to image distortion for identification of the video frames in the video sequence. Extracted features and descriptors are robust with respect to image scaling, aspect ratio change, rotation, camera viewpoint change, illumination and contrast change, video compression/decompression artifacts and noise. Compact, representative signatures are generated for video sequences to provide effective query video matching and retrieval in a large video database.
摘要:
Content identification methods for consumer devices determine robust audio fingerprints that are resilient to audio distortions. One method generates signatures representing audio content based on a constant Q-factor transform (CQT). A 2D spectral representation of a 1D audio signal facilitates generation of region based signatures within frequency octaves and across the entire 2D signal representation. Also, points of interest are detected within the 2D audio signal representation and interest regions are determined around selected points of interest. Another method generates audio descriptors using an accumulating filter function on bands of the audio spectrum and generates audio transform coefficients. A response of each spectral band is computed and transform coefficients are determined by filtering, by accumulating derivatives with different lags, and computing second order derivatives. Additionally, time and frequency based onset detection determines audio descriptors at events and enhances descriptors with information related to an event.
摘要:
A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.
摘要:
Video sequence processing is described with various filtering rules applied to extract dominant features for content based video sequence identification. Active regions are determined in video frames of a video sequence. Video frames are selected in response to temporal statistical characteristics of the determined active regions. A two pass analysis is used to detect a set of initial interest points and interest regions in the selected video frames to reduce the effective area of images that are refined by complex filters that provide accurate region characterizations resistant to image distortion for identification of the video frames in the video sequence. Extracted features and descriptors are robust with respect to image scaling, aspect ratio change, rotation, camera viewpoint change, illumination and contrast change, video compression/decompression artifacts and noise. Compact, representative signatures are generated for video sequences to provide effective query video matching and retrieval in a large video database.
摘要:
Video sequence processing is described with various filtering rules applied to extract dominant features for content based video sequence identification. Active regions are determined in video frames of a video sequence. Video frames are selected in response to temporal statistical characteristics of the determined active regions. A two pass analysis is used to detect a set of initial interest points and interest regions in the selected video frames to reduce the effective area of images that are refined by complex filters that provide accurate region characterizations resistant to image distortion for identification of the video frames in the video sequence. Extracted features and descriptors are robust with respect to image scaling, aspect ratio change, rotation, camera viewpoint change, illumination and contrast change, video compression/decompression artifacts and noise. Compact, representative signatures are generated for video sequences to provide effective query video matching and retrieval in a large video database.
摘要:
Techniques for efficient database formation and search in applications embedded in a media device are provided. The search may be performed synchronously with presentation of media programming content on a nearby media presentation device. A mobile media device captures some temporal fragments of the presented audio/video content on its microphone and camera, and then generates query fingerprints for the captured fragment. A local reference database resides on the mobile media device and a master reference database resides on a remote server with a most recent chunk of reference fingerprints transferred dynamically to the local mobile media device. A chunk of the query fingerprints generated locally on the mobile media device are searched on the local reference database for continuous content search and identification. The method presented automatically switches between the local search on the mobile media device and a remote search on an external search server.
摘要:
Scaleable video sequence processing with various filtering rules is applied to extract dominant features, and generate unique set of signatures based on video content. Video sequence structuring and subsequent video sequence characterization is performed by tracking statistical changes in the content of a succession of video frames and selecting suitable frames for further treatment by region based intra-frame segmentation and contour tracing and description. Compact representative signatures are generated on the video sequence structural level as well as on the selected video frame level, resulting in an efficient video database formation and search.
摘要:
A multi-dimensional database and indexes and operations on the multi-dimensional database are described which include video search applications or other similar sequence or structure searches. Traversal indexes utilize highly discriminative information about images and video sequences or about object shapes. Global and local signatures around keypoints are used for compact and robust retrieval and discriminative information content of images or video sequences of interest. For other objects or structures relevant signature of pattern or structure are used for traversal indexes. Traversal indexes are stored in leaf nodes along with distance measures and occurrence of similar images in the database. During a sequence query, correlation scores are calculated for single frame, for frame sequence, and video clips, or for other objects or structures.
摘要:
A multi-dimensional database and indexes and operations on the multi-dimensional database are described which include video search applications or other similar sequence or structure searches. Traversal indexes utilize highly discriminative information about images and video sequences or about object shapes. Global and local signatures around keypoints are used for compact and robust retrieval and discriminative information content of images or video sequences of interest. For other objects or structures relevant signature of pattern or structure are used for traversal indexes. Traversal indexes are stored in leaf nodes along with distance measures and occurrence of similar images in the database. During a sequence query, correlation scores are calculated for single frame, for frame sequence, and video clips, or for other objects or structures.
摘要:
A multi-dimensional database and indexes and operations on the multi-dimensional database are described which include video search applications or other similar sequence or structure searches. Traversal indexes utilize highly discriminative information about images and video sequences or about object shapes. Global and local signatures around keypoints are used for compact and robust retrieval and discriminative information content of images or video sequences of interest. For other objects or structures relevant signature of pattern or structure are used for traversal indexes. Traversal indexes are stored in leaf nodes along with distance measures and occurrence of similar images in the database. During a sequence query, correlation scores are calculated for single frame, for frame sequence, and video clips, or for other objects or structures.