摘要:
The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy.
摘要:
An architecture for a multimedia search system is described. To perform similarity matching of multimedia query frames against reference content, reference database comprising of a cluster index using cluster keys to perform similarity matching and a multimedia index to perform sequence matching is built. Methods to update and maintain the reference database that enables addition and removal of the multimedia contents, including portions of multimedia content, from the reference database in a running system are described. Hierarchical multi-level partitioning methods to organize the reference database are presented. Smart partitioning of the reference multimedia content according to the nature of the multimedia content, and according to the popularity among the social media, that supports scalable fast multimedia identification is also presented. A caching mechanism for multimedia search queries in a centralized or in a decentralized distributed system and a client based local multimedia search system enabling multimedia tracking are described.
摘要:
Video sequence processing with various filtering rules is applied to extract dominant spatial features and generate unique set of signatures describing video content. Accurate active regions are determined for each video sequence frame. Subsequently, a video sequence is structured by tracking statistical changes in the content of a succession of video frames, and suitable frames are selected for further spatial processing. Selected video frames are processed for feature extraction and description, and compact representative signatures are generated, resulting in an efficient video database formation and search.
摘要:
Video sequence processing with various filtering rules is applied to extract dominant spatial features and generate unique set of signatures describing video content. Accurate active regions are determined for each video sequence frame. Subsequently, a video sequence is structured by tracking statistical changes in the content of a succession of video frames, and suitable frames are selected for further spatial processing. Selected video frames are processed for feature extraction and description, and compact representative signatures are generated, resulting in an efficient video database formation and search.
摘要:
Techniques for efficient database formation and search in applications embedded in a media device are provided. The search may be performed synchronously with presentation of media programming content on a nearby media presentation device. A mobile media device captures some temporal fragments of the presented audio/video content on its microphone and camera, and then generates query fingerprints for the captured fragment. A local reference database resides on the mobile media device and a master reference database resides on a remote server with a most recent chunk of reference fingerprints transferred dynamically to the local mobile media device. A chunk of the query fingerprints generated locally on the mobile media device are searched on the local reference database for continuous content search and identification. The method presented automatically switches between the local search on the mobile media device and a remote search on an external search server.
摘要:
The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy.
摘要:
Scaleable video sequence processing with various filtering rules is applied to extract dominant features, and generate unique set of signatures based on video content. Video sequence structuring and subsequent video sequence characterization is performed by tracking statistical changes in the content of a succession of video frames and selecting suitable frames for further treatment by region based intra-frame segmentation and contour tracing and description. Compact representative signatures are generated on the video sequence structural level as well as on the selected video frame level, resulting in an efficient video database formation and search.
摘要:
Video sequence processing is described with various filtering rules applied to extract dominant features for content based video sequence identification. Active regions are determined in video frames of a video sequence. Video frames are selected in response to temporal statistical characteristics of the determined active regions. A two pass analysis is used to detect a set of initial interest points and interest regions in the selected video frames to reduce the effective area of images that are refined by complex filters that provide accurate region characterizations resistant to image distortion for identification of the video frames in the video sequence. Extracted features and descriptors are robust with respect to image scaling, aspect ratio change, rotation, camera viewpoint change, illumination and contrast change, video compression/decompression artifacts and noise. Compact, representative signatures are generated for video sequences to provide effective query video matching and retrieval in a large video database.
摘要:
A multi-dimensional database and indexes and operations on the multi-dimensional database are described which include video search applications or other similar sequence or structure searches. Traversal indexes utilize highly discriminative information about images and video sequences or about object shapes. Global and local signatures around keypoints are used for compact and robust retrieval and discriminative information content of images or video sequences of interest. For other objects or structures relevant signature of pattern or structure are used for traversal indexes. Traversal indexes are stored in leaf nodes along with distance measures and occurrence of similar images in the database. During a sequence query, correlation scores are calculated for single frame, for frame sequence, and video clips, or for other objects or structures.
摘要:
An efficient large scale search system for video and multi-media content using a distributed database and search, and tiered search servers is described. Selected content is stored at the distributed local database and tier1 search server(s). Content matching frequent queries, and frequent unidentified queries are cached at various levels in the search system. Content is classified using feature descriptors and geographical aspects, at feature level and in time segments. Queries not identified at clients and tier1 search server(s) are queried against tier2 or lower search server(s). Search servers use classification and geographical partitioning to reduce search cost. Methods for content tracking and local content searching are executed on clients. The client performs local search, monitoring and/or tracking of the query content with the reference content and local search with a database of reference fingerprints. This shifts the content search workload from central servers to the distributed monitoring clients.