摘要:
A value is computed for a feature in an instance of query content and compared to a threshold value. Based on the comparison, first and second bits in a hash value, which is derived from the query content feature, are determined. Conditional probability values are computed for the likelihood that quantized values of the first and the second bits equal corresponding quantized bit values of a target or reference feature value. The conditional probabilities are compared and a relative strength determined for the first and second bits, which directly corresponds to the conditional probability. The bit with the lowest bit strength is selected as the weakbit. The value of the weakbit is toggled to generate a variation of the query hash value. The query may be extended using the query hash value variation.
摘要:
Multiple candidate feature components of media content or projection matrices (or other hash functions, e.g., non-linear projections) are identified. Each of the candidate projection matrices (or other hash functions) includes an array of coefficients that relate to the candidate features. A subgroup of the candidate features or the projection matrices (or other hash functions) are selected based at least partially on an optimized combination of at least two characteristics of the candidate features or projection matrices (or other hash functions). Media fingerprints that uniquely identify the media content are derived from the selected optimized subgroup. Optimal projection matrices (or other hash functions) may be designed. Performance or sensitivity (e.g., search time) characteristics of the fingerprints are thus balanced with robustness characteristics thereof.
摘要:
Features are extracted from video and audio content that have a known temporal relationship with one another. The extracted features are used to generate video and audio signatures, which are assembled with an indication of the temporal relationship into a synchronization signature construct. the construct may be used to calculate synchronization errors between video and audio content received at a remote destination. Measures of confidence are generated at the remote destination to optimize processing and to provide an indication of reliability of the calculated synchronization error.
摘要:
A portion of media content is accessed. Components from a first and each subsequent spatial regions of the media content are sampled. Each spatial region has an unsegmented area. Each subsequent spatial region includes those within its area as elements thereof or the spatial regions may partially overlap. The regions may overlap independent of a hierarchical relationship between the regions. A media fingerprint is derived from the components of each of the spatial regions, which reliably corresponds to the media content portion, e.g., over geometric attacks such as rotation.
摘要:
Deriving a fingerprint of an image corresponding to media content involves selecting at least two different regions of the same image, determining a relationship between the two regions, and deriving a fingerprint of the image based on the relationship between the two regions of the image.
摘要:
Metadata comprising a set of gain values for creating a dominance effect is automatically generated. Automatically generating the metadata includes receiving multiple audio streams and a dominance criterion for at least one of the audio streams. A set of gains is computed for one or more audio streams based on the dominance criterion for the at least one audio stream and metadata is generated with the set of gains.
摘要:
Techniques for scene change detection around seed points in media data are provided. Media features of many different types may be extracted from the media data. One or more statistical patterns of media features in a plurality of time-wise intervals around a plurality of seed time points of the media data may be determined using one or more types of features extractable from the media data. At least one of the one or more types of features comprises a type of features that captures structural properties, tonality including harmony and melody, timbre, rhythm, loudness, stereo mix, or a quantity of sound sources as related to the media data. A plurality of beginning scene change points and a plurality of ending scene change points in the media data may be detected, based on the one or more statistical patterns, for the plurality of seed time points in the media data.
摘要:
Attributes are identified in media content. A classification value of the media content is computed based on the identified attributes. Thereafter, a fingerprint derived from the media content is stored or searched for based on the classification value of the media content.
摘要:
Multiple candidate feature components of media content or projection matrices (or other hash functions, e.g., non-linear projections) are identified. Each of the candidate projection matrices (or other hash functions) includes an array of coefficients that relate to the candidate features. A subgroup of the candidate features or the projection matrices (or other hash functions) are selected based at least partially on an optimized combination of at least two characteristics of the candidate features or projection matrices (or other hash functions). Media fingerprints that uniquely identify the media content are derived from the selected optimized subgroup. Optimal projection matrices (or other hash functions) may be designed. Performance or sensitivity (e.g., search time) characteristics of the fingerprints are thus balanced with robustness characteristics thereof.
摘要:
Content identification and quality monitoring are provided. The method involves obtaining a first fingerprint derived from a first media content, processing the first media content to generate a second media content, obtaining a second fingerprint derived from the second media content, and comparing the first fingerprint and the second fingerprint to determine one or more of: a similarity between the first fingerprint and the second fingerprint that indicates that the second media content is generated from the first media content or a difference between the first fingerprint and the second fingerprint to identify a quality degradation between the first media content and the second media content.