Abstract:
Systems and methods for noise based interest point density pruning are disclosed herein. The systems include determining an amount of noise in an audio sample and adjusting the amount of interest points within an audio sample fingerprint based on the amount of noise. Samples containing high amounts of noise correspondingly generate fingerprints with more interest points. The disclosed systems and methods allow reference fingerprints to be reduced in size while increasing the size of sample fingerprints. The benefits in scalability do not compromise the accuracy of an audio matching system using noise based interest point density pruning.
Abstract:
System and methods for characterizing interest points within a descriptor are disclosed herein. The systems include generating a set of interest points related to an audio sample. A set of gradients relating to respective interest points in the set of interest points can be generated. A set of descriptors can then be generated based upon the set of interest points and the set of gradients and used in comparison to reference descriptors to identify the audio sample. The disclosed systems and methods provide for an audio matching system robust to pitch-shift distortion by using gradients that characterize the time-frequency neighborhood around an interest point rather than solely relying on interest points themselves. Thus, the disclosed system and methods result in more accurate audio identification.
Abstract:
A technique for inverted client side fingerprinting and matching provides the benefits of disposable fingerprinting to identify multiple content streams from multiple clients without overloading a fingerprinting system. Rather than tasking a fingerprinting system with the generation and comparison of all fingerprints, the technique distributes some fingerprinting tasks to the clients receiving the content streams. As a result, the fingerprinting system is not bottlenecked by fingerprinting tasks. In one embodiment, the fingerprinting system can provide additional services to the clients.
Abstract:
A matching system receives probe audio samples for comparison to references of a data store. Comparisons are generated to determine a sufficient match for a portion or a first amount of the probe sample. Ranking scores are assigned to the resulting match references. The match references are retained, unless meeting a score threshold. Comparisons are continually generated with second amounts of the probe sample and the retained references are updated with further matching references assigned ranking scores. The retained results are merged and determined to satisfy a score threshold for release as outputted results for matching references.
Abstract:
Systems and methods are provided herein relating to audio matching. The density and quality of interest points can be controlled to assure a small but uniform number of high quality interest points. By scoring interest points based on quality and comparing them over time, those interest points that maintain a high quality when compared with a varying number of neighboring interest points can be retained, while those interest points that do not maintain a high quality can be discarded. Thus, the scalability of an audio matching system can be improved while retaining accuracy.
Abstract:
Systems and methods for generating unique pitch-resistant descriptors for audio clips are provided. In one or more embodiments, a descriptor for an audio clip is generated as a function of relative magnitudes between interest points within the audio clip's time-frequency representation. A number of techniques for leveraging the relative magnitudes to generate descriptors are considered. These techniques include ordering of interest points as a function of ascending or descending magnitude, creation of binary vectors based on magnitude comparisons between pairs of points, and calculation of quantized magnitude ratios between pairs of points. Descriptors generated based on relative magnitudes according to the techniques disclosed herein are relatively invariant to common transformations to the original audio clip, such as pitch shifting, time stretching, global volume changes, equalization, and/or dynamic range compression.
Abstract:
System and methods for characterizing interest points within a fingerprint are disclosed herein. The systems include generating a set of interest points and an anchor point related to an audio sample. A quantized absolute frequency of an anchor point can be calculated and used to calculate a set of quantized ratios. A fingerprint can then be generated based upon the set of quantized ratios and used in comparison to reference fingerprints to identify the audio sample. The disclosed systems and methods provide for an audio matching system robust to pitch-shift distortion by using quantized ratios within fingerprints rather than solely using absolute frequencies of interest points. Thus, the disclosed system and methods result in more accurate audio identification.
Abstract:
An audio recognition service recognizes an audio sample across multiple content types. At least a partial set of results generated by the service are returned to a client while the audio sample is still being recorded and/or transmitted. The client additionally displays the results in real-time or near real-time to the user. The audio sample can be sent over a first HTTP connection and the results can be returned over a second HTTP connection. The audio recognition service further processes check-in selections received from the client for content items indicated by the results. Responsive to receiving the check-in selections, the service determines whether a user is eligible for a reward. If the user is eligible, the service provides the reward.
Abstract:
This disclosure relates to dynamic display of content consumption by geographic location. A recognition component recognizes content being consumed by a set of users, and identifies geographic locations of the consumption and a set of characteristics associated with the consumption. An aggregation component ranks the consumed content based on a subset of the characteristics associated with the consumption, and a display component generates a map displaying subsets of the consumed content as a function of respective rankings and geographic location.
Abstract:
Systems and methods audio matching using interest point overlap are disclosed herein. The systems include determining at least one matching reference segment based on a probe segment. Interest points for both the at least one matching reference segment and the probe segment can be generated. Probe segment interest points and matching reference segment interest points can be time aligned and frequency aligned. A count can be generated based on a number of overlapping interest points between each set of reference interest points and the set of probe segment interest points. The disclosed systems and methods allow false positive reference to be identified and eliminated based on the count. The benefits in eliminating false positive matches improve the accuracy of an audio matching system.