-
公开(公告)号:US08805560B1
公开(公告)日:2014-08-12
申请号:US13276318
申请日:2011-10-18
CPC分类号: G06F17/30743 , G06F17/30758 , G10L25/54
摘要: Systems and methods for noise based interest point density pruning are disclosed herein. The systems include determining an amount of noise in an audio sample and adjusting the amount of interest points within an audio sample fingerprint based on the amount of noise. Samples containing high amounts of noise correspondingly generate fingerprints with more interest points. The disclosed systems and methods allow reference fingerprints to be reduced in size while increasing the size of sample fingerprints. The benefits in scalability do not compromise the accuracy of an audio matching system using noise based interest point density pruning.
摘要翻译: 本文公开了基于噪声的兴趣点密度修剪的系统和方法。 系统包括确定音频样本中的噪声量,并且基于噪声量来调整音频样本指纹内的兴趣点的数量。 含有大量噪声的样本相应地产生了具有更多兴趣点的指纹。 所公开的系统和方法允许参考指纹的尺寸减小,同时增加样本指纹的大小。 可扩展性的优点不会影响使用基于噪声的兴趣点密度修剪的音频匹配系统的准确性。
-
公开(公告)号:US09202472B1
公开(公告)日:2015-12-01
申请号:US13434832
申请日:2012-03-29
IPC分类号: G06F17/30 , G10L19/018
CPC分类号: G10L19/018 , G10L25/03
摘要: Systems and methods for generating unique pitch-resistant descriptors for audio clips are provided. In one or more embodiments, a descriptor for an audio clip is generated as a function of relative magnitudes between interest points within the audio clip's time-frequency representation. A number of techniques for leveraging the relative magnitudes to generate descriptors are considered. These techniques include ordering of interest points as a function of ascending or descending magnitude, creation of binary vectors based on magnitude comparisons between pairs of points, and calculation of quantized magnitude ratios between pairs of points. Descriptors generated based on relative magnitudes according to the techniques disclosed herein are relatively invariant to common transformations to the original audio clip, such as pitch shifting, time stretching, global volume changes, equalization, and/or dynamic range compression.
摘要翻译: 提供了用于为音频剪辑生成独特的音高描述符的系统和方法。 在一个或多个实施例中,音频剪辑的描述符作为音频剪辑的时间 - 频率表示内的兴趣点之间的相对幅度的函数被生成。 考虑了利用相对幅度来生成描述符的许多技术。 这些技术包括将兴趣点排序为上升或下降幅度的函数,基于点对之间的幅度比较的二进制向量的创建以及点对之间的量化幅度比的计算。 基于根据本文公开的技术的相对幅度生成的描述符对于原始音频剪辑的常见变换(例如音调偏移,时间延伸,全局音量变化,均衡和/或动态范围压缩)是相对不变的。
-
公开(公告)号:US08886543B1
公开(公告)日:2014-11-11
申请号:US13296899
申请日:2011-11-15
申请人: Matthew Sharifi , George Tzanetakis , Annie Chen , Dominik Roblek
发明人: Matthew Sharifi , George Tzanetakis , Annie Chen , Dominik Roblek
IPC分类号: G10L11/00
CPC分类号: G10L19/018
摘要: System and methods for characterizing interest points within a fingerprint are disclosed herein. The systems include generating a set of interest points and an anchor point related to an audio sample. A quantized absolute frequency of an anchor point can be calculated and used to calculate a set of quantized ratios. A fingerprint can then be generated based upon the set of quantized ratios and used in comparison to reference fingerprints to identify the audio sample. The disclosed systems and methods provide for an audio matching system robust to pitch-shift distortion by using quantized ratios within fingerprints rather than solely using absolute frequencies of interest points. Thus, the disclosed system and methods result in more accurate audio identification.
摘要翻译: 本文公开了用于表征指纹内的兴趣点的系统和方法。 系统包括产生一组感兴趣点和与音频样本相关的定位点。 锚定点的量化绝对频率可以被计算并用于计算一组量化比率。 然后可以基于所述量化比率的集合生成指纹,并且与参考指纹进行比较以用于识别音频样本。 所公开的系统和方法通过使用指纹内的量化比率而不是仅使用感兴趣点的绝对频率来提供对音调偏移失真鲁棒的音频匹配系统。 因此,所公开的系统和方法导致更准确的音频识别。
-
公开(公告)号:US08738633B1
公开(公告)日:2014-05-27
申请号:US13362905
申请日:2012-01-31
申请人: Matthew Sharifi , Sergey Ioffe , Jay Yagnik , Gheorghe Postelnicu , Dominik Roblek , George Tzanetakis
发明人: Matthew Sharifi , Sergey Ioffe , Jay Yagnik , Gheorghe Postelnicu , Dominik Roblek , George Tzanetakis
IPC分类号: G06F17/30
CPC分类号: G06K9/6267 , G06F17/3002 , G06F17/30244 , G06K9/00013
摘要: This disclosure relates to transformation invariant media matching. A fingerprinting component can generate a transformation invariant identifier for media content by adaptively encoding the relative ordering of interest points in media content. The interest points can be grouped into subsets, and stretch invariant descriptors can be generated for the subsets based on ratios of coordinates of interest points included in the subsets. The stretch invariant descriptors can be aggregated into a transformation invariant identifier. An identification component compares the identifier against a set of identifiers for known media content, and the media content can be matched or identified as a function of the comparison.
摘要翻译: 本公开涉及变换不变媒体匹配。 指纹分量可以通过对媒体内容中的兴趣点的相对排序进行自适应编码来生成媒体内容的变换不变标识符。 可以将兴趣点分组为子集,并且可以基于子集中包括的兴趣点坐标的比例为子集生成拉伸不变描述符。 拉伸不变描述符可以聚合成变换不变标识符。 识别部件将标识符与已知媒体内容的一组标识符进行比较,并且媒体内容可以作为比较的函数进行匹配或标识。
-
公开(公告)号:US09098576B1
公开(公告)日:2015-08-04
申请号:US13274725
申请日:2011-10-17
CPC分类号: G06F17/30743 , G06F17/3074
摘要: Systems and methods for audio matching are disclosed herein. In one embodiment, a system includes both interest point mixing and fingerprint mixing by using multiple interest point detection methods in parallel. Since multiple interest point detection methods are used in parallel, accuracy of audio matching is improved across a wide variety of audio signals. In addition the scalability of the disclosed audio matching system is increased by matching the fingerprint of an audio sample with a fingerprint of a reference sample versus matching an entire spectrogram. Accordingly, a more accurate and more general solution to audio matching can be accomplished.
摘要翻译: 本文公开了用于音频匹配的系统和方法。 在一个实施例中,系统通过并行使用多个兴趣点检测方法来包括兴趣点混合和指纹混合。 由于并行地使用多个兴趣点检测方法,因此在多种音频信号中提高了音频匹配的精度。 此外,通过将音频样本的指纹与参考样本的指纹匹配以匹配整个频谱图来增加所公开的音频匹配系统的可扩展性。 因此,可以实现更准确和更一般的音频匹配解决方案。
-
公开(公告)号:US08831763B1
公开(公告)日:2014-09-09
申请号:US13276316
申请日:2011-10-18
摘要: System and methods for intelligently pruning interest points are disclosed herein. The systems include generating a plurality of distorted audio samples and associated distorted interest points based upon a clean audio sample. Interest points that are common to sets of distorted interest points are retained with interest points not robust to distortion discarded. The disclosed systems and methods therefore can provide for a scalable audio matching solution by eliminating interest points in reference sample fingerprints. The set of pruned interest points are robust to distortion and the benefits of both scalability and accuracy can be had.
摘要翻译: 本文公开了用于智能修剪兴趣点的系统和方法。 系统包括基于干净的音频样本产生多个失真的音频样本和相关联的失真的兴趣点。 对于一组扭曲的兴趣点常见的兴趣点被保留,对于丢弃的失真不利于兴趣点。 因此,所公开的系统和方法可以通过消除参考样本指纹中的兴趣点来提供可扩展的音频匹配解决方案。 修剪的兴趣点的集合对于失真是稳健的,并且可以实现可扩展性和准确性的优点。
-
公开(公告)号:US20110035035A1
公开(公告)日:2011-02-10
申请号:US12906584
申请日:2010-10-18
申请人: Rehan M. Khan , George Tzanetakis
发明人: Rehan M. Khan , George Tzanetakis
IPC分类号: G06F17/00
CPC分类号: G06F16/683 , G06F16/634 , G06F16/635 , G06F16/639 , G06F16/68 , Y10S707/99945 , Y10S707/99948
摘要: A fingerprint is generated from an unknown audio signal by dividing the unknown audio signal into bins, where each bin includes points representing a feature space. Each of the points is mapped to one of a plurality of predetermined cluster centers based on the distance between each point and the plurality of cluster centers, each cluster center being associated with an element of a codebook. A string of elements is generated based on the mapping and compressed.
摘要翻译: 通过将未知音频信号划分成箱,从未知音频信号生成指纹,其中每个仓包括表示特征空间的点。 基于每个点与多个聚类中心之间的距离将每个点映射到多个预定聚类中心之一,每个聚类中心与码本的元素相关联。 基于映射和压缩生成一串元素。
-
公开(公告)号:US07853344B2
公开(公告)日:2010-12-14
申请号:US11839768
申请日:2007-08-16
申请人: Rehan M. Khan , George Tzanetakis
发明人: Rehan M. Khan , George Tzanetakis
CPC分类号: G06F17/30758 , G06F17/30743 , G06F17/30749 , G06F17/30761 , G06F17/30772 , Y10S707/99945 , Y10S707/99948
摘要: A method and system for analyzing audio files is provided. Plural audio file feature vector values based on an audio file's content are determined and the audio file feature vectors are stored in a database that also stores other pre-computed audio file features. The process determines if the audio files feature vectors match the stored audio file vectors. The process also associates a plurality of known attributes to the audio file.
摘要翻译: 提供了一种用于分析音频文件的方法和系统。 确定基于音频文件内容的多个音频文件特征向量值,并且音频文件特征向量存储在也存储其他预先计算的音频文件特征的数据库中。 该过程确定音频文件特征矢量是否与存储的音频文件矢量相匹配。 该过程还将多个已知属性与音频文件相关联。
-
公开(公告)号:US20070282935A1
公开(公告)日:2007-12-06
申请号:US11839768
申请日:2007-08-16
申请人: Rehan Khan , George Tzanetakis
发明人: Rehan Khan , George Tzanetakis
IPC分类号: G06F1/02
CPC分类号: G06F17/30758 , G06F17/30743 , G06F17/30749 , G06F17/30761 , G06F17/30772 , Y10S707/99945 , Y10S707/99948
摘要: A method and system for analyzing audio files is provided. Plural audio file feature vector values based on an audio file's content are determined and the audio file feature vectors are stored in a database that also stores other pre-computed audio file features. The process determines if the audio files feature vectors match the stored audio file vectors. The process also associates a plurality of known attributes to the audio file.
摘要翻译: 提供了一种用于分析音频文件的方法和系统。 确定基于音频文件内容的多个音频文件特征向量值,并且将音频文件特征向量存储在也存储其它预先计算的音频文件特征的数据库中。 该过程确定音频文件特征矢量是否与存储的音频文件矢量相匹配。 该过程还将多个已知属性与音频文件相关联。
-
公开(公告)号:US07277766B1
公开(公告)日:2007-10-02
申请号:US09695457
申请日:2000-10-24
申请人: Rehan M. Khan , George Tzanetakis
发明人: Rehan M. Khan , George Tzanetakis
IPC分类号: G06F17/00
CPC分类号: G06F17/30758 , G06F17/30743 , G06F17/30749 , G06F17/30761 , G06F17/30772 , Y10S707/99945 , Y10S707/99948
摘要: A method and system for analyzing audio files is provided. Plural audio file feature vector values based on an audio file's content are determined and the audio file feature vectors are stored in a database that also stores other pre-computed audio file features. The process determines if the audio files feature vectors match the stored audio file vectors. The process also associates a plurality of known attributes to the audio file.
摘要翻译: 提供了一种用于分析音频文件的方法和系统。 确定基于音频文件内容的多个音频文件特征向量值,并且将音频文件特征向量存储在也存储其它预先计算的音频文件特征的数据库中。 该过程确定音频文件特征矢量是否与存储的音频文件矢量相匹配。 该过程还将多个已知属性与音频文件相关联。
-
-
-
-
-
-
-
-
-