Method, apparatus, and medium for detecting frequency extension coding in the coding history of an audio signal
    41.
    发明授权
    Method, apparatus, and medium for detecting frequency extension coding in the coding history of an audio signal 有权
    用于在音频信号的编码历史中检测频率扩展编码的方法,装置和介质

    公开(公告)号:US09117440B2

    公开(公告)日:2015-08-25

    申请号:US14116113

    申请日:2012-04-30

    摘要: The present document relates to audio forensics, notably the blind detection of traces of parametric audio encoding/decoding. In particular, the present document relates to the detection of parametric frequency extension audio coding, such as spectral band replication (SBR) or spectral extension (SPX), from uncompressed waveforms such as PCM (pulse code modulation) encoded waveforms. A method for detecting frequency extension coding history in a time domain audio signal is described. The method may comprise transforming the time domain audio signal into a frequency domain, thereby generating a plurality of subband signals in a corresponding plurality of subbands comprising low and high frequency subbands; determining a degree of relationship between subband signals in the low frequency subbands and subband signals in the high frequency subbands; wherein the degree of relationship is determined based on the plurality of subband signals; and determining frequency extension coding history if the degree of relationship is greater than a relationship threshold.

    摘要翻译: 本文件涉及音频取证,特别是盲目检测参数音频编码/解码的痕迹。 特别地,本文件涉及从诸如PCM(脉冲编码调制)编码波形的未压缩波形检测参数频率扩展音频编码,例如频谱带复制(SBR)或频谱扩展(SPX)。 描述了用于检测时域音频信号中的频率扩展编码历史的方法。 该方法可以包括将时域音频信号变换成频域,从而在包括低频和高频子带的相应多个子带中产生多个子带信号; 确定低频子带中的子带信号与高频子带中的子带信号之间的关系程度; 其中所述关系度基于所述多个子带信号来确定; 以及如果所述关系度大于关系阈值,则确定频率扩展编码历史。

    AUDIO ENCODING METHOD AND SYSTEM FOR GENERATING A UNIFIED BITSTREAM DECODABLE BY DECODERS IMPLEMENTING DIFFERENT DECODING PROTOCOLS
    42.
    发明申请
    AUDIO ENCODING METHOD AND SYSTEM FOR GENERATING A UNIFIED BITSTREAM DECODABLE BY DECODERS IMPLEMENTING DIFFERENT DECODING PROTOCOLS 有权
    音视频编码方法和系统,用于生成由解码器实现的不同解码协议解码的统一的双绞线

    公开(公告)号:US20140358554A1

    公开(公告)日:2014-12-04

    申请号:US14009503

    申请日:2012-04-05

    IPC分类号: G10L19/002

    CPC分类号: G10L19/002 G10L19/167

    摘要: In a class of embodiments, an audio encoding system (typically, a perceptual encoding system that is configured to generate a single (“unified”) bitstream that is compatible with (i.e., decodable by) a first decoder configured to decode audio data encoded in accordance with a first encoding protocol (e.g., the multichannel Dolby Digital Plus, or DD+, protocol) and a second decoder configured to decode audio data encoded in accordance with a second encoding protocol (e.g., the stereo AAC, HE AAC v1, or HE AAC v2 protocol). The unified bitstream can include both encoded data (e.g., bursts of data) decodable by the first decoder (and ignored by the second decoder) and encoded data (e.g., other bursts of data) decodable by the second decoder (and ignored by the first decoder). In effect, the second encoding format is hidden within the unified bitstream when the bitstream is decoded by the first decoder, and the first encoding format is hidden within the unified bitstream when the bitstream is decoded by the second decoder. The format of the unified bitstream generated in accordance with the invention may eliminate the need for transcoding elements throughout an entire media chain and/or ecosystem. Other aspects of the invention are an encoding method performed by any embodiment of the inventive encoder, a decoding method performed by any embodiment of the inventive decoder, and a computer readable medium (e.g., disc) which stores code for implementing any embodiment of the inventive method.

    摘要翻译: 在一类实施例中,音频编码系统(通常是感知编码系统,其被配置为生成与第一解码器兼容的(即可解码的)单个(“统一”)比特流,第一解码器被配置为对 根据第一编码协议(例如,多频道杜比数字+或DD +协议)和被配置为对根据第二编码协议(例如立体声AAC,HE AAC v1或HE)编码的音频数据进行解码的第二解码器 统一比特流可以包括可由第一解码器解码(并由第二解码器忽略)的可编码数据(例如,数据突发)和由第二解码器解码的编码数据(例如,其他数据突发) 并且被第一解码器忽略),实际上,当第一解码器对比特流进行解码时,第二编码格式被隐藏在统一比特流内,并且当比特流中第一编码格式被隐藏在统一比特流内时 令牌由第二解码器解码。 根据本发明生成的统一比特流的格式可以消除在整个媒体链和/或生态系统中对代码转换元素的需要。 本发明的其他方面是由本发明编码器的任何实施例执行的编码方法,由本发明解码器的任何实施例执行的解码方法,以及存储用于实现本发明的任何实施例的代码的计算机可读介质(例如,盘) 方法。

    Robust media fingerprints
    43.
    发明授权
    Robust media fingerprints 有权
    坚固的媒体指纹

    公开(公告)号:US08700194B2

    公开(公告)日:2014-04-15

    申请号:US13060032

    申请日:2009-08-26

    IPC分类号: G06F17/00

    CPC分类号: G10L19/018

    摘要: Robust media fingerprints are derived from a portion of audio content. A portion of content in an audio signal is categorized. The audio content is characterized based, at least in part, on one or more of its features. The features may include a component that relates to one of several sound categories, e.g., speech and/or noise, which may be mixed with the audio signal. Upon categorizing the audio content as free of the speech or noise related components, the audio signal component is processed. Upon categorizing the audio content as including the speech related component and/or the noise related components, the speech or noise related components are separated from the audio signal. The audio signal is processed independent of the speech related component and/or the noise related component. Processing the audio signal includes computing the audio fingerprint, which reliably corresponds to the audio signal.

    摘要翻译: 强大的媒体指纹是从音频内容的一部分导出的。 对音频信号中的内容的一部分进行分类。 音频内容的特征在于,至少部分地基于其一个或多个特征。 特征可以包括与几个声音类别中的一个相关联的组件,例如可以与音频信号混合的语音和/或噪声。 在将音频内容分类为没有语音或噪声相关组件的情况下,处理音频信号分量。 在将音频内容分类为包括语音相关分量和/或噪声相关分量时,语音或噪声相关分量与音频信号分离。 音频信号被独立于语音相关分量和/或噪声相关分量进行处理。 处理音频信号包括计算可靠地对应于音频信号的音频指纹。

    Scene Change Detection Around a Set of Seed Points in Media Data
    45.
    发明申请
    Scene Change Detection Around a Set of Seed Points in Media Data 有权
    媒体数据中一组种子点的场景变化检测

    公开(公告)号:US20130287214A1

    公开(公告)日:2013-10-31

    申请号:US13997860

    申请日:2011-12-15

    IPC分类号: H04R29/00

    摘要: Techniques for scene change detection around seed points in media data are provided. Media features of many different types may be extracted from the media data. One or more statistical patterns of media features in a plurality of time-wise intervals around a plurality of seed time points of the media data may be determined using one or more types of features extractable from the media data. At least one of the one or more types of features comprises a type of features that captures structural properties, tonality including harmony and melody, timbre, rhythm, loudness, stereo mix, or a quantity of sound sources as related to the media data. A plurality of beginning scene change points and a plurality of ending scene change points in the media data may be detected, based on the one or more statistical patterns, for the plurality of seed time points in the media data.

    摘要翻译: 提供媒体数据中种子点周围场景变化检测技术。 可以从媒体数据中提取许多不同类型的媒体特征。 可以使用从媒体数据可提取的一种或多种类型的特征来确定围绕媒体数据的多个种子时间点的多个时间间隔中的媒体特征的一个或多个统计模式。 一种或多种类型的特征中的至少一种包括捕获与媒体数据相关的结构性质,包括和声和旋律的音调,音色,节奏,响度,立体声混合或数量的声源的特征的类型。 可以基于媒体数据中的多个种子时间点的一个或多个统计模式来检测媒体数据中的多个起始场景变化点和多个结束场景变化点。

    Scalable media fingerprint extraction
    46.
    发明授权
    Scalable media fingerprint extraction 有权
    可扩展媒体指纹提取

    公开(公告)号:US08571255B2

    公开(公告)日:2013-10-29

    申请号:US13142355

    申请日:2010-01-07

    IPC分类号: G06K9/00

    摘要: Derivation of a fingerprint includes generating feature matrices based on one or more training images, generating projection matrices based on the feature matrices in a training process, and deriving a fingerprint for one or more images by, at least in part, projecting a feature matrix based on the one or more images onto the projection matrices generated in the training process.

    摘要翻译: 指纹的推导包括基于一个或多个训练图像生成特征矩阵,基于训练过程中的特征矩阵生成投影矩阵,以及通过至少部分地基于特征矩阵投影来导出一个或多个图像的指纹, 在一个或多个图像上,在训练过程中产生的投影矩阵上。

    Repetition Detection in Media Data
    47.
    发明申请
    Repetition Detection in Media Data 审中-公开
    媒体数据中的重复检测

    公开(公告)号:US20130275421A1

    公开(公告)日:2013-10-17

    申请号:US13997847

    申请日:2011-12-15

    IPC分类号: G06F17/30

    摘要: Techniques for repetition detection in media data are provided. Media features of many different types may be extracted from the media data. Query sequences of fingerprints may be selected time intervals that begin at query times. Matched sequences of fingerprints may be determined. A set of offset values may be determined based on the matched sequences of fingerprints. This set of offset values may be further refined into a set of significant time points using a relatively targeted search and comparison method based on the media features of a second type extracted from the media data.

    摘要翻译: 提供了媒体数据中重复检测技术。 可以从媒体数据中提取许多不同类型的媒体特征。 指纹的查询序列可以是从查询时间开始的选择的时间间隔。 可以确定匹配的指纹序列。 可以基于匹配的指纹序列来确定一组偏移值。 可以使用基于从媒体数据提取的第二类型的媒体特征的相对有针对性的搜索和比较方法,将这组偏移值进一步细化为一组有效时间点。

    Multimode coding of speech-like and non-speech-like signals
    48.
    发明授权
    Multimode coding of speech-like and non-speech-like signals 有权
    语音和非语音信号的多模式编码

    公开(公告)号:US08392179B2

    公开(公告)日:2013-03-05

    申请号:US12921752

    申请日:2009-03-12

    IPC分类号: G10L11/06

    摘要: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.

    摘要翻译: 本发明涉及可以包括语音类和非语音类信号分量的音频信号的编码。 它描述了采用由LPC参数控制的线性预测编码(LPC)合成滤波器的码激励线性预测(CELP)音频编码和解码的方法和装置,每个具有码矢量的多个码本,提供更适合于 非语音类信号和至少一个提供更适合于类似语音的信号的激励的码本,以及多个增益因子,每个与码本相关联。 编码方法和装置通过最小化音频信号与从码本激励导出的音频信号的重建之间的差异的度量来从码本代码矢量和/或相关联的增益因子中选择。 解码方法和装置从LPC参数,代码矢量和增益因子产生重构的输出信号。

    Media fingerprints that reliably correspond to media content
    49.
    发明授权
    Media fingerprints that reliably correspond to media content 有权
    媒体指纹可靠地对应于媒体内容

    公开(公告)号:US08351643B2

    公开(公告)日:2013-01-08

    申请号:US12681598

    申请日:2008-10-06

    IPC分类号: G06K9/00

    摘要: Quantized energy values are accessed to initially represent a temporally related group of content elements in a media sequence. The values are accessed over a matrix of regions into which the initial representation is partitioned. The initial representation may be downsampled and/or cropped from the content. A basis vector set is estimated in a dimensional space from the values. The initial representation is transformed into a subsequent representation, which is in another dimensional space. The subsequent representation projects the initial representation, based on the basis vectors. The subsequent representation reliably corresponds to the media content portion over a change in a geometric orientation thereof. Repeated for other media content portions of the group, subsequent representations of the first and other portions are averaged or transformed over time. The averaged/transformed values reliably correspond to the content portion over speed changes. The initial representation may include spatial or transform related information.

    摘要翻译: 量化的能量值被访问以最初表示媒体序列中时间上相关的内容元素组。 这些值通过分区初始表示的区域矩阵进行访问。 初始表示可以从内容下采样和/或裁剪。 在从值的维度空间中估计基矢量集。 初始表示被转换成另一个维度空间中的后续表示。 随后的表示基于基本向量来投影初始表示。 随后的表示在其几何取向的变化上可靠地对应于媒体内容部分。 对于组的其他媒体内容部分重复,第一和其他部分的后续表示随时间被平均或变换。 平均/变换后的值与速度变化的内容部分可靠地对应。 初始表示可以包括空间或变换相关信息。