专利检索 ap:("Ajay Divakaran" OR "Ziyou Xiong" OR "Regunathan Radhakrishnan") AND inv:"Regunathan Radhakrishnan" 第 5 页

41.

发明授权
Method, apparatus, and medium for detecting frequency extension coding in the coding history of an audio signal 有权
标题翻译：用于在音频信号的编码历史中检测频率扩展编码的方法，装置和介质

公开(公告)号：US09117440B2

公开(公告)日：2015-08-25

申请号：US14116113

申请日：2012-04-30

申请人： Harald H. Mundt , Arijit Biswas , Regunathan Radhakrishnan

发明人： Harald H. Mundt , Arijit Biswas , Regunathan Radhakrishnan

IPC分类号： G10L19/00 , G10L21/02 , G10L25/03 , G10L19/008 , G10L21/038

CPC分类号： G10L19/00 , G10L19/008 , G10L21/02 , G10L21/038 , G10L25/03

摘要： The present document relates to audio forensics, notably the blind detection of traces of parametric audio encoding/decoding. In particular, the present document relates to the detection of parametric frequency extension audio coding, such as spectral band replication (SBR) or spectral extension (SPX), from uncompressed waveforms such as PCM (pulse code modulation) encoded waveforms. A method for detecting frequency extension coding history in a time domain audio signal is described. The method may comprise transforming the time domain audio signal into a frequency domain, thereby generating a plurality of subband signals in a corresponding plurality of subbands comprising low and high frequency subbands; determining a degree of relationship between subband signals in the low frequency subbands and subband signals in the high frequency subbands; wherein the degree of relationship is determined based on the plurality of subband signals; and determining frequency extension coding history if the degree of relationship is greater than a relationship threshold.

摘要翻译： 本文件涉及音频取证，特别是盲目检测参数音频编码/解码的痕迹。特别地，本文件涉及从诸如PCM（脉冲编码调制）编码波形的未压缩波形检测参数频率扩展音频编码，例如频谱带复制（SBR）或频谱扩展（SPX）。描述了用于检测时域音频信号中的频率扩展编码历史的方法。该方法可以包括将时域音频信号变换成频域，从而在包括低频和高频子带的相应多个子带中产生多个子带信号; 确定低频子带中的子带信号与高频子带中的子带信号之间的关系程度; 其中所述关系度基于所述多个子带信号来确定; 以及如果所述关系度大于关系阈值，则确定频率扩展编码历史。

42.

发明申请
AUDIO ENCODING METHOD AND SYSTEM FOR GENERATING A UNIFIED BITSTREAM DECODABLE BY DECODERS IMPLEMENTING DIFFERENT DECODING PROTOCOLS 有权
标题翻译：音视频编码方法和系统，用于生成由解码器实现的不同解码协议解码的统一的双绞线

公开(公告)号：US20140358554A1

公开(公告)日：2014-12-04

申请号：US14009503

申请日：2012-04-05

申请人： Jeffrey C. Riedmiller , Farhad Farahani , Michael Schug , Regunathan Radhakrishnan , Mark S. Vinton

发明人： Jeffrey C. Riedmiller , Farhad Farahani , Michael Schug , Regunathan Radhakrishnan , Mark S. Vinton

IPC分类号： G10L19/002

CPC分类号： G10L19/002 , G10L19/167

摘要： In a class of embodiments, an audio encoding system (typically, a perceptual encoding system that is configured to generate a single (“unified”) bitstream that is compatible with (i.e., decodable by) a first decoder configured to decode audio data encoded in accordance with a first encoding protocol (e.g., the multichannel Dolby Digital Plus, or DD+, protocol) and a second decoder configured to decode audio data encoded in accordance with a second encoding protocol (e.g., the stereo AAC, HE AAC v1, or HE AAC v2 protocol). The unified bitstream can include both encoded data (e.g., bursts of data) decodable by the first decoder (and ignored by the second decoder) and encoded data (e.g., other bursts of data) decodable by the second decoder (and ignored by the first decoder). In effect, the second encoding format is hidden within the unified bitstream when the bitstream is decoded by the first decoder, and the first encoding format is hidden within the unified bitstream when the bitstream is decoded by the second decoder. The format of the unified bitstream generated in accordance with the invention may eliminate the need for transcoding elements throughout an entire media chain and/or ecosystem. Other aspects of the invention are an encoding method performed by any embodiment of the inventive encoder, a decoding method performed by any embodiment of the inventive decoder, and a computer readable medium (e.g., disc) which stores code for implementing any embodiment of the inventive method.

摘要翻译： 在一类实施例中，音频编码系统（通常是感知编码系统，其被配置为生成与第一解码器兼容的（即可解码的）单个（“统一”）比特流，第一解码器被配置为对根据第一编码协议（例如，多频道杜比数字+或DD +协议）和被配置为对根据第二编码协议（例如立体声AAC，HE AAC v1或HE）编码的音频数据进行解码的第二解码器统一比特流可以包括可由第一解码器解码（并由第二解码器忽略）的可编码数据（例如，数据突发）和由第二解码器解码的编码数据（例如，其他数据突发）并且被第一解码器忽略），实际上，当第一解码器对比特流进行解码时，第二编码格式被隐藏在统一比特流内，并且当比特流中第一编码格式被隐藏在统一比特流内时令牌由第二解码器解码。根据本发明生成的统一比特流的格式可以消除在整个媒体链和/或生态系统中对代码转换元素的需要。本发明的其他方面是由本发明编码器的任何实施例执行的编码方法，由本发明解码器的任何实施例执行的解码方法，以及存储用于实现本发明的任何实施例的代码的计算机可读介质（例如，盘）方法。

43.

发明授权
Robust media fingerprints 有权
标题翻译：坚固的媒体指纹

公开(公告)号：US08700194B2

公开(公告)日：2014-04-15

申请号：US13060032

申请日：2009-08-26

申请人： Claus Bauer , Regunathan Radhakrishnan

发明人： Claus Bauer , Regunathan Radhakrishnan

IPC分类号： G06F17/00

CPC分类号： G10L19/018

摘要： Robust media fingerprints are derived from a portion of audio content. A portion of content in an audio signal is categorized. The audio content is characterized based, at least in part, on one or more of its features. The features may include a component that relates to one of several sound categories, e.g., speech and/or noise, which may be mixed with the audio signal. Upon categorizing the audio content as free of the speech or noise related components, the audio signal component is processed. Upon categorizing the audio content as including the speech related component and/or the noise related components, the speech or noise related components are separated from the audio signal. The audio signal is processed independent of the speech related component and/or the noise related component. Processing the audio signal includes computing the audio fingerprint, which reliably corresponds to the audio signal.

摘要翻译： 强大的媒体指纹是从音频内容的一部分导出的。对音频信号中的内容的一部分进行分类。音频内容的特征在于，至少部分地基于其一个或多个特征。特征可以包括与几个声音类别中的一个相关联的组件，例如可以与音频信号混合的语音和/或噪声。在将音频内容分类为没有语音或噪声相关组件的情况下，处理音频信号分量。在将音频内容分类为包括语音相关分量和/或噪声相关分量时，语音或噪声相关分量与音频信号分离。音频信号被独立于语音相关分量和/或噪声相关分量进行处理。处理音频信号包括计算可靠地对应于音频信号的音频指纹。

44.

发明授权
Extracting features of audio signal content to provide reliable identification of the signals 有权
标题翻译：提取音频信号内容的特点，提供信号的可靠识别

公开(公告)号：US08626504B2

公开(公告)日：2014-01-07

申请号：US13599992

申请日：2012-08-30

申请人： Regunathan Radhakrishnan , Claus Bauer , Kent Bennett Terry , Brian David Link , Hyung-Suk Kim , Eric Gsell

发明人： Regunathan Radhakrishnan , Claus Bauer , Kent Bennett Terry , Brian David Link , Hyung-Suk Kim , Eric Gsell

IPC分类号： G10L21/00 , H04N11/02

CPC分类号： G06T1/005 , G06F17/30743 , G06F17/30787 , G06F17/30799 , G06K9/00744 , G06K9/00758 , G06T1/0028 , G10L25/18 , G10L25/54 , G11B2020/10537

摘要： Signatures that can be used to identify video and audio content are generated from the content by generating measures of dissimilarity between features of corresponding groups of pixels in frames of video content and by generating low-resolution time-frequency representations of audio segments. The signatures are generated by applying a hash function to intermediate values derived from the measures of dissimilarity and to the low-resolution time-frequency representations. The generated signatures may be used in a variety of applications such as restoring synchronization between video and audio content streams and identifying copies of original video and audio content. The generated signatures can provide reliable identifications despite intentional and unintentional modifications to the content.

摘要翻译： 可以用于识别视频和音频内容的签名通过在视频内容的帧中产生相应的像素组的特征之间的不相似度量度和通过生成音频段的低分辨率时间频率表示来从内容产生。通过将散列函数应用于从不相似性的度量导出的中间值和低分辨率时间频率表示来生成签名。生成的签名可以用于各种应用中，例如恢复视频和音频内容流之间的同步，并识别原始视频和音频内容的副本。生成的签名可以提供可靠的标识，尽管有意和无意的修改内容。

45.

发明申请
Scene Change Detection Around a Set of Seed Points in Media Data 有权
标题翻译：媒体数据中一组种子点的场景变化检测

公开(公告)号：US20130287214A1

公开(公告)日：2013-10-31

申请号：US13997860

申请日：2011-12-15

申请人： Barbara Resch , Regunathan Radhakrishnan , Arijit Biswas , Jonas Engdegard

发明人： Barbara Resch , Regunathan Radhakrishnan , Arijit Biswas , Jonas Engdegard

IPC分类号： H04R29/00

CPC分类号： H04R29/00 , G06F17/00 , G06F17/3053 , G10H1/0008 , G10H2210/061 , G10H2240/151 , G10L25/48

摘要： Techniques for scene change detection around seed points in media data are provided. Media features of many different types may be extracted from the media data. One or more statistical patterns of media features in a plurality of time-wise intervals around a plurality of seed time points of the media data may be determined using one or more types of features extractable from the media data. At least one of the one or more types of features comprises a type of features that captures structural properties, tonality including harmony and melody, timbre, rhythm, loudness, stereo mix, or a quantity of sound sources as related to the media data. A plurality of beginning scene change points and a plurality of ending scene change points in the media data may be detected, based on the one or more statistical patterns, for the plurality of seed time points in the media data.

摘要翻译： 提供媒体数据中种子点周围场景变化检测技术。可以从媒体数据中提取许多不同类型的媒体特征。可以使用从媒体数据可提取的一种或多种类型的特征来确定围绕媒体数据的多个种子时间点的多个时间间隔中的媒体特征的一个或多个统计模式。一种或多种类型的特征中的至少一种包括捕获与媒体数据相关的结构性质，包括和声和旋律的音调，音色，节奏，响度，立体声混合或数量的声源的特征的类型。可以基于媒体数据中的多个种子时间点的一个或多个统计模式来检测媒体数据中的多个起始场景变化点和多个结束场景变化点。

46.

发明授权
Scalable media fingerprint extraction 有权
标题翻译：可扩展媒体指纹提取

公开(公告)号：US08571255B2

公开(公告)日：2013-10-29

申请号：US13142355

申请日：2010-01-07

申请人： Claus Bauer , Regunathan Radhakrishnan , Wenyu Jiang , Glenn N. Dickins

发明人： Claus Bauer , Regunathan Radhakrishnan , Wenyu Jiang , Glenn N. Dickins

IPC分类号： G06K9/00

CPC分类号： G06K9/00744 , G06K9/46 , G06K9/623

摘要： Derivation of a fingerprint includes generating feature matrices based on one or more training images, generating projection matrices based on the feature matrices in a training process, and deriving a fingerprint for one or more images by, at least in part, projecting a feature matrix based on the one or more images onto the projection matrices generated in the training process.

摘要翻译： 指纹的推导包括基于一个或多个训练图像生成特征矩阵，基于训练过程中的特征矩阵生成投影矩阵，以及通过至少部分地基于特征矩阵投影来导出一个或多个图像的指纹，在一个或多个图像上，在训练过程中产生的投影矩阵上。

47.

发明申请
Repetition Detection in Media Data 审中-公开
标题翻译：媒体数据中的重复检测

公开(公告)号：US20130275421A1

公开(公告)日：2013-10-17

申请号：US13997847

申请日：2011-12-15

申请人： Barbara Resch , Regunathan Radhakrishnan , Arijit Biswas , Jonas Engdegard

发明人： Barbara Resch , Regunathan Radhakrishnan , Arijit Biswas , Jonas Engdegard

IPC分类号： G06F17/30

CPC分类号： G06F16/24578 , G06F17/00 , G10H1/0008 , G10H2210/061 , G10H2240/151 , G10L25/48 , H04R29/00

摘要： Techniques for repetition detection in media data are provided. Media features of many different types may be extracted from the media data. Query sequences of fingerprints may be selected time intervals that begin at query times. Matched sequences of fingerprints may be determined. A set of offset values may be determined based on the matched sequences of fingerprints. This set of offset values may be further refined into a set of significant time points using a relatively targeted search and comparison method based on the media features of a second type extracted from the media data.

摘要翻译： 提供了媒体数据中重复检测技术。可以从媒体数据中提取许多不同类型的媒体特征。指纹的查询序列可以是从查询时间开始的选择的时间间隔。可以确定匹配的指纹序列。可以基于匹配的指纹序列来确定一组偏移值。可以使用基于从媒体数据提取的第二类型的媒体特征的相对有针对性的搜索和比较方法，将这组偏移值进一步细化为一组有效时间点。

48.

发明授权
Multimode coding of speech-like and non-speech-like signals 有权
标题翻译：语音和非语音信号的多模式编码

公开(公告)号：US08392179B2

公开(公告)日：2013-03-05

申请号：US12921752

申请日：2009-03-12

申请人： Rongshan Yu , Regunathan Radhakrishnan , Robert Andersen , Grant Davidson

发明人： Rongshan Yu , Regunathan Radhakrishnan , Robert Andersen , Grant Davidson

IPC分类号： G10L11/06

CPC分类号： G10L19/18 , G10L19/093 , G10L19/12 , G10L2019/0004 , G10L2019/0005

摘要： The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.

摘要翻译： 本发明涉及可以包括语音类和非语音类信号分量的音频信号的编码。它描述了采用由LPC参数控制的线性预测编码（LPC）合成滤波器的码激励线性预测（CELP）音频编码和解码的方法和装置，每个具有码矢量的多个码本，提供更适合于非语音类信号和至少一个提供更适合于类似语音的信号的激励的码本，以及多个增益因子，每个与码本相关联。编码方法和装置通过最小化音频信号与从码本激励导出的音频信号的重建之间的差异的度量来从码本代码矢量和/或相关联的增益因子中选择。解码方法和装置从LPC参数，代码矢量和增益因子产生重构的输出信号。

49.

发明授权
Media fingerprints that reliably correspond to media content 有权
标题翻译：媒体指纹可靠地对应于媒体内容

公开(公告)号：US08351643B2

公开(公告)日：2013-01-08

申请号：US12681598

申请日：2008-10-06

申请人： Regunathan Radhakrishnan , Claus Bauer

发明人： Regunathan Radhakrishnan , Claus Bauer

IPC分类号： G06K9/00

CPC分类号： G06T1/0028 , G06F17/30799 , G06K9/00744 , G06K9/4642 , G06T1/005 , G06T2201/0051 , G06T2201/0061 , H04N21/44008

摘要： Quantized energy values are accessed to initially represent a temporally related group of content elements in a media sequence. The values are accessed over a matrix of regions into which the initial representation is partitioned. The initial representation may be downsampled and/or cropped from the content. A basis vector set is estimated in a dimensional space from the values. The initial representation is transformed into a subsequent representation, which is in another dimensional space. The subsequent representation projects the initial representation, based on the basis vectors. The subsequent representation reliably corresponds to the media content portion over a change in a geometric orientation thereof. Repeated for other media content portions of the group, subsequent representations of the first and other portions are averaged or transformed over time. The averaged/transformed values reliably correspond to the content portion over speed changes. The initial representation may include spatial or transform related information.

摘要翻译： 量化的能量值被访问以最初表示媒体序列中时间上相关的内容元素组。这些值通过分区初始表示的区域矩阵进行访问。初始表示可以从内容下采样和/或裁剪。在从值的维度空间中估计基矢量集。初始表示被转换成另一个维度空间中的后续表示。随后的表示基于基本向量来投影初始表示。随后的表示在其几何取向的变化上可靠地对应于媒体内容部分。对于组的其他媒体内容部分重复，第一和其他部分的后续表示随时间被平均或变换。平均/变换后的值与速度变化的内容部分可靠地对应。初始表示可以包括空间或变换相关信息。

50.

发明授权
Adaptive audio processing based on forensic detection of media processing history 有权
标题翻译：基于媒体处理历史的法医检测的自适应音频处理

公开(公告)号：US09311923B2

公开(公告)日：2016-04-12

申请号：US14117576

申请日：2012-05-15

申请人： Regunathan Radhakrishnan , Sevinc Bayram , Jeffrey Riedmiller

发明人： Regunathan Radhakrishnan , Sevinc Bayram , Jeffrey Riedmiller

IPC分类号： G06F17/00 , G10L19/018 , G10L25/06 , H04S5/00 , G10L19/008 , H04S3/02

CPC分类号： G10L19/018 , G10L19/008 , G10L25/06 , H04S3/02 , H04S5/005 , H04S2400/01

摘要： A media signal is accessed, which has been generated with one or more first processing operations. The media signal includes one or more sets of artifacts, which respectively result from the one or more processing operations. One or more features are extracted from the accessed media signal. The extracted features each respectively correspond to the one or more artifact sets. Based on the extracted features, a conditional probability score and/or a heuristically based score is computed, which relates to the one or more first processing operations.

摘要翻译： 访问已经通过一个或多个第一处理操作生成的媒体信号。媒体信号包括分别由一个或多个处理操作产生的一组或多组伪影。从所访问的媒体信号中提取一个或多个特征。所提取的特征分别对应于一个或多个伪像集。基于提取的特征，计算与一个或多个第一处理操作有关的条件概率得分和/或启发式分数。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类