Multimode coding of speech-like and non-speech-like signals
    1.
    发明授权
    Multimode coding of speech-like and non-speech-like signals 有权
    语音和非语音信号的多模式编码

    公开(公告)号:US08392179B2

    公开(公告)日:2013-03-05

    申请号:US12921752

    申请日:2009-03-12

    IPC分类号: G10L11/06

    摘要: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.

    摘要翻译: 本发明涉及可以包括语音类和非语音类信号分量的音频信号的编码。 它描述了采用由LPC参数控制的线性预测编码(LPC)合成滤波器的码激励线性预测(CELP)音频编码和解码的方法和装置,每个具有码矢量的多个码本,提供更适合于 非语音类信号和至少一个提供更适合于类似语音的信号的激励的码本,以及多个增益因子,每个与码本相关联。 编码方法和装置通过最小化音频信号与从码本激励导出的音频信号的重建之间的差异的度量来从码本代码矢量和/或相关联的增益因子中选择。 解码方法和装置从LPC参数,代码矢量和增益因子产生重构的输出信号。

    MULTIMODE CODING OF SPEECH-LIKE AND NON-SPEECH-LIKE SIGNALS
    2.
    发明申请
    MULTIMODE CODING OF SPEECH-LIKE AND NON-SPEECH-LIKE SIGNALS 有权
    语音类和非语音信号的多模式编码

    公开(公告)号:US20110010168A1

    公开(公告)日:2011-01-13

    申请号:US12921752

    申请日:2009-03-12

    IPC分类号: G10L19/04

    摘要: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.

    摘要翻译: 本发明涉及可以包括语音类和非语音类信号分量的音频信号的编码。 它描述了采用由LPC参数控制的线性预测编码(LPC)合成滤波器的码激励线性预测(CELP)音频编码和解码的方法和装置,每个具有码矢量的多个码本,提供更适合于 非语音类信号和至少一个提供更适合于类似语音的信号的激励的码本,以及多个增益因子,每个与码本相关联。 编码方法和装置通过最小化音频信号与从码本激励导出的音频信号的重建之间的差异的度量来从码本代码矢量和/或相关联的增益因子中选择。 解码方法和装置从LPC参数,代码矢量和增益因子产生重构的输出信号。

    Scene change detection around a set of seed points in media data
    4.
    发明授权
    Scene change detection around a set of seed points in media data 有权
    媒体数据中一组种子点周围的场景变化检测

    公开(公告)号:US09317561B2

    公开(公告)日:2016-04-19

    申请号:US13997860

    申请日:2011-12-15

    摘要: Techniques for scene change detection around seed points in media data are provided. Media features of many different types may be extracted from the media data. One or more statistical patterns of media features in a plurality of time-wise intervals around a plurality of seed time points of the media data may be determined using one or more types of features extractable from the media data. At least one of the one or more types of features comprises a type of features that captures structural properties, tonality including harmony and melody, timbre, rhythm, loudness, stereo mix, or a quantity of sound sources as related to the media data. A plurality of beginning scene change points and a plurality of ending scene change points in the media data may be detected, based on the one or more statistical patterns, for the plurality of seed time points in the media data.

    摘要翻译: 提供媒体数据中种子点周围场景变化检测技术。 可以从媒体数据中提取许多不同类型的媒体特征。 可以使用从媒体数据可提取的一种或多种类型的特征来确定围绕媒体数据的多个种子时间点的多个时间间隔中的媒体特征的一个或多个统计模式。 一种或多种类型的特征中的至少一种包括捕获与媒体数据相关的结构性质,包括和声和旋律的音调,音色,节奏,响度,立体声混合或数量的声源的特征的类型。 可以基于媒体数据中的多个种子时间点的一个或多个统计模式来检测媒体数据中的多个起始场景变化点和多个结束场景变化点。

    Projection based hashing that balances robustness and sensitivity of media fingerprints
    6.
    发明授权
    Projection based hashing that balances robustness and sensitivity of media fingerprints 失效
    基于投影的散列,平衡了媒体指纹的鲁棒性和灵敏度

    公开(公告)号:US08542869B2

    公开(公告)日:2013-09-24

    申请号:US13115542

    申请日:2011-05-25

    IPC分类号: G06K9/00

    CPC分类号: G06K9/00744 G06K9/6232

    摘要: Multiple candidate feature components of media content or projection matrices (or other hash functions, e.g., non-linear projections) are identified. Each of the candidate projection matrices (or other hash functions) includes an array of coefficients that relate to the candidate features. A subgroup of the candidate features or the projection matrices (or other hash functions) are selected based at least partially on an optimized combination of at least two characteristics of the candidate features or projection matrices (or other hash functions). Media fingerprints that uniquely identify the media content are derived from the selected optimized subgroup. Optimal projection matrices (or other hash functions) may be designed. Performance or sensitivity (e.g., search time) characteristics of the fingerprints are thus balanced with robustness characteristics thereof.

    摘要翻译: 识别媒体内容或投影矩阵(或其他散列函数,例如非线性投影)的多个候选特征分量。 每个候选投影矩阵(或其他散列函数)包括与候选特征相关的系数阵列。 至少部分地基于候选特征或投影矩阵(或其他散列函数)的至少两个特征的优化组合来选择候选特征或投影矩阵(或其他散列函数)的子组。 唯一标识媒体内容的媒体指纹是从选定的优化子组派生出来的。 可以设计最佳投影矩阵(或其他散列函数)。 因此,指纹的性能或灵敏度(例如,搜索时间)特性与其鲁棒性特性相平衡。

    Content identification and quality monitoring
    7.
    发明授权
    Content identification and quality monitoring 有权
    内容识别和质量监控

    公开(公告)号:US08428301B2

    公开(公告)日:2013-04-23

    申请号:US13059839

    申请日:2009-08-21

    IPC分类号: G06K9/00 B42D15/00

    CPC分类号: H04N17/004

    摘要: Content identification and quality monitoring are provided. The method involves obtaining a first fingerprint derived from a first media content, processing the first media content to generate a second media content, obtaining a second fingerprint derived from the second media content, and comparing the first fingerprint and the second fingerprint to determine one or more of: a similarity between the first fingerprint and the second fingerprint that indicates that the second media content is generated from the first media content or a difference between the first fingerprint and the second fingerprint to identify a quality degradation between the first media content and the second media content.

    摘要翻译: 提供内容识别和质量监控。 该方法包括获得从第一媒体内容导出的第一指纹,处理第一媒体内容以产生第二媒体内容,获得从第二媒体内容导出的第二指纹,以及比较第一指纹和第二指纹以确定一个或多个 更多:第一指纹和第二指纹之间的相似性,其指示从第一媒体内容生成第二媒体内容或第一指纹与第二指纹之间的差异,以识别第一媒体内容与第二指纹之间的质量下降 第二媒体内容。

    Scalable Media Fingerprint Extraction
    8.
    发明申请
    Scalable Media Fingerprint Extraction 有权
    可扩展媒体指纹提取

    公开(公告)号:US20110268315A1

    公开(公告)日:2011-11-03

    申请号:US13142355

    申请日:2010-01-07

    IPC分类号: G06K9/00

    摘要: Derivation of a fingerprint includes generating feature matrices based on one or more training images, generating projection matrices based on the feature matrices in a training process, and deriving a fingerprint for one or more images by, at least in part, projecting a feature matrix based on the one or more images onto the projection matrices generated in the training process.

    摘要翻译: 指纹的推导包括基于一个或多个训练图像生成特征矩阵,基于训练过程中的特征矩阵生成投影矩阵,以及通过至少部分地基于特征矩阵投影来导出一个或多个图像的指纹, 在一个或多个图像上,在训练过程中产生的投影矩阵上。

    Task specific audio classification for identifying video highlights
    9.
    发明申请
    Task specific audio classification for identifying video highlights 有权
    用于识别视频亮点的任务特定音频分类

    公开(公告)号:US20070162924A1

    公开(公告)日:2007-07-12

    申请号:US11326818

    申请日:2006-01-06

    IPC分类号: H04N7/16 H04H9/00

    摘要: A method classifies segments of a video using an audio signal of the video and a set of classes. Selected classes of the set are combined as a subset of important classes, the subset of important classes being important for a specific highlighting task, the remaining classes of the set are combined as a subset of other classes. The subset of important classes and classes are trained with training audio data to form a task specific classifier. Then, the audio signal can be classified using the task specific classifier as either important or other to identify highlights in the video corresponding to the specific highlighting task. The classified audio signal can be used to segment and summarize the video.

    摘要翻译: 一种方法使用视频的音频信号和一组类来分类视频的片段。 集合的所选类被组合为重要类的子集,重要类的子集对于特定高亮任务是重要的,集合的剩余类被组合为其他类的子集。 通过训练音频数据来训练重要类和类的子集,以形成任务特定的分类器。 然后,可以使用任务专用分类器将音频信号分类为重要的或其他的,以识别与特定突出显示任务相对应的视频中的亮点。 分类音频信号可用于分割和总结视频。

    Identifying video highlights using audio-visual objects
    10.
    发明申请
    Identifying video highlights using audio-visual objects 审中-公开
    使用视听对象识别视频亮点

    公开(公告)号:US20060059120A1

    公开(公告)日:2006-03-16

    申请号:US10928829

    申请日:2004-08-27

    IPC分类号: G06F17/30

    摘要: A method identifies highlight segments in a video including a sequence of frames. Audio objects are detected to identify frames associated with audio events in the video, and visual objects are detected to identify frames associated with visual events. Selected visual objects are matched with an associated audio object to form an audio-visual object only if the selected visual object matches the associated audio object, the audio-visual object identifying a candidate highlight segment. The candidate highlight segments are further refined, using low level features, to eliminate false highlight segments.

    摘要翻译: 一种方法识别包括帧序列的视频中的高亮段。 检测音频对象以识别与视频中的音频事件相关联的帧,并且检测可视对象以识别与视觉事件相关联的帧。 仅当所选择的视觉对象与相关联的音频对象匹配时,所选择的可视对象与相关联的音频对象匹配才能形成视听对象,该视听对象识别候选高亮段。 候选高亮段进一步细化,使用低级特征,消除假高光段。