Greedy algorithm for identifying values for vocal tract resonance vectors
    101.
    发明授权
    Greedy algorithm for identifying values for vocal tract resonance vectors 有权
    用于识别声道共振载体的值的贪婪算法

    公开(公告)号:US07475011B2

    公开(公告)日:2009-01-06

    申请号:US10925585

    申请日:2004-08-25

    IPC分类号: G10L19/06

    CPC分类号: G10L25/48 G10L15/02 G10L25/15

    摘要: A method and apparatus identify values for components of a vocal tract resonance vector by sequentially determining values for each component of the vocal tract resonance vector. To determine a value for a component, the other components are set to static values. A plurality of values for a function are then determined using a plurality of values for the component that is being determined while using the static values for all of the other components. One of the plurality of values for the component is then selected based on the plurality of values for the function.

    摘要翻译: 一种方法和装置通过依次确定声道共振矢量的每个分量的值来识别声道共振矢量的分量的值。 要确定组件的值,其他组件将设置为静态值。 然后,使用正在确定的组件的多个值来确定功能的多个值,同时使用所有其他组件的静态值。 然后基于该功能的多个值来选择该组件的多个值之一。

    SENSOR ARRAY BEAMFORMER POST-PROCESSOR
    102.
    发明申请
    SENSOR ARRAY BEAMFORMER POST-PROCESSOR 有权
    传感器阵列后处理器

    公开(公告)号:US20080288219A1

    公开(公告)日:2008-11-20

    申请号:US11750319

    申请日:2007-05-17

    IPC分类号: H04B15/00 G06F15/00

    CPC分类号: H04B7/0854

    摘要: A novel beamforming post-processor technique with enhanced noise suppression capability. The present beam forming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction resulting in minimal artifacts and musical noise.

    摘要翻译: 一种具有增强噪声抑制能力的新型波束成形后处理器技术。 本发明的波束形成后处理器技术是用于提高方向性和信号分离能力的传感器阵列(例如,麦克风阵列)的非线性后处理技术。 该技术在所谓的瞬时到达空间方向上工作,估计来自给定入射角或查找方向的声音的概率,并且应用时间变化的基于增益的时空滤波器来抑制来自其他方向的声音 比声源方向导致最小的伪像和音乐噪声。

    Noise robust speech recognition with a switching linear dynamic model
    103.
    发明授权
    Noise robust speech recognition with a switching linear dynamic model 有权
    噪声鲁棒语音识别与开关线性动态模型

    公开(公告)号:US07418383B2

    公开(公告)日:2008-08-26

    申请号:US10933763

    申请日:2004-09-03

    IPC分类号: G10L15/00 G10L15/06

    CPC分类号: G10L15/20

    摘要: A unified, nonlinear, non-stationary, stochastic model is disclosed for estimating and removing effects of background noise on speech cepstra. Generally stated, the model is a union of dynamic system equations for speech and noise, and a model describing how speech and noise are mixed. Embodiments also pertain to related methods for enhancement.

    摘要翻译: 公开了一种统一的,非线性的,非平稳的随机模型,用于估计和消除背景噪声对语音cepstra的影响。 一般来说,该模型是语音和噪声的动态系统方程组合,以及描述语音和噪声如何混合的模型。 实施例也涉及用于增强的相关方法。

    Annotating programs for automatic summary generations
    105.
    发明授权
    Annotating programs for automatic summary generations 有权
    注释自动汇总代码的程序

    公开(公告)号:US07403894B2

    公开(公告)日:2008-07-22

    申请号:US11081118

    申请日:2005-03-15

    IPC分类号: G10L21/00

    摘要: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.

    摘要翻译: 音频/视频节目内容从内容提供商可用于接收者,并且元数据从元数据提供者向接收者提供。 元数据对应于节目内容,并且针对节目内容的多个部分中的每一个识别该部分是内容的激动部分的可能性的指示符。 在一个实现中,元数据包括棒球节目的节段令人兴奋的概率,并且通过分析用于激发的语音和棒球命中的棒球节目的音频数据而产生。 然后可以使用元数据来生成棒球程序的摘要。

    NOISE SUPPRESSOR FOR SPEECH RECOGNITION
    106.
    发明申请
    NOISE SUPPRESSOR FOR SPEECH RECOGNITION 有权
    用于语音识别的噪声抑制器

    公开(公告)号:US20080114593A1

    公开(公告)日:2008-05-15

    申请号:US11560210

    申请日:2006-11-15

    IPC分类号: G10L21/02

    摘要: A noise suppressor for altering a speech signal is trained based on a speech recognition system. An objective function can be utilized to adjust parameters of the noise suppressor. The noise suppressor can be used to alter speech signals for the speech recognition system.

    摘要翻译: 基于语音识别系统训练用于改变语音信号的噪声抑制器。 可以利用目标函数来调整噪声抑制器的参数。 噪声抑制器可用于改变语音识别系统的语音信号。

    Removing noise from feature vectors
    107.
    发明授权
    Removing noise from feature vectors 有权
    从特征向量中消除噪声

    公开(公告)号:US07310599B2

    公开(公告)日:2007-12-18

    申请号:US11185159

    申请日:2005-07-20

    IPC分类号: G10L15/20 G10L15/10

    CPC分类号: G10L15/02 G10L15/20

    摘要: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. Aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.

    摘要翻译: 提供了一种用于从噪声信号特征向量识别干净信号特征向量的方法和计算机可读介质。 当识别清洁信号特征向量时,本发明的方面使用噪声特征向量和/或信道失真特征向量的分布的混合。

    Method of speech recognition using time-dependent interpolation and hidden dynamic value classes
    108.
    发明授权
    Method of speech recognition using time-dependent interpolation and hidden dynamic value classes 有权
    使用时间依赖插值和隐藏动态值类的语音识别方法

    公开(公告)号:US07206741B2

    公开(公告)日:2007-04-17

    申请号:US11294858

    申请日:2005-12-06

    IPC分类号: G10L15/04

    CPC分类号: G10L15/12 G10L2015/025

    摘要: A speech signal is decoded by determining a production-related value for a current state based on an optimal production-related value at the end of a preceding state, the optimal production-related value being selected from a set of continuous values. The production-related value is used to determine a likelihood of a phone being represented by a set of observation vectors that are aligned with a path between the preceding state and the current state. The likelihood of the phone is combined with a score from the preceding state to determine a score for the current state, the score from the preceding state being associated with a discrete class of production-related values wherein the class matches the class of the optimal production-related value.

    摘要翻译: 通过基于在先前状态结束时的最佳生产相关值来确定当前状态的生产相关值来解码语音信号,从一组连续值中选择最佳生产相关值。 生产相关值用于确定电话由与先前状态和当前状态之间的路径对准的一组观察向量表示的可能性。 电话的可能性与来自前述状态的得分组合以确定当前状态的分数,来自前一状态的分数与生产相关值的离散类相关联,其中该类与最佳生产类别匹配 相关价值。

    Method of noise estimation using incremental bayes learning
    109.
    发明授权
    Method of noise estimation using incremental bayes learning 有权
    使用增量式贝叶斯学习的噪声估计方法

    公开(公告)号:US07165026B2

    公开(公告)日:2007-01-16

    申请号:US10403638

    申请日:2003-03-31

    IPC分类号: G10L15/00 G10L21/00

    CPC分类号: G10L21/0208

    摘要: A method and apparatus estimate additive noise in a noisy signal using incremental Bayes learning, where a time-varying noise prior distribution is assumed and hyperparameters (mean and variance) are updated recursively using an approximation for posterior computed at the preceding time step. The additive noise in time domain is represented in the log-spectrum or cepstrum domain before applying incremental Bayes learning. The results of both the mean and variance estimates for the noise for each of separate frames are used to perform speech feature enhancement in the same log-spectrum or cepstrum domain.

    摘要翻译: 一种方法和装置使用增量贝叶斯学习估计噪声信号中的加性噪声​​,其中假定时变噪声预先分布,并且使用在前一时间步长计算的后验近似来递归地更新超参数(均值和方差)。 在应用增量贝叶斯学习之前,在对数谱或倒频谱域中表示时域中的加性噪声​​。 用于每个单独帧的噪声的平均值和方差估计的结果用于在相同的对数谱或倒频谱域中执行语音特征增强。