Method and apparatus using spectral addition for speaker recognition

    公开(公告)号:US20060253285A1

    公开(公告)日:2006-11-09

    申请号:US11483574

    申请日:2006-07-10

    IPC分类号: G10L17/00

    摘要: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.

    Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech
    22.
    发明授权
    Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech 失效
    使用校正和缩放矢量进行噪声降低的方法,其中噪声语音领域的声学空间分割

    公开(公告)号:US07003455B1

    公开(公告)日:2006-02-21

    申请号:US09688764

    申请日:2000-10-16

    IPC分类号: G10L15/20

    CPC分类号: G10L21/0208

    摘要: A method and apparatus are provided for reducing noise in a training signal and/or test signal. The noise reduction technique uses a stereo signal formed of two channel signals, each channel containing the same pattern signal. One of the channel signals is “clean” and the other includes additive noise. Using feature vectors from these channel signals, a collection of noise correction and scaling vectors is determined. When a feature vector of a noisy pattern signal is later received, it is multiplied by the best scaling vector for that feature vector and the best correction vector is added to the product to produce a noise reduced feature vector. Under one embodiment, the best scaling and correction vectors are identified by choosing an optimal mixture component for the noisy feature vector. The optimal mixture component being selected based on a distribution of noisy channel feature vectors associated with each mixture component.

    摘要翻译: 提供了一种用于减少训练信号和/或测试信号中的噪声的方法和装置。 噪声降低技术使用由两个信道信号形成的立体声信号,每个信道包含相同的模式信号。 一个通道信号是“干净的”,另一个包括加性噪声。 使用来自这些信道信号的特征向量,确定噪声校正和缩放向量的集合。 当稍后接收到噪声模式信号的特征向量时,将其乘以该特征向量的最佳缩放向量,并将最佳校正向量加到乘积以产生降噪特征向量。 在一个实施例中,通过为噪声特征向量选择最佳混合分量来识别最佳缩放和校正矢量。 基于与每个混合物组分相关联的噪声通道特征向量的分布来选择最佳混合物组分。

    Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech
    23.
    发明申请
    Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech 有权
    使用校正和缩放矢量进行噪声降低的方法,其中噪声语音领域的声学空间分割

    公开(公告)号:US20050149325A1

    公开(公告)日:2005-07-07

    申请号:US11059036

    申请日:2005-02-16

    IPC分类号: G10L15/20 G10L21/02 G10L21/00

    CPC分类号: G10L21/0208

    摘要: A method and apparatus are provided for reducing noise in a training signal and/or test signal. The noise reduction technique uses a stereo signal formed of two channel signals, each channel containing the same pattern signal. One of the channel signals is “clean” and the other includes additive noise. Using feature vectors from these channel signals, a collection of noise correction and scaling vectors is determined. When a feature vector of a noisy pattern signal is later received, it is multiplied by the best scaling vector for that feature vector and the best correction vector is added to the product to produce a noise reduced feature vector. Under one embodiment, the best scaling and correction vectors are identified by choosing an optimal mixture component for the noisy feature vector. The optimal mixture component being selected based on a distribution of noisy channel feature vectors associated with each mixture component.

    摘要翻译: 提供了一种用于减少训练信号和/或测试信号中的噪声的方法和装置。 噪声降低技术使用由两个信道信号形成的立体声信号,每个信道包含相同的模式信号。 一个通道信号是“干净的”,另一个包括加性噪声。 使用来自这些信道信号的特征向量,确定噪声校正和缩放向量的集合。 当稍后接收到噪声模式信号的特征向量时,将其乘以该特征向量的最佳缩放向量,并将最佳校正向量加到乘积以产生降噪特征向量。 在一个实施例中,通过为噪声特征向量选择最佳混合分量来识别最佳缩放和校正矢量。 基于与每个混合物组分相关联的噪声通道特征向量的分布来选择最佳混合物组分。

    Use of a unified language model
    24.
    发明申请

    公开(公告)号:US20050080611A1

    公开(公告)日:2005-04-14

    申请号:US11003089

    申请日:2004-12-03

    IPC分类号: G10L15/18 G06F17/28

    CPC分类号: G10L15/193 G10L15/197

    摘要: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.

    Multi-sensory speech detection system
    25.
    发明申请
    Multi-sensory speech detection system 失效
    多感官语音检测系统

    公开(公告)号:US20050027515A1

    公开(公告)日:2005-02-03

    申请号:US10629278

    申请日:2003-07-29

    摘要: The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.

    摘要翻译: 本发明将常规音频麦克风与基于输入提供语音传感器信号的附加话音传感器组合。 语音传感器信号基于语音中的扬声器在诸如面部运动,骨骼振动,喉部振动,喉部阻抗变化等中的动作而产生。语音检测器组件从语音传感器接收输入并输出语音检测 指示用户是否正在说话的信号。 语音检测器基于麦克风信号和语音传感器信号产生语音检测信号。

    Fuzzy keyboard
    26.
    发明授权
    Fuzzy keyboard 有权
    模糊键盘

    公开(公告)号:US06654733B1

    公开(公告)日:2003-11-25

    申请号:US09484095

    申请日:2000-01-18

    IPC分类号: G06F944

    CPC分类号: G06F3/04886 G06F3/0237

    摘要: Fuzzy keyboards, to determine a most-likely-to-be-intended keystroke or keystrokes, are disclosed. In one embodiment, a method adds each of one or more keys to each of a current list of key sequence hypotheses, to create a new list of key sequence hypotheses. The method determines a likelihood probability for each hypothesis in the new list, and removes any hypothesis failing to satisfy any of one or more thresholds. The most likely key sequence of the new list may then be displayed. Some embodiments of the invention relate specifically to soft keyboards, while other embodiments relate specifically to real, physical and hard keyboards.

    摘要翻译: 公开了模糊键盘,以确定最可能被预期的击键或击键。 在一个实施例中,一种方法将一个或多个密钥中的每一个添加到密钥序列假设的当前列表中的每一个,以创建密钥序列假设的新列表。 该方法确定新列表中每个假设的似然概率,并且去除不能满足一个或多个阈值中的任何一个的假设。 然后可以显示新列表的最可能的键序列。 本发明的一些实施例具体涉及软键盘,而其他实施例具体涉及实际,物理和硬盘键盘。

    Senone tree representation and evaluation
    27.
    发明授权
    Senone tree representation and evaluation 失效
    Senone树代表和评估

    公开(公告)号:US5794197A

    公开(公告)日:1998-08-11

    申请号:US850061

    申请日:1997-05-02

    摘要: A speech recognition method provides improved modeling in recognition accuracy using hidden Markov models. During training, the method creates a senone tree for each state of each phoneme encountered in a data set of training words. All output distributions received for a selected state of a selected phoneme in the set of training words are clustered together in a root node of a senone tree. Each node of the tree beginning with the root node is divided into two nodes by asking linguistic questions regarding the phonemes immediately to the left and right of a central phoneme of a triphone. At a predetermined point, the tree creation stops, resulting in leaves representing clustered output distributions known as senones. The senone trees allow all possible triphones to be mapped into a sequence of senones simply by traversing the senone trees associated with the central phoneme of the triphone. As a result, unseen triphones not encountered in the training data can be modeled with senones created using the triphones actually found in the training data.

    摘要翻译: 语音识别方法使用隐马尔可夫模型提供了识别精度的改进建模。 在训练期间,该方法为训练词数据集中遇到的每个音素的每个状态创建一个声调树。 在训练词集合中为选定音素的选定状态接收的所有输出分布被聚集在声调树的根节点中。 从根节点开始的树的每个节点被分成两个节点,通过询问关于三音节的中心音素的左侧和右侧的音素的语言问题。 在预定的点,树的创建停止,导致代表聚集的输出分布的叶被称为senones。 声音树允许所有可能的三通电话通过遍历与三通电话的中心音素相关联的音素树来映射成一系列的单音。 因此,训练数据中未见到的看不见的三重奏可以使用在训练数据中实际发现的三通奏音而创建的声音进行建模。