Spatial noise suppression for a microphone array
    91.
    发明申请
    Spatial noise suppression for a microphone array 有权
    麦克风阵列的空间噪声抑制

    公开(公告)号:US20070150268A1

    公开(公告)日:2007-06-28

    申请号:US11316002

    申请日:2005-12-22

    IPC分类号: G10L21/02

    摘要: A microphone array having at least three microphones provides a captured signal. Spatial noise suppression estimates a desired signal from a captured signal using spatio-temporal distribution of the speech and the noise. In particular, spatial information indicative of at least two quantities of direction are used. A first quantity is based on a first combination of the signals from the at least three microphones, a second quantity is based on a second combination of the signals of the at least three microphones.

    摘要翻译: 具有至少三个麦克风的麦克风阵列提供捕获的信号。 空间噪声抑制使用语音和噪声的时空分布从捕获的信号估计期望的信号。 特别地,使用指示至少两个方向量的空间信息。 第一数量是基于来自至少三个麦克风的信号的第一组合,第二数量是基于至少三个麦克风的信号的第二组合。

    Indexing and searching speech with text meta-data
    92.
    发明申请
    Indexing and searching speech with text meta-data 有权
    用文本元数据索引和搜索语音

    公开(公告)号:US20070106509A1

    公开(公告)日:2007-05-10

    申请号:US11269872

    申请日:2005-11-08

    IPC分类号: G10L15/00

    摘要: An index for searching spoken documents having speech data and text meta-data is created by obtaining probabilities of occurrence of words and positional information of the words of the speech data and combining it with at least positional information of the words in the text meta-data. A single index can be created because the speech data and the text meta-data are treated the same and considered only different categories.

    摘要翻译: 用于搜索具有语音数据和文本元数据的口头文档的索引是通过获得单词的发生概率和语音数据的单词的位置信息并将其与文本元数据中的单词的至少位置信息进行组合来创建的 。 可以创建单个索引,因为语音数据和文本元数据被视为相同,仅被认为是不同的类别。

    Configurable grammar templates
    93.
    发明申请
    Configurable grammar templates 审中-公开
    可配置语法模板

    公开(公告)号:US20070055492A1

    公开(公告)日:2007-03-08

    申请号:US11259475

    申请日:2005-10-26

    IPC分类号: G06F17/27

    摘要: To provide application developers with the ability to easily form customized grammars, grammar extensions are provided that allow application developers to selectively include portions of grammar templates and to easily combine grammar elements to form new grammar structures.

    摘要翻译: 为了使应用程序开发人员能够轻松构建自定义语法,提供了语法扩展,允许应用程序开发人员选择性地包括语法模板的一部分,并轻松组合语法元素以形成新的语法结构。

    Method and apparatus for indexing speech

    公开(公告)号:US20060265222A1

    公开(公告)日:2006-11-23

    申请号:US11133515

    申请日:2005-05-20

    IPC分类号: G10L15/00

    CPC分类号: G10L15/26

    摘要: A method of indexing a speech segment includes identifying at least two alternative word sequences based on the speech segment. For each word in the alternative sequences, information is placed in an entry for the word in the index. The information indicates the position of the word in at least one of the alternative sequences.

    Method of automatically ranking speech dialog states and transitions to aid in performance analysis in speech applications
    96.
    发明申请
    Method of automatically ranking speech dialog states and transitions to aid in performance analysis in speech applications 有权
    自动排序语音对话状态和转换的方法,以帮助语音应用中的性能分析

    公开(公告)号:US20060178883A1

    公开(公告)日:2006-08-10

    申请号:US11054096

    申请日:2005-02-09

    IPC分类号: G10L15/06

    CPC分类号: G10L15/01 G10L15/083

    摘要: A method of identifying problems in a speech recognition application is provided and includes the step of obtaining a speech application call log containing log data on question-answer (QA) states and transitions. Then, in accordance with the method, for each of a multiple transitions between states, a parameter is generated which is indicative of a gain in a success rate of the speech recognition application if all calls passing through the transition passed instead through other transitions. In exemplary embodiments, the parameter is an Arc Cut Gain in Success Rate (ACGSR) parameter. Methods of generating the ACGSR, as well as systems and tools for aiding developers are also disclosed.

    摘要翻译: 提供了一种识别语音识别应用中的问题的方法,并且包括获得包含问答(QA)状态和转换的日志数据的语音应用呼叫日志的步骤。 然后,根据该方法,对于状态之间的多个转换中的每一个,生成指示如果通过转换的所有呼叫通过其他转换而通过语音识别应用的成功率的增益的参数。 在示例性实施例中,参数是成功率弧度增益(ACGSR)参数。 还公开了生成ACGSR的方法,以及用于帮助开发人员的系统和工具。

    Acoustic models with structured hidden dynamics with integration over many possible hidden trajectories

    公开(公告)号:US20060100862A1

    公开(公告)日:2006-05-11

    申请号:US11071904

    申请日:2005-03-01

    IPC分类号: G10L11/04

    CPC分类号: G10L15/02 G10L2015/025

    摘要: A method of producing at least one possible sequence of vocal tract resonance (VTR) for a fixed sequence of phonetic units, and producing the acoustic observation probability by integrating over such distributions is provided. The method includes identifying a sequence of target distributions for a VTR sequence corresponding to a phone sequence with a given segmentation. The sequence of target distributions is applied to a finite impulse response filter to produce distributions for possible VTR trajectories. Then these distributions are applied to a linearized nonlinear function to produce the acoustic observation probability for the given sequence of phonetic units. This acoustic observation probability is used for phonetic recognition.

    Hidden conditional random field models for phonetic classification and speech recognition
    98.
    发明申请
    Hidden conditional random field models for phonetic classification and speech recognition 有权
    用于语音分类和语音识别的隐藏条件随机场模型

    公开(公告)号:US20060085190A1

    公开(公告)日:2006-04-20

    申请号:US10966047

    申请日:2004-10-15

    IPC分类号: G10L15/14

    CPC分类号: G10L15/14

    摘要: A method and apparatus are provided for training and using a hidden conditional random field model for speech recognition and phonetic classification. The hidden conditional random field model uses features, at least one of which is based on a hidden state in a phonetic unit. Values for the features are determined from a segment of speech, and these values are used to identify a phonetic unit for the segment of speech.

    摘要翻译: 提供了一种用于训练和使用用于语音识别和语音分类的隐藏条件随机场模型的方法和装置。 隐藏的条件随机场模型使用特征,其中至少一个基于语音单元中的隐藏状态。 特征的值由语音段确定,并且这些值用于识别语音段的语音单元。