Method and system for building a phonotactic model for domain independent speech recognition
    52.
    发明授权
    Method and system for building a phonotactic model for domain independent speech recognition 有权
    用于构建域独立语音识别的语音模型的方法和系统

    公开(公告)号:US08392188B1

    公开(公告)日:2013-03-05

    申请号:US09956907

    申请日:2001-09-21

    申请人: Giuseppe Riccardi

    发明人: Giuseppe Riccardi

    IPC分类号: G10L17/00

    摘要: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.

    摘要翻译: 本发明涉及用于构建用于域独立语音识别的语音模型的方法和相应系统。 该方法可以包括使用当前语音模型从用户的输入通信识别电话,从识别的电话中检测语素(声学和/或非声学),并输出检测到的语素进行处理。 该方法还使用检测到的语素来更新语音模型,并将新模型存储在数据库中,以便在下一次用户交互期间由系统使用。 该方法还可以包括基于来自用户的输入通信的检测到的语素来进行任务类型分类决定。

    Recognizing the numeric language in natural spoken dialogue
    53.
    发明授权
    Recognizing the numeric language in natural spoken dialogue 有权
    认识到自然语言对话中的数字语言

    公开(公告)号:US08050925B2

    公开(公告)日:2011-11-01

    申请号:US12612871

    申请日:2009-11-05

    IPC分类号: G10L15/14 G10L15/18

    CPC分类号: G10L15/142

    摘要: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

    摘要翻译: 提供了一种系统和方法。 语音识别处理器接收无约束输入语音并输出一串单词。 语音识别处理器基于代表词汇子集的数字语言。 该子集包括被识别为用于解释和理解数字串的一组单词。 数字理解处理器包含用于将字符串转换为数字序列的规则类型。 语音识别处理器利用声学模型数据库。 验证数据库存储一组有效的数字序列。 字符串验证处理器基于数字理解处理器输出的数字序列与验证数据库中的有效数字序列的比较来输出有效性信息。

    Recognizing the Numeric Language in Natural Spoken Dialogue
    55.
    发明申请
    Recognizing the Numeric Language in Natural Spoken Dialogue 有权
    认识自然语言对话中的数字语言

    公开(公告)号:US20100049519A1

    公开(公告)日:2010-02-25

    申请号:US12612871

    申请日:2009-11-05

    IPC分类号: G10L15/14

    CPC分类号: G10L15/142

    摘要: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

    摘要翻译: 提供了一种系统和方法。 语音识别处理器接收无约束输入语音并输出一串字。 语音识别处理器基于代表词汇子集的数字语言。 该子集包括被识别为用于解释和理解数字串的一组单词。 数字理解处理器包含用于将字符串转换为数字序列的规则类型。 语音识别处理器利用声学模型数据库。 验证数据库存储一组有效的数字序列。 字符串验证处理器基于数字理解处理器输出的数字序列与验证数据库中的有效数字序列的比较来输出有效性信息。

    Generating confidence scores from word lattices
    56.
    发明授权
    Generating confidence scores from word lattices 有权
    从字晶中产生置信分数

    公开(公告)号:US07562010B1

    公开(公告)日:2009-07-14

    申请号:US11426225

    申请日:2006-06-23

    CPC分类号: G10L15/08

    摘要: Systems and methods for determining word confidence scores. Speech recognition systems generate a word lattice for speech input. Posterior probabilities of the words in the word lattice are determined using a forward-backward algorithm. Next, time slots are defined for the word lattice, and for all transitions that at least partially overlap a particular time slot, the posterior probabilities of transitions that have the same word label are combined for those transitions. The combined posterior probabilities are used as confidence scores. A local entropy can be computed on the competitor transitions of a particular time slot and also used as a confidence score.

    摘要翻译: 用于确定单词置信度得分的系统和方法。 语音识别系统产生用于语音输入的字格。 使用前向 - 后向算法来确定单词格中的单词的后验概率。 接下来,为字格格定义时隙,并且对于至少部分地与特定时隙重叠的所有转换,具有相同字标签的转换的后验概率被组合用于那些转换。 组合后验概率用作置信度得分。 可以在特定时隙的竞争者转换上计算局部熵,并将其用作置信度得分。

    System and method of spoken language understanding in a spoken dialog service
    57.
    发明授权
    System and method of spoken language understanding in a spoken dialog service 有权
    口语对话服务中口语理解的系统和方法

    公开(公告)号:US07451089B1

    公开(公告)日:2008-11-11

    申请号:US11675166

    申请日:2007-02-15

    IPC分类号: G10L11/00 G10L21/00

    CPC分类号: G10L15/22 G06F3/167

    摘要: A voice-enabled help desk service is disclosed. The service comprises an automatic speech recognition module for recognizing speech from a user, a spoken language understanding module for understanding the output from the automatic speech recognition module, a dialog management module for generating a response to speech from the user, a natural voices text-to-speech synthesis module for synthesizing speech to generate the response to the user, and a frequently asked questions module. The frequently asked questions module handles frequently asked questions from the user by changing voices and providing predetermined prompts to answer the frequently asked question.

    摘要翻译: 公开了支持语音的帮助台服务。 该服务包括用于识别来自用户的语音的自动语音识别模块,用于理解来自自动语音识别模块的输出的口语语言理解模块,用于生成来自用户对语音的响应的对话管理模块,自然语音文本 - 语音合成模块,用于合成语音以产生对用户的响应,以及常见问题模块。 常见问题模块通过改变语音来处理用户的常见问题,并提供预定的提示来回答常见问题。

    SYSTEMS AND METHODS FOR REDUCING ANNOTATION TIME
    58.
    发明申请
    SYSTEMS AND METHODS FOR REDUCING ANNOTATION TIME 有权
    减少安息时间的系统和方法

    公开(公告)号:US20080270130A1

    公开(公告)日:2008-10-30

    申请号:US12165755

    申请日:2008-07-01

    IPC分类号: G10L15/00

    摘要: Systems and methods for annotating speech data. The present invention reduces the time required to annotate speech data by selecting utterances for annotation that will be of greatest benefit. A selection module uses speech models, including speech recognition models and spoken language understanding models, to identify utterances that should be annotated based on criteria such as confidence scores generated by the models. These utterances are placed in an annotation list along with a type of annotation to be performed for the utterances and an order in which the annotation should proceed. The utterances in the annotation list can be annotated for speech recognition purposes, spoken language understanding purposes, labeling purposes, etc. The selection module can also select utterances for annotation based on previously annotated speech data and deficiencies in the various models.

    摘要翻译: 用于注释语音数据的系统和方法。 本发明通过选择最有益的用于注释的话语来减少注释语音数据所需的时间。 选择模块使用包括语音识别模型和语言理解模型在内的语音模型来基于诸如由模型产生的置信度得分的标准来识别应当注释的话语。 这些话语被放置在注释列表中,以及要为语句执行的注释类型以及注释应该继续执行的顺序。 注释列表中的话语可以被注释用于语音识别目的,语言理解目的,标签目的等。选择模块还可以基于先前注释的语音数据和各种模型中的缺陷来选择用于注释的话语。

    METHOD AND SYSTEM FOR AUTOMATICALLY DETECTING MORPHEMES IN A TASK CLASSIFICATION SYSTEM USING LATTICES
    59.
    发明申请
    METHOD AND SYSTEM FOR AUTOMATICALLY DETECTING MORPHEMES IN A TASK CLASSIFICATION SYSTEM USING LATTICES 审中-公开
    使用LATTICES在任务分类系统中自动检测MORPHEMES的方法和系统

    公开(公告)号:US20080215328A1

    公开(公告)日:2008-09-04

    申请号:US11854706

    申请日:2007-09-13

    IPC分类号: G10L15/04

    CPC分类号: G10L15/08

    摘要: The invention concerns a method and system for detecting morphemes in a user's communication. The method may include recognizing a lattice of phone strings from the user's input communication, the lattice representing a distribution over the phone strings, and detecting morphemes in the user's input communication using the lattice. The morphemes may be acoustic and/or non-acoustic. The morphemes may represent any unit or sub-unit of communication including phones, diphones, phone-phrases, syllables, grammars, words, gestures, tablet strokes, body movements, mouse clicks, etc. The training speech may be verbal, non-verbal, a combination of verbal and non-verbal, or multimodal.

    摘要翻译: 本发明涉及用于检测用户通信中的语素的方法和系统。 该方法可以包括从用户的输入通信识别电话串的格子,格子表示电话串上的分布,以及使用网格检测用户的输入通信中的语素。 语素可以是声学和/或非声学的。 语素可以代表通信的任何单位或子单位,包括手机,双耳,电话短语,音节,语法,单词,手势,平板笔画,身体动作,鼠标点击等。训练语言可以是口头上,非言语的 ,口头和非言语或多式联运。

    METHOD AND SYSTEM FOR AUTOMATIC DETECTING MORPHEMES IN A TASK CLASSIFICATION SYSTEM USING LATTICES
    60.
    发明申请
    METHOD AND SYSTEM FOR AUTOMATIC DETECTING MORPHEMES IN A TASK CLASSIFICATION SYSTEM USING LATTICES 审中-公开
    使用LATTICES在任务分类系统中自动检测MORPHEMES的方法和系统

    公开(公告)号:US20080177544A1

    公开(公告)日:2008-07-24

    申请号:US11854717

    申请日:2007-09-13

    IPC分类号: G10L15/04

    CPC分类号: G10L15/08

    摘要: The invention concerns a method and system for detecting morphemes in a user's communication. The method may include recognizing a lattice of phone strings from the user's input communication, the lattice representing a distribution over the phone strings, and detecting morphemes in the user's input communication using the lattice. The morphemes may be acoustic and/or non-acoustic. The morphemes may represent any unit or sub-unit of communication including phones, diphones, phone-phrases, syllables, grammars, words, gestures, tablet strokes, body movements, mouse clicks, etc. The training speech may be verbal, non-verbal, a combination of verbal and non-verbal, or multimodal.

    摘要翻译: 本发明涉及用于检测用户通信中的语素的方法和系统。 该方法可以包括从用户的输入通信识别电话串的格子,格子表示电话串上的分布,以及使用网格检测用户的输入通信中的语素。 语素可以是声学和/或非声学的。 语素可以代表通信的任何单位或子单位,包括手机,双耳,电话短语,音节,语法,单词,手势,平板笔画,身体动作,鼠标点击等。训练语言可以是口头上,非言语的 ,口头和非言语或多式联运。