Method and apparatus for quantizing model parameters
    2.
    发明授权
    Method and apparatus for quantizing model parameters 有权
    用于量化模型参数的方法和装置

    公开(公告)号:US07272557B2

    公开(公告)日:2007-09-18

    申请号:US10427215

    申请日:2003-05-01

    申请人: Julian J. Odell

    发明人: Julian J. Odell

    IPC分类号: G10L19/00

    CPC分类号: G10L15/142

    摘要: A method of quantizing a model parameter includes applying the model parameter to a non-linear scaling function to produce a scaled model parameter and quantizing the scaled model parameter to form a quantized model parameter. In further embodiments, likelihoods for multiple frames of input feature vectors are determined for each retrieval of quantized model parameters from memory.

    摘要翻译: 量化模型参数的方法包括将模型参数应用于非线性缩放函数以产生缩放的模型参数并量化缩放的模型参数以形成量化的模型参数。 在另外的实施例中,对于每个从存储器检索量化的模型参数,确定多帧输入特征向量的似然性。

    Confidence threshold tuning
    3.
    发明授权
    Confidence threshold tuning 有权
    置信度调整

    公开(公告)号:US08396715B2

    公开(公告)日:2013-03-12

    申请号:US11168278

    申请日:2005-06-28

    IPC分类号: G10L21/00 G10L15/00

    CPC分类号: G10L15/08

    摘要: An expected dialog-turn (ED) value is estimated for evaluating a speech application. Parameters such as a confidence threshold setting can be adjusted based on the expected dialog-turn value. In a particular example, recognition results and corresponding confidence scores are used to estimate the expected dialog-turn value. The recognition results can be associated with a possible outcome for the speech application and a cost for the possible outcome can be used to estimate the expected dialog-turn value.

    摘要翻译: 估计用于评估语音应用程序的预期对话转弯(ED)值。 可以基于预期的对话转弯值来调整诸如置信阈值设置的参数。 在特定的例子中,使用识别结果和相应的置信度分数来估计预期的对话转弯值。 识别结果可以与语音应用的可能结果相关联,并且可以使用可能结果的成本来估计预期的对话转弯值。

    Voice user interface authoring tool
    4.
    发明授权
    Voice user interface authoring tool 有权
    语音用户界面创作工具

    公开(公告)号:US08315874B2

    公开(公告)日:2012-11-20

    申请号:US11401823

    申请日:2006-04-11

    IPC分类号: G10L21/00

    CPC分类号: G10L2015/228

    摘要: A voice user interface authoring tool is configured to use categorized example caller responses, from which callflow paths, automatic speech recognition, and natural language processing control files can be generated automatically within a single, integrated authoring user interface. A voice user interface (VUI) design component allows an author to create an application incorporating various types of action nodes, including Prompt/Response Processing (PRP) nodes. At runtime, the system uses the information from each PRP node to prompt a user to say something, and to process the user's response in order to extract its meaning. An Automatic Speech Recognition/Natural Language Processing (ASR/NLP) Control Design component allows the author to associate sample inputs with each possible meaning, and automatically generates the necessary ASR and NLP runtime control files. The VUI design component allows the author to associate the appropriate ASR and NLP control files with each PRP node, and to associate an action node with each possible meaning, as indicated by the NLP control file.

    摘要翻译: 语音用户界面创作工具被配置为使用分类示例呼叫者响应,可以在单个集成创作用户界面内自动生成呼叫流程路径,自动语音识别和自然语言处理控制文件。 语音用户界面(VUI)设计组件允许作者创建并入各种动作节点的应用程序,包括提示/响应处理(PRP)节点。 在运行时,系统使用来自每个PRP节点的信息来提示用户说出某些内容,并处理用户的响应以提取其含义。 自动语音识别/自然语言处理(ASR / NLP)控制设计组件允许作者将样本输入与每个可能的含义相关联,并自动生成必要的ASR和NLP运行时控制文件。 VUI设计组件允许作者将适当的ASR和NLP控制文件与每个PRP节点相关联,并将动作节点与每个可能的含义相关联,如NLP控制文件所示。

    Automatic identification of dialog timing problems for an interactive speech dialog application using speech log data indicative of cases of barge-in and timing problems
    5.
    发明授权
    Automatic identification of dialog timing problems for an interactive speech dialog application using speech log data indicative of cases of barge-in and timing problems 有权
    使用表示插入和定时问题的语音日志数据来自动识别交互式语音对话应用程序的对话时间问题

    公开(公告)号:US07930183B2

    公开(公告)日:2011-04-19

    申请号:US11392339

    申请日:2006-03-29

    IPC分类号: G10L15/22

    CPC分类号: G10L15/26

    摘要: A method of analyzing dialog between a user and an interactive application having dialog turns is provided. The method includes accessing information indicative of a plurality of dialog turns between the application and at least one user and identifying instances where the application determined a response was received before an associated prompt had completed. The accessed information includes information related to operation of the application with a first grammar to recognize the response. The method includes identifying whether the response was received in a particular limited time period from when the associated prompt began. If the response was received in the limited time period, the method determines whether the response included one or more terms from the associated prompt by performing recognition on the response using a second grammar having more information related to grammar of a language than the first grammar.

    摘要翻译: 提供了一种分析用户和具有对话转弯的交互应用之间的对话的方法。 该方法包括访问指示应用程序和至少一个用户之间的多个对话转弯的信息,并且在相关联的提示已经完成之前识别应用程序确定响应被接收到的实例。 所访问的信息包括与用于识别响应的具有第一语法的应用的操作有关的信息。 该方法包括识别在相关联的提示开始后的特定有限时间段内是否接收到响应。 如果在有限时间段内收到响应,该方法通过使用具有与第一语法相比语言语法相关的更多信息的第二语法来执行对响应的识别来确定响应是否包括来自相关联的提示的一个或多个术语。

    Combination and federation of local and remote speech recognition
    6.
    发明授权
    Combination and federation of local and remote speech recognition 有权
    本地和远程语音识别的组合和联合

    公开(公告)号:US08892439B2

    公开(公告)日:2014-11-18

    申请号:US12503191

    申请日:2009-07-15

    IPC分类号: G10L15/18 G10L15/30

    CPC分类号: G10L15/30 G10L2015/221

    摘要: Techniques to provide automatic speech recognition at a local device are described. An apparatus may include an audio input to receive audio data indicating a task. The apparatus may further include a local recognizer component to receive the audio data, to pass the audio data to a remote recognizer while receiving the audio data, and to recognize speech from the audio data. The apparatus may further include a federation component operative to receive one or more recognition results from the local recognizer and/or the remote recognizer, and to federate a plurality of recognition results to produce a most likely result. The apparatus may further include an application to perform the task indicated by the most likely result. Other embodiments are described and claimed.

    摘要翻译: 描述在本地设备处提供自动语音识别的技术。 装置可以包括用于接收指示任务的音频数据的音频输入。 该装置还可以包括接收音频数据的局部识别器组件,以便在接收音频数据的同时将音频数据传送到远程识别器,并从音频数据识别语音。 该装置还可以包括联合组件,用于从本地识别器和/或远程识别器接收一个或多个识别结果,并联合多个识别结果以产生最可能的结果。 该装置还可以包括用于执行由最可能的结果指示的任务的应用。 描述和要求保护其他实施例。

    Lightweight windowing method for screening harvested data for novelty
    7.
    发明授权
    Lightweight windowing method for screening harvested data for novelty 有权
    用于筛选收获数据的新颖性的轻量级窗口方法

    公开(公告)号:US08069032B2

    公开(公告)日:2011-11-29

    申请号:US11494301

    申请日:2006-07-27

    IPC分类号: G06F17/27

    CPC分类号: G06F17/2715

    摘要: Biasing of language model customization due to repetitious data is substantially reduced by introducing novelty screening to data harvesting process. Novelty detection based filtering is added to ensure that an adaptation system gives more weight to representative adaptation data that is not repetitious. The value of the adaptation data is preserved and the process prevented from being polluted when the same data is seen multiple times, such as the original posting in an email thread, various versions of the same document, and the like. The screening technique may be built on top of existing data harvesting mechanisms as already seen data is used to determine the novelty of a particular portion of the data. A window into the new data, fixed or variable size, is compared against the already collected data to determine the likelihood that the data is novel.

    摘要翻译: 通过对数据采集过程引入新颖性筛选,大大减少了由于重复数据导致的语言模型定制的偏差。 添加基于新奇检测的滤波,以确保适应系统对不重复的代表性适配数据给予更多的重视。 适应数据的值被保留,并且当相同的数据被多次看到时防止被处理的过程,例如电子邮件线程中的原始发布,同一文档的各种版本等等。 筛选技术可以建立在现有数据采集机制的基础上,因为已经看到的数据被用于确定数据的特定部分的新颖性。 将新数据(固定或可变大小)的窗口与已收集的数据进行比较,以确定数据是新颖的可能性。

    Voice user interface authoring tool
    8.
    发明申请
    Voice user interface authoring tool 有权
    语音用户界面创作工具

    公开(公告)号:US20070156406A1

    公开(公告)日:2007-07-05

    申请号:US11401823

    申请日:2006-04-11

    IPC分类号: G10L15/18

    CPC分类号: G10L2015/228

    摘要: A voice user interface authoring tool is configured to use categorized example caller responses, from which callflow paths, automatic speech recognition, and natural language processing control files can be generated automatically within a single, integrated authoring user interface. A voice user interface (VUI) design component allows an author to create an application incorporating various types of action nodes, including Prompt/Response Processing (PRP) nodes. At runtime, the system uses the information from each PRP node to prompt a user to say something, and to process the user's response in order to extract its meaning. An Automatic Speech Recognition/Natural Language Processing (ASR/NLP) Control Design component allows the author to associate sample inputs with each possible meaning, and automatically generates the necessary ASR and NLP runtime control files. The VUI design component allows the author to associate the appropriate ASR and NLP control files with each PRP node, and to associate an action node with each possible meaning, as indicated by the NLP control file.

    摘要翻译: 语音用户界面创作工具被配置为使用分类示例呼叫者响应,可以在单个集成创作用户界面内自动生成呼叫流程路径,自动语音识别和自然语言处理控制文件。 语音用户界面(VUI)设计组件允许作者创建并入各种动作节点的应用程序,包括提示/响应处理(PRP)节点。 在运行时,系统使用来自每个PRP节点的信息来提示用户说出某些内容,并处理用户的响应以提取其含义。 自动语音识别/自然语言处理(ASR / NLP)控制设计组件允许作者将样本输入与每个可能的含义相关联,并自动生成必要的ASR和NLP运行时控制文件。 VUI设计组件允许作者将适当的ASR和NLP控制文件与每个PRP节点相关联,并将动作节点与每个可能的含义相关联,如NLP控制文件所示。

    Speech Recognition System with Display Information
    9.
    发明申请
    Speech Recognition System with Display Information 有权
    具有显示信息的语音识别系统

    公开(公告)号:US20100100384A1

    公开(公告)日:2010-04-22

    申请号:US12255270

    申请日:2008-10-21

    IPC分类号: G10L15/18

    CPC分类号: G10L15/22 G10L15/19

    摘要: A language processing system may determine a display form of a spoken word by analyzing the spoken form using a language model that includes dictionary entries for display forms of homonyms. The homonyms may include trade names as well as given names and other phrases. The language processing system may receive spoken language and produce a display form of the language while displaying the proper form of the homonym. Such a system may be used in search systems where audio input is converted to a graphical display of a portion of the spoken input.

    摘要翻译: 语言处理系统可以通过使用包括用于同音异构的显示形式的字典条目的语言模型来分析口语形式来确定口语单词的显示形式。 同音异义可能包括商品名称以及给定的名称和其他短语。 语言处理系统可以接收口语,并产生语言的显示形式,同时显示适当形式的同音异义。 这样的系统可以用在搜索系统中,其中音频输入被转换为口语输入的一部分的图形显示。

    Recognizing multiple semantic items from single utterance
    10.
    发明申请
    Recognizing multiple semantic items from single utterance 有权
    从单一语音识别多个语义项

    公开(公告)号:US20090228270A1

    公开(公告)日:2009-09-10

    申请号:US12042460

    申请日:2008-03-05

    IPC分类号: G10L15/00

    CPC分类号: G10L15/1815

    摘要: Semantically distinct items are extracted from a single utterance by repeatedly recognizing the same utterance using constraints provided by semantic items already recognized. User feedback for selection or correction of partially recognized utterance may be used in a hierarchical, multi-modal, or single step manner. An accuracy of recognition is preserved while the less structured and more natural single utterance recognition form is allowed to be used.

    摘要翻译: 通过使用已经识别的语义项提供的约束重复地识别相同的话语,从单个话语中提取语义上不同的项目。 用于部分识别的话语的选择或校正的用户反馈可以以分层,多模式或单步的方式使用。 识别的准确性得到保留,而较少结构化和更自然的单个话语识别形式被允许使用。