Method, system, and apparatus for limiting available selections in a speech recognition system
    1.
    发明授权
    Method, system, and apparatus for limiting available selections in a speech recognition system 有权
    用于限制语音识别系统中的可用选择的方法,系统和装置

    公开(公告)号:US07010490B2

    公开(公告)日:2006-03-07

    申请号:US09770577

    申请日:2001-01-26

    IPC分类号: G10L15/00

    CPC分类号: G10L15/26 G10L2015/228

    摘要: A method and system for completing user input in a speech recognition system. The method can include a series of steps which can include receiving a user input. The user input can specify an attribute of a selection. The method can include comparing the user input with a set of selections in the speech recognition system. Also, the method can include limiting the set of selections to an available set of selections which can correspond to the received user input. The step of matching a received user spoken utterance with the selection in the available set of selections also can be included.

    摘要翻译: 一种在语音识别系统中完成用户输入的方法和系统。 该方法可以包括可以包括接收用户输入的一系列步骤。 用户输入可以指定选择的属性。 该方法可以包括将用户输入与语音识别系统中的一组选择进行比较。 此外,该方法可以包括将选择集合限制为可以对应于所接收的用户输入的可用选择集合。 可以包括将接收到的用户口令话语与可用选择集合中的选择进行匹配的步骤。

    Explicitly registering markup based on verbal commands and exploiting audio context
    2.
    发明授权
    Explicitly registering markup based on verbal commands and exploiting audio context 有权
    基于口头命令和利用音频语境明确地注册标记

    公开(公告)号:US07240006B1

    公开(公告)日:2007-07-03

    申请号:US09670646

    申请日:2000-09-27

    IPC分类号: G10L15/22

    CPC分类号: G06F17/30896 G10L2015/228

    摘要: A generic way of encoding information needed by an application to register voice commands and enable a speech engine are used to tell a browser what to present to the user and what options are available to the user to interact with an application. This is accomplished by enhancements to a markup language which register and enable voice commands that are needed by an application to the speech engine, and provide an audio context for the page scope command by adding a context option to make the page much more flexible and usable. The action of the application can be altered based on the current audio context by adding a context option. The application remains independent of the browser and separate from interaction with the speech engine. The application can accommodate both verbal and visual interactions by registering the verbal commands and identifying to what those commands will translate.

    摘要翻译: 用于编码应用程序注册语音命令和启用语音引擎所需的信息的通用方式用于告诉浏览器向用户呈现什么以及用户可以与应用程序交互的选项。 这是通过增强标注语言来实现的,该标记语言注册和启用语音引擎所需的语音命令,并且通过添加上下文选项来提供用于页面范围命令的音频上下文以使得页面更加灵活和可用 。 通过添加上下文选项,可以基于当前音频上下文来更改应用程序的动作。 应用程序保持独立于浏览器,与语音引擎的互动分开。 该应用程序可以通过注册口头命令并识别这些命令将转换来适应语言和视觉交互。

    Method and system for synchronizing audio and visual presentation in a multi-modal content renderer
    3.
    发明授权
    Method and system for synchronizing audio and visual presentation in a multi-modal content renderer 有权
    用于在多模式内容渲染器中同步音频和视觉呈现的方法和系统

    公开(公告)号:US06745163B1

    公开(公告)日:2004-06-01

    申请号:US09670800

    申请日:2000-09-27

    IPC分类号: G10L1300

    CPC分类号: G10L13/00

    摘要: A system and method for a multi-modal browser/renderer that simultaneously renders content visually and verbally in a synchronized manner are provided without having the server applications change. The system and method receives a document via a computer network, parses the text in the document, provides an audible component associated with the text, simultaneously transmits to output the text and the audible component. The desired behavior for the renderer is that when some section of that content is being heard by the user, that section is visible on the screen and, furthermore, the specific visual content being audibly rendered is somehow highlighted visually. In addition, the invention also reacts to input from either the visual component or the aural component. The invention also allows any application or server to be accessible to someone via audio instead of visual means by having the browser handle the Embedded Browser Markup Language (EBML) disclosed herein so that it is audibly read to the user. Existing EBML statements can also be combined so that what is audibly read to the user is related to, but not identical to, the EBML text. The present invention also solves the problem of synchronizing audio and visual presentation of existing content via markup language changes rather than by application code changes.

    摘要翻译: 提供了同时以同步方式呈现视觉和口头内容的多模式浏览器/渲染器的系统和方法,而不改变服务器应用。 系统和方法通过计算机网络接收文档,解析文档中的文本,提供与文本相关联的可听组件,同时发送以输出文本和可听组件。 渲染器的期望行为是当用户听到该内容的某些部分时,该部分在屏幕上可见,此外,可视化呈现的特定视觉内容以某种方式以视觉方式突出显示。 此外,本发明还对来自视觉部件或听觉部件的输入进行反应。 本发明还允许通过使浏览器处理本文公开的嵌入式浏览器标记语言(EBML)而使用户可听见的任何应用或服务器通过音频而不是视觉方式来访问。 现有的EBML语句也可以组合起来,使得用户可听见的内容与EBML文本有关,但与EBML文本不相同。 本发明还解决了通过标记语言改变而不是通过应用程序代码改变同步现有内容的音频和视觉呈现的问题。