Speech-Enabled Content Navigation And Control Of A Distributed Multimodal Browser
    41.
    发明申请
    Speech-Enabled Content Navigation And Control Of A Distributed Multimodal Browser 有权
    分布式多模态浏览器的语音启用内容导航和控制

    公开(公告)号:US20080255851A1

    公开(公告)日:2008-10-16

    申请号:US11734445

    申请日:2007-04-12

    IPC分类号: G10L21/00

    摘要: Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.

    摘要翻译: 公开了一种分布式多模式浏览器的语音启用内容导航和控制,浏览器为多模式应用提供执行环境,浏览器包括图形用户代理(“GUA”)和语音用户代理(“VUA”), GUA在多模式设备上操作,VUA在语音服务器上操作,其包括:由GUA向VUA发送链接消息,指定控制浏览器的语音命令的链接消息和与每个语音命令相对应的事件; 由GUA接收来自用户的语音发音,指定特定语音命令的语音话语; 通过GUA向VUA发送语音识别语音识别语音; 由GUA接收来自VUA的事件消息,事件消息指定与特定语音命令对应的特定事件; 并由GUA根据特定事件控制浏览器。

    Ordering Recognition Results Produced By An Automatic Speech Recognition Engine For A Multimodal Application
    42.
    发明申请
    Ordering Recognition Results Produced By An Automatic Speech Recognition Engine For A Multimodal Application 有权
    由多模式应用程序自动语音识别引擎生成的订购识别结果

    公开(公告)号:US20080208585A1

    公开(公告)日:2008-08-28

    申请号:US11679284

    申请日:2007-02-27

    IPC分类号: G10L21/00

    摘要: Ordering recognition results produced by an automatic speech recognition (‘ASR’) engine for a multimodal application implemented with a grammar of the multimodal application in the ASR engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a weight for each recognition result; and sorting, by the VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.

    摘要翻译: 通过使用ASR引擎中的多模式应用程序的语法实现的多模式应用程序的自动语音识别(“ASR”)引擎进行的订购识别结果,多模式应用程序在支持多种交互模式的多模式设备的多模式浏览器中运行 包括语音模式和一个或多个非语音模式,通过VoiceXML解释器可操作地耦合到ASR引擎的多模式应用包括:在来自多模式应用的VoiceXML解释器中接收语音话语; 通过使用ASR引擎的VoiceXML解释器,根据语音发音和语法来确定多个识别结果; 通过VoiceXML解释器根据语法的语义解释脚本确定每个识别结果的权重; 以及由VoiceXML解释器根据每个识别结果的权重对多个识别结果进行排序。

    Method and system for voice-enabled autofill
    43.
    发明申请
    Method and system for voice-enabled autofill 有权
    语音自动填充的方法和系统

    公开(公告)号:US20060064302A1

    公开(公告)日:2006-03-23

    申请号:US10945112

    申请日:2004-09-20

    IPC分类号: G10L15/26

    摘要: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

    摘要翻译: 提供了一种计算机实现的方法和系统,用于响应于语音说话填充基于图形的表单字段。 计算机实现的方法包括生成对应于表单域的语法,语法基于用户简档并且包括语义解释字符串。 所述方法还包括基于所述至少一个语法并且响应于所述语音话语来创建自动填充事件,所述自动填充事件导致用与所述用户简档对应的数据填写所述表单域。 该系统包括用于生成对应于表单字段的语法的语法生成模块,该语法基于用户简档并且包括语义解释字符串。 该系统还包括一个事件模块,用于基于该至少一个语法创建一个自动填充事件,并且响应于语音话语,该事件导致用对应于用户简档的数据填写表单域。

    SYSTEMS AND METHODS FOR PROMPTING USER SPEECH IN MULTIMODAL DEVICES
    44.
    发明申请
    SYSTEMS AND METHODS FOR PROMPTING USER SPEECH IN MULTIMODAL DEVICES 审中-公开
    用于在多模式设备中提供用户演讲的系统和方法

    公开(公告)号:US20130227417A1

    公开(公告)日:2013-08-29

    申请号:US13847974

    申请日:2013-03-20

    IPC分类号: G06F3/16

    摘要: A method for prompting user input for a multimodal interface including the steps of providing a multimodal interface to a user, where the interface includes a visual interface having a plurality of input regions, each having at least one input field; selecting an input region and processing a multi-token speech input provided by the user, where the processed speech input includes at least one value for at least one input field of the selected input region; and storing at least one value in at least one input field.

    摘要翻译: 一种用于提示用户输入多模式接口的方法,包括向用户提供多模式接口的步骤,其中该接口包括具有多个输入区域的视觉接口,每个输入区域具有至少一个输入区域; 选择输入区域并处理由用户提供的多令牌语音输入,其中处理的语音输入包括用于所选输入区域的至少一个输入字段的至少一个值; 以及在至少一个输入字段中存储至少一个值。

    System and methods for prompting user speech in multimodal devices
    45.
    发明授权
    System and methods for prompting user speech in multimodal devices 有权
    在多模式设备中提示用户演讲的系统和方法

    公开(公告)号:US08417529B2

    公开(公告)日:2013-04-09

    申请号:US11616682

    申请日:2006-12-27

    IPC分类号: G10L21/00

    摘要: A method for prompting user input for a multimodal interface including the steps of providing a multimodal interface to a user, where the interface includes a visual interface having a plurality of input regions, each having at least one input field; selecting an input region and processing a multi-token speech input provided by the user, where the processed speech input includes at least one value for at least one input field of the selected input region; and storing at least one value in at least one input field.

    摘要翻译: 一种用于提示用户输入多模式接口的方法,包括向用户提供多模式接口的步骤,其中该接口包括具有多个输入区域的视觉接口,每个输入区域具有至少一个输入区域; 选择输入区域并处理由用户提供的多令牌语音输入,其中处理的语音输入包括用于所选输入区域的至少一个输入字段的至少一个值; 以及在至少一个输入字段中存储至少一个值。

    Altering Behavior Of A Multimodal Application Based On Location
    46.
    发明申请
    Altering Behavior Of A Multimodal Application Based On Location 有权
    改变基于位置的多模态应用的行为

    公开(公告)号:US20080208593A1

    公开(公告)日:2008-08-28

    申请号:US11679301

    申请日:2007-02-27

    IPC分类号: G10L21/00

    CPC分类号: G10L15/22 G10L15/24

    摘要: Methods, apparatus, and products are disclosed for altering behavior of a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application, including a voice mode and one or more non-voice modes. The voice mode of user interaction with the multimodal application is supported by a voice interpreter. Altering behavior of a multimodal application based on location includes: receiving a location change notification in the voice interpreter from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device; updating, by the voice interpreter, location-based environment parameters for the voice interpreter in dependence upon the current location of the multimodal device; and interpreting, by the voice interpreter, the multimodal application in dependence upon the location-based environment parameters.

    摘要翻译: 公开了基于位置改变多模式应用的行为的方法,装置和产品。 多模式应用程序在多模式设备上运行,支持与多模式应用程序的多种用户交互模式,包括语音模式和一种或多种非语音模式。 与多模式应用程序的用户交互的语音模式由语音解释器支持。 基于位置改变多模式应用的行为包括:从设备位置管理器在语音解释器中接收位置改变通知,该设备位置管理器可操作地耦合到多模态设备的位置检测组件,位置变化通知指定当前位置 的多模式设备; 语音解释器根据多模式设备的当前位置更新语音解释器的基于位置的环境参数; 并且由语音解释器根据基于位置的环境参数来解释多模式应用。

    Method and system of building a grammar rule with baseforms generated dynamically from user utterances
    47.
    发明申请
    Method and system of building a grammar rule with baseforms generated dynamically from user utterances 有权
    使用从用户话语动态生成的基本形式构建语法规则的方法和系统

    公开(公告)号:US20060047510A1

    公开(公告)日:2006-03-02

    申请号:US10924520

    申请日:2004-08-24

    IPC分类号: G10L15/26

    CPC分类号: G10L15/187 G10L2015/0631

    摘要: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.

    摘要翻译: 用用户话语动态生成基本形式的语法的方法(200)可包括以下步骤:(205)用户话语,使用用户话语产生(210)基形,创建或添加(215)语法规则 使用基本形式,并在语音可扩展标记语言程序的语法文档中绑定(230)语法规则。 生成基本形式可以选择性地包括向VoiceXML引入新元素,该属性使得能够从引用的记录(例如用户话语)生成基本形式。 在一个实施例中,该方法可以用于通过重复地访问包含语法规则的表单来创建(235)电话簿和语法来访问电话簿,该属性可以使得能够从引用的记录生成基本形式。