AUTOMATIC SPEECH RECOGNITION WITH A SELECTION LIST
    1.
    发明申请
    AUTOMATIC SPEECH RECOGNITION WITH A SELECTION LIST 有权
    自动语音识别与选择列表

    公开(公告)号:US20080162136A1

    公开(公告)日:2008-07-03

    申请号:US11619209

    申请日:2007-01-03

    IPC分类号: G10L15/18 G10L15/04 G10L21/00

    摘要: Methods, apparatus, and computer program products are described for automatic speech recognition (‘ASR’) that include accepting by the multimodal application speech input and visual input for selecting or deselecting items in a selection list, the speech input enabled by a speech recognition grammar; providing, from the multimodal application to the grammar interpreter, the speech input and the speech recognition grammar; receiving, by the multimodal application from the grammar interpreter, interpretation results including matched words from the grammar that correspond to items in the selection list and a semantic interpretation token that specifies whether to select or deselect items in the selection list; and determining, by the multimodal application in dependence upon the value of the semantic interpretation token, whether to select or deselect items in the selection list that correspond to the matched words.

    摘要翻译: 描述用于自动语音识别(“ASR”)的方法,装置和计算机程序产品,其包括通过多模式应用语音输入的接受和用于在选择列表中选择或取消选择项目的可视输入,由语音识别语法启用的语音输入 ; 从多模式应用程序提供语法解释器,语音输入和语音识别语法; 通过多模式应用从语法解释器接收包括对应于选择列表中的项目的语法的匹配词的解释结果和指定是否选择或取消选择列表中的项目的语义解释令牌; 以及根据所述语义解释令牌的值由所述多模式应用程序确定是否选择或取消选择列表中对应于所述匹配词的项目。

    Enabling Global Grammars For A Particular Multimodal Application
    2.
    发明申请
    Enabling Global Grammars For A Particular Multimodal Application 有权
    为特定的多模式应用程序启用全局语法

    公开(公告)号:US20080208591A1

    公开(公告)日:2008-08-28

    申请号:US11679279

    申请日:2007-02-27

    IPC分类号: G10L21/00

    CPC分类号: G10L15/19

    摘要: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

    摘要翻译: 描述了方法,装置和计算机程序产品,用于通过加载多模式网页来实现根据本发明的特定多模式应用的全局语法; 确定加载的多模式网页是否是特定多模式应用的多个多模式网页之一。 如果加载的多模式网页是特定多模式应用程序的多个多模式网页之一,则启用全局语法通常包括加载在多模式网页中标识的特定多模式应用程序的任何当前未加载的全局语法,并维护任何先前加载的全局语法 。 如果加载的多模式网页不是特定多模式应用程序的多个多模式网页之一,则使全局语法通常包括卸载任何当前加载的全局语法。

    Web Service Support For A Multimodal Client Processing A Multimodal Application
    3.
    发明申请
    Web Service Support For A Multimodal Client Processing A Multimodal Application 失效
    多模式客户端处理多模式应用程序的Web服务支持

    公开(公告)号:US20080249782A1

    公开(公告)日:2008-10-09

    申请号:US11696230

    申请日:2007-04-04

    IPC分类号: G10L21/00

    摘要: Web service support for a multimodal client processing a multimodal application, the multimodal client providing an execution environment for the application and operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the application stored on an application server, includes: receiving, by the server, an application request from the client that specifies the application and device characteristics; determining, by a multimodal adapter of the server, modality requirements for the application; selecting, by the adapter, a modality web service in dependence upon the modality requirements and the characteristics for the device; determining, by the adapter, whether the device supports VoIP in dependence upon the characteristics; providing, by the server, the application to the client; and providing, by the adapter to the client in dependence upon whether the device supports VoIP, access to the modality web service for processing the application.

    摘要翻译: 处理多模式应用程序的多模式客户端的Web服务支持,多模式客户端为应用程序提供执行环境并在支持包括语音模式和一种或多种非语音模式的多种用户交互模式的多模式设备上运行,应用程序存储 在应用服务器上,包括:由服务器接收来自客户端的指定应用和设备特征的应用请求; 通过服务器的多模式适配器确定应用的模态要求; 根据模式要求和设备的特性,由适配器选择模态web服务; 由所述适配器确定所述设备是否根据所述特征支持VoIP; 由服务器将应用程序提供给客户端; 以及根据所述设备是否支持VoIP,通过所述适配器向所述客户端提供对所述模态网络服务的访问以处理所述应用。

    Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application
    4.
    发明申请
    Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application 审中-公开
    在多模式应用程序的X + V页面中启用自然语言理解

    公开(公告)号:US20080208586A1

    公开(公告)日:2008-08-28

    申请号:US11679292

    申请日:2007-02-27

    IPC分类号: G10L21/00

    CPC分类号: G10L2015/228

    摘要: Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in dependence upon the action identifier.

    摘要翻译: 通过使用自动语音识别(“ASR”)引擎中的多模式应用程序的统计语言模型(“SLM”)语法实现的多模式应用程序的X + V页面,实现自然语言理解,多模式应用程序在多模态下运行 浏览器支持包括语音模式和一个或多个非语音模式的多种交互模式的多模式设备,所述多模式应用通过VoiceXML解释器可操作地耦合到ASR引擎,包括:在多模式应用的ASR引擎中, 一个声音说话; 由ASR引擎根据SLM语法生成语音话语的至少一个识别结果; 通过所述VoiceXML解释器的动作分类器确定依赖于所述识别结果的动作标识符,所述动作标识符指定要由所述多模式应用执行的动作; 并且由VoiceXML解释器根据动作标识符解释多模式应用。

    ENABLING GRAMMARS IN WEB PAGE FRAME
    5.
    发明申请
    ENABLING GRAMMARS IN WEB PAGE FRAME 有权
    在网页框架中启用GRAMMARS

    公开(公告)号:US20080140410A1

    公开(公告)日:2008-06-12

    申请号:US11567235

    申请日:2006-12-06

    IPC分类号: H04M1/64 G10L21/00

    摘要: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

    摘要翻译: 在网页框架中启用语法,包括在多模式设备上的多模式应用程序中接收框架集文档,其中框架集文档包括定义网页框架的标记; 通过多模式应用程序内容文档获取以在每个网页帧中显示,其中内容文档包括可导航标记元素; 由多模式应用为每个内容文档中的每个可导航标记元素生成定义语音识别语法的标记段,包括在每个这样的语法标记中插入标识要在语法中的词匹配时要显示的内容,并且标识标识帧 要显示的内容; 并通过多模式应用程序实现所有生成的语法用于语音识别。

    Speech-Enabled Content Navigation And Control Of A Distributed Multimodal Browser
    6.
    发明申请
    Speech-Enabled Content Navigation And Control Of A Distributed Multimodal Browser 有权
    分布式多模态浏览器的语音启用内容导航和控制

    公开(公告)号:US20080255851A1

    公开(公告)日:2008-10-16

    申请号:US11734445

    申请日:2007-04-12

    IPC分类号: G10L21/00

    摘要: Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.

    摘要翻译: 公开了一种分布式多模式浏览器的语音启用内容导航和控制,浏览器为多模式应用提供执行环境,浏览器包括图形用户代理(“GUA”)和语音用户代理(“VUA”), GUA在多模式设备上操作,VUA在语音服务器上操作,其包括:由GUA向VUA发送链接消息,指定控制浏览器的语音命令的链接消息和与每个语音命令相对应的事件; 由GUA接收来自用户的语音发音,指定特定语音命令的语音话语; 通过GUA向VUA发送语音识别语音识别语音; 由GUA接收来自VUA的事件消息,事件消息指定与特定语音命令对应的特定事件; 并由GUA根据特定事件控制浏览器。

    Invoking Tapered Prompts In A Multimodal Application
    7.
    发明申请
    Invoking Tapered Prompts In A Multimodal Application 有权
    在多模式应用程序中调用锥形提示

    公开(公告)号:US20080208588A1

    公开(公告)日:2008-08-28

    申请号:US11678920

    申请日:2007-02-26

    IPC分类号: G10L11/00

    摘要: Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.

    摘要翻译: 描述了用于在多模式浏览器和多模式应用程序实现的多模式应用程序中调用渐变提示的方法,装置和计算机程序产品,该多模式应用程序在多模式设备上运行,该多模式应用程序支持与多模式应用程序的多种用户交互模式,用户交互模式包括 语音模式和一个或多个非语音模式。 实施例包括通过多模式浏览器识别多模式应用中的提示元素; 通过多模式浏览器识别与提示元素相关联的一个或多个属性; 以及根据与所述提示元素相关联的一个或多个属性播放语音提示。

    Altering Behavior Of A Multimodal Application Based On Location
    9.
    发明申请
    Altering Behavior Of A Multimodal Application Based On Location 有权
    改变基于位置的多模态应用的行为

    公开(公告)号:US20080208593A1

    公开(公告)日:2008-08-28

    申请号:US11679301

    申请日:2007-02-27

    IPC分类号: G10L21/00

    CPC分类号: G10L15/22 G10L15/24

    摘要: Methods, apparatus, and products are disclosed for altering behavior of a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application, including a voice mode and one or more non-voice modes. The voice mode of user interaction with the multimodal application is supported by a voice interpreter. Altering behavior of a multimodal application based on location includes: receiving a location change notification in the voice interpreter from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device; updating, by the voice interpreter, location-based environment parameters for the voice interpreter in dependence upon the current location of the multimodal device; and interpreting, by the voice interpreter, the multimodal application in dependence upon the location-based environment parameters.

    摘要翻译: 公开了基于位置改变多模式应用的行为的方法,装置和产品。 多模式应用程序在多模式设备上运行,支持与多模式应用程序的多种用户交互模式,包括语音模式和一种或多种非语音模式。 与多模式应用程序的用户交互的语音模式由语音解释器支持。 基于位置改变多模式应用的行为包括:从设备位置管理器在语音解释器中接收位置改变通知,该设备位置管理器可操作地耦合到多模态设备的位置检测组件,位置变化通知指定当前位置 的多模式设备; 语音解释器根据多模式设备的当前位置更新语音解释器的基于位置的环境参数; 并且由语音解释器根据基于位置的环境参数来解释多模式应用。

    Pausing A VoiceXML Dialog Of A Multimodal Application
    10.
    发明申请
    Pausing A VoiceXML Dialog Of A Multimodal Application 有权
    暂停多模式应用程序的VoiceXML对话框

    公开(公告)号:US20080208584A1

    公开(公告)日:2008-08-28

    申请号:US11679236

    申请日:2007-02-27

    IPC分类号: G10L13/00 G10L11/00

    摘要: Pausing a VoiceXML dialog of a multimodal application, including generating by the multimodal application a pause event; responsive to the pause event, temporarily pausing the dialogue by the VoiceXML interpreter; generating by the multimodal application a resume event; and responsive to the resume event, resuming the dialog. Embodiments are implemented with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the VoiceXML interpreter is interpreting the VoiceXML dialog to be paused.

    摘要翻译: 暂停多模式应用程序的VoiceXML对话框,包括由多模态应用程序生成暂停事件; 响应暂停事件,VoiceXML解释器临时暂停对话; 由多模式应用程序生成一个简历事件; 并响应resume事件,恢复对话。 实施例是通过在多模式设备上操作的多模式应用来实现的,该多模式设备支持包括语音模式和一种或多种非语音模式的多种交互模式,多模式应用可操作地耦合到VoiceXML解释器,并且VoiceXML解释器正在解释VoiceXML对话 暂停