Improving speech capabilities of a multimodal application
    1.
    发明授权
    Improving speech capabilities of a multimodal application 有权
    提高多模式应用程序的语音能力

    公开(公告)号:US08380513B2

    公开(公告)日:2013-02-19

    申请号:US12468166

    申请日:2009-05-19

    CPC classification number: G10L15/22 G10L15/187 G10L15/19 G10L2015/228

    Abstract: Improving speech capabilities of a multimodal application including receiving, by the multimodal browser, a media file having a metadata container; retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser; determining whether the speech artifact includes a grammar rule or a pronunciation rule; if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule; and if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule.

    Abstract translation: 改善多模式应用的语音能力,包括由多模式浏览器接收具有元数据容器的媒体文件; 由所述多模式浏览器从所述元数据容器检索与存储在所述媒体文件中的内容相关的语音伪像,以包括在所述多模式浏览器中可用的语音引擎中; 确定语音伪影是否包括语法规则或发音规则; 如果语音工件包括语法规则,则由多模式浏览器修改语音引擎的语法以包括语法规则; 并且如果语音伪影包括发音规则,则由多模式浏览器修改语音引擎的词典以包括发音规则。

    METHOD AND ARRANGEMENT FOR MANAGING GRAMMAR OPTIONS IN A GRAPHICAL CALLFLOW BUILDER
    2.
    发明申请
    METHOD AND ARRANGEMENT FOR MANAGING GRAMMAR OPTIONS IN A GRAPHICAL CALLFLOW BUILDER 有权
    用于管理图形呼叫建筑物中的灰度选项的方法和布置

    公开(公告)号:US20120209613A1

    公开(公告)日:2012-08-16

    申请号:US13344193

    申请日:2012-01-05

    CPC classification number: G10L2015/228

    Abstract: A method (10) in a speech recognition application callflow can include the steps of assigning (11) an individual option and a pre-built grammar to a same prompt, treating (15) the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase (12) or an annotation (13) in the pre-built grammar, and treating (14) the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.

    Abstract translation: 语音识别应用程序调用流程中的方法(10)可以包括以下步骤:将单个选项和预先构建的语法分配给相同的提示,将(15)个别选项视为预先构建的有效输出 如果个人选项是预先构建的语法中的识别短语(12)或注释(13)的潜在有效匹配,则将语法(14)作为独立语法从预先构建的语法处理(14),如果 单个选项不能成为预先构建的语法中的识别短语或注释的潜在有效匹配。

    Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets
    4.
    发明授权
    Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets 有权
    使用减少的脚本和预录制的语音资源构建级联TTS语音时减少录制时间

    公开(公告)号:US08019605B2

    公开(公告)日:2011-09-13

    申请号:US11748256

    申请日:2007-05-14

    CPC classification number: G10L13/04

    Abstract: The present invention discloses a system and a method for creating a reduced script, which is read by a voice talent to create a concatenative text-to-speech (TTS) voice. The method can automatically process pre-recorded audio to derive speech assets for a concatenative TTS voice. The pre-recording audio can include sets of recorded phrases used by a speech user interface (Sill). A set of unfulfilled speech assets needed for foil phonetic coverage of the concatenative TTS voice can be determined. A reduced script can be constructed that includes a set of phrases, which when read by a voice talent result in a reduced corpus. When the reduced corpus is automatically processed, a reduced set of speech assets result. The reduced set includes each of the unfulfilled speech assets. When this reduced corpus is combined with existing speech assets the result will be a voice with a complete set of speech assets.

    Abstract translation: 本发明公开了一种用于创建简化脚本的系统和方法,该脚本由语音天才读取以创建级联的文本到语音(TTS)语音。 该方法可以自动处理预先录制的音频,以便为连续的TTS语音导出语音资源。 预录音音频可以包括由语音用户界面(Sill)使用的记录短语集合。 可以确定一连串的TTS语音的箔语音覆盖所需的一组未实现的语音资产。 可以构造一个简化的脚本,其包括一组短语,当通过语音天赋读取时,会产生减少的语料库。 当自动处理缩减的语料库时,会产生一组减少的语音资源。 缩减的集合包括每个未实现的语音资产。 当这种减少的语料库与现有语音资源相结合时,结果将是具有完整语音资产的语音。

    Disambiguation systems and methods for use in generating grammars
    5.
    发明授权
    Disambiguation systems and methods for use in generating grammars 有权
    消歧系统和用于生成语法的方法

    公开(公告)号:US08010343B2

    公开(公告)日:2011-08-30

    申请号:US11304964

    申请日:2005-12-15

    CPC classification number: G06F17/2795 G06F17/30731

    Abstract: A method and system for addressing disambiguation issues in interactive applications by creating a disambiguation system for generating complex grammars that includes homonym detection and grouping, and provides optimization feedback that eliminates time-consuming and repetitive iterative steps during the grammar generation portion of the interactive application configuration.

    Abstract translation: 一种用于通过创建消歧系统来解决交互式应用程序中的消歧问题的方法和系统,用于生成包含同音异动检测和分组的复杂语法,并提供优化反馈,消除交互式应用程序配置语法生成部分期间的耗时和重复的迭代步骤 。

    Records Disambiguation In A Multimodal Application Operating On A Multimodal Device
    6.
    发明申请
    Records Disambiguation In A Multimodal Application Operating On A Multimodal Device 有权
    在多模式设备上运行的多模式应用程序中记录消歧

    公开(公告)号:US20090271199A1

    公开(公告)日:2009-10-29

    申请号:US12109167

    申请日:2008-04-24

    CPC classification number: G10L15/22 G10L15/00 G10L15/08 G10L15/183

    Abstract: Methods, apparatus, and products are disclosed for record disambiguation in a multimodal application operating on a multimodal device, the multimodal device supporting multiple modes of interaction including at least a voice mode and a visual mode, that include: prompting, by the multimodal application, a user to identify a particular record among a plurality of records; receiving, by the multimodal application in response to the prompt, a voice utterance from the user; determining, by the multimodal application, that the voice utterance ambiguously identifies more than one of the plurality of records; generating, by the multimodal application, a user interaction to disambiguate the records ambiguously identified by the voice utterance in dependence upon record attributes of the records ambiguously identified by the voice utterance; and selecting, by the multimodal application for further processing, one of the records ambiguously identified by the voice utterance in dependence upon the user interaction.

    Abstract translation: 公开了用于在多模式设备上操作的多模式应用中的记录消歧的方法,装置和产品,所述多模式设备支持包括至少语音模式和视觉模式的多种交互模式,其包括:由多模式应用提示, 用户识别多个记录中的特定记录; 由多模式应用程序响应于该提示,接收来自用户的语音发声; 由所述多模式应用程序确定所述语音发音含糊地识别所述多​​个记录中的多于一个的记录; 由多模式应用程序产生用户交互,以消除由声音话语模糊识别的记录,依赖于由语音话语模糊识别的记录的记录属性; 以及通过多模式应用程序进行进一步处理,根据用户交互,通过语音话语模糊识别的记录之一。

    Dynamically Publishing Directory Information For A Plurality Of Interactive Voice Response Systems
    7.
    发明申请
    Dynamically Publishing Directory Information For A Plurality Of Interactive Voice Response Systems 有权
    动态发布多种交互式语音应答系统的目录信息

    公开(公告)号:US20090268883A1

    公开(公告)日:2009-10-29

    申请号:US12109214

    申请日:2008-04-24

    CPC classification number: H04M3/493

    Abstract: Methods, apparatus, and products are disclosed for dynamically publishing directory information for a plurality of interactive voice response (‘IVR’) systems through an IVR directory service that include: providing a description of a web services publication interface for the IVR directory service; receiving, on behalf of one or more IVR systems, web services publication requests through the publication interface; determining, in response to the web services publication requests, directory information for each IVR system requesting publication; adding the directory information for each IVR system to an IVR system directory; generating a voice mode user interface to reflect the directory information for each IVR system added to the IVR system directory; and interacting, using the voice mode user interface, with a caller to identify a particular IVR system in dependence upon the IVR system directory and query information provided by the caller and to connect the caller with the identified IVR system.

    Abstract translation: 公开了用于通过IVR目录服务动态地发布用于多个交互式语音响应(“IVR”)系统的目录信息的方法,装置和产品,其包括:提供用于IVR目录服务的Web服务发布界面的描述; 通过出版界面接收代表一个或多个IVR系统的Web服务发布请求; 响应于所述Web服务发布请求确定请求发布的每个IVR系统的目录信息; 将每个IVR系统的目录信息添加到IVR系统目录; 生成语音模式用户界面,以反映添加到IVR系统目录的每个IVR系统的目录信息; 并且使用语音模式用户界面与呼叫者进行交互,以根据IVR系统目录和由呼叫者提供的查询信息来识别特定的IVR系统,并将呼叫者连接到所识别的IVR系统。

    DYNAMICALLY TRANSLATING A SOFTWARE APPLICATION TO A USER SELECTED TARGET LANGUAGE THAT IS NOT NATIVELY PROVIDED BY THE SOFTWARE APPLICATION
    8.
    发明申请
    DYNAMICALLY TRANSLATING A SOFTWARE APPLICATION TO A USER SELECTED TARGET LANGUAGE THAT IS NOT NATIVELY PROVIDED BY THE SOFTWARE APPLICATION 审中-公开
    将软件转换为用户选择的目标语言,不由软件应用程序提供的语言进行动态转换

    公开(公告)号:US20080077384A1

    公开(公告)日:2008-03-27

    申请号:US11534500

    申请日:2006-09-22

    CPC classification number: G06F17/289 G06F9/454

    Abstract: The present solution includes a method for dynamically translating application prompts to internationalize software applications for a non-native language that is not specifically supported by the application. In the solution, application prompts can be identified that are associated with a software application. Each application prompt can include text written in an original language. An attempt of the software application to render one of the application prompts can be intercepted and dynamically translated. The translated text can be substituted for the original text. The application prompt can then be rendered.

    Abstract translation: 本解决方案包括一种用于动态翻译应用程序提示以将非本地语言的软件应用程序国际化的方法,该非本地语言未被应用程序特别支持。 在解决方案中,可以识别与软件应用程序相关联的应用程序提示。 每个应用程序提示符都可以包含以原始语言编写的文本。 软件应用程序尝试呈现应用程序提示之一可以被拦截并动态翻译。 翻译的文本可以替代原始文本。 然后可以呈现应用程序提示。

Patent Agency Ranking