System and method for assessing TV-related information over the internet
    1.
    发明授权
    System and method for assessing TV-related information over the internet 有权
    通过互联网评估电视相关信息的系统和方法

    公开(公告)号:US06901366B1

    公开(公告)日:2005-05-31

    申请号:US09383763

    申请日:1999-08-26

    CPC分类号: G06F17/30663 G10L2015/228

    摘要: The system retrieves information from the internet using multiple search engines that are simultaneously launched by the search engine commander. The commander is responsive to a speech-enabled system including a speech recognizer and natural language parser. The user speaks to the system in natural language requests, and the parser extracts the semantic content from the user's speech, based on a set of goal oriented grammars. The preferred system includes a fixed grammar and an updatable or downloaded grammar, allowing the system to be used without extensive training and yet capable of being customized for a particular user's purposes. Results obtained from the search engines are filtered based on information extracted from an electronic program guide and from prestored user profile data. The results may be displayed on screen or through synthesized speech.

    摘要翻译: 系统使用搜索引擎指挥官同时启动的多个搜索引擎从互联网检索信息。 指挥官对包括语音识别器和自然语言解析器的支持语音的系统做出响应。 用户以自然语言请求对系统说话,并且解析器基于一组面向目标的语法从用户的语音中提取语义内容。 优选的系统包括固定语法和可更新或下载的语法,允许系统在没有广泛训练的情况下使用,并且能够根据特定用户的目的进行定制。 从搜索引擎获得的结果根据从电子节目指南提取的信息和预先存储的用户简档数据进行过滤。 结果可能会显示在屏幕上或通过合成语音显示。

    Universal remote control allowing natural language modality for television and multimedia searches and requests
    2.
    发明授权
    Universal remote control allowing natural language modality for television and multimedia searches and requests 有权
    通用遥控器允许电视和多媒体搜索和请求的自然语言模式

    公开(公告)号:US06553345B1

    公开(公告)日:2003-04-22

    申请号:US09383762

    申请日:1999-08-26

    IPC分类号: G10L1522

    摘要: The remote control unit supports multi-modal dialog with the user, through which the user can easily select programs for viewing or recording. The remote control houses a microphone into which the user can input natural language speech. The input speech is recognized and interpreted by a natural language parser that extracts the semantic content of the user's speech. The parser works in conjunction with an electronic program guide, through which the remote control system is able to ascertain what programs are available for viewing or recording and supply appropriate prompts to the user. In one embodiment, the remote control includes a touch screen display upon which the user may view prompts or make selections by pen input or tapping. Selections made on the touch screen automatically limit the context of the ongoing dialog between user and remote control, allowing the user to interact naturally with the unit. The remote control unit can control virtually any audio-video component, including those designed before the current technology. The remote control system can be packaged entirely within the remote control handheld unit, or components may be distributed in other systems attached to the user's multimedia equipment.

    摘要翻译: 遥控器支持与用户的多模态对话,用户可以轻松地选择节目进行观看或录制。 遥控器装有麦克风,用户可以在其中输入自然语言语音。 输入语音由提取用户语音的语义内容的自然语言解析器识别和解释。 解析器与电子节目指南一起工作,通过该电子节目指南,遥控系统能够确定哪些节目可用于观看或录制,并向用户提供适当的提示。 在一个实施例中,遥控器包括触摸屏显示器,用户可以通过触摸屏显示器通过笔输入或点击来查看提示或进行选择。 在触摸屏上进行的选择自动限制用户和遥控器之间正在进行的对话框的上下文,从而允许用户自然地与本机进行交互。 遥控器可以实际控制任何音频 - 视频组件,包括在当前技术之前设计的。 远程控制系统可以完全包装在遥控手持单元内,或者组件可以分布在附接到用户的多媒体设备的其它系统中。

    Automatic filtering of TV contents using speech recognition and natural language
    3.
    发明授权
    Automatic filtering of TV contents using speech recognition and natural language 有权
    使用语音识别和自然语言自动过滤电视内容

    公开(公告)号:US06330537B1

    公开(公告)日:2001-12-11

    申请号:US09383758

    申请日:1999-08-26

    IPC分类号: G10L1518

    摘要: Speech recognition and natural language parsing components are used to extract the meaning of the user's spoken input. The system stores a semantic representation of an electronic program guide, and the contents of the program guide can be mapped into the grammars used by the natural language parser. Thus, when the user wishes to navigate through the complex menu structure of the electronic program guide, he or she only needs to speak in natural language sentences. The system automatically filters the contents of the program guide and supplies the user with on-screen display or synthesized speech responses to the user's request.

    摘要翻译: 语音识别和自然语言解析组件用于提取用户口语输入的含义。 该系统存储电子节目指南的语义表示,并且节目指南的内容可被映射到自然语言解析器使用的语法中。 因此,当用户希望通过电子节目指南的复杂菜单结构导航时,他或她只需要用自然语言句子来说话。 该系统自动过滤节目指南的内容,并向用户提供用户请求的屏幕显示或合成语音响应。

    System for identifying and adapting a TV-user profile by means of speech technology
    4.
    发明授权
    System for identifying and adapting a TV-user profile by means of speech technology 有权
    通过语音技术识别和调整电视用户资料的系统

    公开(公告)号:US06415257B1

    公开(公告)日:2002-07-02

    申请号:US09383797

    申请日:1999-08-26

    IPC分类号: G10L1522

    摘要: Speech input supplied by the user is evaluated by the speaker verification/identification module, and based on the evaluation, parameters are retrieved from a user profile database. These parameters adapt the speech models of the speech recognizer and also supply the natural language parser with customized dialog grammars. The user's speech is then interpreted by the speech recognizer and natural language parser to determine the meaning of the user's spoken input in order to control the television tuner. The parser works in conjunction with a command module that mediates the dialog with the user, providing on-screen prompts or synthesized speech queries to elicit further input from the user when needed. The system integrates with an electronic program guide, so that the natural language parser is made aware of what programs are available when conducting the synthetic dialog with the user.

    摘要翻译: 由用户提供的语音输入由说话人验证/识别模块进行评估,并且基于评估,从用户简档数据库检索参数。 这些参数适应语音识别器的语音模型,并为自然语言解析器提供定制的对话语法。 用户的语音然后由语音识别器和自然语言解析器进行解释,以确定用户的口头输入的含义,以控制电视调谐器。 解析器与一个命令模块一起工作,该模块与用户中介对话,提供屏幕提示或合成语音查询,以便在需要时从用户中引出进一步的输入。 该系统与电子节目指南集成,使得自然语言解析器在与用户进行合成对话时了解哪些程序可用。

    Method for goal-oriented speech translation in hand-held devices using meaning extraction and dialogue
    6.
    发明授权
    Method for goal-oriented speech translation in hand-held devices using meaning extraction and dialogue 有权
    使用意义提取和对话的手持设备中面向目标的语音翻译方法

    公开(公告)号:US06233561B1

    公开(公告)日:2001-05-15

    申请号:US09290628

    申请日:1999-04-12

    IPC分类号: G10L1522

    CPC分类号: G10L15/1822 G10L15/1815

    摘要: A computer-implemented method and apparatus is provided for processing a spoken request from a user. A speech recognizer converts the spoken request into a digital format. A frame data structure associates semantic components of the digitized spoken request with predetermined slots. The slots are indicative of data which are used to achieve a predetermined goal. A speech understanding module which is connected to the speech recognizer and to the frame data structure determines semantic components of the spoken request. The slots are populated based upon the determined semantic components. A dialog manager which is connected to the speech understanding module may determine at least one slot which is unpopulated based upon the determined semantic components and in a preferred embodiment may provide confirmation of the populated slots. A computer generated-request is formulated in order for the user to provide data related to the unpopulated slot. The method and apparatus are well-suited (but not limited) to use in a hand-held speech translation device.

    摘要翻译: 提供了一种用于处理来自用户的口头请求的计算机实现的方法和装置。 语音识别器将口头请求转换为数字格式。 帧数据结构将数字化语音请求的语义分量与预定时隙相关联。 这些时隙指示用于实现预定目标的数据。 连接到语音识别器和帧数据结构的语音理解模块确定语音请求的语义分量。 基于确定的语义分量来填充时隙。 连接到语音理解模块的对话管理器可以基于所确定的语义组件来确定未填充的至少一个时隙,并且在优选实施例中可以提供填充时隙的确认。 制定计算机生成请求以便用户提供与未填充槽相关的数据。 该方法和装置非常适合(但不限于)在手持语音翻译装置中使用。

    Eigenvoice re-estimation technique of acoustic models for speech recognition, speaker identification and speaker verification
    7.
    发明授权
    Eigenvoice re-estimation technique of acoustic models for speech recognition, speaker identification and speaker verification 有权
    用于语音识别,扬声器识别和说话人验证的声学模型的本征语重新估计技术

    公开(公告)号:US06895376B2

    公开(公告)日:2005-05-17

    申请号:US09849174

    申请日:2001-05-04

    IPC分类号: G10L15/06 G10L17/00

    CPC分类号: G10L15/07 G10L17/02

    摘要: A reduced dimensionality eigenvoice analytical technique is used during training to develop context-dependent acoustic models for allophones. Re-estimation processes are performed to more strongly separate speaker-dependent and speaker-independent components of the speech model. The eigenvoice technique is also used during run time upon the speech of a new speaker. The technique removes individual speaker idiosyncrasies, to produce more universally applicable and robust allophone models. In one embodiment the eigenvoice technique is used to identify the centroid of each speaker, which may then be “subtracted out” of the recognition equation.

    摘要翻译: 在训练期间使用减小的维度本征语音分析技术来开发用于异音素的上下文相关的声学模型。 执行重新估计过程以更强烈地分离语音模型的与扬声器相关的和与扬声器无关的组件。 特定语音技术在运行时也用于新演讲者的演讲。 该技术可以消除单个扬声器的特性,从而产生更普遍适用和强大的异音模型。 在一个实施例中,本征语音技术用于识别每个说话者的质心,然后可以将其“减去”识别方程。

    Adaptation system and method for E-commerce and V-commerce applications
    8.
    发明授权
    Adaptation system and method for E-commerce and V-commerce applications 有权
    电子商务和电子商务应用的适应系统和方法

    公开(公告)号:US06341264B1

    公开(公告)日:2002-01-22

    申请号:US09258113

    申请日:1999-02-25

    IPC分类号: G10L1528

    摘要: Electronic commerce (E-commerce) and Voice commerce (V-commerce) proceeds by having the user speak into the system. The user's speech is converted by speech recognizer into a form required by the transaction processor that effects the electronic commerce operation. A dimensionality reduction processor converts the user's input speech into a reduced dimensionality set of values termed eigenvoice parameters. These parameters are compared with a set of previously stored eigenvoice parameters representing a speaker population (the eigenspace representing speaker space) and the comparison is used by the speech model adaptation system to rapidly adapt the speech recognizer to the user's speech characteristics. The user's eigenvoice parameters are also stored for subsequent use by the speaker verification and speaker identification modules.

    摘要翻译: 电子商务(电子商务)和语音商务(V-commerce)通过让用户进入系统进行。 用户的语音由语音识别器转换成影响电子商务操作的交易处理器所需的形式。 维数降低处理器将用户的输入语音转换成称为本征语音参数的减小的维度值集合。 将这些参数与表示扬声器群体(表示扬声器空间的本征空间)的一组先前存储的本征语音参数进行比较,并且语音模型适配系统使用该比较来快速地将语音识别器适应于用户的语音特征。 用户的本征语音参数也被存储供讲话人验证和说话者识别模块随后使用。

    Maximum likelihood method for finding an adapted speaker model in eigenvoice space
    9.
    发明授权
    Maximum likelihood method for finding an adapted speaker model in eigenvoice space 失效
    在本征语音空间中找到适应的说话者模型的最大似然法

    公开(公告)号:US06263309B1

    公开(公告)日:2001-07-17

    申请号:US09070054

    申请日:1998-04-30

    IPC分类号: G10L1508

    CPC分类号: G10L15/07

    摘要: A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principle component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.

    摘要翻译: 一组扬声器依赖模型训练在相对较多数量的训练扬声器上,每个扬声器一个模型和模型参数以预定义的顺序提取,以构建一组超级矢量,每个扬声器一个。 然后在一组超级矢量上执行原理分量分析,以生成一组定义本征语音空间的特征向量。 如果需要,可以减少向量的数量以实现数据压缩。 此后,新的说话者提供了通过基于最大似然估计将该超向量限制在本征语音空间中来构建超向量的适配数据。 然后,可以使用这个新的说话者的本征空间中得到的系数来构建一组新的模型参数,从该模型参数构建适合于该说话者的适应模型。 可以通过在训练数据中包括环境变化来执行环境适应。

    Method for generating spelling-to-pronunciation decision tree
    10.
    发明授权
    Method for generating spelling-to-pronunciation decision tree 失效
    拼写到发音决策树的方法

    公开(公告)号:US06230131B1

    公开(公告)日:2001-05-08

    申请号:US09069308

    申请日:1998-04-29

    IPC分类号: G10L1308

    CPC分类号: G10L13/08

    摘要: Decision trees are used to store a series of yes-no questions that can be used to convert spelled-word letter sequences into pronunciations. Letter-only trees, having internal nodes populated with questions about letters in the input sequence, generate one or more pronunciations based on probability data stored in the leaf nodes of the tree. The pronunciations may then be improved by processing them using mixed trees which are populated with questions about letters in the sequence and also questions about phonemes associated with those letters. The mixed tree screens out pronunciations that would not occur in natural speech, thereby greatly improving the results of the letter-to-pronunciation transformation.

    摘要翻译: 决策树用于存储可用于将拼写字母序列转换为发音的一系列“是”的问题。 仅有信息树,内部节点填充有关输入序列中的字母的问题,根据存储在树的叶节点中的概率数据生成一个或多个发音。 然后可以通过使用填充有序列中的字母的问题的混合树以及与这些字母相关的音素的问题来处理它们来发音。 混合树屏蔽了自然语言中不会发生的发音,从而大大提高了字母到发音转换的结果。