Voice quality control for high quality speech reconstruction
    1.
    发明申请
    Voice quality control for high quality speech reconstruction 审中-公开
    高质量语音重建的语音质量控制

    公开(公告)号:US20070129945A1

    公开(公告)日:2007-06-07

    申请号:US11294959

    申请日:2005-12-06

    IPC分类号: G10L15/04

    CPC分类号: G10L25/69 G10L15/26

    摘要: A method and apparatus are provided for reproducing a speech sequence of a user through a communication device of the user. The method includes the steps of detecting a speech sequence from the user through the communication device, recognizing a phoneme sequence within the detected speech sequence and forming a confidence level of each phoneme within the recognized phoneme sequence. The method further includes the steps of audibly reproducing the recognized phoneme sequence for the user through the communication device and gradually highlighting or degrading a voice quality of at least some phonemes of the recognized phoneme sequence based upon the formed confidence level of the at least some phonemes.

    摘要翻译: 提供了一种用于通过用户的通信设备再现用户的语音序列的方法和装置。 该方法包括以下步骤:通过通信设备检测来自用户的语音序列,识别检测到的语音序列内的音素序列,并形成识别的音素序列内每个音素的置信度。 该方法还包括以下步骤:通过通信设备可听地再现用户的识别音素序列,并基于所形成的至少一些音素的置信水平逐渐突出或降低所识别的音素序列的至少一些音素的语音质量 。

    High quality speech reconstruction for a dialog method and system
    2.
    发明申请
    High quality speech reconstruction for a dialog method and system 审中-公开
    对话方法和系统的高质量语音重建

    公开(公告)号:US20070129946A1

    公开(公告)日:2007-06-07

    申请号:US11294964

    申请日:2005-12-06

    IPC分类号: G10L15/14

    摘要: An electronic device (400) for speech dialog includes functions that receive (405, 205) a speech phrase that includes an instantiated variable (315), generate pitch and voicing characteristics (330) of the instantiated variable, and performs voice recognition (410, 220) of the instantiated variable to determine a most likely set of recognition acoustic states (335). A trained map (358) is established (115) that maps recognition feature vectors derived from training speech (105) to synthesis feature vectors derived from the same training speech (110). Recognition feature vectors that represent the most likely set of recognition acoustic states for the recognized instantiated variable are converted to a most likely set of synthesis acoustic states (420) in accordance with the map. The electronic device may generate (421, 440, 445) a synthesized value of the instantiated variable using the most likely set of synthesis acoustic states and the pitch and voicing characteristics extracted from the instantiated variable.

    摘要翻译: 一种用于语音对话的电子设备(400)包括接收(405,205)包括实例变量(315)的语音短语的功能,产生所述实例化变量的音调和语音特征(330),并且执行语音识别(410, 220),以确定最可能的识别声学状态集合(335)。 建立训练图(358)(115),其将从训练语音(105)导出的识别特征向量映射到从相同训练语音(110)导出的合成特征向量。 表示识别的实例化变量的最可能的识别声学状态集合的识别特征向量根据该映射被转换成最可能的一组合成声学状态(420)。 电子设备可以使用最可能的合成声学状态集合和从实例变量提取的音高和发声特性来生成(421,440,445)所述实例化变量的合成值。

    Speech dialog method and system
    3.
    发明申请
    Speech dialog method and system 有权
    语音对话方法和系统

    公开(公告)号:US20060247921A1

    公开(公告)日:2006-11-02

    申请号:US11118670

    申请日:2005-04-29

    IPC分类号: G10L11/04

    摘要: An electronic device (300) for speech dialog includes functions that receive (305, 105) a speech phrase that comprises a request phrase that includes an instantiated variable (215), generate (335, 115) pitch and voicing characteristics (315) of the instantiated variable, and performs voice recognition (319, 125) of the instantiated variable to determine a most likely set of acoustic states (235). The electronic device may generate (335, 140) a synthesized value of the instantiated variable using the most likely set of acoustic states and the pitch and voicing characteristics of the instantiated variable. The electronic device may use a table of previously entered values of variables that have been determined to be unique, and in which the values are associated with a most likely set of acoustic states and the pitch and voicing characteristics determined at the receipt of each value to disambiguate (425, 430) a newly received instantiated variable.

    摘要翻译: 一种用于语音对话的电子设备(300)包括接收(305,105)语音短语的功能,该语音短语包括包含实例化变量(215)的请求短语,产生(335,115)音调和语音特征(315) 并且执行所述实例化变量的语音识别(319,125)以确定最可能的一组声学状态(235)。 电子设备可以使用最可能的声学状态集合和实例化变量的音调和语音特征来生成(335,140)实例化变量的合成值。 电子设备可以使用已经被确定为唯一的先前输入的变量值的表,并且其中值与最可能的一组声学状态相关联,并且在接收每个值时确定的音高和发声特性 消除歧义(425,430)一个新接收的实例变量。

    Tailored speaker-independent voice recognition system
    4.
    发明申请
    Tailored speaker-independent voice recognition system 有权
    量身定制的与扬声器无关的语音识别系统

    公开(公告)号:US20060085186A1

    公开(公告)日:2006-04-20

    申请号:US10967957

    申请日:2004-10-19

    申请人: Changxue Ma Yan Cheng

    发明人: Changxue Ma Yan Cheng

    IPC分类号: G10L15/08

    CPC分类号: G10L15/063 G10L2015/0631

    摘要: A tailored speaker-independent voice recognition system has a speech recognition dictionary (360) with at least one word (371). That word (371) has at least two transcriptions (373), each transcription (373) having a probability factor (375) and an indicator (377) of whether the transcription is active. When a speech utterance is received (510), the voice recognition system determines (520, 530) the word signified by the speech utterance, evaluates (540) the speech utterance against the transcriptions of the correct word, updates (550) the probability factors for each transcription, and inactivates (570) any transcription that has an updated probability factor that is less than a threshold.

    摘要翻译: 定制的与扬声器无关的语音识别系统具有至少一个单词(371)的语音识别词典(360)。 该字(371)具有至少两个转录(373),每个转录(373)具有概率因子(375)和指示符(377)是否转录是活性的。 当接收到语音话语(510)时,语音识别系统确定(520,530)由语音发音表示的单词,根据正确单词的转录评估(540)语音发音,更新(550)概率因子 对于每个转录,并使(570)任何具有小于阈值的更新概率因子的转录失活。

    Method and apparatus for generating a voice tag
    5.
    发明申请
    Method and apparatus for generating a voice tag 审中-公开
    用于生成语音标签的方法和装置

    公开(公告)号:US20060287867A1

    公开(公告)日:2006-12-21

    申请号:US11155944

    申请日:2005-06-17

    申请人: Yan Cheng Changxue Ma

    发明人: Yan Cheng Changxue Ma

    IPC分类号: G10L21/00

    摘要: A method and apparatus for generating a voice tag (140) includes a means (110) for combining (205) a plurality of utterances (106, 107, 108) into a combined utterance (111) and a means (120) for extraction (210) of the voice tag as a sequence of phonemes having a high likelihood of representing the combined utterance, using a set of stored phonemes (115) and the combined utterance.

    摘要翻译: 一种用于生成语音标签(140)的方法和装置包括:用于将多个话语(106,107,108)组合(205)到组合话语(111)中的装置(110)和用于提取的装置(120) 210)作为具有表示组合发音的高可能性的音素序列,使用一组存储的音素(115)和组合的话语。

    Method and system for interpreting verbal inputs in multimodal dialog system
    6.
    发明申请
    Method and system for interpreting verbal inputs in multimodal dialog system 有权
    在多模态对话系统中解释口头输入的方法和系统

    公开(公告)号:US20060229862A1

    公开(公告)日:2006-10-12

    申请号:US11100185

    申请日:2005-04-06

    IPC分类号: G06F17/28

    摘要: A method, a system and a computer program product for interpreting a verbal input in a multimodal dialog system are provided. The method includes assigning (302) a confidence value to at least one word generated by a verbal recognition component. The method further includes generating (304) a semantic unit confidence score for the verbal input. The generation of a semantic unit confidence score is based on the confidence value of at least one word and at least one semantic confidence operator.

    摘要翻译: 提供了一种用于在多模式对话系统中解释口头输入的方法,系统和计算机程序产品。 该方法包括将置信度值(302)分配(302)至由语言识别组件生成的至少一个词。 该方法还包括为语言输入生成(304)语义单位置信度得分。 语义单位置信度得分的产生基于至少一个单词和至少一个语义置信度运算符的置信度值。

    Content item retrieval based on a free text entry
    7.
    发明授权
    Content item retrieval based on a free text entry 失效
    基于自由文本输入的内容项检索

    公开(公告)号:US08041700B2

    公开(公告)日:2011-10-18

    申请号:US12419341

    申请日:2009-04-07

    申请人: Changxue Ma

    发明人: Changxue Ma

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30675

    摘要: A method and apparatus for textual searching of a database is provided herein. During operation a user will input a letter into a search engine. The search engine will score words based on the letter and display results of the highest-scored words. Another letter will again be received and the process repeated. In situations where titles are returned to the user, additional steps of associating the words with a title and scoring the title take place. The highest-scored titles are provided to the user as the displayed results.

    摘要翻译: 本文提供了一种用于文本搜索数据库的方法和装置。 在操作期间,用户将输入一个字母到搜索引擎。 搜索引擎将根据最高得分字的字母和显示结果对单词进行分数。 再次收到另一封信,重复过程。 在将标题返回给用户的情况下,会发生将单词与标题相关联并对标题进行评分的其他步骤。 作为显示结果,向用户提供最高分的标题。

    METHOD AND APPARATUS FOR ORDERING RESULTS OF A QUERY
    8.
    发明申请
    METHOD AND APPARATUS FOR ORDERING RESULTS OF A QUERY 审中-公开
    用于订购查询结果的方法和装置

    公开(公告)号:US20110071826A1

    公开(公告)日:2011-03-24

    申请号:US12564968

    申请日:2009-09-23

    IPC分类号: G10L15/26 G06F17/30

    CPC分类号: G10L15/083 G06F16/3343

    摘要: A method and apparatus for ordering results from a query is provided herein. During operation, a spoken query is received and converted to a textual representation, such as a word lattice. Search strings are then created from the word lattice. For example a set search strings may be created from the N-grams, such as unigrams and bigrams, of the word lattice. The search strings may be ordered and truncated based on confidence values assigned to the n-grams by the speech recognition system. The set of search strings are sent to at least one search engine, and search results are obtained. The search results are then re-arranged or reordered based on a semantic similarity between the search results and the word lattice.

    摘要翻译: 本文提供了一种用于排序查询结果的方法和装置。 在操作期间,接收到口语查询并将其转换为文本表示,例如单词格。 搜索字符串然后从单词格中创建。 例如,可以从单词格的N克(例如单字母和双字母)创建集合搜索字符串。 搜索字符串可以基于由语音识别系统分配给n-gram的置信度来排序和截断。 搜索字符串集合被发送到至少一个搜索引擎,并且获得搜索结果。 然后基于搜索结果和单词格之间的语义相似度重新排列或重新排序搜索结果。

    Method and Apparatus for Voice Searching for Stored Content Using Uniterm Discovery
    9.
    发明申请
    Method and Apparatus for Voice Searching for Stored Content Using Uniterm Discovery 有权
    使用Uniterm发现的语音搜索存储内容的方法和装置

    公开(公告)号:US20090210226A1

    公开(公告)日:2009-08-20

    申请号:US12032258

    申请日:2008-02-15

    申请人: Changxue Ma

    发明人: Changxue Ma

    IPC分类号: G10L15/08

    摘要: A method, system and communication device for enabling voice-to-voice searching and ordered content retrieval via audio tags assigned to individual content, which tags generate uniterms that are matched against components of a voice query. The method includes storing content and tagging at least one of the content with an audio tag. The method further includes receiving a voice query to retrieve content stored on the device. When the voice query is received, the method completes a voice-to-voice search utilizing uniterms of the audio tag, scored against the phoneme latent lattice model generated by the voice query to identify matching terms within the audio tags and corresponding stored content. The retrieved content(s) associated with the identified audio tags having uniterms that score within the phoneme lattice model are outputted in an order corresponding to an order in which the uniterms are structured within the voice query.

    摘要翻译: 一种用于通过分配给各个内容的音频标签启用语音到语音搜索和排序内容检索的方法,系统和通信设备,该标签生成与语音查询的组件匹配的单位。 该方法包括存储内容并且将具有音频标签的内容中的至少一个标记。 该方法还包括接收语音查询以检索存储在设备上的内容。 当接收到语音查询时,该方法使用音频标签的单位完成语音到语音搜索,对由语音查询产生的音素潜在网格模型进行评分,以识别音频标签内的匹配项和对应的存储内容。 与所识别的音频标签相关联的检索到的内容,其具有在音素格子模型内得分的单位格式,其顺序与语音查询内的单元格结构的顺序相对应地输出。