Patent search cpc:"G10L2015/0638" Page 1

1.

发明申请
SPEECH RECOGNITION METHOD 审中-公开
Title translation: 语音识别方法

公开(公告)号：WO2004061822A1

公开(公告)日：2004-07-22

申请号：PCT/US2003/041697

申请日：2003-12-31

Applicant: LESSAC TECHNOLOGY, INC. , WILSON, Donald, H. , HANDAL, Anthony, H. , LESSAC, Michael

Inventor： WILSON, Donald, H. , HANDAL, Anthony, H. , LESSAC, Michael

IPC: G10L15/04

CPC classification number: G10L13/10 , G09B5/04 , G09B19/04 , G09B21/00 , G10L15/063 , G10L15/187 , G10L2015/0638

Abstract: In accordance with a present invention speech recognition is disclosed (10). It uses a microphone to receive audible sounds input by a user into a first computing device (28) having a program with a database (16) comprising (i) digital responses of known audible sounds and associated alphanumeric representations of the known audible sounds and for the first time (ii) digital representations of known audible sounds corresponding to mispronunciations resulting from known class of mispronounced words and phrases. The method is performed by receiving the audible sounds in the form of the electrical output of the microphone (28). A particular audible sound to be recognized is converted into a digital representation of the audible sound (30). The digital representation of the particular audible sound is then compared to the digital representations of the known audible sounds to determine which of those known audible sounds is most likely to be the particular audible sounds in the database (30).

Abstract translation: 根据本发明，公开了语音识别（10）。它使用麦克风来接收由用户输入到具有数据库（16）的程序的第一计算设备（28）的可听见的声音，所述数据库包括（i）已知可听见的声音和所述已知可听见的声音的相关联的字母数字表示的数字响应，并且第一次（ii）已知的可听见的声音的数字表示对应于由已知类型的错误的单词和短语产生的误导。通过以麦克风（28）的电输出的形式接收可听见的声音来执行该方法。要识别的特定可听见的声音被转换成可听见的声音的数字表示（30）。然后将特定可听见的声音的数字表示与已知可听见的声音的数字表示进行比较，以确定哪些已知的可听见的声音最有可能是数据库中的特定可听见的声音（30）。

2.

发明申请
TEXT TO SPEECH 审中-公开
Title translation: 文字转语音

公开(公告)号：WO03065349A2

公开(公告)日：2003-08-07

申请号：PCT/US0302561

申请日：2003-01-28

Applicant: LESSAC TECHNOLOGY INC , ADDISON ED , WILSON H DONALD , MARPLE GARY , HANDAL ANTHONY H , KREBS NANCY

Inventor： ADDISON ED , WILSON H DONALD , MARPLE GARY , HANDAL ANTHONY H , KREBS NANCY

IPC: G09B5/04 , G09B19/04 , G10L13/00 , G10L13/06 , G10L13/08 , G10L15/06 , G10L

CPC classification number: G09B5/04 , G09B19/04 , G10L13/10 , G10L15/063 , G10L2015/0638

Abstract: A preferred embodiment of the method for converting text to speech using a computing device having a memory is disclosed. The inventive method comprises examining a text to be spoken to an audience for a specific communications purpose, followed by marking-up the text according to a phonetic markup systems such as the Lessac System pronunciation rules notations. A set of rules to control a speech to text generator based on speech principles, such as Lessac principles. Such rules are of the tide normally implemented on prior art text-to-speech engines, and control the operation of the software and the characteristics of the speech generated by a computer using the software. A computer is used to speak the marked-up text expressively. The step of using a computer to speak the marked-up text expressively is repeated using alternative pronunciations of the selected style of expression where each of the tonal, structural, and consonant energies, have a different balance in the speech, are also spoken to a trained speech practitioners that listened to the spoken speech generated by the computer. The spoken speech generated by the computer is then evaluated for consistency with style criteria and/or expressiveness. And audience is then assembled and the spoken speech generated by the computer is played back to the audience. Audience comprehension of spoken speech generated by the computer is evaluated and correlated to a particular implemented rule or rules, and those rules which resulted relatively high audience comprehension are selected.

Abstract translation: 公开了使用具有存储器的计算装置将文本转换成语音的方法的优选实施例。本发明的方法包括检查用于特定通信目的的待观众的文本，然后根据诸如Lessac System发音规则符号的语音标记系统标记文本。一套基于言语原则（如Lessac原则）控制语音到文本生成器的规则。这样的规则通常在现有技术的文本到语音引擎上实现，并且控制软件的操作和使用该软件由计算机产生的语音的特性。一台电脑用来表达出标记的文字。使用计算机表达地说出标记的文本的步骤被重复使用选择的表达形式的替代发音，其中每个音调，结构和辅音能量在语音中具有不同的平衡，也被称为训练有素的讲话从业人员聆听了计算机产生的口语演讲。然后评估由计算机产生的口语语音与风格标准和/或表现力的一致性。然后组合观众，并将计算机产生的口语演讲播放给观众。对计算机产生的口语表达的听众理解进行评估，并与特定实施的规则或规则相关联，并且选择导致相对较高的受众理解的规则。

3.

发明申请
SPEECH RECOGNITION AND TRAINING METHODS AND SYSTEMS 审中-公开
Title translation: 语音识别和培训方法与系统

公开(公告)号：WO01082291A1

公开(公告)日：2001-11-01

申请号：PCT/US2001/012959

申请日：2001-04-23

IPC: G09B19/04 , G10L15/06 , G09B1/00 , G09B19/00 , G10L15/04 , G10L15/26

CPC classification number: G10L15/063 , G09B19/04 , G10L2015/0638

Abstract: In accordance with a present invention speech recognition (10) and training (110), methods and systems are disclosed. A microphone receives audible sounds input (28) from a user into a first computing device having a program with a database (16). The database consists of digital representations of known audible sounds and associated alphanumeric representations of the known audible sounds and mispronunciations. The program compares the digital representation to the digital representations of known audible sounds in a database (30) to determine the likely desired output. If an error in recognition (32) occurs, then the user can indicate the proper alphanumeric representation of the particular audible sound (34). This allows the system to determine whether the error is a result of a known type or instance of mispronunciation (36). In response to a determination of the error's nature, the system presents an interactive training program from the computer to the user to enable the user to correct such mispronunciation (45). The present invention has the advantage of improving voice recognition and speech patterns of the user by focusing in on the user in error correction. Thus improving oral communication skills of the user.

Abstract translation: 根据本发明，公开了语音识别（10）和训练（110），方法和系统。麦克风从用户接收到具有具有数据库（16）的程序的第一计算设备的声音输入（28）。该数据库由已知的可听见的声音和已知可听见的声音和错误的相关联的字母数字表示的数字表示。该程序将数字表示与数据库（30）中的已知声音的数字表示进行比较，以确定可能的期望输出。如果发生识别（32）中的错误，则用户可以指示特定可听见的声音的适当的字母数字表示（34）。这允许系统确定错误是否是已知类型或误会实例的结果（36）。响应于错误的性质的确定，系统提供从计算机到用户的交互式训练程序，以使用户能够纠正这种错误发音（45）。本发明的优点在于，通过集中在用户的纠错中来改善用户的语音识别和语音模式。从而提高用户的口语交流能力。

4.

发明申请
MULTITASKING INTERACTIVE VOICE USER INTERFACE 审中-公开
Title translation: 多媒体互动语音用户界面

公开(公告)号：WO01004872A1

公开(公告)日：2001-01-18

申请号：PCT/US2000/017516

申请日：2000-06-26

IPC: G10L15/06 , G10L15/22 , G10L3/00

CPC classification number: G10L15/22 , G10L2015/0638

Abstract: A dictation command voice multitasking interface is illustrated by the GUI computer training template to be displayed and implemented by the creation of a question and multiple answer database where, for example, a first box (10) labeled "print question" receives text for question A. "RecQ" box (11) is selected by a mouse in which case the trainer records the voice equivalent of the question. The system is thereby made responsive to recognized spoken words, such as for the alternative questions illustrated by box (10) and box (18) and corresponding stored voice equivalents illustrated by box (11) and (19). A voice equivalent of the printed answer in box (12) is stored in as "RecA" in box (13). Corresponding descriptive text is stored in Box (14). The process is interactive in storing the voice equivalents as shown by decision box (34) which queries the trainer for more questions to be stored in the database wherein the computer interrupt handler (25) waits for further input from voice (24). A practical application of the system would enable a doctor, with hands and eyes occupied in performing a clinical procedure, to input voiced queries to the computer in order to create a report during the clinical procedure.

Abstract translation: 听写命令语音多任务接口由GUI计算机训练模板说明，以通过创建问题和多应答数据库来显示和实现，其中例如标记为“打印问题”的第一个框（10）接收问题A的文本 “RecQ”框（11）由鼠标选择，在这种情况下，培训者记录相当于该问题的声音。因此，系统响应于所识别的口语，例如由框（10）和框（18）示出的替代问题和由框（11）和（19）示出的相应的存储的语音等效。方框（12）中的打印答案的等效语句作为框（13）中的“RecA”存储。相应的描述性文本存储在Box（14）中。该过程是交互式的，用于存储语音等同物，如决策框（34）所示，其向训练者询问要存储在数据库中的更多问题，其中计算机中断处理程序（25）等待来自语音的进一步输入（24）。该系统的实际应用将使得在手术和眼睛被占用的医生执行临床程序时，可以向计算机输入有声查询以在临床过程中创建报告。

5.

发明申请
SPEECH REFERENCE ENROLLMENT METHOD 审中-公开
Title translation: 语音参考引用方法

公开(公告)号：WO99013456A1

公开(公告)日：1999-03-18

申请号：PCT/US1998/017095

申请日：1998-08-17

IPC: G10L15/06 , G10L17/00 , H04M3/38 , H04M3/493 , G10L5/06

CPC classification number: H04M3/382 , G10L15/07 , G10L17/04 , G10L2015/0631 , G10L2015/0636 , G10L2015/0638 , H04M3/493 , H04M2201/40

Abstract: A speech reference enrollment method involves the following steps: (a) requesting a user speak a vocabulary word; (b) detecting a first utterance (354); (c) requesting the user speak the vocabulary word; (d) detecting a second utterance (358); (e) determining a first similarity between the first utterance and the second utterance (362); (f) when the first similarity is less than a predetermined similarity, requesting the user speak the vocabulary word; (g) detecting a third utterance (366); (h) determining a second similarity between the first utterance and the third utterance (370); and (i) when the second similarity is greater than or equal to the predetermined similarity, creating a reference (364).

Abstract translation: 语音参考注册方法包括以下步骤：（a）请求用户说出词汇单词; （b）检测第一话语（354）; （c）请求用户说出词汇; （d）检测第二话语（358）; （e）确定第一话语和第二发音之间的第一相似度（362）; （f）当第一相似度小于预定相似度时，请求用户说出词汇单词; （g）检测第三个发音（366）; （h）确定所述第一话语和所述第三语音之间的第二相似度（370）; 和（i）当第二相似度大于或等于预定相似度时，创建参考（364）。

6.

发明申请
CUSTOM GRAMMARS BUILDER PLATFORM 审中-公开
Title translation: 自定义GRAMMARS BUILDER PLATFORM

公开(公告)号：WO2015184374A1

公开(公告)日：2015-12-03

申请号：PCT/US2015/033358

申请日：2015-05-29

Applicant: ANGEL.COM INCORPORATED

Inventor： KUMAR, Praphul , WELLMAN, Aaron

IPC: G06Q30/02 , G06Q10/06 , G06Q10/10

CPC classification number: H04M3/4936 , G10L15/183 , G10L15/19 , G10L15/22 , G10L2015/0635 , G10L2015/0638 , G10L2015/223 , H04M3/42382 , H04M3/4938 , H04M3/51

Abstract: A request to execute an interaction site associated with a custom grammars file is received from a user device and by a communications system. An interaction flow document to execute the interaction site is accessed by the communications system. The custom grammars file is accessed by the communications system, the custom grammars file being configured to enable the communications system to identify executable commands corresponding to utterances spoken by users of user devices. An utterance spoken by a user of the user device is received from the user device and by the communications system. The utterance is stored by the communications system. The custom grammars file is updated by a grammar generation system to include a representation of the stored utterance for processing utterances in subsequent communications with users.

Abstract translation: 从用户设备和通信系统接收到执行与自定义语法文件相关联的交互站点的请求。用于执行交互站点的交互流文档由通信系统访问。定制语法文件由通信系统访问，自定义语法文件被配置为使通信系统能够识别与用户设备的用户说出的话语相对应的可执行命令。从用户设备和通信系统接收用户设备的用户说出的话语。话音由通信系统存储。自定义语法文件由语法生成系统更新，以包括用于在与用户的后续通信中处理话语的存储话语的表示。

7.

发明申请
SYSTEM AND METHOD FOR USER-SPECIFIED PRONUNCIATION OF WORDS FOR SPEECH SYNTHESIS AND RECOGNITION 审中-公开
Title translation: 用于用户指定的语音合成和识别词汇的系统和方法

公开(公告)号：WO2014197334A2

公开(公告)日：2014-12-11

申请号：PCT/US2014/040393

申请日：2014-05-30

Applicant: APPLE INC. , NAIK, Devang, K. , WEINER, Liam , BINDER, Justin, G. , SRISUWANANUKORN, Charles , EVERMANN, Gunnar , WILLIAMS, Shaun , CHEN, Hong , NAPOLITANO, Lia, T.

Inventor： NAIK, Devang, K. , WEINER, Liam , BINDER, Justin, G. , SRISUWANANUKORN, Charles , EVERMANN, Gunnar , WILLIAMS, Shaun , CHEN, Hong , NAPOLITANO, Lia, T. , GRUBER, Thomas, R.

IPC: G10L13/00

CPC classification number: G10L13/027 , G10L13/04 , G10L13/08 , G10L15/063 , G10L15/22 , G10L15/26 , G10L15/265 , G10L2015/0631 , G10L2015/0638

Abstract: The method is performed at an electronic device with one or more processors and memory storing one or more programs for execution by the one or more processors. A first speech input including at least one word is received. A first phonetic representation of the at least one word is determined, the first phonetic representation comprising a first set of phonemes selected from a speech recognition phonetic alphabet. The first set of phonemes is mapped to a second set of phonemes to generate a second phonetic representation, where the second set of phonemes is selected from a speech synthesis phonetic alphabet. The second phonetic representation is stored in association with a text string corresponding to the at least one word.

Abstract translation: 该方法在具有一个或多个处理器的电子设备中执行，存储器存储一个或多个程序以供一个或多个处理器执行。接收包括至少一个字的第一语音输入。确定所述至少一个单词的第一语音表示，所述第一语音表示包括从语音识别语音字母表中选择的第一组音素。将第一组音素映射到第二组音素以产生第二语音表示，其中从语音合成语音字母表中选择第二组音素。第二语音表示与对应于至少一个单词的文本串相关联地存储。

8.

发明申请
METHOD AND SYSTEM FOR CONTROLLING A USER RECEIVING DEVICE USING VOICE COMMANDS 审中-公开
Title translation: 用于使用语音命令控制用户接收设备的方法和系统

公开(公告)号：WO2014130897A1

公开(公告)日：2014-08-28

申请号：PCT/US2014/017844

申请日：2014-02-21

Applicant: THE DIRECTV GROUP, INC.

Inventor： PONTUAL, Romulo , CANSINO, Don, E. , CHAN, Yeung, K. , BONOVICH, Earl, J.

IPC: G10L15/26 , H04N21/422 , G10L15/30

CPC classification number: H04N5/4403 , G06F3/165 , G06F17/30654 , G06F17/30755 , G06F17/30976 , G08C17/02 , G08C2201/31 , G10L15/063 , G10L15/22 , G10L15/26 , G10L15/30 , G10L2015/0638 , H04N21/2221 , H04N21/4126 , H04N21/4147 , H04N21/42203 , H04N21/42204 , H04N21/42207 , H04N21/42209 , H04N21/4222 , H04N21/42222 , H04N21/42224 , H04N21/4334 , H04N21/4394 , H04N21/4398 , H04N21/4415 , H04N21/4583 , H04N21/47211 , H04N21/47214 , H04N21/4722 , H04N21/475 , H04N21/4753 , H04N21/4828 , H04N21/4882 , H04N21/6143 , H04N21/6175 , H04N2005/4428

Abstract: A system and method includes a language processing module converting an electrical signal corresponding to an audible signal into a textual signal. The system further includes a command generation module converting the textual signal into a user receiving device control signal. A controller controls a function of a user receiving device in response to the user receiving device control signal.

Abstract translation: 一种系统和方法包括将对应于可听见的信号的电信号转换为文本信号的语言处理模块。该系统还包括将文本信号转换为用户接收设备控制信号的命令生成模块。控制器响应于用户接收设备控制信号控制用户接收设备的功能。

9.

发明申请
一种语音交互方法和装置审中-公开

公开(公告)号：WO2014079324A1

公开(公告)日：2014-05-30

申请号：PCT/CN2013/086734

申请日：2013-11-08

Applicant: 腾讯科技(深圳)有限公司

Inventor： 周彬

IPC: G06F3/16 , G10L15/02

CPC classification number: G10L17/22 , G10L15/06 , G10L2015/0638 , G10L2015/088

Abstract: 一种语音交互方法和装置，方法包括：设置画面素材移动命令与交互关键字的对应关系，该方法还包括：展示画面素材；录制用户语音文件，分析用户语音文件以解析出交互关键字；根据解析出的交互关键字确定对应于交互关键字的画面素材移动命令，并基于所确定的画面素材移动命令控制画面素材的移动。

10.

发明申请
WORDSPOTTING SYSTEM 审中-公开
Title translation: WORDSPOTTING系统

公开(公告)号：WO2007134293A2

公开(公告)日：2007-11-22

申请号：PCT/US2007/068888

申请日：2007-05-14

Applicant: NEXIDIA, INC. , MORRIS, Robert, W. , ARROWOOD, Jon, A. , GAVALDA, Marsal , CARDILLO, Peter S. , FINLAY, Mark

Inventor： MORRIS, Robert, W. , ARROWOOD, Jon, A. , GAVALDA, Marsal , CARDILLO, Peter S. , FINLAY, Mark

IPC: A61K9/14

CPC classification number: G10L15/26 , G10L2015/025 , G10L2015/0638 , G10L2015/088

Abstract: An approach to improving the performance of a wordspotting system includes providing an interface for interactive improvement of a phonetic representation of a query based on an operator identifying true detections and false alarms in a data set.

Abstract translation: 改进字注系统的性能的方法包括提供用于基于识别数据集中的真实检测和虚假警报的操作者的交互式改进查询的语音表示的接口。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification