SPEECH RECOGNITION METHOD
    1.
    发明申请
    SPEECH RECOGNITION METHOD 审中-公开
    语音识别方法

    公开(公告)号:WO2004061822A1

    公开(公告)日:2004-07-22

    申请号:PCT/US2003/041697

    申请日:2003-12-31

    Abstract: In accordance with a present invention speech recognition is disclosed (10). It uses a microphone to receive audible sounds input by a user into a first computing device (28) having a program with a database (16) comprising (i) digital responses of known audible sounds and associated alphanumeric representations of the known audible sounds and for the first time (ii) digital representations of known audible sounds corresponding to mispronunciations resulting from known class of mispronounced words and phrases. The method is performed by receiving the audible sounds in the form of the electrical output of the microphone (28). A particular audible sound to be recognized is converted into a digital representation of the audible sound (30). The digital representation of the particular audible sound is then compared to the digital representations of the known audible sounds to determine which of those known audible sounds is most likely to be the particular audible sounds in the database (30).

    Abstract translation: 根据本发明,公开了语音识别(10)。 它使用麦克风来接收由用户输入到具有数据库(16)的程序的第一计算设备(28)的可听见的声音,所述数据库包括(i)已知可听见的声音和所述已知可听见的声音的相关联的字母数字表示的数字响应,并且 第一次(ii)已知的可听见的声音的数字表示对应于由已知类型的错误的单词和短语产生的误导。 通过以麦克风(28)的电输出的形式接收可听见的声音来执行该方法。 要识别的特定可听见的声音被转换成可听见的声音的数字表示(30)。 然后将特定可听见的声音的数字表示与已知可听见的声音的数字表示进行比较,以确定哪些已知的可听见的声音最有可能是数据库中的特定可听见的声音(30)。

    TEXT TO SPEECH
    2.
    发明申请
    TEXT TO SPEECH 审中-公开
    文字转语音

    公开(公告)号:WO03065349A2

    公开(公告)日:2003-08-07

    申请号:PCT/US0302561

    申请日:2003-01-28

    Abstract: A preferred embodiment of the method for converting text to speech using a computing device having a memory is disclosed. The inventive method comprises examining a text to be spoken to an audience for a specific communications purpose, followed by marking-up the text according to a phonetic markup systems such as the Lessac System pronunciation rules notations. A set of rules to control a speech to text generator based on speech principles, such as Lessac principles. Such rules are of the tide normally implemented on prior art text-to-speech engines, and control the operation of the software and the characteristics of the speech generated by a computer using the software. A computer is used to speak the marked-up text expressively. The step of using a computer to speak the marked-up text expressively is repeated using alternative pronunciations of the selected style of expression where each of the tonal, structural, and consonant energies, have a different balance in the speech, are also spoken to a trained speech practitioners that listened to the spoken speech generated by the computer. The spoken speech generated by the computer is then evaluated for consistency with style criteria and/or expressiveness. And audience is then assembled and the spoken speech generated by the computer is played back to the audience. Audience comprehension of spoken speech generated by the computer is evaluated and correlated to a particular implemented rule or rules, and those rules which resulted relatively high audience comprehension are selected.

    Abstract translation: 公开了使用具有存储器的计算装置将文本转换成语音的方法的优选实施例。 本发明的方法包括检查用于特定通信目的的待观众的文本,然后根据诸如Lessac System发音规则符号的语音标记系统标记文本。 一套基于言语原则(如Lessac原则)控制语音到文本生成器的规则。 这样的规则通常在现有技术的文本到语音引擎上实现,并且控制软件的操作和使用该软件由计算机产生的语音的特性。 一台电脑用来表达出标记的文字。 使用计算机表达地说出标记的文本的步骤被重复使用选择的表达形式的替代发音,其中每个音调,结构和辅音能量在语音中具有不同的平衡,也被称为 训练有素的讲话从业人员聆听了计算机产生的口语演讲。 然后评估由计算机产生的口语语音与风格标准和/或表现力的一致性。 然后组合观众,并将计算机产生的口语演讲播放给观众。 对计算机产生的口语表达的听众理解进行评估,并与特定实施的规则或规则相关联,并且选择导致相对较高的受众理解的规则。

    SPEECH RECOGNITION AND TRAINING METHODS AND SYSTEMS
    3.
    发明申请
    SPEECH RECOGNITION AND TRAINING METHODS AND SYSTEMS 审中-公开
    语音识别和培训方法与系统

    公开(公告)号:WO01082291A1

    公开(公告)日:2001-11-01

    申请号:PCT/US2001/012959

    申请日:2001-04-23

    CPC classification number: G10L15/063 G09B19/04 G10L2015/0638

    Abstract: In accordance with a present invention speech recognition (10) and training (110), methods and systems are disclosed. A microphone receives audible sounds input (28) from a user into a first computing device having a program with a database (16). The database consists of digital representations of known audible sounds and associated alphanumeric representations of the known audible sounds and mispronunciations. The program compares the digital representation to the digital representations of known audible sounds in a database (30) to determine the likely desired output. If an error in recognition (32) occurs, then the user can indicate the proper alphanumeric representation of the particular audible sound (34). This allows the system to determine whether the error is a result of a known type or instance of mispronunciation (36). In response to a determination of the error's nature, the system presents an interactive training program from the computer to the user to enable the user to correct such mispronunciation (45). The present invention has the advantage of improving voice recognition and speech patterns of the user by focusing in on the user in error correction. Thus improving oral communication skills of the user.

    Abstract translation: 根据本发明,公开了语音识别(10)和训练(110),方法和系统。 麦克风从用户接收到具有具有数据库(16)的程序的第一计算设备的声音输入(28)。 该数据库由已知的可听见的声音和已知可听见的声音和错误的相关联的字母数字表示的数字表示。 该程序将数字表示与数据库(30)中的已知声音的数字表示进行比较,以确定可能的期望输出。 如果发生识别(32)中的错误,则用户可以指示特定可听见的声音的适当的字母数字表示(34)。 这允许系统确定错误是否是已知类型或误会实例的结果(36)。 响应于错误的性质的确定,系统提供从计算机到用户的交互式训练程序,以使用户能够纠正这种错误发音(45)。 本发明的优点在于,通过集中在用户的纠错中来改善用户的语音识别和语音模式。 从而提高用户的口语交流能力。

    MULTITASKING INTERACTIVE VOICE USER INTERFACE
    4.
    发明申请
    MULTITASKING INTERACTIVE VOICE USER INTERFACE 审中-公开
    多媒体互动语音用户界面

    公开(公告)号:WO01004872A1

    公开(公告)日:2001-01-18

    申请号:PCT/US2000/017516

    申请日:2000-06-26

    CPC classification number: G10L15/22 G10L2015/0638

    Abstract: A dictation command voice multitasking interface is illustrated by the GUI computer training template to be displayed and implemented by the creation of a question and multiple answer database where, for example, a first box (10) labeled "print question" receives text for question A. "RecQ" box (11) is selected by a mouse in which case the trainer records the voice equivalent of the question. The system is thereby made responsive to recognized spoken words, such as for the alternative questions illustrated by box (10) and box (18) and corresponding stored voice equivalents illustrated by box (11) and (19). A voice equivalent of the printed answer in box (12) is stored in as "RecA" in box (13). Corresponding descriptive text is stored in Box (14). The process is interactive in storing the voice equivalents as shown by decision box (34) which queries the trainer for more questions to be stored in the database wherein the computer interrupt handler (25) waits for further input from voice (24). A practical application of the system would enable a doctor, with hands and eyes occupied in performing a clinical procedure, to input voiced queries to the computer in order to create a report during the clinical procedure.

    Abstract translation: 听写命令语音多任务接口由GUI计算机训练模板说明,以通过创建问题和多应答数据库来显示和实现,其中例如标记为“打印问题”的第一个框(10)接收问题A的文本 “RecQ”框(11)由鼠标选择,在这种情况下,培训者记录相当于该问题的声音。 因此,系统响应于所识别的口语,例如由框(10)和框(18)示出的替代问题和由框(11)和(19)示出的相应的存储的语音等效。 方框(12)中的打印答案的等效语句作为框(13)中的“RecA”存储。 相应的描述性文本存储在Box(14)中。 该过程是交互式的,用于存储语音等同物,如决策框(34)所示,其向训练者询问要存储在数据库中的更多问题,其中计算机中断处理程序(25)等待来自语音的进一步输入(24)。 该系统的实际应用将使得在手术和眼睛被占用的医生执行临床程序时,可以向计算机输入有声查询以在临床过程中创建报告。

    SPEECH REFERENCE ENROLLMENT METHOD
    5.
    发明申请
    SPEECH REFERENCE ENROLLMENT METHOD 审中-公开
    语音参考引用方法

    公开(公告)号:WO99013456A1

    公开(公告)日:1999-03-18

    申请号:PCT/US1998/017095

    申请日:1998-08-17

    Abstract: A speech reference enrollment method involves the following steps: (a) requesting a user speak a vocabulary word; (b) detecting a first utterance (354); (c) requesting the user speak the vocabulary word; (d) detecting a second utterance (358); (e) determining a first similarity between the first utterance and the second utterance (362); (f) when the first similarity is less than a predetermined similarity, requesting the user speak the vocabulary word; (g) detecting a third utterance (366); (h) determining a second similarity between the first utterance and the third utterance (370); and (i) when the second similarity is greater than or equal to the predetermined similarity, creating a reference (364).

    Abstract translation: 语音参考注册方法包括以下步骤:(a)请求用户说出词汇单词; (b)检测第一话语(354); (c)请求用户说出词汇; (d)检测第二话语(358); (e)确定第一话语和第二发音之间的第一相似度(362); (f)当第一相似度小于预定相似度时,请求用户说出词汇单词; (g)检测第三个发音(366); (h)确定所述第一话语和所述第三语音之间​​的第二相似度(370); 和(i)当第二相似度大于或等于预定相似度时,创建参考(364)。

    CUSTOM GRAMMARS BUILDER PLATFORM
    6.
    发明申请
    CUSTOM GRAMMARS BUILDER PLATFORM 审中-公开
    自定义GRAMMARS BUILDER PLATFORM

    公开(公告)号:WO2015184374A1

    公开(公告)日:2015-12-03

    申请号:PCT/US2015/033358

    申请日:2015-05-29

    Abstract: A request to execute an interaction site associated with a custom grammars file is received from a user device and by a communications system. An interaction flow document to execute the interaction site is accessed by the communications system. The custom grammars file is accessed by the communications system, the custom grammars file being configured to enable the communications system to identify executable commands corresponding to utterances spoken by users of user devices. An utterance spoken by a user of the user device is received from the user device and by the communications system. The utterance is stored by the communications system. The custom grammars file is updated by a grammar generation system to include a representation of the stored utterance for processing utterances in subsequent communications with users.

    Abstract translation: 从用户设备和通信系统接收到执行与自定义语法文件相关联的交互站点的请求。 用于执行交互站点的交互流文档由通信系统访问。 定制语法文件由通信系统访问,自定义语法文件被配置为使通信系统能够识别与用户设备的用户说出的话语相对应的可执行命令。 从用户设备和通信系统接收用户设备的用户说出的话语。 话音由通信系统存储。 自定义语法文件由语法生成系统更新,以包括用于在与用户的后续通信中处理话语的存储话语的表示。

    一种语音交互方法和装置
    9.
    发明申请

    公开(公告)号:WO2014079324A1

    公开(公告)日:2014-05-30

    申请号:PCT/CN2013/086734

    申请日:2013-11-08

    Inventor: 周彬

    CPC classification number: G10L17/22 G10L15/06 G10L2015/0638 G10L2015/088

    Abstract: 一种语音交互方法和装置,方法包括:设置画面素材移动命令与交互关键字的对应关系,该方法还包括:展示画面素材;录制用户语音文件,分析用户语音文件以解析出交互关键字;根据解析出的交互关键字确定对应于交互关键字的画面素材移动命令,并基于所确定的画面素材移动命令控制画面素材的移动。

Patent Agency Ranking