Word recognition using choice lists
    1.
    发明授权
    Word recognition using choice lists 有权
    使用选择列表的Word识别

    公开(公告)号:US07809574B2

    公开(公告)日:2010-10-05

    申请号:US10950074

    申请日:2004-09-24

    IPC分类号: G10L21/00 G06F3/16 G06F3/14

    CPC分类号: G10L15/14

    摘要: One aspect of the invention involves word recognition that uses scrollable choice lists in which choices are listed in character-order. Another aspect relates to a scrollable, visually-displayed word recognition choice list, where the recognition candidates on the choice list are each associated with a choice-selecting symbol the user can use to select a desired recognition candidate by pressing an associated button, and where the same choice-selecting symbol is used for different choices displayed on the display at different times as a result of scrolling. Another aspect of the invention relates to providing a choice list of best scoring characters for a particular character position in the spelling of a filter that is used to filter word recognition. Another aspect of the invention relates to a choice list used in word recognition in which the choice list can be scrolled horizontally.

    摘要翻译: 本发明的一个方面涉及使用可滚动选择列表的单词识别,其中以字符顺序列出选择。 另一方面涉及可滚动的,视觉上显示的词识别选择列表,其中选择列表上的识别候选者各自与选择选择符号相关联,用户可以通过按下相关联的按钮来选择期望的识别候选,并且其中 相同的选择选择符号用于在不同时间显示在显示器上作为滚动的结果的不同选择。 本发明的另一方面涉及提供用于滤波器的拼写中用于过滤词识别的特定字符位置的最佳评分字符的选择列表。 本发明的另一方面涉及用于字识别中的选择列表,其中选择列表可以水平滚动。

    Combined speech recognition and text-to-speech generation
    2.
    发明授权
    Combined speech recognition and text-to-speech generation 有权
    组合语音识别和文本到语音生成

    公开(公告)号:US07577569B2

    公开(公告)日:2009-08-18

    申请号:US10949991

    申请日:2004-09-24

    IPC分类号: G10L13/08

    摘要: Text-to-speech (TTS) generation is used in conjunction with large vocabulary speech recognition to say words selected by the speech recognition. The software for performing the large vocabulary speech recognition can share speech modeling data with the TTS software. TTS or recorded audio can be used to automatically say both recognized text and the names of recognized commands after their recognition. The TTS can automatically repeats text recognized by the speech recognition after each of a succession of end of utterance detections. A user can move a cursor back or forward in recognized text, and the TTS can speak one or more words at the cursor location after each such move. The speech recognition can be used to produces a choice list of possible recognition candidates and the TTS can be used to provide spoken output of one or more of the candidates on the choice list.

    摘要翻译: 文本到语音(TTS)生成与大词汇语音识别结合使用来说出由语音识别选择的单词。 用于执行大词汇语音识别的软件可以与TTS软件共享语音建模数据。 TTS或录制音频可以用于在识别后自动说出识别的文本和识别的命令的名称。 TTS可以在每次连续的话语检测结束后自动重复通过语音识别识别的文本。 用户可以在识别的文本中向后或向前移动光标,并且在每次这样的移动之后,TTS可以在光标位置说一个或多个单词。 语音识别可用于产生可能的识别候选者的选择列表,并且TTS可以用于在选择列表上提供一个或多个候选者的口语输出。

    Combined speech recognition and sound recording
    3.
    发明授权
    Combined speech recognition and sound recording 有权
    组合语音识别和录音

    公开(公告)号:US07505911B2

    公开(公告)日:2009-03-17

    申请号:US11005568

    申请日:2004-12-05

    IPC分类号: G01L21/06

    摘要: A handheld device with both large-vocabulary speech recognition and audio recoding allows users to switch between at least two of the following three modes: (1) recording audio without corresponding speech recognition; (2) recording with speech recognition; and (3) speech recognition without audio recording. A handheld device with both large-vocabulary speech recognition and audio recoding enables a user to select a portion of previously recorded sound and have speech recognition performed upon it. A system enables a user to search for a text label associated with portions of unrecognized recorded sound by uttering the label's words. A large-vocabulary system allows users to switch between playing back recorded audio and speech recognition with a single input, with successive audio playbacks automatically starting slightly before the end of prior playback. And a cell phone that allows both large-vocabulary speech recognition and audio recording and playback.

    摘要翻译: 具有大词汇语音识别和音频重新编码的手持设备允许用户在以下三种模式中的至少两种之间进行切换:(1)记录没有相应语音识别的音频; (2)用语音识别录音; 和(3)没有录音的语音识别。 具有大词汇语音识别和音频重新编码的手持设备使得用户能够选择先前记录的声音的一部分并且对其进行语音识别。 系统使用户能够通过发出标签的单词来搜索与未被识别的记录声音的部分相关联的文本标签。 大词汇系统允许用户使用单个输入在回放记录的音频和语音识别之间切换,连续的音频播放在先前播放结束之前自动开始。 和一个手机,允许大词汇语音识别和音频录音和播放。

    Speech recognition using automatic recognition turn off
    4.
    发明授权
    Speech recognition using automatic recognition turn off 有权
    语音识别使用自动识别关闭

    公开(公告)号:US07716058B2

    公开(公告)日:2010-05-11

    申请号:US10949972

    申请日:2004-09-24

    IPC分类号: G10L15/28

    CPC分类号: G10L15/22 G10L15/19

    摘要: Large vocabulary speech recognition can automatically turn recognition off in one or more ways. A user command can turn on recognition that is automatically turned off after the next end of utterance. A plurality of buttons can each be associated with a different speech mode and the touch of a given button can turn on, and then automatically turn off, the given button's associated speech recognition mode. These selectable modes can include large vocabulary and alphabetic entry modes, or continuous and discrete modes. A first user input can start recognition that allows a sequence of vocabulary words to be recognized and a second user input can start recognition that turns off after one word has been recognized. A first user input can start recognition that allows a sequence of utterances to be recognized and a second user input can start recognition that allows only a single utterance to be recognized.

    摘要翻译: 大词汇语音识别可以以一种或多种方式自动转移识别。 用户命令可以打开在下一个结束语句后自动关闭的识别。 多个按钮可以各自与不同的语音模式相关联,并且给定按钮的触摸可以打开,然后自动关闭给定按钮的相关语音识别模式。 这些可选择的模式可以包括大词汇和字母输入模式,或连续和离散模式。 第一用户输入可以开始识别,其允许识别词汇序列的序列,并且第二用户输入可以开始识别,一个字被识别之后关闭。 第一用户输入可以开始识别,其允许识别一系列话语,并且第二用户输入可以开始仅允许单个话语被识别的识别。

    Word recognition using word transformation commands
    5.
    发明授权
    Word recognition using word transformation commands 有权
    使用字变换命令的字识别

    公开(公告)号:US07634403B2

    公开(公告)日:2009-12-15

    申请号:US10949974

    申请日:2004-09-24

    IPC分类号: G10L21/06

    CPC分类号: G10L15/22 G10L15/19

    摘要: Word recognition enables a user to have a selected transformation performed on a given word produced by word recognition. In one aspect of the invention, a selectable transformation changes the given word to a differently spelled word having the same word root. In another, a selectable transformation changes a given word to one or more of its homonyms. In yet another, a selectable transformation changes the given word between a representation that spells the word with letters and one that does not. In one aspect of the invention a user can select to display a choice list of transformed words corresponding to a given recognized word and then select to have one of the listed transformed words replace the given word. In another aspect of the invention word recognition favors recognition of words corresponding to a user selected part of speech.

    摘要翻译: 词识别使得用户能够对由单词识别产生的给定单词进行选择的变换。 在本发明的一个方面,可选择的变换将给定的词改变成具有相同词根的不同拼写的词。 在另一个中,可选择的变换将给定词改变成其一个或多个同音异义词。 另一方面,可选择的变换将给出的字改变为用字母表示的表示和不与字母拼写的表示。 在本发明的一个方面,用户可以选择显示对应于给定识别字的变换字的选择列表,然后选择使所列出的变换字之一替换给定字。 在本发明的另一方面,字识别有利于对与用户选择的部分语音相对应的字的识别。

    Speech recognition using ambiguous or phone key spelling and/or filtering
    6.
    发明授权
    Speech recognition using ambiguous or phone key spelling and/or filtering 有权
    使用模糊或手机键拼写和/或过滤的语音识别

    公开(公告)号:US07526431B2

    公开(公告)日:2009-04-28

    申请号:US10950090

    申请日:2004-09-24

    CPC分类号: G10L15/22 G10L15/19

    摘要: Alphabetic filtering of the speech recognition of words uses a key press to indicate a desired character in an alphabetic filter string, where each key press represents two or more letters. The key presses can be disambiguated by recognizing a key-disambiguation utterance in association with a given key press. A user can select a desired recognition candidate from a choice list produced by such filtered word recognition. Ambiguous alphabetic filtering can be performed iteratively in response to the addition of successive ambiguous key presses. A user can select to re-recognize the utterance using filtering based on ambiguous key input after seeing the results of recognition without such filtering. Unambiguous alphabetic filtering can be performed by using multiple presses of an ambiguous key to disambiguate which letter is intended. A user can select between entering text by either large vocabulary speech recognition or by spelling text by pressing phone keys.

    摘要翻译: 字母语音识别的字母过滤使用按键来在字母过滤器字符串中指示期望的字符,其中每个按键表示两个或多个字母。 通过识别与给定的重点新闻相关的关键消歧话语,可以消除按键。 用户可以从由这种经过过滤的字识别产生的选择列表中选择所需的识别候选。 响应于添加连续模糊的按键,可以迭代地执行不确定的字母过滤。 用户可以在看到没有这种过滤的识别结果之后,基于模糊键输入使用过滤来重新识别话语。 无歧义的字母过滤可以通过使用多个不明确的键来进行,以消除哪个字母的意图。 用户可以通过大词汇语音识别输入文本或通过按电话键拼写文本来进行选择。

    Speech recognition using selectable recognition modes
    7.
    发明授权
    Speech recognition using selectable recognition modes 有权
    使用可选识别模式进行语音识别

    公开(公告)号:US07313526B2

    公开(公告)日:2007-12-25

    申请号:US10950092

    申请日:2004-09-24

    CPC分类号: G10L15/22 G10L15/19

    摘要: The present invention relates to speech recognition using selectable recognition modes. This includes innovations such as: large vocabulary speech recognition programming that supplies recognized words to external program as they are recognized, and allows a user to select between large vocabulary recognition of an utterance with and without language context from the prior utterance independently of state of the external program; allowing a user to select between continuous and discrete speech recognition that use substantially the same vocabulary; allowing a user to select between continuous and discrete large-vocabulary speech recognition modes; allowing a user to select between at least two different alphabetic entry speech recognition modes; and allowing a user to select from among four or more of the following recognitions modes when creating text: a large-vocabulary mode, an alphabetic entry mode, a number entry mode, and a punctuation entry mode.

    摘要翻译: 本发明涉及使用可选择识别模式的语音识别。 这包括创新,例如:大量词汇语音识别程序,在识别出外部程序时,将识别的词提供给外部程序,并允许用户在与先前的语言无关的语言语境的大量词汇识别与非语言语境之间进行选择 外部程序; 允许用户在使用基本相同词汇的连续和离散语音识别之间进行选择; 允许用户在连续和离散的大词汇语音识别模式之间进行选择; 允许用户在至少两个不同的字母进入语音识别模式之间进行选择; 并且允许用户在创建文本时从四种或更多种以下识别模式中进行选择:大词汇模式,字母输入模式,数字输入模式和标点输入模式。

    Methods, systems, and programming for performing speech recognition
    8.
    发明授权
    Methods, systems, and programming for performing speech recognition 有权
    用于执行语音识别的方法,系统和编程

    公开(公告)号:US07225130B2

    公开(公告)日:2007-05-29

    申请号:US10227653

    申请日:2002-09-06

    CPC分类号: G10L15/19 G10L15/22

    摘要: The present invention relates to: speech recognition using selectable recognition modes; using choice lists in large-vocabulary speech recognition; enabling users to select word transformations; speech recognition that automatically turns recognition off in one or more specified ways; phone key control of large-vocabulary speech recognition; speech recognition using phone key alphabetic filtering and spelling: speech recognition that enables a user to perform re-utterance recognition; the combination of speech recognition and text-to-speech (TTS) generation; the combination of speech recognition with handwriting and/or character recognition; and the combination of large-vocabulary speech recognition with audio recording and playback.

    摘要翻译: 本发明涉及:使用可选择识别模式的语音识别; 在大词汇语音识别中使用选择列表; 使用户能够选择字变换; 以一种或多种指定方式自动转移识别的语音识别; 电话键控大词汇语音识别; 使用手机密钥字母过滤和拼写的语音识别:语音识别,使得用户能够执行重新发音识别; 语音识别和文本到语音(TTS)生成的组合; 语音识别与手写和/或字符识别的组合; 以及大词汇语音识别与音频录制和播放的组合。

    Speech recognition using re-utterance recognition
    9.
    发明授权
    Speech recognition using re-utterance recognition 有权
    使用重新识别语音识别

    公开(公告)号:US07444286B2

    公开(公告)日:2008-10-28

    申请号:US11005567

    申请日:2004-12-05

    IPC分类号: G10L11/00

    CPC分类号: G10L15/22

    摘要: The present invention relates to speech recognition that enables a user to perform re-utterance recognition, in which speech recognition is performed upon both a second saying of a sequence of one or more words and upon an earlier saying of the same sequence to help the speech recognition better select one or more best scoring text sequences for the utterances.

    摘要翻译: 本发明涉及语音识别,其使得用户能够执行再话语识别,其中语音识别是在对一个或多个单词的序列的第二语言进行的,并且根据相同序列的较早说话来帮助语音 识别更好地为话语选择一个或多个最佳得分文本序列。

    Methods and apparatus for formant-based voice systems
    10.
    发明授权
    Methods and apparatus for formant-based voice systems 有权
    基于共振峰的语音系统的方法和装置

    公开(公告)号:US08447592B2

    公开(公告)日:2013-05-21

    申请号:US11225524

    申请日:2005-09-13

    IPC分类号: G10L11/04

    摘要: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.

    摘要翻译: 在一个方面,提供一种处理语音信号以提取信息以便于训练语音合成模型的方法。 该方法包括检测语音信号中的多个候选特征的动作,执行多个候选特征的一个或多个组合与语音信号之间的至少一个比较,以及从多个候选特征中选择一组特征 ,至少部分地在至少一个比较上。 在另一方面,通过执行在计算机可读介质上编码的程序来执行该方法。 在另一方面,通过至少部分地执行该方法来提供语音合成模型。