Expanding an effective vocabulary of a speech recognition system
    1.
    发明授权
    Expanding an effective vocabulary of a speech recognition system 有权
    扩展语音识别系统的有效词汇

    公开(公告)号:US07120582B1

    公开(公告)日:2006-10-10

    申请号:US09390370

    申请日:1999-09-07

    IPC分类号: G10L15/00 G10L15/06

    摘要: The invention provides techniques for creating and using fragmented word models to increase the effective size of an active vocabulary of a speech recognition system. The active vocabulary represents all words and word fragments that the speech recognition system is able to recognize. Each word may be represented by a combination of acoustic models. As such, the active vocabulary represents the combinations of acoustic models that the speech recognition system may compare to a user's speech to identify acoustic models that best match the user's speech. The effective size of the active vocabulary may be increased by dividing words into constituent components or fragments (for example, prefixes, suffixes, separators, infixes, and roots) and including each component as a separate entry in the active vocabulary. Thus, for example, a list of words and their plural forms (for example, “book, books, cook, cooks, hook, hooks, look and looks”) may be represented in the active vocabulary using the words (for example, “book, cook, hook and look”) and an entry representing the suffix that makes the words plural (for example, “+s”, where the “+” preceding the “s” indicates that “+s” is a suffix). For a large list of words, and ignoring the entry associated with the suffix, this technique may reduce the number of vocabulary entries needed to represent the list of words considerably.

    摘要翻译: 本发明提供了用于创建和使用分割词模型以增加语音识别系统的活跃词汇表的有效大小的技术。 活动词汇表示语音识别系统能够识别的所有单词和单词片段。 每个单词可以由声学模型的组合来表示。 因此,活动词汇表示声学模型的组合,语音识别系统可以与用户的语音进行比较,以识别与用户的语音最匹配的声学模型。 活动词汇表的有效大小可以通过将单词划分成组成组件或片段(例如,前缀,后缀,分隔符,中缀和根)并将每个组件作为活动词汇表中的单独条目来增加。 因此,例如,可以在活动词汇表中使用单词(例如,“书籍,书籍,烹饪,烹饪,钩子,钩子,外观和外观”)的单词列表及其复数形式 书签,烹饪,钩子和外观“)和表示使单词复数的后缀的条目(例如,”+ s“,其中”+“之前的”+“表示”+ s“是后缀)。 对于大量单词列表,忽略与后缀相关联的条目,这种技术可能会大大减少用于表示单词列表所需的词汇表数量。

    Multilingual speech recognition
    2.
    发明授权
    Multilingual speech recognition 有权
    多语言语音识别

    公开(公告)号:US08065144B1

    公开(公告)日:2011-11-22

    申请号:US12699172

    申请日:2010-02-03

    IPC分类号: G10L15/06 G10L15/28

    CPC分类号: G10L15/005

    摘要: A method for speech recognition. The method uses a single pronunciation estimator to train acoustic phoneme models and recognize utterances from multiple languages. The method includes accepting text spellings of training words in a plurality of sets of training words, each set corresponding to a different one of a plurality of languages. The method also includes, for each of the sets of training words in the plurality, receiving pronunciations for the training words in the set, the pronunciations being characteristic of native speakers of the language of the set, the pronunciations also being in terms of subword units at least some of which are common to two or more of the languages. The method also includes training a single pronunciation estimator using data comprising the text spellings and the pronunciations of the training words.

    摘要翻译: 一种语音识别方法。 该方法使用单个发音估计器来训练声音音素模型并识别来自多种语言的语音。 该方法包括接受多组训练词中训练词的文本拼写,每组训练单词对应于多种语言中的不同语言。 该方法还包括对于多个训练词集合中的每一组,接收组中的训练单词的发音,发音是该组语言的母语者的特征,发音还以子单位 其中至少有一些是两种或多种语言的共同之处。 该方法还包括使用包括文本拼写和训练词的发音的数据训练单个发音估计器。

    Systems and methods for word recognition
    3.
    发明授权
    Systems and methods for word recognition 失效
    词识别的系统和方法

    公开(公告)号:US5680511A

    公开(公告)日:1997-10-21

    申请号:US477287

    申请日:1995-06-07

    IPC分类号: G10L15/18 G10L9/00

    CPC分类号: G10L15/1815

    摘要: In one aspect, the invention provides word recognition systems that operate to recognize an unrecognized or ambiguous word that occurs within a passage of words. The system can offer several words as choice words for inserting into the passage to replace the unrecognized word. The system can select the best choice word by using the choice word to extract from a reference source, sample passages of text that relate to the choice word. For example, the system can select the dictionary passage that defines the choice word. The system then compares the selected passage to the current passage, and generates a score that indicates the likelihood that the choice word would occur within that passage of text. The system can select the choice word with the best score to substitute into the passage. The passage of words being analyzed can be any word sequence including an utterance, a portion of handwritten text, a portion of typewritten text or other such sequence of words, numbers and characters. Alternative embodiments of the present invention are disclosed which function to retrieve documents from a library as a function of context.

    摘要翻译: 在一个方面,本发明提供了操作以识别在单词通过内出现的未识别或不明确的单词的单词识别系统。 该系统可以提供多个单词作为选择单词,用于插入到段落中以替换未被识别的单词。 系统可以通过使用选择单词从参考源中提取出最佳选择单词,与选择单词相关的文本的样本段落。 例如,系统可以选择定义选择字的字典通道。 然后,系统将所选择的段落与当前段落进行比较,并生成一个分数,指示选择单词在文本段落内发生的可能性。 系统可以选择具有最佳分数的选择词来代替段落。 正在分析的单词的通过可以是包括发音,手写文本的一部分,打字文本的一部分或其他这样的单词,数字和字符序列的任何单词序列。 公开了本发明的替代实施例,其功能是根据上下文从库中检索文档。

    Training and using pronunciation guessers in speech recognition
    4.
    发明授权
    Training and using pronunciation guessers in speech recognition 有权
    在语音识别中训练和使用发音猜测器

    公开(公告)号:US07467087B1

    公开(公告)日:2008-12-16

    申请号:US10684135

    申请日:2003-10-10

    IPC分类号: G10L13/00 G10L15/00 G10L15/06

    摘要: The error rate of a pronunciation guesser that guesses the phonetic spelling of words used in speech recognition is improved by causing its training to weigh letter-to-phoneme mappings used as data in such training as a function of the frequency of the words in which such mappings occur. Preferably the ratio of the weight to word frequency increases as word frequencies decreases. Acoustic phoneme models for use in speech recognition with phonetic spellings generated by a pronunciation guesser that makes errors are trained against word models whose phonetic spellings have been generated by a pronunciation guesser that makes similar errors. As a result, the acoustic models represent blends of phoneme sounds that reflect the spelling errors made by the pronunciation guessers. Speech recognition enabled systems are made by storing in them both a pronunciation guesser and a corresponding set of such blended acoustic models.

    摘要翻译: 猜测语音识别中使用的单词的拼音拼写的发音猜测器的错误率通过使其训练来衡量用作这种训练中的数据的字母到音素映射,作为其中这样的单词的频率的函数 映射发生。 优选地,权重与字频率的比率随着字频率的降低而增加。 用于语音识别的声学音素模型,由发音猜测器产生的语音拼写用于产生错误的声音拼音针对由发音猜测器产生类似错误的语音拼写的单词模型。 结果,声学模型表示声音发音的混合,反映了发音猜测者的拼写错误。 通过在其中存储发音猜测器和相应的一组这样的混合声学模型来进行支持语音识别的系统。

    Multilingual speech recognition
    6.
    发明授权
    Multilingual speech recognition 有权
    多语言语音识别

    公开(公告)号:US07716050B2

    公开(公告)日:2010-05-11

    申请号:US10716027

    申请日:2003-11-17

    IPC分类号: G10L15/00

    CPC分类号: G10L15/005

    摘要: A method for speech recognition. The method uses a single pronunciation estimator to train acoustic phoneme models and recognize utterances from multiple languages. The method includes accepting text spellings of training words in a plurality of sets of training words, each set corresponding to a different one of a plurality of languages. The method also includes, for each of the sets of training words in the plurality, receiving pronunciations for the training words in the set, the pronunciations being characteristic of native speakers of the language of the set, the pronunciations also being in terms of subword units at least some of which are common to two or more of the languages. The method also includes training a single pronunciation estimator using data comprising the text spellings and the pronunciations of the training words.

    摘要翻译: 一种语音识别方法。 该方法使用单个发音估计器来训练声音音素模型并识别来自多种语言的语音。 该方法包括接受多组训练词中训练词的文本拼写,每组训练单词对应于多种语言中的不同语言。 该方法还包括对于多个训练词集合中的每一组,接收组中的训练单词的发音,发音是该组语言的母语者的特征,发音还以子单位 其中至少有一些是两种或多种语言的共同之处。 该方法还包括使用包括文本拼写和训练词的发音的数据训练单个发音估计器。

    Method of producing alternate utterance hypotheses using auxiliary information on close competitors
    7.
    发明授权
    Method of producing alternate utterance hypotheses using auxiliary information on close competitors 有权
    使用辅助信息在密切的竞争对手上产生替代发音假设的方法

    公开(公告)号:US07676367B2

    公开(公告)日:2010-03-09

    申请号:US10783518

    申请日:2004-02-20

    IPC分类号: G10L15/04 G10L15/00

    摘要: A method of constructing a list of alternate transcripts from a recognized transcript includes generating a list of close call records, matching partial sub-histories from the recognized transcript with one of the history pairs stored in each of the records, and substituting the other of the history pairs for the partial sub-history of the recognized transcript. A close call record is generated each time a pair of partial hypotheses attempt to seed a common word. Each close call record includes history information and scoring information associated with a particular pair of partial hypotheses seeding a common word. Alternate transcripts are constructed by substituting close call histories for partial histories of the recognized transcripts, and also by substituting close call histories for partial histories of other alternate transcript.

    摘要翻译: 从识别的记录中构建候选抄本的列表的方法包括生成紧密呼叫记录的列表,将来自所识别抄本的部分子历史与存储在每个记录中的历史对之一进行匹配, 历史对对于识别的成绩单的部分子历史记录。 每当一对部分假设尝试种植一个共同词时,就会产生一个接近通话记录。 每个近距离通话记录包括历史信息和与特定的一对部分假设相关联的评分信息,播种公共字。 替代的成绩单是通过将认可的记录的部分历史代替关闭呼叫历史,并通过替代其他替代记录的部分历史的近距离呼叫历史来代替。

    Large-vocabulary continuous speech prefiltering and processing system
    8.
    发明授权
    Large-vocabulary continuous speech prefiltering and processing system 失效
    大型音频连续语音预处理和处理系统

    公开(公告)号:US5202952A

    公开(公告)日:1993-04-13

    申请号:US542520

    申请日:1990-06-22

    IPC分类号: G10L15/08 G10L15/10 G10L15/28

    CPC分类号: G10L15/10

    摘要: A continuous speech prefiltering system for use in continuous speech recognition computer systems. The speech to be recognized is converted from utterances to frame data sets, which frame data sets are smoothed to generate a smooth frame model over a predetermined number of frames. A resident vocabulary is stored within the computer as clusters of word models which are acoustically similar over a succession of frame periods. A cluster score is generated by the system, which score includes the likelihood of the smooth frames evaluated using a probability model for the cluster against which the smooth frame model is being compared. Cluster sets having cluster scores below a predetermined acoustic threshold are removed from further consideration. The remaining cluster sets are unpacked for determination of a word score for each unpacked word. These word scores are used to identify those words which are above a second predetermined threshold to define a word list which is sent to a recognizer for a more lengthy word match. A controller enables the system to initialize times corresponding to the frame start time for each frame data set, defining a sliding window.

    摘要翻译: 一种用于连续语音识别计算机系统的连续语音预过滤系统。 要识别的语音从语音转换为帧数据集,该帧数据集被平滑以在预定数量的帧上生成平滑帧模型。 驻留词汇被存储在计算机内,作为在一系列帧周期中在声学上相似的单词模型的群集。 由系统产生聚类分数,该分数包括使用针对平滑帧模型进行比较的群集的概率模型评估的平滑帧的可能性。 具有低于预定声学阈值的聚类分数的聚类集从进一步的考虑中被去除。 剩下的集群集被解包以确定每个未打包的单词的单词得分。 这些单词分数用于识别高于第二预定阈值的那些单词以定义一个单词列表,该单词列表被发送到识别器以获得更长的单词匹配。 控制器使得系统可以初始化与帧开始时间对应的时间,从而定义滑动窗口。

    Pronunciation discovery for spoken words
    9.
    发明授权
    Pronunciation discovery for spoken words 有权
    口语发音发现

    公开(公告)号:US08577681B2

    公开(公告)日:2013-11-05

    申请号:US10939942

    申请日:2004-09-13

    IPC分类号: G10L15/06 G10L15/00

    摘要: A method of generating an alternative pronunciation for a word or phrase, given an initial pronunciation and a spoken example of the word or phrase, includes providing the initial pronunciation of the word or phrase, and generating the alternative pronunciation by searching a neighborhood of pronunciations about the initial pronunciation via a constrained hypothesis, wherein the neighborhood includes pronunciations that differ from the initial pronunciation by at most one phoneme. The method further includes selecting a highest scoring pronunciation within the neighborhood of pronunciations.

    摘要翻译: 给定一个单词或短语的替代发音的方法,给定一个单词或短语的初始发音和口语例子,包括提供单词或短语的初始发音,并通过搜索关于发音的邻域发生替代发音 通过约束假设的初始发音,其中所述邻域包括与最初一个音素的初始发音不同的发音。 该方法还包括在发音附近选择最高的评分发音。

    VOICE SEARCH-ENABLED MOBILE DEVICE
    10.
    发明申请
    VOICE SEARCH-ENABLED MOBILE DEVICE 审中-公开
    语音搜索启用移动设备

    公开(公告)号:US20080153465A1

    公开(公告)日:2008-06-26

    申请号:US11673341

    申请日:2007-02-09

    IPC分类号: G10L21/00 H04Q7/22

    摘要: Methods and devices for providing a user of a mobile communications device with mobile voice-mediated search capability. The methods and devices involve receiving an utterance from a user of the mobile device, the utterance including a search request; using the speech recognition functionality to recognize that the utterance includes a search request; as a result of recognizing that the utterance includes a search request, establishing a wireless data connection to a remote server; sending a representation of the search request to the remote server over the wireless data connection; receiving search results that are responsive to the search request; and presenting the search results on the mobile device.

    摘要翻译: 用于向移动通信设备的用户提供具有移动语音媒介搜索能力的方法和设备。 所述方法和设备涉及从所述移动设备的用户接收话语,所述话语包括搜索请求; 使用所述语音识别功能来识别所述话语包括搜索请求; 作为识别出话语包括搜索请求的结果,建立到远程服务器的无线数据连接; 通过无线数据连接向远程服务器发送搜索请求的表示; 接收响应于该搜索请求的搜索结果; 并在移动设备上呈现搜索结果。