Systems and methods for word recognition
    1.
    发明授权
    Systems and methods for word recognition 失效
    词识别的系统和方法

    公开(公告)号:US5680511A

    公开(公告)日:1997-10-21

    申请号:US477287

    申请日:1995-06-07

    IPC分类号: G10L15/18 G10L9/00

    CPC分类号: G10L15/1815

    摘要: In one aspect, the invention provides word recognition systems that operate to recognize an unrecognized or ambiguous word that occurs within a passage of words. The system can offer several words as choice words for inserting into the passage to replace the unrecognized word. The system can select the best choice word by using the choice word to extract from a reference source, sample passages of text that relate to the choice word. For example, the system can select the dictionary passage that defines the choice word. The system then compares the selected passage to the current passage, and generates a score that indicates the likelihood that the choice word would occur within that passage of text. The system can select the choice word with the best score to substitute into the passage. The passage of words being analyzed can be any word sequence including an utterance, a portion of handwritten text, a portion of typewritten text or other such sequence of words, numbers and characters. Alternative embodiments of the present invention are disclosed which function to retrieve documents from a library as a function of context.

    摘要翻译: 在一个方面,本发明提供了操作以识别在单词通过内出现的未识别或不明确的单词的单词识别系统。 该系统可以提供多个单词作为选择单词,用于插入到段落中以替换未被识别的单词。 系统可以通过使用选择单词从参考源中提取出最佳选择单词,与选择单词相关的文本的样本段落。 例如,系统可以选择定义选择字的字典通道。 然后,系统将所选择的段落与当前段落进行比较,并生成一个分数,指示选择单词在文本段落内发生的可能性。 系统可以选择具有最佳分数的选择词来代替段落。 正在分析的单词的通过可以是包括发音,手写文本的一部分,打字文本的一部分或其他这样的单词,数字和字符序列的任何单词序列。 公开了本发明的替代实施例,其功能是根据上下文从库中检索文档。

    Method for representing word models for use in speech recognition
    2.
    发明授权
    Method for representing word models for use in speech recognition 失效
    用于表示用于语音识别的单词模型的方法

    公开(公告)号:US4903305A

    公开(公告)日:1990-02-20

    申请号:US328738

    申请日:1989-03-23

    IPC分类号: G10L15/06 G10L15/14

    摘要: A method is provided for deriving acoustic word representations for use in speech recognition. Initial word models are created, each formed of a sequence of acoustic sub-models. The acoustic sub-models from a plurality of word models are clustered, so as to group acoustically similar sub-models from different words, using, for example, the Kullback-Leibler information as a metric of similarity. Then each word is represented by cluster spelling representing the clusters into which its acoustic sub-models were placed by the clustering. Speech recognition is performed by comparing sequences of frames from speech to be recognized against sequences of acoustic models associated with the clusters of the cluster spelling of individual word models. The invention also provides a method for deriving a word representation which involves receiving a first set of frame sequences for a word, using dynamic programming to derive a corresponding initial sequence of probabilistic acoustic sub-models for the word independently of any previously derived acoustic model particular to the word, using dynamic programming to time align each of a second set of frame sequences for the word into a succession of new sub-sequences corresponding to the initial sequence of models, and using these new sub-sequences to calculate new probabilistic sub-models.

    摘要翻译: 提供了一种用于导出用于语音识别的声学词表示的方法。 创建初始词模型,每个模型由一系列声学子模型组成。 来自多个单词模型的声学子模型被聚类,以便使用例如Kullback-Leibler信息作为相似度的度量来将来自不同单词的声学上相似的子模型分组。 然后,每个单词都是用聚类拼写表示的,表示聚类中其声学子模型放置的聚类。 通过将要识别的来自语音的帧的序列与与单个词模型的群集拼写的群集相关联的声学模型的序列进行比较来执行语音识别。 本发明还提供了一种用于导出单词表示的方法,该方法涉及用于接收单词的第一组帧序列,使用动态规划来导出独立于任何先前导出的任何声学模型特定的单词的概率声学子模型的对应的初始序列 使用动态规划来将该单词的第二组帧序列中的每一个时间对齐到与模型的初始序列相对应的一系列新子序列中,并且使用这些新的子序列来计算新的概率子序列, 楷模。

    Parallel pattern verifier with dynamic time warping
    3.
    发明授权
    Parallel pattern verifier with dynamic time warping 失效
    具有动态时间扭曲的并行模式验证器

    公开(公告)号:US4348553A

    公开(公告)日:1982-09-07

    申请号:US165466

    申请日:1980-07-02

    摘要: A speech recognition system is disclosed which employs a network of elementary local decision modules for matching an observed time-varying speech pattern against all possible time warpings of the stored prototype patterns. For each elementary speech segment, an elementary recognizer provides a score indicating the degree of correlation of the input speech segment with stored spectral patterns. Each local decision module receives the results of the elementary recognizer and, at the same time, receives an input from selected ones of the other local decision modules. Each local decision module specializes in a particular node in the network wherein each node matches the probability of how well the input segment of speech matches the particular sound segments in the sounds of the words spoken. Each local decision module takes the prior decisions of all preceding sound segments which are input from the other local decision modules and makes a selection of the locally optimum time warping to be permitted. By this selection technique, each speech segment is stretched or compressed by an arbitrary, nonlinear function based on the control of the interconnections of the other local decision modules to a particular local decision module. Each local decision module includes an accumulator memory which stores the logarithmic probabilities of the current observation which is conditional upon the internal event specified by a word to be matched or identifier of the particular pattern that corresponds to the subject node for that particular pattern. For each observation, these probabilities are computed and loaded into the accumulator memory of all the modules and, the result of the locally optimum time warping representing the accumulated score or network path to a node for the word with the highest probability is chosen.

    摘要翻译: 公开了一种语音识别系统,其采用基本局部决策模块的网络,用于将观察到的时变语音模式与存储的原型图案的所有可能的时间变形相匹配。 对于每个基本语音段,基本识别器提供指示输入语音段与存储的频谱模式的相关程度的分数。 每个本地决策模块接收基本识别器的结果,同时从其他本地决策模块的选定接收器接收输入。 每个本地决策模块专门针对网络中的特定节点,其中每个节点匹配输入段语音与所说话语音中的特定声音段匹配的概率。 每个本地决策模块采用从其他本地决策模块输入的所有以前的声音段的先前决定,并且选择要允许的局部最佳时间翘曲。 通过这种选择技术,基于将其他局部决策模块的互连控制到特定的本地决策模块,每个语音段被任意的非线性函数拉伸或压缩。 每个本地决策模块包括累加器存储器,其存储当前观察的对数概率,该对数概率取决于要匹配的字指定的内部事件或与该特定模式的对象节点对应的特定模式的标识符。 对于每个观察,这些概率被计算并加载到所有模块的累加器存储器中,并且选择表示具有最高概率的单词的节点的累积分数或网络路径的局部最佳时间扭曲的结果。

    Robust pattern recognition system and method using Socratic agents

    公开(公告)号:US08331656B2

    公开(公告)日:2012-12-11

    申请号:US13446942

    申请日:2012-04-13

    申请人: James K. Baker

    发明人: James K. Baker

    IPC分类号: G06K9/62 G06F17/00 G06N5/00

    CPC分类号: G06K9/6256 G06K9/6262

    摘要: A computer-implemented pattern recognition method, system and program product, the method comprising in one embodiment: creating electronically a linkage between a plurality of models within a classifier module within a pattern recognition system such that any one of said plurality of models may be selected as an active model in a recognition process; creating electronically a null hypothesis between at least one model of said plurality of linked models and at least a second model among said plurality of linked models; accumulating electronically evidence to accept or reject said null hypothesis until sufficient evidence is accumulated to reject said null hypothesis in favor of one of said plurality of linked models or until a stopping criterion is met; and transmitting at least a portion of the electronically accumulated evidence or a summary thereof to accept or reject said null hypothesis to a pattern classifier module.

    Speech recognition method
    5.
    发明授权

    公开(公告)号:US4803729A

    公开(公告)日:1989-02-07

    申请号:US34843

    申请日:1987-04-03

    申请人: James K. Baker

    发明人: James K. Baker

    IPC分类号: G10L15/04 G10L5/00

    CPC分类号: G10L15/04

    摘要: Smoothed frame labeling associates phonetic frame labels with a given speech frame as a function of (a) the closeness with which the given frame compares to each of a plurality of acoustic models, (b) which frame labels correspond with a neighboring frame, and (c) transition probabilities which indicate, for the frame labels associated with the neighboring frame, which frame labels are probably associated with the given frame. The smoothed frame labeling is used to divide the speech into segments of frames having the same class of labels. The invention represents words as a collection of known diphone models, each of which models the sound before and after a boundary between segments derived by the smoothed frame labeling. At recognition time, the speech is divided into segments by smoothed frame labeling; diphone models are derived for each boundary between the resulting segments; and the resulting diphone models are compared against the known diphone models to determine which of the known diphone models match the segment boundaries in the speech. Then a combined-displaced-evidence method is used to determine which words occur in the speech. This method detects which acoustic patterns, in the form of the known diphone models, match various portions of the speech. In response to each such match, it associates with the speech an evidence score for each vocabulary word in which that pattern is known to occur. It displaces each such score from the location of its associated matched pattern by the known distance between that pattern and the beginning of the score's word. Then all the evidence scores for a word located in a given portion of the speech are combined to produce a score which indicates the probability of that word starting in that portion of the speech. This score is combined with a score produced by comparing a histogram from a portion of the speech against a histogram of each word. The resulting combined score determines whether a given word should undergo a more detailed comparison against the speech to be recognized.

    Robust pattern recognition system and method using socratic agents
    6.
    发明授权
    Robust pattern recognition system and method using socratic agents 有权
    强大的图案识别系统和方法,使用沉重的代理

    公开(公告)号:US08014591B2

    公开(公告)日:2011-09-06

    申请号:US11898636

    申请日:2007-09-13

    申请人: James K. Baker

    发明人: James K. Baker

    IPC分类号: G06K9/62 G06F17/00 G06N5/00

    CPC分类号: G06K9/6256 G06K9/6262

    摘要: A computer-implemented pattern recognition method, system and program product, the method comprising in one embodiment: creating electronically a linkage between a plurality of models within a classifier module within a pattern recognition system such that any one of said plurality of models may be selected as an active model in a recognition process; creating electronically a null hypothesis between at least one model of said plurality of linked models and at least a second model among said plurality of linked models; accumulating electronically evidence to accept or reject said null hypothesis until sufficient evidence is accumulated to reject said null hypothesis in favor of one of said plurality of linked models or until a stopping criterion is met; and transmitting at least a portion of the electronically accumulated evidence or a summary thereof to accept or reject said null hypothesis to a pattern classifier module.

    摘要翻译: 一种计算机实现的模式识别方法,系统和程序产品,所述方法包括在一个实施例中:电子地创建模式识别系统内的分类器模块内的多个模型之间的链接,使得可以选择所述多个模型中的任何一个 作为识别过程中的积极模式; 在所述多个链接模型的至少一个模型和所述多个链接模型中的至少第二模型之间以电子方式创建零假设; 积累电子证据以接受或拒绝所述零假设,直到足够的证据被累积以拒绝所述零假设以有利于所述多个链接模型中的一个或直到满足停止标准为止; 以及传送所述电子累积证据的至少一部分或其摘要,以将所述零假设接受或拒绝给模式分类器模块。

    Assisted speech recognition by dual search acceleration technique
    7.
    发明授权
    Assisted speech recognition by dual search acceleration technique 有权
    辅助语音识别双重搜索加速技术

    公开(公告)号:US07031915B2

    公开(公告)日:2006-04-18

    申请号:US10348966

    申请日:2003-01-23

    申请人: James K. Baker

    发明人: James K. Baker

    IPC分类号: G10L15/00 G10L15/12 G10L15/28

    CPC分类号: G10L15/08

    摘要: A speech recognition method, system and program product, the method in one embodiment comprising: obtaining input speech data; initiating a first speech recognition search process with at least one hypothesis; initiating a second speech recognition search process with a plurality of hypotheses; obtaining partial results from the second speech recognition search process, where the partial results include an evaluation of at least one hypothesis that the first speech recognition search process has not evaluated at this point in time; and utilizing the partial results to alter the first speech recognition search process.

    摘要翻译: 一种语音识别方法,系统和程序产品,一个实施例中的方法包括:获得输入语音数据; 发起具有至少一个假设的第一语音识别搜索过程; 发起具有多个假设的第二语音识别搜索过程; 从所述第二语音识别搜索过程获得部分结果,其中所述部分结果包括所述第一语音识别搜索处理在该时间点尚未评估的至少一个假设的评估; 并且利用部分结果来改变第一语音识别搜索过程。

    Apparatus and methods for training speech recognition systems and their
users and otherwise improving speech recognition performance
    8.
    发明授权
    Apparatus and methods for training speech recognition systems and their users and otherwise improving speech recognition performance 失效
    用于训练语音识别系统及其用户的装置和方法,以及改善语音识别性能

    公开(公告)号:US5428707A

    公开(公告)日:1995-06-27

    申请号:US976413

    申请日:1992-11-13

    摘要: A tutorial instructs how to use a word recognition system, such as one for speech recognition. It specifies a set of allowed response words for each of a plurality of states. It sends messages on how to use the recognizer in certain states, and, in others, presents exercises in which the user is to enter signals representing expected words. It scores each such signal against word models to select which response word corresponds to it, and then advances to a state associated with that selected response. This scoring is performed against a large vocabulary even though only a small number of responses are allowed, and the signal is rejected if too many non-allowed words score better than any allowed word. The system comes with multiple sets of standard signal models; it scores each against a given user's signals, selects the set which scores best, and then performs adaptive and batch training upon that set. Preferably, the tutorial prompts users to enter the words used for training in an environment similar to that of the actual recognizer the tutorial is training them to use. The system will normally simulate the recognition of the prompted word, but will sometimes it will simulate an error. When it does, notifies the user if he fails to correct the error. The recognizer associated with the tutorial allows users to perform adaptive training either on all words, or only on those whose recognition has been corrected or confirmed. The recognizer also uses a context language model which indicates the probability that a given word will be used in the context of other words which precede it in a grouping of text.

    摘要翻译: 教程指导如何使用单词识别系统,如用于语音识别的系统。 它为多个状态中的每一个指定一组允许的响应字。 它在某些状态下发送关于如何使用识别器的消息,而在另一些状态下,呈现用户将要输入表示预期字的信号的练习。 它将每个这样的信号与单词模型分值以选择哪个响应字对应于它,然后前进到与所选择的响应相关联的状态。 即使只允许少量答复,这种评分也是针对较大的词汇进行的,并且如果太多的不允许的词得分比任何允许的词更好,则该信号被拒绝。 该系统配有多套标准信号模型; 它根据给定的用户信号进行分数,选择最佳分数的集合,然后对该集合进行自适应和批量训练。 优选地,该教程提示用户在类似于实际识别器的环境中输入用于训练的单词,该教程正在培训他们使用。 系统通常会模拟提示词的识别,但有时会模拟错误。 当这样做时,通知用户他是否无法纠正错误。 与本教程相关联的识别器允许用户以所有单词进行适应性训练,或仅对那些已被校正或确认的人进行适应性训练。 识别器还使用上下文语言模型,其指示在文本分组中在其之前的其它单词的上下文中使用给定单词的概率。

    Speech recognition training method
    9.
    发明授权
    Speech recognition training method 失效
    语音识别训练方法

    公开(公告)号:US4718088A

    公开(公告)日:1988-01-05

    申请号:US593891

    申请日:1984-03-27

    摘要: A speech recognition method and apparatus employ a speech processing circuitry for repetitively deriving from a speech input, at a frame repetition rate, a plurality of acoustic parameters. The acoustic parameters represent the speech input signal for a frame time. A plurality of template matching and cost processing circuitries are connected to a system bus, along with the speech processing circuitry, for determining, or identifying, the speech units in the input speech, by comparing the acoustic parameters with stored template patterns. The apparatus can be expanded by adding more template matching and cost processing circuitry to the bus thereby increasing the speech recognition capacity of the apparatus. Template pattern generation is advantageously aided by using a "joker" word to specify the time boundaries of utterances spoken in isolation, by finding the beginning and ending of an utterance surrounded by silence.

    摘要翻译: 语音识别方法和装置采用语音处理电路,以帧重复率重复地从语音输入中导出多个声学参数。 声学参数表示帧时间的语音输入信号。 通过将声学参数与存储的模板图案进行比较,多个模板匹配和成本处理电路连同语音处理电路连接到用于确定或识别输入语音中的语音单元的系统总线。 可以通过向总线添加更多的模板匹配和成本处理电路来扩展该装置,从而增加装置的语音识别能力。 通过使用“小丑”字通过找到由沉默包围的话语的开始和结束来有助于指定孤立地说出的话语的时间边界。

    Word recognition consistency check and error correction system and method
    10.
    发明授权
    Word recognition consistency check and error correction system and method 失效
    词识别一致性检查和纠错系统及方法

    公开(公告)号:US06823493B2

    公开(公告)日:2004-11-23

    申请号:US10348780

    申请日:2003-01-23

    申请人: James K. Baker

    发明人: James K. Baker

    IPC分类号: G06F1700

    CPC分类号: G06F17/276

    摘要: A word recognition method and system includes obtaining a first portion of a sentence and a second portion of the sentence. The first portion of the sentence is used to obtain a pointer to respective list of second portions of sentences that are complementary to the first portion of the sentence. A match is determined to one of the second portions of sentences from the list based on information obtained from the second portion of the sentence. An error correction capability for dealing with one word errors in the sentence for checking against the database is also attainable.

    摘要翻译: 词识别方法和系统包括获得句子的第一部分和句子的第二部分。 句子的第一部分用于获得指向与句子的第一部分互补的句子的第二部分的相应列表的指针。 基于从句子的第二部分获得的信息,从列表中确定句子的第二部分中的一个匹配。 用于处理句子中的一个字错误以用于检查数据库的纠错能力也是可以实现的。