Integrating feature extraction via local sequential embedding for automatic handwriting recognition
    21.
    发明授权
    Integrating feature extraction via local sequential embedding for automatic handwriting recognition 有权
    通过局部顺序嵌入集成特征提取,实现自动手写识别

    公开(公告)号:US08977059B2

    公开(公告)日:2015-03-10

    申请号:US13507119

    申请日:2012-06-04

    IPC分类号: G06K9/48 G06K9/00

    CPC分类号: G06K9/00416 G06K9/48

    摘要: Integrating features is disclosed, including: determining a value associated with a temporal feature for a point; determining a value associated with a spatial feature associated with the temporal feature; including the value associated with a spatial feature and the value associated with the temporal feature into a feature vector; and using the feature vector to decode for a character. Determining a transform is also disclosed, including: determining, for a point associated with a sequence of points, a set of points including: the point, a first subset of points of the sequence preceding a sequence position associated with the point, and a second subset of points following the sequence position associated with the point; and determining the transform associated with the point based at least in part on the set of points.

    摘要翻译: 公开了集成特征,包括:确定与点的时间特征相关联的值; 确定与与所述时间特征相关联的空间特征相关联的值; 包括与空间特征相关联的值和与时间特征相关联的值到特征向量中; 并使用特征向量来解码字符。 还公开了确定变换,包括:针对与点序列相关联的点确定一组点,所述点包括:所述点,与所述点相关联的序列位置之前的序列的点的第一子集,以及第二 跟随与点相关的序列位置的点子集; 以及至少部分地基于所述集合来确定与所述点相关联的变换。

    DATA-DRIVEN GLOBAL BOUNDARY OPTIMIZATION
    22.
    发明申请
    DATA-DRIVEN GLOBAL BOUNDARY OPTIMIZATION 有权
    数据驱动全球边界优化

    公开(公告)号:US20090048836A1

    公开(公告)日:2009-02-19

    申请号:US12181259

    申请日:2008-07-28

    IPC分类号: G10L15/04

    CPC分类号: G10L13/06

    摘要: Portions from segment boundary regions of a plurality of speech segments are extracted. Each segment boundary region is based on a corresponding initial unit boundary. Feature vectors that represent the portions in a vector space are created. For each of a plurality of potential unit boundaries within each segment boundary region, an average discontinuity based on distances between the feature vectors is determined. For each segment, the potential unit boundary associated with a minimum average discontinuity is selected as a new unit boundary.

    摘要翻译: 提取从多个语音段的分段边界区域的部分。 每个段边界区域基于对应的初始单位边界。 创建表示向量空间中的部分的特征向量。 对于每个分段边界区域内的多个潜在单位边界中的每一个,确定基于特征向量之间的距离的平均不连续性。 对于每个段,选择与最小平均不连续相关联的潜在单位边界作为新单位边界。

    Representation of orthography in a continuous vector space
    23.
    发明授权
    Representation of orthography in a continuous vector space 失效
    在连续矢量空间中表示正字法

    公开(公告)号:US07353164B1

    公开(公告)日:2008-04-01

    申请号:US10242849

    申请日:2002-09-13

    IPC分类号: G06F17/20

    摘要: An orthographic anchor for each word in a dictionary is created in an orthographic space by mapping the words and a set of letter patterns characteristic of the words into the orthographic space. In one aspect the orthographic anchors are row or column vectors resulting from a decomposition of a matrix of feature vectors created by the mapping. In another aspect, a pronunciation for an input word is modeled based on a set of candidate phoneme strings that have pronunciations close to the input word in the orthographic space.

    摘要翻译: 通过将单词和单词特征的一组字母模式映射到正交空间中,在正交空间中创建字典中每个单词的正字拼图。 在一个方面,正字拼图是由映射创建的特征向量矩阵的分解产生的行或列向量。 在另一方面,用于输入单词的发音是基于具有接近正交空间中的输入单词的发音的一组候选音素串进行建模的。

    Method for dynamic context scope selection in hybrid N-gram+LSA language modeling
    24.
    发明授权
    Method for dynamic context scope selection in hybrid N-gram+LSA language modeling 有权
    混合N-gram + LSA语言建模中动态上下文范围选择的方法

    公开(公告)号:US07191118B2

    公开(公告)日:2007-03-13

    申请号:US10917730

    申请日:2004-08-12

    IPC分类号: G06F17/27

    摘要: A method and system for dynamic language modeling of a document are described. In one embodiment, a number of local probabilities of a current document are computed and a vector representation of the current document in a latent semantic analysis (LSA) space is determined. In addition, a number of global probabilities based upon the vector representation of the current document in an LSA space is computed. Further, the local probabilities and the global probabilities are combined to produce the language modeling.

    摘要翻译: 描述了用于文档的动态语言建模的方法和系统。 在一个实施例中,计算当前文档的多个局部概率,并确定潜在语义分析(LSA)空间中当前文档的向量表示。 此外,计算出基于LSA空间中的当前文档的向量表示的多个全局概率。 此外,组合局部概率和全局概率以产生语言建模。

    Method and apparatus for speech recognition using latent semantic adaptation
    25.
    发明授权
    Method and apparatus for speech recognition using latent semantic adaptation 有权
    使用潜在语义适应的语音识别方法和装置

    公开(公告)号:US07124081B1

    公开(公告)日:2006-10-17

    申请号:US09967072

    申请日:2001-09-28

    IPC分类号: G10L15/00 G10L15/28 G10L15/06

    CPC分类号: G10L15/1815 G10L15/183

    摘要: A method and apparatus for speech recognition using latent semantic adaptation is described herein. According to one aspect of the present invention, a method for recognizing speech comprises using latent semantic analysis (LSA) to generate an LSA space for a collection of documents and to continually adapt the LSA space with new documents as they become available. Adaptation of the LSA space is optimally two-sided, taking into account the new words in the new documents. Alternatively, adaptation is one-sided, taking into account the new documents but discarding any new words appearing in those documents.

    摘要翻译: 本文描述了使用潜在语义适配的语音识别的方法和装置。 根据本发明的一个方面,一种用于识别语音的方法包括使用潜在语义分析(LSA)来生成用于文档集合的LSA空间,并且在LSA空间随着新文档变得可用时不断地适应LSA空间。 考虑到新文件中的新词,适应LSA空间是最为双向的。 或者,适应是片面的,考虑到新的文件,但丢弃了这些文件中出现的任何新词。

    Method for dynamic context scope selection in hybrid N-gram+LSA language modeling
    26.
    发明授权
    Method for dynamic context scope selection in hybrid N-gram+LSA language modeling 有权
    混合N-gram + LSA语言建模中动态上下文范围选择的方法

    公开(公告)号:US06778952B2

    公开(公告)日:2004-08-17

    申请号:US10243423

    申请日:2002-09-12

    IPC分类号: G06F1727

    摘要: A method and system for dynamic language modeling of a document are described. In one embodiment, a number of local probabilities of a current document are computed and a vector representation of the current document in a latent semantic analysis (LSA) space is determined. In addition, a number of global probabilities based upon the vector representation of the current document in an LSA space is computed. Further, the local probabilities and the global probabilities are combined to produce the language modeling.

    摘要翻译: 描述了用于文档的动态语言建模的方法和系统。 在一个实施例中,计算当前文档的多个局部概率,并确定潜在语义分析(LSA)空间中当前文档的向量表示。 此外,计算出基于LSA空间中的当前文档的向量表示的多个全局概率。 此外,组合局部概率和全局概率以产生语言建模。

    Speaker adaptation based on lateral tying for large-vocabulary
continuous speech recognition
    27.
    发明授权
    Speaker adaptation based on lateral tying for large-vocabulary continuous speech recognition 失效
    基于横向绑定的大词汇连续语音识别的演讲者适应

    公开(公告)号:US5737487A

    公开(公告)日:1998-04-07

    申请号:US600859

    申请日:1996-02-13

    IPC分类号: G10L15/06 G10L5/06

    CPC分类号: G10L15/065

    摘要: A system and method for performing speaker adaptation in a speech recognition system which includes a set of reference models corresponding to speech data from a plurality of speakers. The speech data is represented by a plurality of acoustic models and corresponding sub-events, and each sub-event includes one or more observations of speech data. A degree of lateral tying is computed between each pair of sub-events, wherein the degree of tying indicates the degree to which a first observation in a first sub-event contributes to the remaining sub-events. When adaptation data from a new speaker becomes available, a new observation from adaptation data is assigned to one of the sub-events. Each of the sub-events is then populated with the observations contained in the assigned sub-event based on the degree of lateral tying that was computed between each pair of sub-events. The reference models corresponding to the populated sub-events are then adapted to account for speech pattern idiosyncrasies of the new speaker, thereby reducing the error rate of the speech recognition system.

    摘要翻译: 一种用于在语音识别系统中执行说话者适应的系统和方法,该系统和方法包括对应于来自多个扬声器的语音数据的一组参考模型。 语音数据由多个声学模型和相应的子事件表示,并且每个子事件包括语音数据的一个或多个观察结果。 在每对子事件之间计算横向绑定的程度,其中绑定度表示第一子事件中的第一观察对其余子事件有贡献的程度。 当来自新的说话者的自适应数据变得可用时,从适配数据中的新的观察被分配给一个子事件。 然后基于在每对子事件之间计算的横向绑定的程度,将包含在所分配的子事件中的观察值填充每个子事件。 然后,对应于填充的子事件的参考模型被调整以考虑新说话者的语音模式特征,从而降低语音识别系统的错误率。

    Automatic handwriting recognition using both static and dynamic
parameters

    公开(公告)号:US5544261A

    公开(公告)日:1996-08-06

    申请号:US450556

    申请日:1995-05-25

    摘要: Methods and apparatus are disclosed for recognizing handwritten characters in response to an input signal from a handwriting transducer. A feature extraction and reduction procedure is disclosed that relies on static or shape information, wherein the temporal order in which points are captured by an electronic tablet may be disregarded. A method of the invention generates and processes the tablet data with three independent sets of feature vectors which encode the shape information of the input character information. These feature vectors include horizontal (x-axis) and vertical (y-axis) slices of a bit-mapped image of the input character data, and an additional feature vector to encode an absolute y-axis displacement from a baseline of the bit-mapped image. It is shown that the recognition errors that result from the spatial or static processing are quite different from those resulting from temporal or dynamic processing. Furthermore, it is shown that these differences complement one another. As a result, a combination of these two sources of feature vector information provides a substantial reduction in an overall recognition error rate. Methods to combine probability scores from dynamic and the static character models are also disclosed.

    Automatic handwriting recognition using both static and dynamic
parameters

    公开(公告)号:US5539839A

    公开(公告)日:1996-07-23

    申请号:US450558

    申请日:1995-05-25

    摘要: Methods and apparatus are disclosed for recognizing handwritten characters in response to an input signal from a handwriting transducer. A feature extraction and reduction procedure is disclosed that relies on static or shape information, wherein the temporal order in which points are captured by an electronic tablet may be disregarded. A method of the invention generates and processes the tablet data with three independent sets of feature vectors which encode the shape information of the input character information. These feature vectors include horizontal (x-axis) and vertical (y-axis) slices of a bit-mapped image of the input character data, and an additional feature vector to encode an absolute y-axis displacement from a baseline of the bit-mapped image. It is shown that the recognition errors that result from the spatial or static processing are quite different from those resulting from temporal or dynamic processing. Furthermore, it is shown that these differences complement one another. As a result, a combination of these two sources of feature vector information provides a substantial reduction in an overall recognition error rate. Methods to combine probability scores from dynamic and the static character models are also disclosed.

    Speech coding apparatus with single-dimension acoustic prototypes for a
speech recognizer
    30.
    发明授权
    Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer 失效
    具有用于语音识别器的单维声学原型的语音编码装置

    公开(公告)号:US5280562A

    公开(公告)日:1994-01-18

    申请号:US770495

    申请日:1991-10-03

    CPC分类号: G10L19/038 H03M7/3082

    摘要: In speech recognition and speech coding, the values of at least two features of an utterance are measured during a series of time intervals to produce a series of feature vector signals. A plurality of single-dimension prototype vector signals having only one parameter value are stored. At least two single-dimension prototype vector signals having parameter values representing first feature values, and at least two other single-dimension prototype vector signals have parameter values representing second feature values. A plurality of compound-dimension prototype vector signals have unique identification values and comprise one first-dimension and one second-dimension prototype vector signal. At least two compound-dimension prototype vector signals comprise the same first-dimension prototype vector signal. The feature values of each feature vector signal are compared to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores. The identification values of the compound-dimension prototype vector signals having the best prototype match scores for the feature vectors signals are output as a sequence of coded representations of an utterance to be recognized. A match score, comprising an estimate of the closeness of a match between a speech unit and the sequence of coded representations of the utterance, is generated for each of a plurality of speech units. At least one speech subunit, of one or more best candidate speech units having the best match scores, is displayed.

    摘要翻译: 在语音识别和语音编码中,在一系列时间间隔期间测量话音的至少两个特征的值,以产生一系列特征向量信号。 存储仅具有一个参数值的多个单维原型矢量信号。 具有表示第一特征值的参数值和至少两个其它单维原型矢量信号的至少两个单维原型矢量信号具有表示第二特征值的参数值。 多个复合尺寸原型矢量信号具有唯一的识别值,并且包括一个第一维和一个第二维原型矢量信号。 至少两个复合维度原型矢量信号包括相同的第一维原型矢量信号。 将每个特征向量信号的特征值与化合物维度原型矢量信号的参数值进行比较,以获得原型匹配分数。 具有特征矢量信号的具有最佳原型匹配分数的复合维度原型矢量信号的识别值被输出为将被识别的话语的编码表示的序列。 针对多个语音单元中的每一个生成包括语音单元与语音编码表示序列之间的匹配的接近度的估计的匹配分数。 显示具有最佳匹配分数的一个或多个最佳候选语音单元的至少一个语音子单元。