Computer program product for automatic recognition of a consistent
message using multiple complimentary sources of information
    1.
    发明授权
    Computer program product for automatic recognition of a consistent message using multiple complimentary sources of information 失效
    计算机程序产品,用于使用多个免费的信息来自动识别一致的消息

    公开(公告)号:US5621809A

    公开(公告)日:1997-04-15

    申请号:US481150

    申请日:1995-06-07

    CPC classification number: G06K9/6293 G10L15/24 G10L15/10 G10L2015/0635

    Abstract: A general approach is provided for the combined use of several sources of information in the automatic recognition of a consistent message. For each message unit (e.g., word) the total likelihood score is assumed to be the weighted sum of the likelihood scores resulting from the separate evaluation of each information source. Emphasis is placed on the estimation of weighing factors used in forming this total likelihood. This method can be applied, for example, to the decoding of a consistent message using both handwriting and speech recognition. The present invention includes three procedures which provide the optimal weighing coefficients.

    Abstract translation: 提供了一种通用方法,用于在一致的消息的自动识别中组合使用多种信息源。 对于每个消息单元(例如,单词),总概率分数被假设为由每个信息源的单独评估得到的似然分数的加权和。 强调用于形成这种总可能性的称重因子的估计。 该方法例如可以应用于使用手写和语音识别两者的一致消息的解码。 本发明包括提供最佳称重系数的三个步骤。

    Automatic recognition of a consistent message using multiple
complimentary sources of information
    2.
    发明授权
    Automatic recognition of a consistent message using multiple complimentary sources of information 失效
    使用多个免费信息来自动识别一致的消息

    公开(公告)号:US5502774A

    公开(公告)日:1996-03-26

    申请号:US300232

    申请日:1994-09-06

    CPC classification number: G06K9/6293 G10L15/24 G10L15/10 G10L2015/0635

    Abstract: A general approach is provided for the combined use of several sources of information in the automatic recognition of a consistent message. For each message unit (e.g., word) the total likelihood score is assumed to be the weighted sum of the likelihood scores resulting from the separate evaluation of each information source. Emphasis is placed on the estimation of weighing factors used in forming this total likelihood. This method can be applied, for example, to the decoding of a consistent message using both handwriting and speech recognition. The present invention includes three procedures which provide the optimal weighing coefficients.

    Abstract translation: 提供了一种通用方法,用于在一致的消息的自动识别中组合使用多种信息源。 对于每个消息单元(例如,单词),总概率分数被假设为由每个信息源的单独评估得到的似然分数的加权和。 强调用于形成这种总可能性的称重因子的估计。 该方法例如可以应用于使用手写和语音识别两者的一致消息的解码。 本发明包括提供最佳称重系数的三个步骤。

    Methods and apparatuses for automatic speech recognition
    3.
    发明授权
    Methods and apparatuses for automatic speech recognition 有权
    自动语音识别的方法和装置

    公开(公告)号:US09431006B2

    公开(公告)日:2016-08-30

    申请号:US12497511

    申请日:2009-07-02

    CPC classification number: G10L15/063 G10L15/08 G10L15/14 G10L15/187 G10L15/32

    Abstract: Exemplary embodiments of methods and apparatuses for automatic speech recognition are described. First model parameters associated with a first representation of an input signal are generated. The first representation of the input signal is a discrete parameter representation. Second model parameters associated with a second representation of the input signal are generated. The second representation of the input signal includes a continuous parameter representation of residuals of the input signal. The first representation of the input signal includes discrete parameters representing first portions of the input signal. The second representation includes discrete parameters representing second portions of the input signal that are smaller than the first portions. Third model parameters are generated to couple the first representation of the input signal with the second representation of the input signal. The first representation and the second representation of the input signal are mapped into a vector space.

    Abstract translation: 描述用于自动语音识别的方法和装置的示例性实施例。 产生与输入信号的第一表示相关联的第一模型参数。 输入信号的第一个表示是离散参数表示。 产生与输入信号的第二表示相关联的第二模型参数。 输入信号的第二表示包括输入信号的残差的连续参数表示。 输入信号的第一表示包括表示输入信号的第一部分的离散参数。 第二表示包括表示输入信号的小于第一部分的第二部分的离散参数。 产生第三模型参数以将输入信号的第一表示与输入信号的第二表示耦合。 输入信号的第一表示和第二表示被映射到向量空间中。

    Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
    4.
    发明授权
    Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis 失效
    用于文本到语音合成的组合统计和规则的词性标签

    公开(公告)号:US08719006B2

    公开(公告)日:2014-05-06

    申请号:US12870542

    申请日:2010-08-27

    CPC classification number: G10L13/02 G10L13/10

    Abstract: In response to a word of a text sequence, a first part-of-speech (POS) tag is generated using a statistical part-of-speech (POS) tagger based on a corpus of trained text sequences, each representing a likely POS of a word for a given text sequence. A second POS tag is generated using a rule-based POS tagger based on a set of one or more rules associated with a type of an application associated with the text sequence. A final POS tag is assigned to the word of the text sequence for TTS synthesis based on the first POS tag and the second POS tag.

    Abstract translation: 响应于文本序列的单词,使用基于经训练的文本序列的语料库的统计语音(POS)标签器来生成第一语音(POS)标签,每个表示可能的POS 给定文本序列的一个单词。 使用基于规则的POS标签器基于与与文本序列相关联的应用的类型相关联的一个或多个规则的集合来生成第二POS标签。 基于第一POS标签和第二POS标签,将最终的POS标签分配给用于TTS合成的文本序列的单词。

    Unsupervised document clustering using latent semantic density analysis
    5.
    发明授权
    Unsupervised document clustering using latent semantic density analysis 有权
    使用潜在语义密度分析的无监督文档聚类

    公开(公告)号:US08713021B2

    公开(公告)日:2014-04-29

    申请号:US12831909

    申请日:2010-07-07

    CPC classification number: G06F17/3071 G06K9/6223 G06K9/627

    Abstract: According to one embodiment, a latent semantic mapping (LSM) space is generated from a collection of a plurality of documents, where the LSM space includes a plurality of document vectors, each representing one of the documents in the collection. For each of the document vectors considered as a centroid document vector, a group of document vectors is identified in the LSM space that are within a predetermined hypersphere diameter from the centroid document vector. As a result, multiple groups of document vectors are formed. The predetermined hypersphere diameter represents a predetermined closeness measure among the document vectors in the LSM space. Thereafter, a group from the plurality of groups is designated as a cluster of document vectors, where the designated group contains a maximum number of document vectors among the plurality of groups.

    Abstract translation: 根据一个实施例,从多个文档的集合生成潜在语义映射(LSM)空间,其中LSM空间包括多个文档向量,每个文档向量表示集合中的文档之一。 对于被认为是质心文档向量的每个文档向量,在LSM空间中识别出一组文档向量,其位于距重心文档向量的预定超球直径内。 结果,形成了多组文档向量。 预定的超球直径表示LSM空间中的文档向量中的预定的接近度量度。 此后,将来自多个组的组指定为文档向量的集合,其中指定组在多个组中包含最大数量的文档向量。

    Method for dynamic context scope selection in hybrid N-GRAM+LSA language modeling
    6.
    发明授权
    Method for dynamic context scope selection in hybrid N-GRAM+LSA language modeling 有权
    混合N-GRAM + LSA语言建模中动态上下文范围选择的方法

    公开(公告)号:US07720673B2

    公开(公告)日:2010-05-18

    申请号:US11710098

    申请日:2007-02-23

    Abstract: A method and system for dynamic language modeling of a document are described. In one embodiment, a number of local probabilities of a current document are computed and a vector representation of the current document in a latent semantic analysis (LSA) space is determined. In addition, a number of global probabilities based upon the vector representation of the current document in an LSA space is computed. Further, the local probabilities and the global probabilities are combined to produce the language modeling.

    Abstract translation: 描述了用于文档的动态语言建模的方法和系统。 在一个实施例中,计算当前文档的多个局部概率,并确定潜在语义分析(LSA)空间中当前文档的向量表示。 此外,计算出基于LSA空间中的当前文档的向量表示的多个全局概率。 此外,组合局部概率和全局概率以产生语言建模。

    Unsupervised data-driven pronunciation modeling
    7.
    发明授权
    Unsupervised data-driven pronunciation modeling 失效
    无监督的数据驱动的发音建模

    公开(公告)号:US07702509B2

    公开(公告)日:2010-04-20

    申请号:US11603586

    申请日:2006-11-21

    CPC classification number: G10L15/187 G10L15/063

    Abstract: Pronunciation for an input word is modeled by generating a set of candidate phoneme strings having pronunciations close to the input word in an orthographic space. Phoneme sub-strings in the set are selected as the pronunciation. In one aspect, a first closeness measure between phoneme strings for words chosen from a dictionary and contexts within the input word is used to determine the candidate phoneme strings. The words are chosen from the dictionary based on a second closeness measure between a representation of the input word in the orthographic space and orthographic anchors corresponding to the words in the dictionary. In another aspect, the phoneme sub-strings are selected by aligning the candidate phoneme strings on common phoneme sub-strings to produce an occurrence count, which is used to choose the phoneme sub-strings for the pronunciation.

    Abstract translation: 通过在正交空间中生成具有接近输入字的发音的候选音素串的集合来建模输入字的发音。 选择音色中的音素子串作为发音。 在一个方面,用于从字典中选择的词语的音素字符串和输入单词内的上下文之间的第一接近度量度用于确定候选音素字符串。 基于字典中的输入字的表示和对应于字典中的单词的正字拼图之间的第二接近度测量,从字典中选择词。 在另一方面,通过将候选音素串对准在公共音素子串上以产生一个出现次数来选择音素子串,该数目用于选择发音的音素子串。

    Method and apparatus for assigning word prominence to new or previous information in speech synthesis
    8.
    发明授权
    Method and apparatus for assigning word prominence to new or previous information in speech synthesis 有权
    将语音突出分配给语音合成中的新信息或先前信息的方法和装置

    公开(公告)号:US07313523B1

    公开(公告)日:2007-12-25

    申请号:US10439217

    申请日:2003-05-14

    CPC classification number: G10L13/033 G10L13/04

    Abstract: A method and apparatus is provided for generating speech that sounds more natural. In one embodiment, word prominence and latent semantic analysis are used to generate more natural sounding speech. A method for generating speech that sounds more natural may comprise generating synthesized speech having certain word prominence characteristics and applying a semantically-driven word prominence assignment model to specify word prominence consistent with the way humans assign word prominence. A speech representative of a current sentence is generated. The determination is made whether information in the current sentence is new or previously given in accordance with a semantic relationship between the current sentence and a number of preceding sentences. A word prominence is assigned to a word in the current sentence in accordance with the information determination.

    Abstract translation: 提供一种用于产生听起来更自然的语音的方法和装置。 在一个实施例中,词突出和潜在语义分析被用于产生更自然的声音语音。 用于产生听起来更自然的语音的方法可以包括产生具有某些字突出特征的合成语音,并且应用语义驱动的词突出分配模型来指定与人类分配字突出的方式一致的词突出。 生成当前句子的演讲代表。 确定当前句子中的信息是新的还是先前根据当前句子和多个先前句子之间的语义关系给出的确定。 根据信息确定,将当前句子中的单词分配给单词。

    Method for dynamic context scope selection in hybrid n-gram+LSA language modeling
    9.
    发明授权
    Method for dynamic context scope selection in hybrid n-gram+LSA language modeling 有权
    混合n-gram + LSA语言建模中动态上下文范围选择的方法

    公开(公告)号:US06477488B1

    公开(公告)日:2002-11-05

    申请号:US09523070

    申请日:2000-03-10

    Abstract: A method and system for dynamic language modeling of a document are described. In one embodiment, a number of local probabilities of a current document are computed and a vector representation of the current document in a latent semantic analysis (LSA) space is determined. In addition, a number of global probabilities based upon the vector representation of the current document in an LSA space is computed. Further, the local probabilities and the global probabilities are combined to produce the language modeling.

    Abstract translation: 描述了用于文档的动态语言建模的方法和系统。 在一个实施例中,计算当前文档的多个局部概率,并确定潜在语义分析(LSA)空间中当前文档的向量表示。 此外,计算出基于LSA空间中的当前文档的向量表示的多个全局概率。 此外,组合局部概率和全局概率以产生语言建模。

    Fast update implementation for efficient latent semantic language modeling
    10.
    发明授权
    Fast update implementation for efficient latent semantic language modeling 有权
    快速更新实现高效潜在语义语言建模

    公开(公告)号:US06374217B1

    公开(公告)日:2002-04-16

    申请号:US09267334

    申请日:1999-03-12

    CPC classification number: G10L15/1815 G10L15/197

    Abstract: Speech or acoustic signals are processed directly using a hybrid stochastic language model produced by integrating a latent semantic analysis language model into an n-gram probability language model. The latent semantic analysis language model probability is computed using a first pseudo-document vector that is derived from a second pseudo-document vector with the pseudo-document vectors representing pseudo-documents created from the signals received at different times. The first pseudo-document vector is derived from the second pseudo-document vector by updating the second pseudo-document vector directly in latent semantic analysis space in response to at least one addition of a candidate word of the received speech signals to the pseudo-document represented by the second pseudo-document vector. Updating precludes mapping a sparse representation for a pseudo-document into the latent semantic space to produce the first pseudo-document vector. A linguistic message representative of the received speech signals is generated.

    Abstract translation: 使用通过将潜在语义分析语言模型集成到n-gram概率语言模型中产生的混合随机语言模型直接处理语音或声信号。 使用从第二伪文档向量导出的第一伪文档向量计算潜在语义分析语言模型概率,其中伪文档向量表示从在不同时间接收的信号创建的伪文档。 通过响应于接收到的语音信号的候选词的至少一个添加到伪文档,在第一伪文档向量中直接在潜在语义分析空间中更新第二伪文档向量,从第二伪文档向量导出第一伪文档向量 由第二伪文档向量表示。 更新排除了将伪文档的稀疏表示映射到潜在语义空间中以产生第一伪文档向量。 产生代表接收到的语音信号的语言消息。

Patent Agency Ranking