Sampling training data for an automatic speech recognition system based on a benchmark classification distribution
    1.
    发明授权
    Sampling training data for an automatic speech recognition system based on a benchmark classification distribution 有权
    基于基准分类分布的自动语音识别系统的采样训练数据

    公开(公告)号:US09202461B2

    公开(公告)日:2015-12-01

    申请号:US13745295

    申请日:2013-01-18

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G10L15/183

    Abstract: A set of benchmark text strings may be classified to provide a set of benchmark classifications. The benchmark text strings in the set may correspond to a benchmark corpus of benchmark utterances in a particular language. A benchmark classification distribution of the set of benchmark classifications may be determined. A respective classification for each text string in a corpus of text strings may also be determined. Text strings from the corpus of text strings may be sampled to form a training corpus of training text strings such that the classifications of the training text strings have a training text string classification distribution that is based on the benchmark classification distribution. The training corpus of training text strings may be used to train an automatic speech recognition (ASR) system.

    Abstract translation: 一组基准文本字符串可以分类为提供一组基准分类。 集合中的基准文本字符串可以对应于特定语言的基准语音的基准语料库。 可以确定基准分类集合的基准分类分布。 也可以确定文本字符串的语料库中的每个文本字符串的相应分类。 来自文本字符串语料库的文本字符串可以被采样以形成训练文本串的训练语料库,使得训练文本串的分类具有基于基准分类分布的训练文本串分类分布。 训练文本字符串的训练语料库可用于训练自动语音识别(ASR)系统。

    Increasing semantic coverage with semantically irrelevant insertions
    2.
    发明授权
    Increasing semantic coverage with semantically irrelevant insertions 有权
    用语义上不相关的插入来增加语义覆盖

    公开(公告)号:US09129598B1

    公开(公告)日:2015-09-08

    申请号:US14671353

    申请日:2015-03-27

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G06F17/2785 G10L15/19 G10L2015/0631

    Abstract: A method includes accessing data specifying a set of actions, each action defining a user device operation and for each action: accessing a corresponding set of command sentences for the action, determining first n-grams in the set of command sentences that are semantically relevant for the action, determining second n-grams in the set of command sentences that are semantically irrelevant for the action, generating a training set of command sentences from the corresponding set of command sentences, the generating the training set of command sentences including removing each second n-gram from each sentence in the corresponding set of command sentences for the action, and generating a command model from the training set of command sentences configured to generate an action score for the action for an input sentence based on: first n-grams for the action, and second n-grams for the action that are also second n-grams for all other actions.

    Abstract translation: 一种方法包括访问指定一组动作的数据,每个动作定义用户设备操作和每个动作:访问用于动作的相应命令语句集合,确定在命令语句集合中与语义相关的第一n-gram 确定所述命令语句集合中与所述动作语义无关的第二n-gram,从相应的命令句集合生成训练集的命令句,生成所述命令句的训练集,包括移除每个第二n -gram从用于该动作的相应命令句集合中的每个句子,以及根据命令句子的训练集合生成命令模型,所述命令语句被配置为基于以下步骤生成用于输入句子的动作的动作得分: 动作和第二个n-gram,也是所有其他动作的第二个n-gram。

    Mining data for natural language system
    3.
    发明授权
    Mining data for natural language system 有权
    挖掘自然语言系统的数据

    公开(公告)号:US09047271B1

    公开(公告)日:2015-06-02

    申请号:US13780757

    申请日:2013-02-28

    Applicant: Google Inc.

    CPC classification number: G06F17/2765 G10L15/1815 G10L15/197 G10L2015/223

    Abstract: A method iteratively processes data for a set of actions, including: for each action: accessing a corresponding set of command sentences for the action, determining first n-grams that are semantically relevant for the action and second n-grams that are semantically irrelevant for the action, and identifying, from a log of command sentences that includes command sentences not included in the corresponding set of command sentences, candidate command sentences that include one first n-gram and a third n-gram that has not yet been determined to be a first n-gram or a second n-gram; for each candidate command sentence, determining each third n-gram that is semantically relevant for an action to be a first n-gram, and determining each third n-gram that is semantically irrelevant for an action to be a second n-gram, and adjusting the corresponding set of command sentences for each action based on the first n-grams and the second n-grams.

    Abstract translation: 一种方法迭代地处理一组动作的数据,包括:对于每个动作:访问用于该动作的相应的一组命令句子,确定与该动作语义相关的第一个n-gram和与语义上不相关的第二个n-gram 从包括不包括在相应的一组命令句子中的命令句子的命令句子的日志中的动作和识别,包括尚未被确定为的第一个n-gram和第三个n-gram的候选命令句子 第一个n-gram或第二个n-gram; 对于每个候选命令句,确定与作为第一个n-gram的动作语义相关的每个第三个n-gram,以及确定对于作为第二个n-gram的动作语义上无关的每个第三个n-gram,以及 基于第一n克和第二n克调整针对每个动作的相应命令句集。

    LANGUAGE MODELS USING DOMAIN-SPECIFIC MODEL COMPONENTS

    公开(公告)号:US20180053502A1

    公开(公告)日:2018-02-22

    申请号:US15682133

    申请日:2017-08-21

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using domain-specific model components. In some implementations, context data for an utterance is obtained. A domain-specific model component is selected from among multiple domain-specific model components of a language model based on the non-linguistic context of the utterance. A score for a candidate transcription for the utterance is generated using the selected domain-specific model component and a baseline model component of the language model that is domain-independent. A transcription for the utterance is determined using the score the transcription is provided as output of an automated speech recognition system.

    Enhanced maximum entropy models
    6.
    发明授权
    Enhanced maximum entropy models 有权
    增强最大熵模型

    公开(公告)号:US09412365B2

    公开(公告)日:2016-08-09

    申请号:US14667518

    申请日:2015-03-24

    Applicant: Google Inc.

    CPC classification number: G10L15/197

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to enhanced maximum entropy models. In some implementations, data indicating a candidate transcription for an utterance and a particular context for the utterance are received. A maximum entropy language model is obtained. Feature values are determined for n-gram features and backoff features of the maximum entropy language model. The feature values are input to the maximum entropy language model, and an output is received from the maximum entropy language model. A transcription for the utterance is selected from among a plurality of candidate transcriptions based on the output from the maximum entropy language model. The selected transcription is provided to a client device.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,涉及增强的最大熵模型。 在一些实施方式中,接收表示用于话语的候选转录和用于话语的特定上下文的数据。 获得最大熵语言模型。 为最大熵语言模型的n-gram特征和退避特征确定特征值。 特征值被输入到最大熵语言模型,并且从最大熵语言模型接收输出。 基于最大熵语言模型的输出,从多个候选转录中选择用于发音的转录。 选择的转录被提供给客户端设备。

    Increasing semantic coverage with semantically irrelevant insertions

    公开(公告)号:US09020809B1

    公开(公告)日:2015-04-28

    申请号:US13780804

    申请日:2013-02-28

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G06F17/2785 G10L15/19 G10L2015/0631

    Abstract: A method includes accessing data specifying a set of actions, each action defining a user device operation and for each action: accessing a corresponding set of command sentences for the action, determining first n-grams in the set of command sentences that are semantically relevant for the action, determining second n-grams in the set of command sentences that are semantically irrelevant for the action, generating a training set of command sentences from the corresponding set of command sentences, the generating the training set of command sentences including removing each second n-gram from each sentence in the corresponding set of command sentences for the action, and generating a command model from the training set of command sentences configured to generate an action score for the action for an input sentence based on: first n-grams for the action, and second n-grams for the action that are also second n-grams for all other actions.

    Addressing missing features in models

    公开(公告)号:US09805713B2

    公开(公告)日:2017-10-31

    申请号:US14681652

    申请日:2015-04-08

    Applicant: Google Inc.

    CPC classification number: G10L15/08 G10L15/183 G10L15/26 G10L15/30

    Abstract: Systems and methods for addressing missing features in models are provided. In some implementations, a model configured to indicate likelihoods of different outcomes is accessed. The model includes a respective score for each of a plurality of features, and each feature corresponds to an outcome in an associated context. It is determined that the model does not include a score for a feature corresponding to a potential outcome in a particular context. A score is determined for the potential outcome in the particular context based on the scores for one or more features in the model that correspond to different outcomes in the particular context. The model and the score are used to determine a likelihood of occurrence of the potential outcome.

    SPEECH RECOGNITION USING LOG-LINEAR MODEL
    9.
    发明申请
    SPEECH RECOGNITION USING LOG-LINEAR MODEL 审中-公开
    使用LOG-LINEAR MODEL语音识别

    公开(公告)号:US20160275946A1

    公开(公告)日:2016-09-22

    申请号:US14708465

    申请日:2015-05-11

    Applicant: Google Inc.

    CPC classification number: G10L15/197 G06F17/2715

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to generating log-linear models. In some implementations, n-gram parameter values derived from an n-gram language model are obtained. N-gram features for a log-linear language model are determined based on the n-grams corresponding to the obtained n-gram parameter values. A weight for each of the determined n-gram features is determined, where the weight is determined based on (i) an n-gram parameter value that is derived from the n-gram language model and that corresponds to a particular n-gram, and (ii) an n-gram parameter value that is derived from the n-gram language model and that corresponds to an n-gram that is a sub-sequence within the particular n-gram. A log-linear language model having the determined n-gram features is generated, where the determined n-gram features in the log-linear language model have weights that are initialized based on the determined weights.

    Abstract translation: 方法,系统和装置,包括编码在计算机存储介质上的与生成对数线性模型有关的计算机程序。 在一些实现中,获得从n-gram语言模型导出的n-gram参数值。 基于对应于获得的n-gram参数值的n克确定对数线性语言模型的N-gram特征。 确定每个确定的n-gram特征的权重,其中根据(i)从n-gram语言模型导出并对应于特定n-gram的n-gram参数值来确定权重, 和(ii)从n-gram语言模型导出并且对应于作为特定n-gram内的子序列的n-gram的n-gram参数值。 生成具有确定的n-gram特征的对数线性语言模型,其中在对数线性语言模型中确定的n-gram特征具有基于所确定的权重被初始化的权重。

    Speech recognition using topic-specific language models
    10.
    发明授权
    Speech recognition using topic-specific language models 有权
    使用主题特定语言模型的语音识别

    公开(公告)号:US09324323B1

    公开(公告)日:2016-04-26

    申请号:US13715139

    申请日:2012-12-14

    Applicant: Google Inc.

    CPC classification number: G10L15/183 G10L15/197

    Abstract: Speech recognition techniques may include: receiving audio; identifying one or more topics associated with audio; identifying language models in a topic space that correspond to the one or more topics, where the language models are identified based on proximity of a representation of the audio to representations of other audio in the topic space; using the language models to generate recognition candidates for the audio, where the recognition candidates have scores associated therewith that are indicative of a likelihood of a recognition candidate matching the audio; and selecting a recognition candidate for the audio based on the scores.

    Abstract translation: 语音识别技术可以包括:接收音频; 识别与音频相关联的一个或多个主题; 识别对应于所述一个或多个主题的主题空间中的语言模型,其中基于所述音频的表示与所述主题空间中的其他音频的表示的接近度来识别所述语言模型; 使用语言模型来生成用于音频的识别候选,其中识别候选具有与之相关联的分数,其指示与音频匹配的识别候选者的可能性; 以及基于分数来选择音频的识别候选。

Patent Agency Ranking