Cooperatively training and/or using separate input and subsequent content neural networks for information retrieval

    公开(公告)号:US11188824B2

    公开(公告)日:2021-11-30

    申请号:US15476280

    申请日:2017-03-31

    Applicant: Google Inc.

    Abstract: Systems, methods, and computer readable media related to information retrieval. Some implementations are related to training and/or using a relevance model for information retrieval. The relevance model includes an input neural network model and a subsequent content neural network model. The input neural network model and the subsequent content neural network model can be separate, but trained and/or used cooperatively. The input neural network model and the subsequent content neural network model can be “separate” in that separate inputs are applied to the neural network models, and each of the neural network models is used to generate its own feature vector based on its applied input. A comparison of the feature vectors generated based on the separate network models can then be performed, where the comparison indicates relevance of the input applied to the input neural network model to the separate input applied to the subsequent content neural network model.

    Discovery of problematic pronunciations for automatic speech recognition systems
    12.
    发明授权
    Discovery of problematic pronunciations for automatic speech recognition systems 有权
    发现自动语音识别系统的有问题的发音

    公开(公告)号:US08959020B1

    公开(公告)日:2015-02-17

    申请号:US13853150

    申请日:2013-03-29

    Applicant: Google Inc.

    CPC classification number: G10L15/187

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for discovery of problematic pronunciations for automatic speech recognition systems. One of the methods includes determining a frequency of occurrences of one or more n-grams in transcribed text and a frequency of occurrences of the n-grams in typed text and classifying a system pronunciation of a word included in the n-grams as correct or incorrect based on the frequencies. The n-grams may comprise one or more words and at least one of the words is classified as incorrect based on the frequencies. The frequencies of the specific n-grams may be determined across a domain using one or more n-grams that typically appear adjacent to the specific n-grams.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于发现用于自动语音识别系统的有问题的发音。 其中一种方法包括确定转录文本中一个或多个n克的出现频率和类型文本中出现的n-gram的频率,并将包含在n-gram中的单词的系统发音分类为正确或 基于频率不正确。 n克可以包括一个或多个单词,并且基于频率将这些单词中的至少一个分类为不正确的。 可以使用通常出现在特定n-gram附近的一个或多个n克来跨域确定特定n克的频率。

    COOPERATIVELY TRAINING AND/OR USING SEPARATE INPUT AND SUBSEQUENT CONTENT NEURAL NETWORKS FOR INFORMATION RETRIEVAL

    公开(公告)号:US20180240013A1

    公开(公告)日:2018-08-23

    申请号:US15476280

    申请日:2017-03-31

    Applicant: Google Inc.

    Abstract: Systems, methods, and computer readable media related to information retrieval. Some implementations are related to training and/or using a relevance model for information retrieval. The relevance model includes an input neural network model and a subsequent content neural network model. The input neural network model and the subsequent content neural network model can be separate, but trained and/or used cooperatively. The input neural network model and the subsequent content neural network model can be “separate” in that separate inputs are applied to the neural network models, and each of the neural network models is used to generate its own feature vector based on its applied input. A comparison of the feature vectors generated based on the separate network models can then be performed, where the comparison indicates relevance of the input applied to the input neural network model to the separate input applied to the subsequent content neural network model.

    Speech recognition with parallel recognition tasks

    公开(公告)号:US09373329B2

    公开(公告)日:2016-06-21

    申请号:US14064755

    申请日:2013-10-28

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/00 G10L15/01 G10L15/26

    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

    Computing device with remote contact lists
    17.
    发明授权
    Computing device with remote contact lists 有权
    具有远程联系人列表的计算设备

    公开(公告)号:US09210258B2

    公开(公告)日:2015-12-08

    申请号:US13934993

    申请日:2013-07-03

    Applicant: Google Inc.

    Abstract: In one implementation a computer-implemented method includes generating a group of telephone contacts for a first user, wherein the generating includes identifying a second user as a contact of the first user based upon a determination that the second user has at least a threshold email-based association with the first user; and adding the identified second user to the group of telephone contacts for the first user. The method further includes receiving a first request to connect a first telephone device associated with the first user to a second telephone device associated with the second user. The method also includes identifying a contact identifier of the second telephone device using the generated group of telephone contacts for the first user, and initiating a connection between the first telephone device and the second telephone device using the identified contact identifier.

    Abstract translation: 在一个实现中,计算机实现的方法包括为第一用户生成一组电话联系人,其中生成包括基于第二用户至少具有阈值电子邮件地址的确定来将第二用户识别为第一用户的联系人, 与第一个用户的关联; 以及将所识别的第二用户添加到第一用户的电话联系人组。 该方法还包括接收将与第一用户相关联的第一电话设备连接到与第二用户相关联的第二电话设备的第一请求。 该方法还包括使用生成的第一用户的电话联系人识别第二电话设备的联系人标识符,以及使用所识别的联系人标识符来启动第一电话设备和第二电话设备之间的连接。

    Updating phonetic dictionaries
    18.
    发明授权
    Updating phonetic dictionaries 有权
    更新语音字典

    公开(公告)号:US09135912B1

    公开(公告)日:2015-09-15

    申请号:US13622547

    申请日:2012-09-19

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for updating phonetic dictionaries. In one aspect, a method includes accessing a phonetic dictionary that identifies terms and one or more phonetic representations associated with each term, determining that a particular term that is identified in the phonetic dictionary is a spelling correction for another term that is identified in the phonetic dictionary, and storing, in the phonetic dictionary, one or more of the phonetic representations associated with the other term, with the particular term that is a spelling correction for the other term.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于更新语音字典。 在一个方面,一种方法包括访问语音字典,其识别术语和与每个术语相关联的一个或多个语音表示,确定在语音字典中识别的特定术语是对于在语音中标识的另一术语的拼写校正 字典,并且在语音字典中存储与另一术语相关联的一个或多个语音表示,其中特定术语是另一术语的拼写校正。

    Training an automatic speech recognition system using compressed word frequencies
    19.
    发明授权
    Training an automatic speech recognition system using compressed word frequencies 有权
    训练使用压缩字频率的自动语音识别系统

    公开(公告)号:US09123331B1

    公开(公告)日:2015-09-01

    申请号:US13967965

    申请日:2013-08-15

    Applicant: Google Inc.

    CPC classification number: G10L15/063

    Abstract: Respective word frequencies may be determined from a corpus of utterance-to-text-string mappings that contain associations between audio utterances and a respective text string transcription of each audio utterance. Respective compressed word frequencies may be obtained based on the respective word frequencies such that the distribution of the respective compressed word frequencies has a lower variance than the distribution of the respective word frequencies. Sample utterance-to-text-string mappings may be selected from the corpus of utterance-to-text-string mappings based on the compressed word frequencies. An automatic speech recognition (ASR) system may be trained with the sample utterance-to-text-string mappings.

    Abstract translation: 可以从包含音频话语和每个音频话语的相应文本串转录之间的关联的话语到文本串映射的语料库来确定相应的词频率。 可以基于相应的字频率来获得各个压缩字频率,使得各个压缩字频率的分布具有比各个字频率的分布更低的方差。 可以从基于压缩字频率的话语到文本串映射的语料库中选择示例到文本串的映射。 自动语音识别(ASR)系统可以用样本话语到文本串映射进行训练。

    DATA DRIVEN PRONUNCIATION LEARNING WITH CROWD SOURCING
    20.
    发明申请
    DATA DRIVEN PRONUNCIATION LEARNING WITH CROWD SOURCING 有权
    数据驱动公开学习与CROWD采购

    公开(公告)号:US20150006178A1

    公开(公告)日:2015-01-01

    申请号:US13930495

    申请日:2013-06-28

    Applicant: Google Inc.

    CPC classification number: G10L15/18 G09B17/006 G10L13/08 G10L15/06

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining pronunciations for particular terms. The methods, systems, and apparatus include actions of obtaining audio samples of speech corresponding to a particular term and obtaining candidate pronunciations for the particular term. Further actions include generating, for each candidate pronunciation for the particular term and audio sample of speech corresponding to the particular term, a score reflecting a level of similarity between of the candidate pronunciation and the audio sample. Additional actions include aggregating the scores for each candidate pronunciation and adding one or more candidate pronunciations for the particular term to a pronunciation lexicon based on the aggregated scores for the candidate pronunciations.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于确定特定术语的发音。 方法,系统和装置包括获得与特定术语相对应的语音样本的动作,并获得特定术语的候选发音。 进一步的动作包括针对特定术语的每个候选发音和对应于特定术语的语音样本生成反映候选发音和音频样本之间的相似程度的分数。 附加动作包括聚合每个候选发音的分数,并且基于候选发音的聚合分数,将特定术语的一个或多个候选发音添加到发音词典。

Patent Agency Ranking