Recognizing speech in multiple languages
    1.
    发明授权
    Recognizing speech in multiple languages 有权
    认识多种语言的言论

    公开(公告)号:US09129591B2

    公开(公告)日:2015-09-08

    申请号:US13726954

    申请日:2012-12-26

    Applicant: Google Inc.

    CPC classification number: G10L15/005 G10L15/183 G10L15/32

    Abstract: Speech recognition systems may perform the following operations: receiving audio; recognizing the audio using language models for different languages to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding recognition scores; identifying a candidate language for the audio; selecting a recognition candidate based on the recognition scores and the candidate language; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.

    Abstract translation: 语音识别系统可以执行以下操作:接收音频; 使用不同语言的语言模型识别音频以产生用于音频的识别候选,其中识别候选与相应的识别分数相关联; 识别音频的候选语言; 基于识别分数和候选语言选择识别候选; 并输出与所选择的识别候选对应的数据作为音频的识别版本。

    Speech recognition models based on location indicia
    2.
    发明授权
    Speech recognition models based on location indicia 有权
    基于位置标记的语音识别模型

    公开(公告)号:US08831957B2

    公开(公告)日:2014-09-09

    申请号:US13651566

    申请日:2012-10-15

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speech recognition using models that are based on where, within a building, a speaker makes an utterance are disclosed. The methods, systems, and apparatus include actions of receiving data corresponding to an utterance, and obtaining location indicia for an area within a building where the utterance was spoken. Further actions include selecting one or more models for speech recognition based on the location indicia, wherein each of the selected one or more models is associated with a weight based on the location indicia. Additionally, the actions include generating a composite model using the selected one or more models and the respective weights of the selected one or more models. And the actions also include generating a transcription of the utterance using the composite model.

    Abstract translation: 公开了包括在计算机存储介质上编码的计算机程序的方法,系统和装置,用于使用基于建筑物内的扬声器发出话语的模型进行语音识别。 方法,系统和装置包括接收对应于话语的数据的动作,以及获取用于说出话语的建筑物内的区域的位置标记。 进一步的动作包括基于位置标记来选择用于语音识别的一个或多个模型,其中所选择的一个或多个模型中的每一个与基于位置标记的权重相关联。 另外,动作包括使用所选择的一个或多个模型以及所选择的一个或多个模型的相应权重来生成复合模型。 并且动作还包括使用复合模型生成话语的转录。

    Computing Device With Remote Contact Lists
    3.
    发明申请
    Computing Device With Remote Contact Lists 有权
    计算设备与远程联系人列表

    公开(公告)号:US20140079204A1

    公开(公告)日:2014-03-20

    申请号:US13934993

    申请日:2013-07-03

    Applicant: Google Inc.

    Abstract: In one implementation a computer-implemented method includes generating a group of telephone contacts for a first user, wherein the generating includes identifying a second user as a contact of the first user based upon a determination that the second user has at least a threshold email-based association with the first user; and adding the identified second user to the group of telephone contacts for the first user. The method further includes receiving a first request to connect a first telephone device associated with the first user to a second telephone device associated with the second user. The method also includes identifying a contact identifier of the second telephone device using the generated group of telephone contacts for the first user, and initiating a connection between the first telephone device and the second telephone device using the identified contact identifier.

    Abstract translation: 在一个实现中,计算机实现的方法包括为第一用户生成一组电话联系人,其中生成包括基于第二用户至少具有阈值电子邮件地址的确定来将第二用户识别为第一用户的联系人, 与第一个用户的关联; 以及将所识别的第二用户添加到第一用户的电话联系人组。 该方法还包括接收将与第一用户相关联的第一电话设备连接到与第二用户相关联的第二电话设备的第一请求。 该方法还包括使用生成的第一用户的电话联系人识别第二电话设备的联系人标识符,以及使用所识别的联系人标识符来启动第一电话设备和第二电话设备之间的连接。

    Speech recognition with parallel recognition tasks
    4.
    发明授权
    Speech recognition with parallel recognition tasks 有权
    具有并行识别任务的语音识别

    公开(公告)号:US08571860B2

    公开(公告)日:2013-10-29

    申请号:US13750807

    申请日:2013-01-25

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/00 G10L15/01 G10L15/26

    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

    Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个来输出最终识别结果。

    Business listing search
    7.
    发明授权
    Business listing search 有权
    商家列表搜索

    公开(公告)号:US09460712B1

    公开(公告)日:2016-10-04

    申请号:US14454198

    申请日:2014-08-07

    Applicant: Google Inc.

    Abstract: A method of operating a voice-enabled business directory search system includes receiving category-business pairs, each category-business pair including a business category and a specific business, and establishing a data structure having nodes based on the category-business pairs. Each node of the data structure is associated with one or more business categories and a speech recognition language model for recognizing specific businesses associated with the one or more businesses categories.

    Abstract translation: 操作启用语音的业务目录搜索系统的方法包括接收类别业务对,每个类别业务对包括业务类别和特定业务,以及基于类别业务对建立具有节点的数据结构。 数据结构的每个节点与一个或多个业务类别和用于识别与一个或多个企业类别相关联的特定业务的语音识别语言模型相关联。

    Recognizing different versions of a language
    8.
    发明授权
    Recognizing different versions of a language 有权
    识别不同版本的语言

    公开(公告)号:US09275635B1

    公开(公告)日:2016-03-01

    申请号:US13672945

    申请日:2012-11-09

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/183

    Abstract: Speech recognition systems may perform the following operations: receiving audio at a computing device; identifying a language associated with the audio; recognizing the audio using recognition models for different versions of the language to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding information; comparing the information of the recognition candidates to identify agreement between at least two of the recognition models; selecting a recognition candidate based on information of the recognition candidate and agreement between the at least two of the recognition models; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.

    Abstract translation: 语音识别系统可以执行以下操作:在计算设备处接收音频; 识别与音频相关联的语言; 使用用于不同版本的语言的识别模型来识别音频以产生用于音频的识别候选,其中识别候选者与对应的信息相关联; 比较识别候选者的信息以识别至少两个识别模型之间的一致性; 基于所述识别候选者的信息和所述至少两个识别模型之间的一致性来选择识别候选者; 并输出与所选择的识别候选对应的数据作为音频的识别版本。

    Acoustically informed pruning for language modeling
    9.
    发明授权
    Acoustically informed pruning for language modeling 有权
    语言建模的声学修剪

    公开(公告)号:US09110880B1

    公开(公告)日:2015-08-18

    申请号:US13832160

    申请日:2013-03-15

    Applicant: Google, Inc.

    CPC classification number: G10L15/183

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for pruning a language model are disclosed. The methods, systems, and apparatus include actions of selecting a candidate portion of the language model to evaluate for pruning, obtaining an entropy score representing information loss that would result from pruning the candidate portion of the language model, obtaining an acoustic score representing acoustic confusability of one or more words modeled by the candidate portion of the language model, and evaluating whether to prune the candidate portion of the language model using the entropy score and the acoustic score.

    Abstract translation: 公开了包括在计算机存储介质上编码的用于修剪语言模型的计算机程序的方法,系统和装置。 方法,系统和装置包括选择语言模型的候选部分以评估修剪的动作,获得表示由修剪语言模型的候选部分导致的信息丢失的熵分数,获得表示声学混淆性的声学分数 由所述语言模型的候选部分建模的一个或多个单词,以及使用所述熵评分和所述声分数来评估是否修剪所述语言模型的候选部分。

    Training an automatic speech recognition system using compressed word frequencies

    公开(公告)号:US08543398B1

    公开(公告)日:2013-09-24

    申请号:US13666223

    申请日:2012-11-01

    Applicant: Google Inc.

    CPC classification number: G10L15/063

    Abstract: Respective word frequencies may be determined from a corpus of utterance-to-text-string mappings that contain associations between audio utterances and a respective text string transcription of each audio utterance. Respective compressed word frequencies may be obtained based on the respective word frequencies such that the distribution of the respective compressed word frequencies has a lower variance than the distribution of the respective word frequencies. Sample utterance-to-text-string mappings may be selected from the corpus of utterance-to-text-string mappings based on the compressed word frequencies. An automatic speech recognition (ASR) system may be trained with the sample utterance-to-text-string mappings.

Patent Agency Ranking