LEARNING PERSONALIZED ENTITY PRONUNCIATIONS
    21.
    发明申请

    公开(公告)号:US20170221475A1

    公开(公告)日:2017-08-03

    申请号:US15014213

    申请日:2016-02-03

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for implementing a pronunciation dictionary that stores entity name pronunciations. In one aspect, a method includes actions of receiving audio data corresponding to an utterance that includes a command and an entity name. Additional actions may include generating, by an automated speech recognizer, an initial transcription for a portion of the audio data that is associated with the entity name, receiving a corrected transcription for the portion of the utterance that is associated with the entity name, obtaining a phonetic pronunciation that is associated with the portion of the audio data that is associated with the entity name, updating a pronunciation dictionary to associate the phonetic pronunciation with the entity name, receiving a subsequent utterance that includes the entity name, and transcribing the subsequent utterance based at least in part on the updated pronunciation dictionary.

    CONTINUOUS KEYBOARD RECOGNITION
    22.
    发明申请

    公开(公告)号:US20170185286A1

    公开(公告)日:2017-06-29

    申请号:US14982887

    申请日:2015-12-29

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus for receiving data indicating a location of a particular touchpoint representing a latest received touchpoint in a sequence of received touchpoints; identifying candidate characters associated with the particular touchpoint; generating, for each of the candidate characters, a confidence score; identifying different candidate sequences of characters each including for each received touchpoint, one candidate character associated with a location of the received touchpoint, and one of the candidate characters associated with the particular touchpoint; for each different candidate sequence of characters, determining a language model score and generating a transcription score based at least on the confidence score for one or more of the candidate characters in the candidate sequence of characters and the language model score for the candidate sequence of characters; selecting, and providing for output, a representative sequence of characters from among the candidate sequences of characters based at least on the transcription scores.

    NEURAL NETWORK FOR KEYBOARD INPUT DECODING
    23.
    发明申请
    NEURAL NETWORK FOR KEYBOARD INPUT DECODING 有权
    键盘输入解码的神经网络

    公开(公告)号:US20160299685A1

    公开(公告)日:2016-10-13

    申请号:US14683861

    申请日:2015-04-10

    Applicant: Google Inc.

    Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

    Abstract translation: 在一些示例中,计算设备包括至少一个处理器; 以及至少一个模块,可由所述至少一个处理器操作以:输出用于在输出设备处显示图形键盘; 接收在存在敏感输入设备的位置处检测到的手势的指示,其中所述存在敏感输入设备的位置对应于输出所述图形键盘的输出设备的位置; 基于由所述计算设备使用神经网络处理的手势的至少一个空间特征确定至少一个字符串,其中所述至少一个空间特征指示所述手势的至少一个物理属性; 并且至少部分地基于使用所述神经网络的所述手势的所述至少一个空间特征的处理来输出所述至少一个字符串,用于在所述输出设备处显示。

    GENERATING REPRESENTATIONS OF INPUT SEQUENCES USING NEURAL NETWORKS
    24.
    发明申请
    GENERATING REPRESENTATIONS OF INPUT SEQUENCES USING NEURAL NETWORKS 审中-公开
    使用神经网络生成输入序列的表示

    公开(公告)号:US20150356075A1

    公开(公告)日:2015-12-10

    申请号:US14728875

    申请日:2015-06-02

    Applicant: Google Inc.

    CPC classification number: G06N3/0445

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating representations of input sequences. One of the methods includes receiving a grapheme sequence, the grapheme sequence comprising a plurality of graphemes arranged according to an input order; processing the sequence of graphemes using a long short-term memory (LSTM) neural network to generate an initial phoneme sequence from the grapheme sequence, the initial phoneme sequence comprising a plurality of phonemes arranged according to an output order; and generating a phoneme representation of the grapheme sequence from the initial phoneme sequence generated by the LSTM neural network, wherein generating the phoneme representation comprises removing, from the initial phoneme sequence, phonemes in one or more positions in the output order.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于产生输入序列的表示。 所述方法之一包括接收字母序列,所述字符序列包括根据输入顺序排列的多个字形; 使用长的短期记忆(LSTM)神经网络处理字符序列以从图形序列生成初始音素序列,所述初始音素序列包括根据输出顺序排列的多个音素; 以及从由LSTM神经网络生成的初始音素序列生成字形序列的音素表示,其中产生音素表示包括从初始音素序列去除输出顺序中的一个或多个位置中的音素。

    RECOGNIZING SPEECH USING NEURAL NETWORKS
    25.
    发明申请
    RECOGNIZING SPEECH USING NEURAL NETWORKS 有权
    使用神经网络识别语音

    公开(公告)号:US20150340034A1

    公开(公告)日:2015-11-26

    申请号:US14720113

    申请日:2015-05-22

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using neural networks. One of the methods includes receiving an audio input; processing the audio input using an acoustic model to generate a respective phoneme score for each of a plurality of phoneme labels; processing one or more of the phoneme scores using an inverse pronunciation model to generate a respective grapheme score for each of a plurality of grapheme labels; and processing one or more of the grapheme scores using a language model to generate a respective text label score for each of a plurality of text labels.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用神经网络识别语音。 其中一种方法包括接收音频输入; 使用声学模型处理音频输入以为多个音素标签中的每一个产生相应的音素分数; 使用反向发音模型处理一个或多个音素得分,以产生多个图形标签中的每一个的各自的图形分数; 以及使用语言模型处理一个或多个所述图形分数,以生成多个文本标签中的每一个的相应文本标签分数。

    Computing Device With Remote Contact Lists
    26.
    发明申请
    Computing Device With Remote Contact Lists 有权
    计算设备与远程联系人列表

    公开(公告)号:US20140079204A1

    公开(公告)日:2014-03-20

    申请号:US13934993

    申请日:2013-07-03

    Applicant: Google Inc.

    Abstract: In one implementation a computer-implemented method includes generating a group of telephone contacts for a first user, wherein the generating includes identifying a second user as a contact of the first user based upon a determination that the second user has at least a threshold email-based association with the first user; and adding the identified second user to the group of telephone contacts for the first user. The method further includes receiving a first request to connect a first telephone device associated with the first user to a second telephone device associated with the second user. The method also includes identifying a contact identifier of the second telephone device using the generated group of telephone contacts for the first user, and initiating a connection between the first telephone device and the second telephone device using the identified contact identifier.

    Abstract translation: 在一个实现中,计算机实现的方法包括为第一用户生成一组电话联系人,其中生成包括基于第二用户至少具有阈值电子邮件地址的确定来将第二用户识别为第一用户的联系人, 与第一个用户的关联; 以及将所识别的第二用户添加到第一用户的电话联系人组。 该方法还包括接收将与第一用户相关联的第一电话设备连接到与第二用户相关联的第二电话设备的第一请求。 该方法还包括使用生成的第一用户的电话联系人识别第二电话设备的联系人标识符,以及使用所识别的联系人标识符来启动第一电话设备和第二电话设备之间的连接。

    Speech recognition with parallel recognition tasks
    27.
    发明授权
    Speech recognition with parallel recognition tasks 有权
    具有并行识别任务的语音识别

    公开(公告)号:US08571860B2

    公开(公告)日:2013-10-29

    申请号:US13750807

    申请日:2013-01-25

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/00 G10L15/01 G10L15/26

    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

    Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个来输出最终识别结果。

    Neural network for keyboard input decoding

    公开(公告)号:US10248313B2

    公开(公告)日:2019-04-02

    申请号:US15473010

    申请日:2017-03-29

    Applicant: Google Inc.

    Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

Patent Agency Ranking