LEARNING PERSONALIZED ENTITY PRONUNCIATIONS
    11.
    发明申请

    公开(公告)号:US20170221475A1

    公开(公告)日:2017-08-03

    申请号:US15014213

    申请日:2016-02-03

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for implementing a pronunciation dictionary that stores entity name pronunciations. In one aspect, a method includes actions of receiving audio data corresponding to an utterance that includes a command and an entity name. Additional actions may include generating, by an automated speech recognizer, an initial transcription for a portion of the audio data that is associated with the entity name, receiving a corrected transcription for the portion of the utterance that is associated with the entity name, obtaining a phonetic pronunciation that is associated with the portion of the audio data that is associated with the entity name, updating a pronunciation dictionary to associate the phonetic pronunciation with the entity name, receiving a subsequent utterance that includes the entity name, and transcribing the subsequent utterance based at least in part on the updated pronunciation dictionary.

    GENERATING REPRESENTATIONS OF INPUT SEQUENCES USING NEURAL NETWORKS
    12.
    发明申请
    GENERATING REPRESENTATIONS OF INPUT SEQUENCES USING NEURAL NETWORKS 审中-公开
    使用神经网络生成输入序列的表示

    公开(公告)号:US20150356075A1

    公开(公告)日:2015-12-10

    申请号:US14728875

    申请日:2015-06-02

    Applicant: Google Inc.

    CPC classification number: G06N3/0445

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating representations of input sequences. One of the methods includes receiving a grapheme sequence, the grapheme sequence comprising a plurality of graphemes arranged according to an input order; processing the sequence of graphemes using a long short-term memory (LSTM) neural network to generate an initial phoneme sequence from the grapheme sequence, the initial phoneme sequence comprising a plurality of phonemes arranged according to an output order; and generating a phoneme representation of the grapheme sequence from the initial phoneme sequence generated by the LSTM neural network, wherein generating the phoneme representation comprises removing, from the initial phoneme sequence, phonemes in one or more positions in the output order.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于产生输入序列的表示。 所述方法之一包括接收字母序列,所述字符序列包括根据输入顺序排列的多个字形; 使用长的短期记忆(LSTM)神经网络处理字符序列以从图形序列生成初始音素序列,所述初始音素序列包括根据输出顺序排列的多个音素; 以及从由LSTM神经网络生成的初始音素序列生成字形序列的音素表示,其中产生音素表示包括从初始音素序列去除输出顺序中的一个或多个位置中的音素。

Patent Agency Ranking