Training neural networks using posterior sharpening

    公开(公告)号:US10824946B2

    公开(公告)日:2020-11-03

    申请号:US16511496

    申请日:2019-07-15

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network. In one aspect, a method includes maintaining data specifying, for each of the network parameters, current values of a respective set of distribution parameters that define a posterior distribution over possible values for the network parameter. A respective current training value for each of the network parameters is determined from a respective temporary gradient value for the network parameter. The current values of the respective sets of distribution parameters for the network parameters are updated in accordance with the respective current training values for the network parameters. The trained values of the network parameters are determined based on the updated current values of the respective sets of distribution parameters.

    Sample-efficient adaptive text-to-speech

    公开(公告)号:US10810993B2

    公开(公告)日:2020-10-20

    申请号:US16666043

    申请日:2019-10-28

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers. The adaptive audio-generation model is adapted for a new individual speaker using adaptation data comprising second text and audio data representing the new individual speaker speaking portions of the second text, the new individual speaker being different from each of the plurality of individual speakers, wherein adapting the audio-generation model includes learning a new embedding vector for the new individual speaker.

    Speech recognition using convolutional neural networks

    公开(公告)号:US10586531B2

    公开(公告)日:2020-03-10

    申请号:US16209661

    申请日:2018-12-04

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech recognition by generating a neural network output from an audio data input sequence, where the neural network output characterizes words spoken in the audio data input sequence. One of the methods includes, for each of the audio data inputs, providing a current audio data input sequence that comprises the audio data input and the audio data inputs preceding the audio data input in the audio data input sequence to a convolutional subnetwork comprising a plurality of dilated convolutional neural network layers, wherein the convolutional subnetwork is configured to, for each of the plurality of audio data inputs: receive the current audio data input sequence for the audio data input, and process the current audio data input sequence to generate an alternative representation for the audio data input.

    GENERATING DISCRETE LATENT REPRESENTATIONS OF INPUT DATA ITEMS

    公开(公告)号:US20240354566A1

    公开(公告)日:2024-10-24

    申请号:US18623952

    申请日:2024-04-01

    CPC classification number: G06N3/08 G06N3/04

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input data items. One of the methods includes receiving an input data item; providing the input data item as input to an encoder neural network to obtain an encoder output for the input data item; and generating a discrete latent representation of the input data item from the encoder output, comprising: for each of the latent variables, determining, from a set of latent embedding vectors in the memory, a latent embedding vector that is nearest to the encoded vector for the latent variable.

    Generating discrete latent representations of input data items

    公开(公告)号:US11948075B2

    公开(公告)日:2024-04-02

    申请号:US16620815

    申请日:2018-06-11

    CPC classification number: G06N3/08 G06N3/04

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input data items. One of the methods includes receiving an input data item; providing the input data item as input to an encoder neural network to obtain an encoder output for the input data item; and generating a discrete latent representation of the input data item from the encoder output, comprising: for each of the latent variables, determining, from a set of latent embedding vectors in the memory, a latent embedding vector that is nearest to the encoded vector for the latent variable.

    Training neural networks using posterior sharpening

    公开(公告)号:US11836630B2

    公开(公告)日:2023-12-05

    申请号:US17024217

    申请日:2020-09-17

    CPC classification number: G06N3/084 G06N3/044 G06N3/047

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network. In one aspect, a method includes maintaining data specifying, for each of the network parameters, current values of a respective set of distribution parameters that define a posterior distribution over possible values for the network parameter. A respective current training value for each of the network parameters is determined from a respective temporary gradient value for the network parameter. The current values of the respective sets of distribution parameters for the network parameters are updated in accordance with the respective current training values for the network parameters. The trained values of the network parameters are determined based on the updated current values of the respective sets of distribution parameters.

Patent Agency Ranking