TRAINING GIANT NEURAL NETWORKS USING PIPELINE PARALLELISM

    公开(公告)号:US20210042620A1

    公开(公告)日:2021-02-11

    申请号:US16989787

    申请日:2020-08-10

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data specifying a partitioning of the neural network into N composite layers that form a sequence of composite layers, wherein each composite layer comprises a distinct plurality of layers from the multiple network layers of the neural network; obtaining data assigning each of the N composite layers to one or more computing devices from a set of N computing devices; partitioning a mini-batch of training examples into a plurality of micro-batches; and training the neural network, comprising: performing a forward pass through the neural network until output activations have been computed for each micro-batch for a final composite layer in the sequence, and performing a backward pass through the neural network until output gradients have been computed for each micro-batch for the first composite layer in the sequence.

    MULTILINGUAL SPEECH SYNTHESIS AND CROSS-LANGUAGE VOICE CLONING

    公开(公告)号:US20200380952A1

    公开(公告)日:2020-12-03

    申请号:US16855042

    申请日:2020-04-22

    Applicant: Google LLC

    Abstract: A method includes receiving an input text sequence to be synthesized into speech in a first language and obtaining a speaker embedding, the speaker embedding specifying specific voice characteristics of a target speaker for synthesizing the input text sequence into speech that clones a voice of the target speaker. The target speaker includes a native speaker of a second language different than the first language. The method also includes generating, using a text-to-speech (TTS) model, an output audio feature representation of the input text by processing the input text sequence and the speaker embedding. The output audio feature representation includes the voice characteristics of the target speaker specified by the speaker embedding.

    REWARD AUGMENTED MODEL TRAINING
    14.
    发明申请

    公开(公告)号:US20190188566A1

    公开(公告)日:2019-06-20

    申请号:US16328207

    申请日:2017-08-25

    Applicant: GOOGLE LLC

    CPC classification number: G06N3/08 G06N20/00

    Abstract: A method includes obtaining data identifying a machine learning model to be trained to perform a machine learning task, the machine learning model being configured to receive an input example and to process the input example in accordance with current values of a plurality of model parameters to generate a model output for the input example; obtaining initial training data for training the machine learning model, the initial training data comprising a plurality of training examples and, for each training example, a ground truth output that should be generated by the machine learning model by processing the training example; generating modified training data from the initial training data; and training the machine learning model on the modified training data.

    Multi-dialect and multilingual speech recognition

    公开(公告)号:US12254865B2

    公开(公告)日:2025-03-18

    申请号:US18418246

    申请日:2024-01-20

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.

    LARGE LANGUAGE MODEL (LLM) QUANTIZATION

    公开(公告)号:US20240428006A1

    公开(公告)日:2024-12-26

    申请号:US18211967

    申请日:2023-06-20

    Applicant: GOOGLE LLC

    Abstract: Implementations relate to asymmetric quantization of large language models (LLMs). Processor(s) of a system can: obtain a trained LLM, wherein the trained LLM includes a plurality of layers, each layer comprising a respective plurality of weights; for each layer of the plurality of layers: calculate an optimal clipping range for the respective plurality of weights, and clip one or more weights of the respective plurality of weights that lie outside of the optimal clipping range to produce a clipped layer; quantize the LLM to generate a quantized LLM, wherein the instructions to quantize include instructions to map weights of the plurality of clipped layers of the LLM from continuous values to discrete values; and provide the quantized LLM for downstream processing.

Patent Agency Ranking