-
公开(公告)号:US10482873B2
公开(公告)日:2019-11-19
申请号:US15910720
申请日:2018-03-02
Applicant: Google LLC
Inventor: Georg Heigold , Erik McDermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A. U. Bacchiani
IPC: G10L15/06 , G10L15/16 , G10L15/183 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.
-
公开(公告)号:US20180068207A1
公开(公告)日:2018-03-08
申请号:US15809200
申请日:2017-11-10
Applicant: Google LLC
Inventor: Christian Szegedy , Vincent O. Vanhoucke
CPC classification number: G06K9/66 , G06N3/0454 , G06N3/063 , G06N3/084
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.
-
公开(公告)号:US09911069B1
公开(公告)日:2018-03-06
申请号:US15809200
申请日:2017-11-10
Applicant: Google LLC
Inventor: Christian Szegedy , Vincent O. Vanhoucke
CPC classification number: G06K9/66 , G06N3/0454 , G06N3/063 , G06N3/084
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.
-
公开(公告)号:US20240087559A1
公开(公告)日:2024-03-14
申请号:US18506540
申请日:2023-11-10
Applicant: Google LLC
Inventor: Georg Heigold , Erik Mcdermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A. U. Bacchiani
IPC: G10L15/06 , G06N3/045 , G10L15/16 , G10L15/183
CPC classification number: G10L15/063 , G06N3/045 , G10L15/16 , G10L15/183
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.
-
公开(公告)号:US11462035B2
公开(公告)日:2022-10-04
申请号:US17199978
申请日:2021-03-12
Applicant: Google LLC
Inventor: Christian Szegedy , Vincent O. Vanhoucke
IPC: G06K9/46 , G06V30/194 , G06N3/063 , G06N3/04 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.
-
公开(公告)号:US11341364B2
公开(公告)日:2022-05-24
申请号:US16649599
申请日:2018-09-20
Applicant: GOOGLE LLC
Inventor: Konstantinos Bousmalis , Alexander Irpan , Paul Wohlhart , Yunfei Bai , Mrinal Kalakrishnan , Julian Ibarz , Sergey Vladimir Levine , Kurt Konolige , Vincent O. Vanhoucke , Matthew Laurance Kelcey
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network that is used to control a robotic agent interacting with a real-world environment.
-
公开(公告)号:US11062181B2
公开(公告)日:2021-07-13
申请号:US16550731
申请日:2019-08-26
Applicant: Google LLC
Inventor: Vincent O. Vanhoucke , Christian Szegedy , Sergey Ioffe
Abstract: A neural network system that includes: multiple subnetworks that includes: a first subnetwork including multiple first modules, each first module including: a pass-through convolutional layer configured to process the subnetwork input for the first subnetwork to generate a pass-through output; an average pooling stack of neural network layers that collectively processes the subnetwork input for the first subnetwork to generate an average pooling output; a first stack of convolutional neural network layers configured to collectively process the subnetwork input for the first subnetwork to generate a first stack output; a second stack of convolutional neural network layers that are configured to collectively process the subnetwork input for the first subnetwork to generate a second stack output; and a concatenation layer configured to concatenate the pass-through output, the average pooling output, the first stack output, and the second stack output to generate a first module output for the first module.
-
公开(公告)号:US10916238B2
公开(公告)日:2021-02-09
申请号:US16863432
申请日:2020-04-30
Applicant: Google LLC
Inventor: Georg Heigold , Erik Mcdermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A. U. Bacchiani
IPC: G10L15/06 , G10L15/16 , G10L15/183 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.
-
29.
公开(公告)号:US10073817B1
公开(公告)日:2018-09-11
申请号:US15792077
申请日:2017-10-24
Applicant: Google LLC
Inventor: Nishant Patil , Matthew Sarett , Rama Krishna Govindaraju , Benoit Steiner , Vincent O. Vanhoucke
CPC classification number: G06F17/16
Abstract: The present disclosure relates to optimized matrix multiplication using vector multiplication of interleaved matrix values. Two matrices to be multiplied are organized into specially ordered vectors, which are multiplied together to produce a portion of a product matrix.
-
公开(公告)号:US20180137396A1
公开(公告)日:2018-05-17
申请号:US15868587
申请日:2018-01-11
Applicant: Google LLC
Inventor: Christian Szegedy , Vincent O. Vanhoucke
CPC classification number: G06K9/66 , G06N3/0454 , G06N3/063 , G06N3/084
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.
-
-
-
-
-
-
-
-
-