-
公开(公告)号:US20220138991A1
公开(公告)日:2022-05-05
申请号:US17578794
申请日:2022-01-19
Applicant: Google LLC
Inventor: David Charles Minnen , Saurabh Singh , Johannes Balle , Troy Chinen , Sung Jin Hwang , Nicholas Johnston , George Dan Toderici
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compressing and decompressing data. In one aspect, a method comprises: processing data using an encoder neural network to generate a latent representation of the data; processing the latent representation of the data using a hyper-encoder neural network to generate a latent representation of an entropy model; generating an entropy encoded representation of the latent representation of the entropy model; generating an entropy encoded representation of the latent representation of the data using the latent representation of the entropy model; and determining a compressed representation of the data from the entropy encoded representations of: (i) the latent representation of the data and (ii) the latent representation of the entropy model used to entropy encode the latent representation of the data.
-
公开(公告)号:US20210335017A1
公开(公告)日:2021-10-28
申请号:US16610063
申请日:2018-05-16
Applicant: GOOGLE LLC
Inventor: Michele Covell , Damien Vincent , David Charles Minnen , Saurabh Singh , Sung Jin Hwang , Nicholas Johnston , Joel Eric Shor , George Dan Toderici
IPC: G06T9/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image compression and reconstruction. A request to generate an encoded representation of an input image is received. The encoded representation of the input image is then generated. The encoded representation includes a respective set of binary codes at each iteration. Generating the set of binary codes for the iteration from an initial set of binary includes: for any tiles that have already been masked off during any previous iteration, masking off the tile. For any tiles that have not yet been masked off during any of the previous iterations, a determination is made as to whether a reconstruction error of the tile when reconstructed from binary codes at the previous iterations satisfies an error threshold. When the reconstruction quality satisfies the error threshold, the tile is masked off.
-
公开(公告)号:US20200311548A1
公开(公告)日:2020-10-01
申请号:US16666689
申请日:2019-10-29
Applicant: Google LLC
Inventor: Abhinav Shrivastava , Saurabh Singh , Johannes Balle , Sami Ahmad Abu-El-Haija , Nicholas Johnston , George Dan Toderici
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving, by a neural network (NN), a dataset for generating features from the dataset. A first set of features is computed from the dataset using at least a feature layer of the NN. The first set of features i) is characterized by a measure of informativeness; and ii) is computed such that a size of the first set of features is compressible into a second set of features that is smaller in size than the first set of features and that has a same measure of informativeness as the measure of informativeness of the first set of features. The second set of features if generated from the first set of features using a compression method that compresses the first set of features to generate the second set of features.
-
公开(公告)号:US10713818B1
公开(公告)日:2020-07-14
申请号:US16259207
申请日:2019-01-28
Applicant: Google LLC
Inventor: George Dan Toderici , Sean O'Malley , Rahul Sukthankar , Sung Jin Hwang , Damien Vincent , Nicholas Johnston , David Charles Minnen , Joel Shor , Michele Covell
Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.
-
公开(公告)号:US20200082173A1
公开(公告)日:2020-03-12
申请号:US16687118
申请日:2019-11-18
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US10289912B1
公开(公告)日:2019-05-14
申请号:US15143218
申请日:2016-04-29
Applicant: Google LLC
Inventor: Sudheendra Vijayanarasimhan , George Dan Toderici , Yue Hei Ng , Matthew John Hausknecht , Oriol Vinyals , Rajat Monga
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying videos using neural networks. One of the methods includes obtaining a temporal sequence of video frames, wherein the temporal sequence comprises a respective video frame from a particular video at each of a plurality time steps; for each time step of the plurality of time steps: processing the video frame at the time step using a convolutional neural network to generate features of the video frame; and processing the features of the video frame using an LSTM neural network to generate a set of label scores for the time step and classifying the video as relating to one or more of the topics represented by labels in the set of labels from the label scores for each of the plurality of time steps.
-
公开(公告)号:US12033077B2
公开(公告)日:2024-07-09
申请号:US18175125
申请日:2023-02-27
Applicant: GOOGLE LLC
Inventor: Abhinav Shrivastava , Saurabh Singh , Johannes Ballé , Sami Ahmad Abu-El-Haija , Nicholas Milo Johnston , George Dan Toderici
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving, by a neural network (NN), a dataset for generating features from the dataset. A first set of features is computed from the dataset using at least a feature layer of the NN. The first set of features i) is characterized by a measure of informativeness; and ii) is computed such that a size of the first set of features is compressible into a second set of features that is smaller in size than the first set of features and that has a same measure of informativeness as the measure of informativeness of the first set of features. The second set of features if generated from the first set of features using a compression method that compresses the first set of features to generate the second set of features.
-
公开(公告)号:US20240144583A1
公开(公告)日:2024-05-02
申请号:US18398009
申请日:2023-12-27
Applicant: Google LLC
Inventor: Philip Andrew Chou , Berivan Isik , Sung Jin Hwang , Nicholas Milo Johnston , George Dan Toderici
IPC: G06T15/08 , H04N19/176 , H04N19/46
CPC classification number: G06T15/08 , H04N19/176 , H04N19/46
Abstract: Example embodiments of the present disclosure relate to systems and methods for compressing attributes of volumetric and hypervolumetric datasets. An example system performs operations including obtaining a reference dataset comprising attributes indexed by a domain of multidimensional coordinates; subdividing the domain into a plurality of blocks respectively associated with a plurality of attribute subsets; inputting, to a local nonlinear operator, a latent representation for an attribute subset associated with at least one block of the plurality of blocks; obtaining, using the local nonlinear operator and based on the latent representation, an attribute representation of one or more attributes of the attribute subset; and updating the latent representation based on a comparison of the attribute representation and the reference dataset.
-
公开(公告)号:US20240078712A1
公开(公告)日:2024-03-07
申请号:US18306771
申请日:2023-04-25
Applicant: Google LLC
Inventor: David Charles Minnen , Saurabh Singh , Johannes Balle , Troy Chinen , Sung Jin Hwang , Nicholas Johnston , George Dan Toderici
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compressing and decompressing data. In one aspect, a method comprises: processing data using an encoder neural network to generate a latent representation of the data; processing the latent representation of the data using a hyper-encoder neural network to generate a latent representation of an entropy model; generating an entropy encoded representation of the latent representation of the entropy model; generating an entropy encoded representation of the latent representation of the data using the latent representation of the entropy model; and determining a compressed representation of the data from the entropy encoded representations of: (i) the latent representation of the data and (ii) the latent representation of the entropy model used to entropy encode the latent representation of the data.
-
公开(公告)号:US11750848B2
公开(公告)日:2023-09-05
申请号:US17107684
申请日:2020-11-30
Applicant: Google LLC
Inventor: George Dan Toderici , Fabian Julius Mentzer , Eirikur Thor Agustsson , Michael Tobias Tschannen
IPC: G06V10/00 , H04N19/91 , H04N19/124 , G06N3/088 , H04N19/154 , G06N3/045
CPC classification number: H04N19/91 , G06N3/045 , G06N3/088 , H04N19/124 , H04N19/154
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network configured to receive a data item and to process the data item to output a compressed representation of the data item. In one aspect, a method includes, for each training data item: processing the data item using the encoder neural network to generate a latent representation of the training data item; processing the latent representation using a hyper-encoder neural network to determine a conditional entropy model; generating a compressed representation of the training data item; processing the compressed representation using a decoder neural network to generate a reconstruction of the training data item; processing the reconstruction of the training data item using a discriminator neural network to generate a discriminator network output; evaluating a first loss function; and determining an update to the current values of the encoder network parameters.
-
-
-
-
-
-
-
-
-