-
公开(公告)号:US10713818B1
公开(公告)日:2020-07-14
申请号:US16259207
申请日:2019-01-28
Applicant: Google LLC
Inventor: George Dan Toderici , Sean O'Malley , Rahul Sukthankar , Sung Jin Hwang , Damien Vincent , Nicholas Johnston , David Charles Minnen , Joel Shor , Michele Covell
Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.
-
公开(公告)号:US12249030B2
公开(公告)日:2025-03-11
申请号:US17922160
申请日:2020-04-30
Applicant: Google LLC
Inventor: Cristian Sminchisescu , Hongyi Xu , Eduard Gabriel Bazavan , Andrei Zanfir , William T. Freeman , Rahul Sukthankar
IPC: G06T17/20 , G06N3/0455 , G06N3/08 , G06T19/20
Abstract: The present disclosure provides a statistical, articulated 3D human shape modeling pipeline within a fully trainable, modular, deep learning framework. In particular, aspects of the present disclosure are directed to a machine-learned 3D human shape model with at least facial and body shape components that are jointly trained end-to-end on a set of training data. Joint training of the model components (e.g., including both facial, hands, and rest of body components) enables improved consistency of synthesis between the generated face and body shapes.
-
公开(公告)号:US12175740B2
公开(公告)日:2024-12-24
申请号:US17614929
申请日:2019-05-28
Applicant: Google LLC
Inventor: Shumeet Baluja , Rahul Sukthankar
IPC: G06V10/94 , G06V10/774 , G06V10/82
Abstract: The present disclosure is directed to encoding images. In particular, one or more computing devices can receive data representing one or more machine learning (ML) models configured, at least in part, to encode images comprising objects of a particular type. The computing device(s) can receive data representing an image comprising one or more objects of the particular type. The computing device(s) can generate, based at least in part on the data representing the image and the data representing the ML model(s), data representing an encoded version of the image that alters at least a portion of the image comprising the object(s) such that when the encoded version of the image is decoded, the object(s) are unrecognizable as being of the particular type by one or more object-recognition ML models based at least in part upon which the ML model(s) configured to encode the images were trained.
-
公开(公告)号:US11908071B2
公开(公告)日:2024-02-20
申请号:US17495960
申请日:2021-10-07
Applicant: Google LLC
Inventor: Cristian Sminchisescu , Mihai Zanfir , Andrei Zanfir , Eduard Gabriel Bazavan , William Tafel Freeman , Rahul Sukthankar
CPC classification number: G06T17/00 , G06N3/08 , G06N20/00 , G06T11/003 , G06T2207/20081
Abstract: The present disclosure is generally directed to reconstructing representations of bodies from images. An example method of the present disclosure includes inputting, into a machine-learned reconstruction model, input data descriptive of an image depicting a body; predicting, using a machine-learned marker prediction component of the reconstruction model, a set of surface marker locations on the body; and outputting, using a machine-learned marker poser component of the reconstruction model, an output representation of the body that corresponds to the set of surface marker locations. In the example method, one or more parameters of the reconstruction model were learned at least in part based on a consistency loss corresponding to a distance between relaxed-constraint representations generated from a prior set of surface marker locations predicted according to the one or more parameters and parametric representations generated from the prior set using kinematic constraints associated with the body.
-
公开(公告)号:US20220237882A1
公开(公告)日:2022-07-28
申请号:US17614929
申请日:2019-05-28
Applicant: Google LLC
Inventor: Shumeet Baluja , Rahul Sukthankar
IPC: G06V10/20 , G06V10/774 , G06V10/82 , G06V10/94
Abstract: The present disclosure is directed to encoding images. In particular, one or more computing devices can receive data representing one or more machine learning (ML) models configured, at least in part, to encode images comprising objects of a particular type. The computing device(s) can receive data representing an image comprising one or more objects of the particular type. The computing device(s) can generate, based at least in part on the data representing the image and the data representing the ML model(s), data representing an encoded version of the image that alters at least a portion of the image comprising the object(s) such that when the encoded version of the image is decoded, the object(s) are unrecognizable as being of the particular type by one or more object-recognition ML models based at least in part upon which the ML model(s) configured to encode the images were trained.
-
公开(公告)号:US10681388B2
公开(公告)日:2020-06-09
申请号:US15883639
申请日:2018-01-30
Applicant: GOOGLE LLC
Inventor: Michele Covell , David Marwood , Shumeet Baluja , Rahul Sukthankar
IPC: H04N19/44 , H04N19/463 , H04N19/91 , H04N19/176 , H04N19/14 , H04N19/13
Abstract: Encoding and decoding occupancy information is disclosed. A method includes determining row sums for the region, determining column sums for the region, encoding, in a compressed bitstream, at least one of the row sums and the column sums, and encoding, in the compressed bitstream and based on a coding order, at least one of the rows and the columns of the region. The coding order is based on the encoded at least one of the row sums and the column sums. The row sums include, for each row of the region, a respective count of a number of locations in the row having a specified value. The column sums include, for each column of the region, a respective count of a number of locations in the column having the specified value. A location having the specified value is indicative of the occupancy information at the location.
-
公开(公告)号:US11836221B2
公开(公告)日:2023-12-05
申请号:US17200643
申请日:2021-03-12
Applicant: Google LLC
Inventor: Cristian Sminchisescu , Andrei Zanfir , Eduard Gabriel Bazavan , Mihai Zanfir , William Tafel Freeman , Rahul Sukthankar
CPC classification number: G06F18/217 , G06N3/08 , G06T7/73 , G06V40/103 , G06T2207/20081 , G06T2207/20084 , G06T2207/30196
Abstract: Systems and methods are directed to a method for estimation of an object state from image data. The method can include obtaining two-dimensional image data depicting an object. The method can include processing, with an estimation portion of a machine-learned object state estimation model, the two-dimensional image data to obtain an initial estimated state of the object. The method can include, for each of one or more refinement iterations, obtaining a previous loss value associated with a previous estimated state for the object, processing the previous loss value to obtain a current estimated state of the object, and evaluating a loss function to determine a loss value associated with the current estimated state of the object. The method can include providing a final estimated state for the object.
-
公开(公告)号:US20230116884A1
公开(公告)日:2023-04-13
申请号:US17495960
申请日:2021-10-07
Applicant: Google LLC
Inventor: Cristian Sminchisescu , Mihai Zanfir , Andrei Zanfir , Eduard Gabriel Bazavan , William Tafel Freeman , Rahul Sukthankar
Abstract: The present disclosure is generally directed to reconstructing representations of bodies from images. An example method of the present disclosure includes inputting, into a machine-learned reconstruction model, input data descriptive of an image depicting a body; predicting, using a machine-learned marker prediction component of the reconstruction model, a set of surface marker locations on the body; and outputting, using a machine-learned marker poser component of the reconstruction model, an output representation of the body that corresponds to the set of surface marker locations. In the example method, one or more parameters of the reconstruction model were learned at least in part based on a consistency loss corresponding to a distance between relaxed-constraint representations generated from a prior set of surface marker locations predicted according to the one or more parameters and parametric representations generated from the prior set using kinematic constraints associated with the body.
-
公开(公告)号:US10192327B1
公开(公告)日:2019-01-29
申请号:US15424711
申请日:2017-02-03
Applicant: Google LLC
Inventor: George Dan Toderici , Sean O'Malley , Rahul Sukthankar , Sung Jin Hwang , Damien Vincent , Nicholas Johnston , David Charles Minnen , Joel Shor , Michele Covell
Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.
-
公开(公告)号:US20250078494A1
公开(公告)日:2025-03-06
申请号:US18953894
申请日:2024-11-20
Applicant: Google LLC
Inventor: Shumeet Baluja , Rahul Sukthankar
IPC: G06V10/94 , G06V10/774 , G06V10/82
Abstract: The present disclosure is directed to encoding images. In particular, one or more computing devices can receive data representing one or more machine learning (ML) models configured, at least in part, to encode images comprising objects of a particular type. The computing device(s) can receive data representing an image comprising one or more objects of the particular type. The computing device(s) can generate, based at least in part on the data representing the image and the data representing the ML model(s), data representing an encoded version of the image that alters at least a portion of the image comprising the object(s) such that when the encoded version of the image is decoded, the object(s) are unrecognizable as being of the particular type by one or more object-recognition ML models based at least in part upon which the ML model(s) configured to encode the images were trained.
-
-
-
-
-
-
-
-
-