Patent search ap:("GOOGLE LLC") AND inv:"Honglak Lee" Page 1

1.

发明申请
Cross-Modal Contrastive Learning for Text-to-Image Generation based on Machine Learning Models 有权

公开(公告)号：US20230081171A1

公开(公告)日：2023-03-16

申请号：US17467628

申请日：2021-09-07

Applicant: Google LLC

Inventor： Han Zhang , Jing Yu Koh , Jason Michael Baldridge , Yinfei Yang , Honglak Lee

IPC: G06T11/00 , G06K9/62 , G10L15/26 , G06N3/08

Abstract: A computer-implemented method includes receiving, by a computing device, a particular textual description of a scene. The method also includes applying a neural network for text-to-image generation to generate an output image rendition of the scene, the neural network having been trained to cause two image renditions associated with a same textual description to attract each other and two image renditions associated with different textual descriptions to repel each other based on mutual information between a plurality of corresponding pairs, wherein the plurality of corresponding pairs comprise an image-to-image pair and a text-to-image pair. The method further includes predicting the output image rendition of the scene.

2.

发明授权
Robotic grasping prediction using neural networks and geometry aware object representation 有权

公开(公告)号：US11554483B2

公开(公告)日：2023-01-17

申请号：US17094111

申请日：2020-11-10

Applicant: Google LLC

Inventor： James Davidson , Xinchen Yan , Yunfei Bai , Honglak Lee , Abhinav Gupta , Seyed Mohammad Khansari Zadeh , Arkanath Pathak , Jasmine Hsu

IPC: B25J9/16

Abstract: Deep machine learning methods and apparatus, some of which are related to determining a grasp outcome prediction for a candidate grasp pose of an end effector of a robot. Some implementations are directed to training and utilization of both a geometry network and a grasp outcome prediction network. The trained geometry network can be utilized to generate, based on two-dimensional or two-and-a-half-dimensional image(s), geometry output(s) that are: geometry-aware, and that represent (e.g., high-dimensionally) three-dimensional features captured by the image(s). In some implementations, the geometry output(s) include at least an encoding that is generated based on a trained encoding neural network trained to generate encodings that represent three-dimensional features (e.g., shape). The trained grasp outcome prediction network can be utilized to generate, based on applying the geometry output(s) and additional data as input(s) to the network, a grasp outcome prediction for a candidate grasp pose.

3.

发明申请
SAMPLE-EFFICIENT REINFORCEMENT LEARNING 有权

公开(公告)号：US20210201156A1

公开(公告)日：2021-07-01

申请号：US17056640

申请日：2019-05-20

Applicant: GOOGLE LLC

Inventor： Danijar Hafner , Jacob Buckman , Honglak Lee , Eugene Brevdo , George Jay Tucker

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for sample-efficient reinforcement learning. One of the methods includes maintaining an ensemble of Q networks, an ensemble of transition models, and an ensemble of reward models; obtaining a transition; generating, using the ensemble of transition models, M trajectories; for each time step in each of the trajectories: generating, using the ensemble of reward models, N rewards for the time step, generating, using the ensemble of Q networks, L Q values for the time step, and determining, from the rewards, the Q values, and the training reward, L*N candidate target Q values for the trajectory and for the time step; for each of the time steps, combining the candidate target Q values; determining a final target Q value; and training at least one of the Q networks in the ensemble using the final target Q value.

4.

发明授权
Cross-modal contrastive learning for text-to-image generation based on machine learning models 有权

公开(公告)号：US12067646B2

公开(公告)日：2024-08-20

申请号：US17467628

申请日：2021-09-07

Applicant: Google LLC

Inventor： Han Zhang , Jing Yu Koh , Jason Michael Baldridge , Yinfei Yang , Honglak Lee

IPC: G06T11/00 , G06F18/214 , G06F18/22 , G06N3/08 , G10L15/26

CPC classification number: G06T11/00 , G06F18/2148 , G06F18/22 , G06N3/08 , G10L15/26

Abstract: A computer-implemented method includes receiving, by a computing device, a particular textual description of a scene. The method also includes applying a neural network for text-to-image generation to generate an output image rendition of the scene, the neural network having been trained to cause two image renditions associated with a same textual description to attract each other and two image renditions associated with different textual descriptions to repel each other based on mutual information between a plurality of corresponding pairs, wherein the plurality of corresponding pairs comprise an image-to-image pair and a text-to-image pair. The method further includes predicting the output image rendition of the scene.

5.

发明公开
IMAGE MANIPULATION BY TEXT INSTRUCTION 审中-公开

公开(公告)号：US20240212246A1

公开(公告)日：2024-06-27

申请号：US18400629

申请日：2023-12-29

Applicant: Google LLC

Inventor： Tianhao Zhang , Weilong Yang , Honglak Lee , Hung-Yu Tseng , Irfan Aziz Essa , Lu Jiang

IPC: G06T11/60 , G06N3/045 , G06N3/088 , G06T3/02 , G06T3/40 , G06T9/00

CPC classification number: G06T11/60 , G06N3/045 , G06N3/088 , G06T3/02 , G06T3/40 , G06T9/002

Abstract: A method for generating an output image from an input image and an input text instruction that specifies a location and a modification of an edit applied to the input image using a neural network is described. The neural network includes an image encoder, an image decoder, and an instruction attention network. The method includes receiving the input image and the input text instruction; extracting, from the input image, an input image feature that represents features of the input image using the image encoder; generating a spatial feature and a modification feature from the input text instruction using the instruction attention network; generating an edited image feature from the input image feature, the spatial feature and the modification feature; and generating the output image from the edited image feature using the image decoder.

6.

发明申请
ROBOTIC GRASPING PREDICTION USING NEURAL NETWORKS AND GEOMETRY AWARE OBJECT REPRESENTATION 有权

公开(公告)号：US20210053217A1

公开(公告)日：2021-02-25

申请号：US17094111

申请日：2020-11-10

Applicant: Google LLC

Inventor： James Davidson , Xinchen Yan , Yunfei Bai , Honglak Lee , Abhinav Gupta , Seyed Mohammad Khansari Zadeh , Arkanath Pathak , Jasmine Hsu

IPC: B25J9/16

Abstract: Deep machine learning methods and apparatus, some of which are related to determining a grasp outcome prediction for a candidate grasp pose of an end effector of a robot. Some implementations are directed to training and utilization of both a geometry network and a grasp outcome prediction network. The trained geometry network can be utilized to generate, based on two-dimensional or two-and-a-half-dimensional image(s), geometry output(s) that are: geometry-aware, and that represent (e.g., high-dimensionally) three-dimensional features captured by the image(s). In some implementations, the geometry output(s) include at least an encoding that is generated based on a trained encoding neural network trained to generate encodings that represent three-dimensional features (e.g., shape). The trained grasp outcome prediction network can be utilized to generate, based on applying the geometry output(s) and additional data as input(s) to the network, a grasp outcome prediction for a candidate grasp pose.

7.

发明申请
ROBOTIC GRASPING PREDICTION USING NEURAL NETWORKS AND GEOMETRY AWARE OBJECT REPRESENTATION 审中-公开

公开(公告)号：US20200094405A1

公开(公告)日：2020-03-26

申请号：US16617169

申请日：2018-06-18

Applicant: Google LLC

Inventor： James Davidson , Xinchen Yan , Yunfei Bai , Honglak Lee , Abhinav Gupta , Seyed Mohammad Khansari Zadeh , Arkanath Pathak , Jasmine Hsu

IPC: B25J9/16

Abstract: Deep machine learning methods and apparatus, some of which are related to determining a grasp outcome prediction for a candidate grasp pose of an end effector of a robot. Some implementations are directed to training and utilization of both a geometry network and a grasp outcome prediction network. The trained geometry network can be utilized to generate, based on two-dimensional or two-and-a-half-dimensional image(s), geometry output(s) that are: geometry-aware, and that represent (e.g., high-dimensionally) three-dimensional features captured by the image(s). In some implementations, the geometry output(s) include at least an encoding that is generated based on a trained encoding neural network trained to generate encodings that represent three-dimensional features (e.g., shape). The trained grasp outcome prediction network can be utilized to generate, based on applying the geometry output(s) and additional data as input(s) to the network, a grasp outcome prediction for a candidate grasp pose.

8.

发明公开
Cross-Modal Contrastive Learning for Text-to-Image Generation based on Machine Learning Models 审中-公开

公开(公告)号：US20240362830A1

公开(公告)日：2024-10-31

申请号：US18770154

申请日：2024-07-11

Applicant: Google LLC

Inventor： Han Zhang , Jing Yu Koh , Jason Michael Baldridge , Yinfei Yang , Honglak Lee

IPC: G06T11/00 , G06F18/214 , G06F18/22 , G06N3/08 , G10L15/26

CPC classification number: G06T11/00 , G06F18/2148 , G06F18/22 , G06N3/08 , G10L15/26

Abstract: A computer-implemented method includes receiving, by a computing device, a particular textual description of a scene. The method also includes applying a neural network for text-to-image generation to generate an output image rendition of the scene, the neural network having been trained to cause two image renditions associated with a same textual description to attract each other and two image renditions associated with different textual descriptions to repel each other based on mutual information between a plurality of corresponding pairs, wherein the plurality of corresponding pairs comprise an image-to-image pair and a text-to-image pair. The method further includes predicting the output image rendition of the scene.

9.

发明公开
DATA-EFFICIENT HIERARCHICAL REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20240308068A1

公开(公告)日：2024-09-19

申请号：US18673510

申请日：2024-05-24

Applicant: GOOGLE LLC

Inventor： Honglak Lee , Shixiang Gu , Sergey Levine

IPC: B25J9/16

CPC classification number: B25J9/163

Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).

10.

发明授权
Image manipulation by text instruction 有权

公开(公告)号：US11900517B2

公开(公告)日：2024-02-13

申请号：US18085487

申请日：2022-12-20

Applicant: Google LLC

Inventor： Tianhao Zhang , Weilong Yang , Honglak Lee , Hung-Yu Tseng , Irfan Aziz Essa , Lu Jiang

IPC: G06T11/60 , G06T9/00 , G06T3/00 , G06N3/088 , G06T3/40 , G06N3/045

CPC classification number: G06T11/60 , G06N3/045 , G06N3/088 , G06T3/0006 , G06T3/40 , G06T9/002

Abstract: A method for generating an output image from an input image and an input text instruction that specifies a location and a modification of an edit applied to the input image using a neural network is described. The neural network includes an image encoder, an image decoder, and an instruction attention network. The method includes receiving the input image and the input text instruction; extracting, from the input image, an input image feature that represents features of the input image using the image encoder; generating a spatial feature and a modification feature from the input text instruction using the instruction attention network; generating an edited image feature from the input image feature, the spatial feature and the modification feature; and generating the output image from the edited image feature using the image decoder.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification