Patent search ap:("GOOGLE LLC") AND inv:"Sergey Vladimir Levine" Page 1

1.

发明公开
EFFICIENT HARDWARE ACCELERATOR CONFIGURATION EXPLORATION 审中-公开

公开(公告)号：US20240311267A1

公开(公告)日：2024-09-19

申请号：US18575621

申请日：2022-06-30

Applicant: Google LLC

Inventor： Amir Yazdanbakhsh , Sergey Vladimir Levine , Aviral Kumar

IPC: G06F11/34 , G06F11/30

CPC classification number: G06F11/3447 , G06F11/3024

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a surrogate neural network configured to determine a predicted performance measure of a hardware accelerator having a target hardware configuration on a target application. The trained instance of the surrogate neural network can be used. in addition to or in place of hardware simulation, during a search process for determining hardware configurations for application-specific hardware accelerators. i.e., hardware accelerators on which one or more neural networks can be deployed to perform one or more target machine learning tasks.

2.

发明授权
Off-policy control policy evaluation 有权

公开(公告)号：US11477243B2

公开(公告)日：2022-10-18

申请号：US16827596

申请日：2020-03-23

Applicant: Google LLC

Inventor： Kanury Kanishka Rao , Konstantinos Bousmalis , Christopher K. Harris , Alexander Irpan , Sergey Vladimir Levine , Julian Ibarz

IPC: G06N20/20 , H04L9/40 , G06K9/62 , G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for off-policy evaluation of a control policy. One of the methods includes obtaining policy data specifying a control policy for controlling a source agent interacting with a source environment to perform a particular task; obtaining a validation data set generated from interactions of a target agent in a target environment; determining a performance estimate that represents an estimate of a performance of the control policy in controlling the target agent to perform the particular task in the target environment; and determining, based on the performance estimate, whether to deploy the control policy for controlling the target agent to perform the particular task in the target environment.

3.

发明公开
CONTROL POLICIES FOR ROBOTIC AGENTS 审中-公开

公开(公告)号：US20240078429A1

公开(公告)日：2024-03-07

申请号：US18389022

申请日：2023-11-13

Applicant: Google LLC

Inventor： Chelsea Breanna Finn , Sergey Vladimir Levine

IPC: G06N3/08 , G06N3/008 , G06N3/04 , G06N3/044 , G06N3/045

CPC classification number: G06N3/08 , G06N3/008 , G06N3/04 , G06N3/044 , G06N3/045

Abstract: A method includes: receiving data identifying, for each of one or more objects, a respective target location to which a robotic agent interacting with a real-world environment should move the object; causing the robotic agent to move the one or more objects to the one or more target locations by repeatedly performing the following: receiving a current image of a current state of the real-world environment; determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent; and directing the robotic agent to perform the next sequence of actions.

4.

发明授权
Control policies for robotic agents 有权

公开(公告)号：US11853876B2

公开(公告)日：2023-12-26

申请号：US16332961

申请日：2017-09-15

Applicant: Google LLC

Inventor： Chelsea Breanna Finn , Sergey Vladimir Levine

IPC: G06N3/08 , G06N3/008 , G06N3/044 , G06N3/045 , G06N3/04

CPC classification number: G06N3/08 , G06N3/008 , G06N3/04 , G06N3/044 , G06N3/045

Abstract: A method includes: receiving data identifying, for each of one or more objects, a respective target location to which a robotic agent interacting with a real-world environment should move the object; causing the robotic agent to move the one or more objects to the one or more target locations by repeatedly performing the following: receiving a current image of a current state of the real-world environment; determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent; and directing the robotic agent to perform the next sequence of actions.

5.

发明申请
REINFORCEMENT LEARNING USING ADVANTAGE ESTIMATES 有权

公开(公告)号：US20220284266A1

公开(公告)日：2022-09-08

申请号：US17704721

申请日：2022-03-25

Applicant: Google LLC

Inventor： Shixiang Gu , Timothy Paul Lillicrap , Ilya Sutskever , Sergey Vladimir Levine

IPC: G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

6.

发明授权
Agent navigation using visual inputs 有权

公开(公告)号：US11010948B2

公开(公告)日：2021-05-18

申请号：US16485140

申请日：2018-02-09

Applicant: GOOGLE LLC

Inventor： Rahul Sukthankar , Saurabh Gupta , James Christopher Davidson , Sergey Vladimir Levine , Jitendra Malik

IPC: G06T11/60 , G01C21/32 , G06K9/00 , G06T3/00 , G06T7/20

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for navigation using visual inputs. One of the systems includes a mapping subsystem configured to, at each time step of a plurality of time steps, generate a characterization of an environment from an image of the environment at the time step, wherein the characterization comprises an environment map identifying locations in the environment having a particular characteristic, and wherein generating the characterization comprises, for each time step: obtaining the image of the environment at the time step, processing the image to generate a first initial characterization for the time step, obtaining a final characterization for a previous time step, processing the characterization for the previous time step to generate a second initial characterization for the time step, and combining the first initial characterization and the second initial characterization to generate a final characterization for the time step.

7.

发明申请
REINFORCEMENT LEARNING ALGORITHM SEARCH 有权

公开(公告)号：US20220391687A1

公开(公告)日：2022-12-08

申请号：US17338093

申请日：2021-06-03

Applicant: Google LLC

Inventor： John Dalton Co-Reyes , Yingjie Miao , Daiyi Peng , Sergey Vladimir Levine , Quoc V. Le , Honglak Lee , Aleksandra Faust

IPC: G06N3/08 , G06F11/34 , G06F16/901

Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for generating and searching reinforcement learning algorithms. In some implementations, a computer-implemented system generates a sequence of candidate reinforcement learning algorithms. Each candidate reinforcement learning algorithm in the sequence is configured to receive an input environment state characterizing a state of an environment and to generate an output that specifies an action to be performed by an agent interacting with the environment. For each candidate reinforcement learning algorithm in the sequence, the system performs a performance evaluation for a set of a plurality of training environments. For each training environment, the system adjusts a set of environment-specific parameters of the candidate reinforcement learning algorithm by performing training of the candidate reinforcement learning algorithm to control a corresponding agent in the training environment. The system generates an environment-specific performance metric for the candidate reinforcement learning algorithm that measures a performance of the candidate reinforcement learning algorithm in controlling the corresponding agent in the training environment as a result of the training. After performing training in the set of training environments, the system generates a summary performance metric for the candidate reinforcement learning algorithm by combining the environment-specific performance metrics generated for the set of training environments. After evaluating each of the candidate reinforcement learning algorithms in the sequence, the system selects one or more output reinforcement learning algorithms from the sequence based on the summary performance metrics of the candidate reinforcement learning algorithms.

8.

发明授权
Using simulation and domain adaptation for robotic control 有权

公开(公告)号：US11341364B2

公开(公告)日：2022-05-24

申请号：US16649599

申请日：2018-09-20

Applicant: GOOGLE LLC

Inventor： Konstantinos Bousmalis , Alexander Irpan , Paul Wohlhart , Yunfei Bai , Mrinal Kalakrishnan , Julian Ibarz , Sergey Vladimir Levine , Kurt Konolige , Vincent O. Vanhoucke , Matthew Laurance Kelcey

IPC: G06K9/62 , B25J9/16 , G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network that is used to control a robotic agent interacting with a real-world environment.

9.

发明授权
Reinforcement learning using advantage estimates 有权

公开(公告)号：US11288568B2

公开(公告)日：2022-03-29

申请号：US15429088

申请日：2017-02-09

Applicant: Google LLC

Inventor： Shixiang Gu , Timothy Paul Lillicrap , Ilya Sutskever , Sergey Vladimir Levine

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

10.

发明公开
Pixel-Level Video Prediction with Improved Performance and Efficiency 审中-公开

公开(公告)号：US20230239499A1

公开(公告)日：2023-07-27

申请号：US18011922

申请日：2022-05-27

Applicant: Google LLC

Inventor： Mohammad Babaeizadeh , Chelsea Breanna Finn , Dumitru Erhan , Mohammad Taghi Saffar , Sergey Vladimir Levine , Suraj Nair

IPC: H04N19/59 , H04N19/117 , H04N19/176 , H04N19/42 , G06V10/70

CPC classification number: H04N19/59 , H04N19/117 , H04N19/176 , H04N19/42 , G06V10/70

Abstract: One aspect provides a machine-learned video prediction model configured to receive and process one or more previous video frames to generate one or more predicted subsequent video frames, wherein the machine-learned video prediction model comprises a convolutional variational auto encoder, and wherein the convolutional variational auto encoder comprises an encoder portion comprising one or more encoding cells and a decoder portion comprising one or more decoding cells.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification