Patent search ap:("Google LLC") AND inv:"Dmitry Kalashnikov" Page 1

1.

发明公开
REAL-WORLD ROBOT CONTROL USING TRANSFORMER NEURAL NETWORKS 审中-公开

公开(公告)号：US20240189994A1

公开(公告)日：2024-06-13

申请号：US18539171

申请日：2023-12-13

Applicant: Google LLC

Inventor： Keerthana P G , Karol Hausman , Julian Ibarz , Brian Ichter , Alexander Irpan , Dmitry Kalashnikov , Yao Lu , Kanury Kanishka Rao , Michael Sahngwon Ryoo , Austin Charles Stone , Teddey Ming Xiao , Quan Ho Vuong , Sumedh Anand Sontakke

IPC: B25J9/16

CPC classification number: B25J9/163 , B25J9/161

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling an agent interacting with an environment. In one aspect, a method comprises: receiving a natural language text sequence that characterizes a task to be performed by the agent in the environment; generating an encoded representation of the natural language text sequence; and at each of a plurality of time steps: obtaining an observation image characterizing a state of the environment at the time step; processing the observation image to generate an encoded representation of the observation image; generating a sequence of input tokens; processing the sequence of input tokens to generate a policy output that defines an action to be performed by the agent in response to the observation image; selecting an action to be performed by the agent using the policy output; and causing the agent to perform the selected action.

2.

发明申请
DEEP REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATION 有权

公开(公告)号：US20210237266A1

公开(公告)日：2021-08-05

申请号：US17052679

申请日：2019-06-14

Applicant: Google LLC

Inventor： Dmitry Kalashnikov , Alexander Irpan , Peter Pastor Sampedro , Julian Ibarz , Alexander Herzog , Eric Jang , Deirdre Quillen , Ethan Holly , Sergey Levine

IPC: B25J9/16 , G06N3/08

Abstract: Using large-scale reinforcement learning to train a policy model that can be utilized by a robot in performing a robotic task in which the robot interacts with one or more environmental objects. In various implementations, off-policy deep reinforcement learning is used to train the policy model, and the off-policy deep reinforcement learning is based on self-supervised data collection. The policy model can be a neural network model. Implementations of the reinforcement learning utilized in training the neural network model utilize a continuous-action variant of Q-learning. Through techniques disclosed herein, implementations can learn policies that generalize effectively to previously unseen objects, previously unseen environments, etc.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification