Mirror loss neural networks
    11.
    发明授权

    公开(公告)号:US11853895B2

    公开(公告)日:2023-12-26

    申请号:US17893454

    申请日:2022-08-23

    Applicant: Google LLC

    Inventor: Pierre Sermanet

    Abstract: This description relates to a neural network that has multiple network parameters and is configured to receive an input observation characterizing a state of an environment and to process the input observation to generate a numeric embedding of the state of the environment. The neural network can be used to control a robotic agent. The network can be trained using a method comprising: obtaining a first observation captured by a first modality; obtaining a second observation that is co-occurring with the first observation and that is captured by a second, different modality; obtaining a third observation captured by the first modality that is not co-occurring with the first observation; determining a gradient of a triplet loss that uses the first observation, the second observation, and the third observation; and updating current values of the network parameters using the gradient of the triplet loss.

    TRAINING AND/OR UTILIZING MACHINE LEARNING MODEL(S) FOR USE IN NATURAL LANGUAGE BASED ROBOTIC CONTROL

    公开(公告)号:US20230182296A1

    公开(公告)日:2023-06-15

    申请号:US17924891

    申请日:2021-05-14

    Applicant: GOOGLE LLC

    CPC classification number: B25J9/1664 B25J9/163 B25J9/1697

    Abstract: Techniques are disclosed that enable training a goal-conditioned policy based on multiple data sets, where each of the data sets describes a robot task in a different way. For example, the multiple data sets can include: a goal image data set, where the task is captured in the goal image; a natural language instruction data set, where the task is described in the natural language instruction; a task ID data set, where the task is described by the task ID, etc. In various implementations, each of the multiple data sets has a corresponding encoder, where the encoders are trained to generate a shared latent space representation of the corresponding task description. Additional or alternative techniques are disclosed that enable control of a robot using a goal-conditioned policy network. For example, the robot can be controlled, using the goal-conditioned policy network, based on free-form natural language input describing robot task(s).

    UNSUPERVISED DETECTION OF INTERMEDIATE REINFORCEMENT LEARNING GOALS

    公开(公告)号:US20190332920A1

    公开(公告)日:2019-10-31

    申请号:US16347651

    申请日:2017-11-06

    Applicant: GOOGLE LLC

    Inventor: Pierre Sermanet

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting intermediate reinforcement learning goals. One of the methods includes obtaining a plurality of demonstration sequences, each of the demonstration sequences being a sequence of images of an environment while a respective instance of a reinforcement learning task is being performed; for each demonstration sequence, processing each image in the demonstration sequence through an image processing neural network to determine feature values for a respective set of features for the image; determining, from the demonstration sequences, a partitioning of the reinforcement learning task into a plurality of subtasks, wherein each image in each demonstration sequence is assigned to a respective subtask of the plurality of subtasks; and determining, from the feature values for the images in the demonstration sequences, a respective set of discriminative features for each of the plurality of subtasks.

    MIRROR LOSS NEURAL NETWORKS
    16.
    发明申请

    公开(公告)号:US20190314985A1

    公开(公告)日:2019-10-17

    申请号:US16468987

    申请日:2018-03-19

    Applicant: Google LLC

    Inventor: Pierre Sermanet

    Abstract: This description relates to a neural network that has multiple network parameters and is configured to receive an input observation characterizing a state of an environment and to process the input observation to generate a numeric embedding of the state of the environment. The neural network can be used to control a robotic agent. The network can be trained using a method comprising: obtaining a first observation captured by a first modality; obtaining a second observation that is co-occurring with the first observation and that is captured by a second, different modality; obtaining a third observation captured by the first modality that is not co-occurring with the first observation; determining a gradient of a triplet loss that uses the first observation, the second observation, and the third observation; and updating current values of the network parameters using the gradient of the triplet loss.

Patent Agency Ranking