GATED LINEAR CONTEXTUAL BANDITS
    1.
    发明申请

    公开(公告)号:US20230079338A1

    公开(公告)日:2023-03-16

    申请号:US17766854

    申请日:2020-10-08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for training a neural network to control a real-world agent interacting with a real-world environment to cause the real-world agent to perform a particular task. One of the methods includes training the neural network to determine first values of the parameters by optimizing a first task-specific objective that measures a performance of the policy neural network in controlling a simulated version of the real-world agent; obtaining real-world data generated from interactions of the real-world agent with the real-world environment; and training the neural network to determine trained values of the parameters from the first values of the parameters by jointly optimizing (i) a self-supervised objective that measures at least a performance of internal representations generated by the neural network on a self-supervised task performed on the real-world data and (ii) a second task-specific objective.

    POPULATION-BASED TRAINING OF MACHINE LEARNING MODELS

    公开(公告)号:US20210097443A1

    公开(公告)日:2021-04-01

    申请号:US16586236

    申请日:2019-09-27

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. A method includes: maintaining a plurality of training sessions; assigning, to each worker of one or more workers, a respective training session of the plurality of training sessions; repeatedly performing operations until meeting one or more termination criteria, the operations comprising: receiving an updated training session from a respective worker of the one or more workers, selecting a second training session, selecting, based on comparing the updated training session and the second training session using a fitness evaluation function, either the updated training session or the second training session as a parent training session, generating a child training session from the selected parent training session, and assigning the child training session to an available worker, and selecting a candidate model to be a trained model for the machine learning model.

    Reinforcement learning using distributed prioritized replay

    公开(公告)号:US11625604B2

    公开(公告)日:2023-04-11

    申请号:US16641751

    申请日:2018-10-29

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

    REINFORCEMENT LEARNING USING DISTRIBUTED PRIORITIZED REPLAY

    公开(公告)号:US20200265305A1

    公开(公告)日:2020-08-20

    申请号:US16641751

    申请日:2018-10-29

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

    Population-based training of machine learning models

    公开(公告)号:US11907821B2

    公开(公告)日:2024-02-20

    申请号:US16586236

    申请日:2019-09-27

    CPC classification number: G06N20/20 G06F16/9024 G06N5/04

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. A method includes: maintaining a plurality of training sessions; assigning, to each worker of one or more workers, a respective training session of the plurality of training sessions; repeatedly performing operations until meeting one or more termination criteria, the operations comprising: receiving an updated training session from a respective worker of the one or more workers, selecting a second training session, selecting, based on comparing the updated training session and the second training session using a fitness evaluation function, either the updated training session or the second training session as a parent training session, generating a child training session from the selected parent training session, and assigning the child training session to an available worker, and selecting a candidate model to be a trained model for the machine learning model.

Patent Agency Ranking