RECURRENT ENVIRONMENT PREDICTORS
    3.
    发明申请

    公开(公告)号:US20190266475A1

    公开(公告)日:2019-08-29

    申请号:US16403352

    申请日:2019-05-03

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

    Recurrent environment predictors
    4.
    发明授权

    公开(公告)号:US11200482B2

    公开(公告)日:2021-12-14

    申请号:US16893565

    申请日:2020-06-05

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

    Recurrent environment predictors
    5.
    发明授权

    公开(公告)号:US10713559B2

    公开(公告)日:2020-07-14

    申请号:US16403352

    申请日:2019-05-03

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

    RECURRENT ENVIRONMENT PREDICTORS
    9.
    发明申请

    公开(公告)号:US20200342289A1

    公开(公告)日:2020-10-29

    申请号:US16893565

    申请日:2020-06-05

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

Patent Agency Ranking