Efficient off-policy credit assignment

    公开(公告)号:US11580445B2

    公开(公告)日:2023-02-14

    申请号:US16653890

    申请日:2019-10-15

    Abstract: Systems and methods are provided for efficient off-policy credit assignment (ECA) in reinforcement learning. ECA allows principled credit assignment for off-policy samples, and therefore improves sample efficiency and asymptotic performance. One aspect of ECA is to formulate the optimization of expected return as approximate inference, where policy is approximating a learned prior distribution, which leads to a principled way of utilizing off-policy samples. Other features are also provided.

Patent Agency Ranking