ADAPTIVE LOOKAHEAD FOR PLANNING AND LEARNING

    公开(公告)号:US20230237342A1

    公开(公告)日:2023-07-27

    申请号:US18158920

    申请日:2023-01-24

    CPC classification number: G06N3/092

    Abstract: A method is performed by an agent operating in an environment. The method comprises computing a first value associated with each state of a number of states in the environment, determining a lookahead horizon for each state of the number of states in the environment based on the computed first value for each state of the number of states, applying a first policy to compute a second value associated with each state of at least one state in the number of states in the environment for the at least one state in the number of states based on the determined lookahead horizons for the number of states, and determining a second policy based on the first policy and the second value for each state of the number of states in the environment.

    METHOD FOR FAST AND BETTER TREE SEARCH FOR REINFORCEMENT LEARNING

    公开(公告)号:US20220398283A1

    公开(公告)日:2022-12-15

    申请号:US17824680

    申请日:2022-05-25

    Abstract: A method for performing a Tree-Search (TS) on an environment is provided. The method comprises generating a tree for a current state of the environment based on a TS policy, determining a corrected TS policy, and determining an action to apply to the environment based on the corrected TS policy. The tree comprises a plurality of nodes including a root node among the plurality of nodes corresponding to the current state of the environment. Each node other than the root node among the plurality of nodes corresponding to an estimated future state of the environment. The plurality of nodes in the tree are connected by a plurality of edges. Each edge among the plurality of edges is associated with an action causing a transition from a first state to a different sate of the environment.

Patent Agency Ranking