-
公开(公告)号:US11562287B2
公开(公告)日:2023-01-24
申请号:US15885727
申请日:2018-01-31
Applicant: salesforce.com, inc.
Inventor: Caiming Xiong , Tianmin Shu , Richard Socher
Abstract: The disclosed technology reveals a hierarchical policy network, for use by a software agent, to accomplish an objective that requires execution of multiple tasks. A terminal policy learned by training the agent on a terminal task set, serves as a base task set of the intermediate task set. An intermediate policy learned by training the agent on an intermediate task set serves as a base policy of the top policy. A top policy learned by training the agent on a top task set serves as a base task set of the top task set. The agent is configurable to accomplish the objective by traversal of the hierarchical policy network. A current task in a current task set is executed by executing a previously-learned task selected from a corresponding base task set governed by a corresponding base policy, or performing a primitive action selected from a library of primitive actions.