-
公开(公告)号:US20250165793A1
公开(公告)日:2025-05-22
申请号:US18393464
申请日:2023-12-21
Applicant: Industrial Technology Research Institute
Inventor: Chia-Hsiang Yang , Shih-Hao Chen , Chih-Wei Liu
IPC: G06N3/092
Abstract: An algorithm method for deep reinforcement learning includes initializing an environment and a model; executing an experience collection process and a network update process in parallel, and determining whether the experience collection process and the network update process have reached a termination condition; and continuing executing the experience collection process and the network update process in parallel in response to neither of the experience collection process and the network update processes has met the termination conditions; and stopping executing the experience collection process and the network update process in response to one of the experience collection processes and the network update process having met the termination conditions. The experience collection process includes obtaining a current state of the environment; calculating to determine the current action based on the current observation values according to a current policy of the model; and returning the current action to the environment.