发明申请
US20100094786A1 Smoothed Sarsa: Reinforcement Learning for Robot Delivery Tasks 有权
平滑Sarsa:加强学习机器人传送任务

Smoothed Sarsa: Reinforcement Learning for Robot Delivery Tasks
摘要:
The present invention provides a method for learning a policy used by a computing system to perform a task, such delivery of one or more objects by the computing system. During a first time interval, the computing system determines a first state, a first action and a first reward value. As the computing system determines different states, actions and reward values during subsequent time intervals, a state description identifying the current sate, the current action, the current reward and a predicted action is stored. Responsive to a variance of a stored state description falling below a threshold value, the stored state description is used to modify one or more weights in the policy associated with the first state.
信息查询
0/0