EADP控制器的离线训练方法和系统及其在线控制方法和系统

发明公开

CN105513380A EADP控制器的离线训练方法和系统及其在线控制方法和系统有权转让

请登陆查看更多内容

专利标题： EADP控制器的离线训练方法和系统及其在线控制方法和系统
专利标题（英）： EADP controller off-line training method, EADP controller off-line training system, EADP controller online training method and EADP controller online training system
申请号： CN201511009719.2

申请日： 2015-12-29
公开(公告)号： CN105513380A

公开(公告)日： 2016-04-20
发明人: 王飞跃 , 刘裕良 , 吕宜生 , 段艳杰 , 陈松航
申请人： 中国科学院自动化研究所
申请人地址： 北京市海淀区中关村东路95号
专利权人： 中国科学院自动化研究所
当前专利权人： 青岛慧拓智能机器有限公司
当前专利权人地址： 北京市海淀区中关村东路95号
代理机构： 北京博维知识产权代理事务所
代理商 郭文浩
主分类号： G08G1/08
IPC分类号： G08G1/08 ; G08G1/01

摘要：

本发明公开了一种用于交叉路口交通信号控制的EADP控制器离线训练方法和系统以及EADP控制器在线控制方法和系统。其中，该方法包括：根据得到的系统状态和构建好的各子ADP控制器的Action网络和Critic网络，确定回报函数、系统控制参数和性能指标；并根据性能指标和回报函数交替地训练各子ADP控制器的Critic网络以及根据性能指标和系统控制参数交替地训练各子ADP控制器的Action网络，以更新Critic网络的权值和Action网络的权值；确定在训练达到训练目标时，记录各子ADP控制器的Action网络的权值和Critic网络的权值。通过本发明实施例解决了传统ADP控制器稳定性难以保证的技术问题，进而实现了对交通信号的自适应控制。

摘要（英）：

The invention discloses an EADP controller off-line training method, an EADP controller off-line training system, an EADP controller online training method and an EADP controller online training system, wherein the methods and the systems are used for controlling a traffic signal of an intersection. The EADP controller off-line training method comprises the steps of according to an obtained system state and a constructed Action network and a constructed Critic network of each sub-ADP controller, defining a reward function, a system control parameter and a performance index; alternately training the Critic network of each sub-ADP controller according to the performance index and the reward function and alternately training the Action network of each sub-ADP controller according to the performance index and the reward function, thereby updating the weight of the Critic network and the weight of the Action network; and a fact that training reaches a training object is determined, recording the weight of the action network and the weight of the Critic network of each sub-ADP controller. Through the EADP controller offline training method, a technical problem of high difficulty for ensuring stability of a traditional ADP controller is settled, and furthermore self-adaptive control to the traffic signal is realized.

公开/授权文献

CN105513380B EADP控制器的离线训练方法和系统及其在线控制方法和系统公开/授权日：2018-07-31

信息查询

中国专利公布公告 Global Dossier Espacenet

IPC分类:

G	物理
G08	信号装置
G08G	交通控制系统（指导铁路交通，保证铁路交通安全的入B61L；专用于交通控制的雷达或类似系统、声纳系统或激光雷达系统入G01S13/91、G01S15/88、G01S17/88；专用于防碰撞目的的雷达或类似系统、声纳系统或激光雷达系统入G01S13/93、G01S15/93、G01S17/93；陆地、水上、空中或太空中的运载工具的位置、航道、高度或姿态的控制，不限于交通环境入G05D1/00）
G08G1/00	道路车辆的交通控制系统（道路标志或交通信号装置入E01F9/00）
G08G1/07	.交通信号控制
G08G1/08	..根据检测的车辆数或速度的