一种基于增强学习算法的无人船侧向控制方法

发明授权

请登陆查看更多内容

专利标题： 一种基于增强学习算法的无人船侧向控制方法
申请号： CN201710458496.0

申请日： 2017-06-16
公开(公告)号： CN107346138B

公开(公告)日： 2020-05-05
发明人: 赵东明 , 周浩 , 朱楷 , 柳欣
申请人： 武汉理工大学
申请人地址： 湖北省武汉市洪山区珞狮路122号
专利权人： 武汉理工大学
当前专利权人： 武汉理工大学
当前专利权人地址： 湖北省武汉市洪山区珞狮路122号
代理机构： 湖北武汉永嘉专利代理有限公司
代理商 张惠玲
主分类号： G05D1/02
IPC分类号： G05D1/02 ; G05B13/04

摘要：

本发明公开了一种基于增强学习算法的无人船侧向控制方法，建立无人船行驶的动力学模型，采用简化的船体‑路径一体化模型；根据无人船侧向控制性能的要求，选择无人船侧向控制器的设计方案；无人船侧向控制器中增强学习控制器采用Actor‑Critic结构，分为执行器网络和评价器网络；设计无人船的侧向偏差参考模型；通过对性能指标的优化，实现系统状态或输出对参考模型状态的跟随，进而保证了系统的性能优化。本发明的优点在于：增强学习基于动物学习心理学的“试误法”原理，使得无人船能够在与环境的交互过程中根据评价性的反馈信号实现序贯决策的优化，从而可以用于解决某些监督学习难以应用的优化控制问题。

摘要（英）：

The invention discloses an unmanned ship lateral control method based on a reinforcement learning algorithm, including the following steps: a dynamic model of an unmanned ship is built, and a simplified hull-path integrated model is used; a design scheme of an unmanned ship lateral controller is selected according to the requirement for unmanned ship lateral control performance; a reinforcement learning controller in the unmanned ship lateral controller is of an Actor-Critic structure, and is divided into an actor network and a critic network; a lateral deviation reference model of the unmanned ship is designed; and through performance index optimization, the system state or output follows the state of the reference model, and thus, the system performance is optimized. The advantage is as follows: the reinforcement learning is based on the 'trial and error' principle of the animal learning psychology, sequential decision making can be optimized according to evaluative feedback signals in the process of interaction between an unmanned ship and the environment, and thus, the method can be used to solve optimization control problems to which supervised learning can be hardly applied.

公开/授权文献

CN107346138A 一种基于增强学习算法的无人船侧向控制方法公开/授权日：2017-11-14

信息查询

中国专利公布公告 Global Dossier Espacenet

IPC分类:

G	物理
G05	控制；调节
G05D	非电变量的控制或调节系统（金属的连续铸造入B22D11/16；阀门本身入F16K；非电变量的检测见G01各有关小类；电或磁变量的调节入G05F）
G05D1/00	陆地、水上、空中或太空中的运载工具的位置、航道、高度或姿态的控制，例如自动驾驶仪（无线电导航系统或使用其他波的类似系统入G01S）
G05D1/02	.二维的位置或航道控制