一种未知伺服系统的强化学习最优跟踪控制方法

Invention Publication

CN109946975A 一种未知伺服系统的强化学习最优跟踪控制方法失效 - 权利终止

Please log in to see more content

Patent Title: 一种未知伺服系统的强化学习最优跟踪控制方法
Patent Title (English): Reinforcement learning based optimal tracking control method for unknown servo system
Application No.: CN201910295400.2

Application Date: 2019-04-12
Publication No.: CN109946975A

Publication Date: 2019-06-28
Inventor: 任雪梅 , 吕永峰 , 李慧超 , 李林伟
Applicant: 北京理工大学
Applicant Address: 北京市海淀区中关村南大街5号
Assignee: 北京理工大学
Current Assignee: 北京理工大学
Current Assignee Address: 北京市海淀区中关村南大街5号
Agency: 北京理工正阳知识产权代理事务所
Agent 邬晓楠
Main IPC: G05B13/04
IPC: G05B13/04 ; G05B13/02 ; H02P23/00

Abstract:

本发明主要涉及模型未知伺服系统的强化学习最优跟踪控制器的设计方法。主要基于简化的强化学习评价-执行结构，应用高阶神经网络逼近方法，介绍模型未知伺服系统的强化学习最优跟踪控制器的设计方法，加快电机最优跟踪控制求解速度。针对模型未知的伺服系统，首先应用多层神经网络智能辨识系统模型，求解稳态控制；给定性能指标，应用高阶神经网络逼近最优性能指标函数；根据近似的性能指标函数和辨识的系统模型建立哈密尔顿-雅克比-贝尔曼(HJB)方程，求得伺服系统最优反馈控制。根据求得的稳态控制和最优反馈控制，计算最优跟踪控制，使负载转角和转速快速跟踪给定信号的同时，跟踪误差积累值和系统能耗同时达到最小。

Abstract(English):

The invention mainly relates to a design method of a reinforcement learning based optimal tracking controller for a model unknown servo system. The design method of the reinforcement learning based optimal tracking controller for the model unknown servo system is introduced mainly on the basis of a simplified reinforcement learning evaluation-execution structure with a high-order neural network approach method, and the optimal tracking control solution speed of a motor is increased. As for the model unknown servo system, firstly, homeostatic control is solved with a multilayer neutral networkintelligent identification system model; performance indexes are given, and a high-order neutral network approach optimal performance index function is applied; an HJB (Hamilton-Jacobi-Bellman) equation is established according to an approximate performance index function and the identification system model, and the optimal feedback control of the servo system is solved. The optimal tracking control is calculated according to the solved homeostatic control and optimal feedback control, so that tracking error accumulation values and system energy consumption are minimized simultaneously while load rotation angle and rotation speed rapidly track given signals.

Public/Granted literature

CN109946975B 一种未知伺服系统的强化学习最优跟踪控制方法 Public/Granted day:2020-04-24

Information query

Chinese Patent Announcement Global Dossier Espacenet

IPC分类:

G	物理
G05	控制；调节
G05B	一般的控制或调节系统；这种系统的功能单元；用于这种系统或单元的监视或测试装置（应用流体作用的一般流体压力执行器或系统入F15B；阀门本身入F16K；仅按机械特征区分的入G05G；传感元件见相应小类，例如G12B，G01、H01的小类；校正单元见相应的小类，例如H02K）
G05B13/00	自适应控制系统，即系统按照一些预定的准则自动调整自己使之具有最佳性能的系统（G05B19/00优先；机器学习G06N 20/00）
G05B13/02	.电的
G05B13/04	..包括使用模型或模拟器的