一种基于强化学习的网络流量负载均衡控制方法

发明公开

CN102571570A 一种基于强化学习的网络流量负载均衡控制方法无效 - 撤回

请登陆查看更多内容

专利标题： 一种基于强化学习的网络流量负载均衡控制方法
专利标题（英）： Network flow load balancing control method based on reinforcement learning
申请号： CN201110447514.8

申请日： 2011-12-27
公开(公告)号： CN102571570A

公开(公告)日： 2012-07-11
发明人: 胡朝辉 , 梁智强 , 梁志宏 , 周强峰 , 江泽鑫 , 石炜君 , 梁毅成
申请人： 广东电网公司电力科学研究院
申请人地址： 广东省广州市越秀区东风东路水均岗8号粤电大厦
专利权人： 广东电网公司电力科学研究院
当前专利权人： 广东电网公司电力科学研究院
当前专利权人地址： 广东省广州市越秀区东风东路水均岗8号粤电大厦
代理机构： 广州知友专利商标代理有限公司
代理商 周克佑
主分类号： H04L12/56
IPC分类号： H04L12/56

摘要：

本发明公开了一种基于强化学习的网络流量负载均衡控制方法，包括以下步骤：1)数据包处在路由器节点R*时，根据当前数据包的状态量s和策略π从下一跳的动作集合选择回报值最大的动作ai；2)当前数据包被路由以后，根据该数据包的实际情况修改该数据包的状态量s；并更新当前数据包的下一跳的动作集合；3)根据当前网络流量均衡状态修改当前数据包的奖惩值r；4)根据奖惩值更新策略π；重复步骤1)到步骤4)，直到当前数据包达到最终目的地址。该方法通过智能体与网络环境不断的交互学习，实现网络流量负载均衡的最优或近似最优控制。

摘要（英）：

The invention discloses a network flow load balancing control method based on reinforcement learning, which comprises the following steps of: 1) selecting an action ai with a maximal return value from an action set of the next hop according to the state quantity s and the strategy pi of a current data packet when the data packet is in a routing node R*; 2) modifying the state quantity s of the data packet according to actual conditions of the data packet after the current data packet is routed, and updating the action set of the next hop of the current data packet; 3) modifying the rewards and punishment values r of the current data packet according to balancing states of the current network flow; and 4) updating the strategy pi according to the rewards and punishment values; and repeating the step 1) to the step 4) when the current data packet reaches a final destination address. According to the method, optimal or approximately optimal control on load balancing of the network flow is realized by unceasing interactive learning of an intelligent agent and the network environment.

信息查询

中国专利公布公告

审查信息

Global Dossier

Espacenet