Adversarial automated reinforcement-learning-based application-manager training
Abstract:
The current document is directed to automated reinforcement-learning-based application managers that that are trained using adversarial training. During adversarial training, potentially disadvantageous next actions are selected for issuance by an automated reinforcement-learning-based application manager at a lower frequency than selection of next actions, according to a policy that is learned to provide optimal or near-optimal control over a computing environment that includes one or more applications controlled by the automated reinforcement-learning-based application manager. By selecting disadvantageous actions, the automated reinforcement-learning-based application manager is forced to explore a much larger subset of the system-state space during training, so that, upon completion of training, the automated reinforcement-learning-based application manager has learned a more robust and complete optimal or near-optimal control policy than had the automated reinforcement-learning-based application manager been trained by simulators or using management actions and computing-environment responses recorded during previous controlled operation of a computing-environment.
Information query
Patent Agency Ranking
0/0