TRAINING ACTION SELECTION NEURAL NETWORKS USING Q-LEARNING COMBINED WITH LOOK AHEAD SEARCH
Abstract:
A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used in turn to improve the Q-values. The system learns from a combination of real experience and simulated, model-based experience.
Information query
Patent Agency Ranking
0/0