-
公开(公告)号:US20140344202A1
公开(公告)日:2014-11-20
申请号:US14293928
申请日:2014-06-02
Applicant: HRL LABORATORIES LLC
Inventor: Corey M. THIBEAULT , Narayan Srinivasa
IPC: G06N3/08
CPC classification number: G06N3/08 , G06N3/04 , G06N3/049 , G06N99/005
Abstract: A neural model for reinforcement-learning and for action-selection includes a plurality of channels, a population of input neurons in each of the channels, a population of output neurons in each of the channels, each population of input neurons in each of the channels coupled to each population of output neurons in each of the channels, and a population of reward neurons in each of the channels. Each channel of a population of reward neurons receives input from an environmental input, and is coupled only to output neurons in a channel that the reward neuron is part of. If the environmental input for a channel is positive, the corresponding channel of a population of output neurons are rewarded and have their responses reinforced, otherwise the corresponding channel of a population of output neurons are punished and have their responses attenuated.
Abstract translation: 用于加强学习和动作选择的神经模型包括多个通道,每个通道中的输入神经元群体,每个通道中的输出神经元群,每个通道中的输入神经元的每个群体 耦合到每个信道中的每个输出神经元的群体,以及每个信道中的一群奖励神经元。 奖励神经元群体的每个通道从环境输入接收输入,并且仅耦合到奖励神经元属于其中的一个通道中的输出神经元。 如果通道的环境输入为正,输出神经元群体的相应通道将得到奖励,并加强其响应,否则输出神经元群体的相应通道受到惩罚并使其响应减弱。