-
公开(公告)号:US12265924B1
公开(公告)日:2025-04-01
申请号:US16908486
申请日:2020-06-22
Applicant: Amazon Technologies, Inc.
Inventor: Tao Sun , Yunzhe Tao , Sahika Genc , Sunil Mallya Kasaragod , Kaiqing Zhang
Abstract: Techniques for robust multi-agent reinforcement learning (MARL) are described. An exemplary method includes initializing a plurality of parameters for a plurality of agents including at least policy parameters and action-value (Q) parameters; performing robust multi-agent reinforcement learning to learn polices for the agents, wherein in the learned polices no agent has an incentive to deviate, the agents include an implicit agent that is to select a worst-case at any given time during the learning process; and at least one agent utilizing its learned policy.