English
 
Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Cooperative Learning of Multi-Agent Systems via Reinforcement Learning

Authors

Wang,  Xin
External Organizations;

Zhao,  Chen
External Organizations;

Huang,  Tingwen
External Organizations;

Chakrabarti,  Prasun
External Organizations;

/persons/resource/Juergen.Kurths

Kurths,  Jürgen
Potsdam Institute for Climate Impact Research;

External Ressource
No external resources are shared
Fulltext (public)
There are no public fulltexts stored in PIKpublic
Supplementary Material (public)
There is no public supplementary material available
Citation

Wang, X., Zhao, C., Huang, T., Chakrabarti, P., Kurths, J. (2023): Cooperative Learning of Multi-Agent Systems via Reinforcement Learning. - IEEE Transactions on Signal and Information Processing over Networks, 9, 13-23.
https://doi.org/10.1109/TSIPN.2023.3239654


Cite as: https://publications.pik-potsdam.de/pubman/item/item_28321
Abstract
In many specific scenarios, accurateand practical cooperative learning is a commonly encountered challenge in multi-agent systems. Thus, the current investigation focuses on cooperative learning algorithms for multi-agent systems and underpins an alternate data-based neural network reinforcement learning framework. To achieve the data-based learning optimization, the proposed cooperative learning framework, which comprises two layers, introduces a virtual learning objective. The followers learn the behaviors of the virtual objects in the first layer based on the adaptive neural networks (NNs). Specifically, the actor and critic NNs are applied to acquire cooperative behaviors and assess this layer's long-term utility function. Then another layer realizes the tracking performance between the virtual objects and the leader by introducing the local data-based performance index. Then, we formulate a resulting deterministic optimization problem and resolve it effectively with the policy iteration algorithm. This intuitive cooperative learning algorithm also preserves good robustness properties and eliminates the dependence on the prior knowledge of the multi-agent system model in the solution process. Finally, a multi-robot formation system demonstrates this promising development's practical appeal and highly effective outcome.