English
 
Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Explaining Metastable Cooperation in Independent Multi-Agent Boltzmann Q-Learning—A Deterministic Approximation

Goll, D., Barfuss, W., Heitzig, J. (2026): Explaining Metastable Cooperation in Independent Multi-Agent Boltzmann Q-Learning—A Deterministic Approximation. - Applied Sciences, 16, 7, 3524.
https://doi.org/10.3390/app16073524

Item is

Files

show Files
hide Files
:
Explaining_Metastable_Cooperation_in_Independent_Multi_Agent_Boltzmann_Q_Learning___A_Deterministic_Approximation.pdf (Publisher version), 10MB
Name:
Explaining_Metastable_Cooperation_in_Independent_Multi_Agent_Boltzmann_Q_Learning___A_Deterministic_Approximation.pdf
Description:
-
OA-Status:
Gold
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Goll, David1, Author
Barfuss, Wolfram1, Author
Heitzig, Jobst2, Author                 
Affiliations:
1External Organizations, ou_persistent22              
2Potsdam Institute for Climate Impact Research, ou_persistent13              

Content

show
hide
Free keywords: -
 Abstract: Multi-agent reinforcement learning involves interacting agents whose learning processes are coupled through a shared environment. This work introduces a discrete-time approximation model for multi-agent Boltzmann Q-learning that accounts for agents’ update frequencies. We demonstrate why previous models do not accurately represent the actual stochastic learning dynamics while our model can reproduce several complex emergent dynamic regimes, including transient cooperation and metastable states in social dilemmas like the Prisoner’s Dilemma. We show that increasing the discount factor can prevent convergence by inducing oscillations through a supercritical Neimark–Sacker bifurcation, which transforms the unique stable fixed point into a stable limit cycle. This analysis provides a deeper understanding of the complexities of multi-agent learning dynamics and the conditions under which convergence may not be achieved.

Details

show
hide
Language(s): eng - English
 Dates: 2026-03-312026-04-032026-04-03
 Publication Status: Finally published
 Pages: 20
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: DOI: 10.3390/app16073524
MDB-ID: No data to archive
PIKDOMAIN: RD4 - Complexity Science
Organisational keyword: RD4 - Complexity Science
Working Group: Behavioural Game Theory and Interacting Agents
Research topic keyword: Nonlinear Dynamics
Regional keyword: Global
Model / method: Machine Learning
OATYPE: Gold Open Access
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Applied Sciences
Source Genre: Journal, SCI, Scopus, oa
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 16 (7) Sequence Number: 3524 Start / End Page: - Identifier: CoNE: https://publications.pik-potsdam.de/cone/journals/resource/applied-sciences
Publisher: MDPI