Distributed On-Policy Actor-Critic Reinforcement Learning

Izdanje: Sinteza 2022 - International Scientific Conference on Information Technology and Data Related Research

DOI: 10.15308/Sinteza-2022-389-393

Oblast: DECIDE Project Session

Stranice: 389-393

Link: https://portal.sinteza.singidunum.ac.rs/paper/889

Apstrakt:
In this paper, a novel distributed on-policy Actor-Critic algorithm for multiagent reinforcement learning is proposed. The algorithm consists of the temporal difference scheme with function approximation at the Critic stage, and a policy gradient algorithm at the Actor stage, derived starting from a global objective. At both stages, decentralized agreement among the agents is achieved using the linear dynamic consensus strategy. Compared to the existing schemes, the algorithm has improved convergence rate and noise immunity, and a possibility to achieve multi-task global optimization.
Ključne reči: Multi-Agent Systems, Reinforcement Learning, Actor-Critic, Distributed Consensus, Function Approximation
Priložene datoteke:
  • 389-393 ( veličina: 546,81 KB, broj pregleda: 231 )

Preuzimanje citata:

BibTeX format
@article{article,
  author  = {M. Stanković, M. Beko, M. Pavlović, I. Popadić and S. Stanković}, 
  title   = {Distributed On-Policy Actor-Critic Reinforcement Learning},
  journal = {Sinteza 2022 - International Scientific Conference on Information Technology and Data Related Research},
  year    = 2022,
  pages   = {389-393},
  doi     = {10.15308/Sinteza-2022-389-393}
}
RefWorks Tagged format
RT Conference Proceedings
A1 Miloš Stanković
A1 Miloš Beko
A1 Miloš Pavlović
A1 Ilija Popadić
A1 Srđan Stanković
T1 Distributed On-Policy Actor-Critic Reinforcement Learning
AD Univerzitet Singidunum, Beograd, Beograd, Srbija
YR 2022
NO doi: 10.15308/Sinteza-2022-389-393
Unapred formatirani prikaz citata
M. Stanković, M. Beko, M. Pavlović, I. Popadić and S. Stanković, Distributed On-Policy Actor-Critic Reinforcement Learning, Univerzitet Singidunum, Beograd, 2022, doi:10.15308/Sinteza-2022-389-393