Distributed On-Policy Actor-Critic Reinforcement Learning

Stanković, Srđan

Distributed On-Policy Actor-Critic Reinforcement Learning

Autori: Miloš Stanković, Miloš Beko, Miloš Pavlović, Ilija Popadić, Srđan Stanković

Izdanje: Sinteza 2022 - International Scientific Conference on Information Technology and Data Related Research

DOI: 10.15308/Sinteza-2022-389-393

Oblast: DECIDE Project Session

Stranice: 389-393

Link: https://portal.sinteza.singidunum.ac.rs/paper/889

Apstrakt:

In this paper, a novel distributed on-policy Actor-Critic algorithm for multiagent reinforcement learning is proposed. The algorithm consists of the temporal difference scheme with function approximation at the Critic stage, and a policy gradient algorithm at the Actor stage, derived starting from a global objective. At both stages, decentralized agreement among the agents is achieved using the linear dynamic consensus strategy. Compared to the existing schemes, the algorithm has improved convergence rate and noise immunity, and a possibility to achieve multi-task global optimization.

Ključne reči: Multi-Agent Systems, Reinforcement Learning, Actor-Critic, Distributed Consensus, Function Approximation

Priložene datoteke:

389-393 ( veličina: 546,81 KB, broj pregleda: 414 )

Zahvaljujemo se što ste preuzeli publikaciju sa portala Singipedia.

Ukoliko želite da se prijavite za obaveštenja o sadržajima iz oblasti ove publikacije, možete nam ostaviti adresu svoje elektronske pošte.

Preuzimanje citata:

BibTeX format

@article{article,
  author  = {M. Stanković, M. Beko, M. Pavlović, I. Popadić and S. Stanković}, 
  title   = {Distributed On-Policy Actor-Critic Reinforcement Learning},
  journal = {Sinteza 2022 - International Scientific Conference on Information Technology and Data Related Research},
  year    = 2022,
  pages   = {389-393},
  doi     = {10.15308/Sinteza-2022-389-393}
}

RefWorks Tagged format

RT Conference Proceedings
A1 Miloš Stanković
A1 Miloš Beko
A1 Miloš Pavlović
A1 Ilija Popadić
A1 Srđan Stanković
T1 Distributed On-Policy Actor-Critic Reinforcement Learning
AD Univerzitet Singidunum, Beograd, Beograd, Srbija
YR 2022
NO doi: 10.15308/Sinteza-2022-389-393

Unapred formatirani prikaz citata

M. Stanković, M. Beko, M. Pavlović, I. Popadić and S. Stanković, Distributed On-Policy Actor-Critic Reinforcement Learning, Univerzitet Singidunum, Beograd, 2022, doi:10.15308/Sinteza-2022-389-393