Efficient Multiagent Policy Optimization Based on Weighted Estimators in Stochastic Cooperative Environments
Journal of Computer Science and Technology, pp. 268-280, 2020.
deep reinforcement learning multiagent system weighted double estimator lenient reinforcement learning cooperative Markov game
Multiagent deep reinforcement learning (MA-DRL) has received increasingly wide attention. Most of the existing MA-DRL algorithms, however, are still inefficient when faced with the non-stationarity due to agents changing behavior consistently in stochastic environments. This paper extends the weighted double estimator to multiagent domain...More
Full Text (Upload PDF)
PPT (Upload PPT)