Efficient Multiagent Policy Optimization Based on Weighted Estimators in Stochastic Cooperative Environments
Journal of Computer Science and Technology, pp. 268-280, 2020.
Multiagent deep reinforcement learning (MA-DRL) has received increasingly wide attention. Most of the existing MA-DRL algorithms, however, are still inefficient when faced with the non-stationarity due to agents changing behavior consistently in stochastic environments. This paper extends the weighted double estimator to multiagent domain...更多
下载 PDF 全文 (上传PDF)