Reinforcement learning based monotonic policy for online resource allocation

Future Generation Computer Systems(2023)

引用 2|浏览4
暂无评分
摘要
This research aims to design an optimal and strategyproof mechanism for online resource allocation problems. In such problems, consumers randomly arrive with their resource requests in an arbitrary manner. As a result, there is uncertainty in the future resource demands. In addition, the allocation and payment decisions depend on the providers’ past experiences. To address these challenges, we propose a novel reinforcement learning algorithm for optimising the resource allocation policy. The proposed algorithm adopts a novel monotonic reward shaping function that uses a dominant-resource multi-label classification technique. Finally, a critical payment value is calculated in order to maintain the strategyproofness in the online environment. The experimental evaluations show that the proposed mechanism achieves results that are within 96% of the optimal social welfare while outperforming the other mechanisms that use fixed pricing.
更多
查看译文
关键词
Resource allocation,Reinforcement learning,Online mechanism design,Monotonic policy,Critical payment,Resource dominant clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要