The cost of learning fast with reinforcement learning for edge cache allocation

international teletraffic congress(2020)

引用 0|浏览5
暂无评分
摘要
We study data-driven cache allocation in MultiTenant Edge Computing: a Network Operator (NO) owns storage at the Edge and dynamically allocates it to third party application Content Providers (CPs). CPs can cache a part of their catalog and satisfy locally users’ requests, thus reducing inter-domain traffic. The objective of the NO is to find the optimal cache allocation, which minimizes the total interdomain traffic bandwidth, which constitutes an operational cost. Since CPs’ traffic is encrypted, NO’s allocation strategy is based solely on the amount of traffic measured. In this exploratory work, we solve this problem via ReinforcementLearning (RL). RL has mainly been intended to be trained in simulation, before applying it in real scenarios. We instead employ RL online, training it directly while the system is hot and running. An important factor emerges in this case: in order to learn the optimal cache allocation, the NO needs to perturb the allocation several times and measure how the inter-domain traffic changes; when perturbing the allocation, the NO has to pay a perturbation cost. While it has no physical meaning in simulation, it cannot be ignored in a hot and running system. We explore in this work the trade-off between perturbing a lot the system in order to learn a good allocation faster, or learning slower to reduce the perturbation cost. We show results from simulation and make the entire code available as open-source.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要