A novel deep reinforcement learning architecture for dynamic power and bandwidth allocation in multibeam satellites

Jing Xu, Zhongtian Zhao,Lei Wang,Yizhai Zhang

Acta Astronautica(2023)

引用 1|浏览57
暂无评分
摘要
Due to the explosive growth and dynamic change of user demand, an efficient power and bandwidth allocation algorithm is quite essential for multibeam satellites with flexible digital payloads. To suit the real-time use of the multibeam satellites communication, we build a novel deep reinforcement learning (DRL) architecture for dynamic power and bandwidth allocation. For minimizing unmet system capacity (USC), the proposed DRL architecture adopts proximal policy optimization algorithm to directly allocate the resource in continuous space. Under the proposed DRL architecture, the DRL strategy for joint power and bandwidth allocation (named as pbDRL) reaps the best USC performance in comparison with the separate power or bandwidth allocation method of pDRL and bDRL. Besides, by implementing the allocation decision determined by pbDRL into the initial population of the existing genetic algorithm (GA), we also develop another improved GA, namely drlGA. Numerical results verify that (i) pbDRL outperforms the existing GA and PSO; (ii) pbDRL achieves comparable USC performance within a significantly reduced computation time compared with the existing optimized GA method; (iii) in comparison with the other three related heuristic algorithms, drlGA derives improved USC performance within the same computation time. These results draw a conclusion that the developed pbDRL and drlGA approaches are capable to meet the requirements of high timeliness and desirable demand satisfaction, respectively.
更多
查看译文
关键词
Deep reinforcement learning,Dynamic resource allocation,Genetic algorithm,Multibeam satellite system,Proximal policy optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要