Multi-Task Deep Reinforcement Learning for Terahertz NOMA Resource Allocation With Hybrid Discrete and Continuous Actions

IEEE Transactions on Vehicular Technology(2024)

引用 0|浏览1
暂无评分
摘要
Terahertz (THz) non-orthogonal multiple access (NOMA) networks have great potential for next-generation wireless communications, by providing promising ultra-high data rates and user fairness. In THz-NOMA networks, efficient and effective long-term beamforming-bandwidth-power (BBP) allocation is yet an open problem due to its non-deterministic polynomial-time hard (NP-hard) nature. In this paper, the continuous property of power and sub-arrays ratios assignment and the discrete property of sub-bands allocation are carefully treated. In light of these attributes, an offline hybrid discrete and continuous actions (DISCO) multi-task deep reinforcement learning (DRL) algorithm is proposed to maximize the long-term throughput. Specifically, the deployment of multi-task learning enables the actor of DISCO to smartly integrate two state-of-the-art DRL algorithms, e.g., actor-critic (AC) that only selects discrete actions and deep deterministic policy gradient (DDPG) that only generates continuous actions. Rigorous theoretical derivations for the neural network design and backpropagation process are provided to tailor our proposed DISCO for the BBP problem. Compared to the benchmark no-learning and conventional DRL algorithms, DISCO enhances the network throughput, while achieving good fairness among users. Furthermore, DISCO consumes hundred-of-millisecond computational time, revealing the practicability of DISCO.
更多
查看译文
关键词
Terahertz (THz) networks,Non-Orthogonal Multiple Access (NOMA),Deep Reinforcement Learning (DRL)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要