Chase or Wait: Dynamic UAV Deployment to Learn and Catch Time-Varying User Activities

IEEE Transactions on Mobile Computing(2021)

引用 30|浏览11
暂无评分
摘要
Unmanned aerial vehicle (UAV) technology is a promising solution for rapidly providing wireless communication services to ground users. When the users demands dynamically change over time, the key challenge is how to adapt the UAV deployment strategy to the partial and even outdated observations on the users' activities given the UAV's flying speed limit. In this paper, we study dynamic UAV deployment to learn and adapt to the time-varying user activities, where the activity pattern of a user (if out of the UAV service coverage) is hidden from the UAV and follows a time-slotted Markov chain that switches between active and idle states. We formulate the learning-and-adaption based UAV deployment problem as a partially observable Markov decision process (POMDP) to maximize the total discounted hit rate of active users. We show there is a fundamental delay-reward tradeoff, and prove that the UAV will optimally follow a threshold-based policy by waiting at an idle user for a time threshold before moving to another user. Furthermore, we extend to a more general scenario where the UAV does not even know the parameters of each user's temporal activity distribution, and apply Q-learning to develop another threshold-based deployment policy for a multi-user scenario.
更多
查看译文
关键词
Unmanned aerial vehicle,dynamic deployment,partially observable MDP,reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要