Online learning for multi-channel opportunistic access over unknown Markovian channels

SECON(2014)

引用 26|浏览11
暂无评分
摘要
A fundamental theoretical problem in opportunistic spectrum access is the following: a single secondary user must choose a channel to sense and access at each time, with the availability of each channel (due to primary user behavior) described by a Markov Chain. The problem of maximizing the expected channel usage can be formulated as a restless multi-armed bandit. We present in this paper an online learning algorithm with the best known results to date for this problem in the case when channels are homogeneous and the channel statistics are unknown a priori. Specifically, we show that this policy, that we refer to as CSE, achieves a regret (the gap between the rewards accumulated by a model-aware Genie and the policy) that is bounded in finite time by a function that scales as O(log t). By explicitly learning the underlying statistics over time, this novel policy outperforms a previously proposed scheme shown to provide near-logarithmic regret.
更多
查看译文
关键词
unknown markovian channels,online learning,restless multiarmed bandit,single secondary user,model aware genie,on-line learning algorithm,opportunistic spectrum access,markov chain,channel statistics,near logarithmic regret,markov processes,logarithmic regret,signal detection,radiocommunication,expected channel usage,multichannel opportunistic access,restless multi-armed bandit
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要