Secure protocols for cumulative reward maximization in stochastic multi-armed bandits

Journal of Computer Security(2022)

引用 0|浏览15
暂无评分
摘要
We consider the problem of cumulative reward maximization in multi-armed bandits. We address the security concerns that occur when data and computations are outsourced to an honest-but-curious cloud i.e., that executes tasks dutifully, but tries to gain as much information as possible. We consider situations where data used in bandit algorithms is sensitive and has to be protected e.g., commercial or personal data. We rely on cryptographic schemes and propose UCB - MS, a secure multi-party protocol based on the UCB algorithm. We prove that UCB - MS computes the same cumulative reward as UCB while satisfying desirable security properties. In particular, cloud nodes cannot learn the cumulative reward or the sum of rewards for more than one arm. Moreover, by analyzing messages exchanged among cloud nodes, an external observer cannot learn the cumulative reward or the sum of rewards produced by some arm. We show that the overhead due to cryptographic primitives is linear in the size of the input. Our implementation confirms the linear-time behavior and the practical feasibility of our protocol, on both synthetic and real-world data.
更多
查看译文
关键词
Security in machine learning,cumulative reward maximization,honest-but-curious cloud,UCB algorithm,AES-GCM symmetric encryption scheme,Paillier's additive homomorphic asymmetric encryption scheme
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要