Oblivious Markov Decision Processes: Planning and Policy Execution

2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC(2023)

引用 0|浏览0
暂无评分
摘要
We examine a novel setting in which two parties have partial knowledge of the elements that make up a Markov Decision Process (MDP) and must cooperate to compute and execute an optimal policy for the problem constructed from those elements. This situation arises when one party wants to give a robot some task, but does not wish to divulge those details to a second party-while the second party possesses sensitive data about the robot's dynamics (information needed for planning). Both parties want the robot to perform the task successfully, but neither is willing to disclose any more information than is absolutely necessary. We utilize techniques from secure multi-party computation, combining primitives and algorithms to construct protocols that can compute an optimal policy while ensuring that the policy remains opaque by being split across both parties. To execute a split policy, we also give a protocol that enables the robot to determine what actions to trigger, while the second party guards against attempts to probe for information inconsistent with the policy's prescribed execution. In order to improve scalability, we find that basis functions and constraint sampling methods are useful in forming effective approximate MDPs. We report simulation results examining performance and precision, and assess the scaling properties of our Python implementation. We also describe a hardware proof-of-feasibility implementation using inexpensive physical robots, which, being a small-scale instance, can be solved directly.
更多
查看译文
关键词
Linear programming,Markov Decision Process,Robotics,Control Systems Privacy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要