How to spice up your planning under uncertainty research life

international conference on automated planning and scheduling(2008)

引用 25|浏览5
暂无评分
摘要
Does planning under uncertainty have you down? Is your “state-of-the-art” uncertain planner slugglish in comparison to that peppy deterministic replanner that you wish would just go away? Do you feel like the world would just be better off without uncertain planners? Before you take drastic measures (and switch research topics), you should take a moment to read this short position paper. Planning under uncertainty is a field rife with unexplored possibilities. Current benchmarks and planning competitions have only begun to scratch the surface of the types of problems that can be solved and the level of excitement to be had by exploring the research issues that these problems pose. Here we discuss just a few of the myriad extensions of planning under uncertainty that promise to spice up your uncertain planning research life. Spicing up your Uncertain Planning Research Like any activity, research in planning under uncertainty can seem dull and boring if occasional efforts are not made to explore interesting alternatives. Furthermore, by lack of exploration, we risk creating the misconception that problems currently being solved are the only problems of interest to the uncertain planning research community. In the following, we show how VIAGRA1 may add some spice to your uncertain planning research life: • EleVators: Aerosmith aside, never underestimate the level of sheer planning excitement that is possible when elevators and planning under uncertainty are combined. Of course, the key concept here is not the elevator itself, but rather the notion of multiple concurrent actions with uncertain outcomes. Such research has already been addressed in factored planning models (Guestrin, Koller, & Parr 2001b) where joint transition functions are factored according to individual concurrent actions. In such a model, the number of joint actions is generally exponential in the number of concurrent actions. Thus, for large numbers of concurrent actions, there simply is not an option to explore all possible actions or outcomes Copyright c © 2008, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. in a deterministic replanning framework, e.g., (Yoon, Fern, & Givan 2007). Even if all joint actions could be evaluated and only the most likely outcome2 was used for replanning, it is not clear that this would lead to a useful model — even though an outcome may be most likely, it may still only occur a negligible fraction of the time given an exponential number of outcomes. Clearly, there may be some gain by exploiting the uncertainty directly in a true probabilistic planning approach by (efficiently) calculating the expected value of a course of action. • ContInuous State and Action Spaces: Going outside of your comfort zone makes things more interesting. Sure, discrete state and action spaces are nice, but the world is generally not discrete. The Bellman equations still hold for uncertain planning problems that can be formalized as continuous state and action Markov decision processes (MDPs) (Puterman 1994). With this generalization, complex continous transition distributions and states consisting of continuous time and resources can be encoded, thus leading to more accurate models of real-world problems, e.g., Mars Rovers (Bresina et al. 2002). Of course, solving such continuous state and action MDPs is another story; it is not clear that deterministic replanning methods that rely on a most likely outcome (from an infinite set) would work well with multi-modal transition distributions. In this case, a proper expectation as computed by the optimal Bellman equations is likely to give a more robust solution. • Multiple Agents: Sometimes it takes more than one agent to adequately spice up your planning research. It is well-known in tha AI literature that multiagent problems can be formalized as a Markov game (Littman 1994) and that a simple minimax reformulation of the Bellman backup suffices to provide an optimal finitehorizon solution to this adversarial planning problem. With this generalization, one can model strict uncertainty in the transition model when it is affected by the actions Your results may vary. Please discontinue use if VIAGRA leads to an overwhelming feeling of unease or discomfort. Using variable elimination (Zhang & Poole 1996) to efficiently compute the max or expectation in the factored model. of other self-interested agents or when transition probabilitites are not well-specified and the agent must plan for the worst case. While stochastic strategies may be required for optimality in this setting, it is not immediately clear how to generalize current deterministic replanners to cope with this paradigm and produce (approximately) optimal stochastic policies. • No Goals: Sometimes its good not to have a predefined conception of exactly what you intend to achieve during planning under uncertainty. A lot of uncertain planning research focuses on problems with clearly defined absorbing goal states. However, there is a more general class of problems that do not always have clearly defined goals but rather the more general task of optimizing expected (infinite-horizon) discounted or average reward (Boutilier, Dean, & Hanks 1999). Take for instance a mail-delivery robot: as different packages arrive due to exogenous events (see below), the robot must continuously optimize its delivery schedule to maximize reward over an infinite-horizon; note that there are no absorbing goal-states to be reached in this problem. In goal-oriented problems, it is already known that deterministic replanners may have difficulties with domains with avoidable dead-ends (Little & Thiebaux 2007) (although such problems may be partially resolved through dead-end analysis in the underlying domain). However, avoidable dead-ends are just the tip of the iceberg w.r.t. the ways in which the performance of optimal deterministic replanners may differ from the performance of optimal uncertain planners. For the more general class of MDPs with expected utility maximization objectives, the problem for deterministic replanners may be generalized to that of avoidable low expected value states. While deterministic planning may be generalized to cope with reasoning in expectation, doing so will start to blur the distinction between deterministic replanners and (approximately) optimal uncertain planners. • Real Problems: We cannot expect to maximize our planning under uncertainty experience if we play with toys instead of focusing on reality. From dialogue management in natural language processing to robotics to real-time program optimization, many real-world problems inherently involve making sequential decisions that should be optimized for best performance. Most of these problems are partially observable, which speaks to the need for such model expressivity in practical uncertain planning research. • Exogenous Actions and Events: When unexpected things happen, uncertain planning can get interesting. Most usage of the planning domain description language PPDDL (Younes et al. 2005) makes a strong frame assumption that only allows relational fluents directly referenced by an action’s parameterization to change as a result of that action. More realistic probabilistic planning problems may also include multiple exogenous actions that occur independently, e.g., independent random failures of computers in the SysAdmin domain (Guestrin, Koller, & Parr 2001a), or arrival of packages in a maildelivery robot domain. Planning in these non-inertial models can be very difficult and adds an extra dimension over simple stochastic versions of standard deterministic planning models (which inherently make a strong frame assumption). In Appendix B, we present a PPDDL variant of the SysAdmin problem with exogenous events that cause computers not directly affected by an action to crash. Problems such as this one may benefit from direct reasoning about uncertain exogenous events.
更多
查看译文
关键词
uncertainty research life,planning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要