Vision approaches to solve the screen pose acquisition problem for PerspectiveCursor
msra(2006)
摘要
PespectiveCursor is a new interaction technique for multidisplay environments that significantly improves human interaction with computer systems. Computer Vision has the potential to provide a solution for the implementation of this interaction technique that might be better than the current ones. A survey of the current state-of-the-art in 2 D and 3D vision shows that there are many techniques that might be of use in finding the spatial relationships between the point of view of users and the displays in the environment, which is the main requirement for an implementation of PerspectiveCursor. INTRODUCTION With the advent of cheap cameras and the continuous increase in computing power of consumer micro-processor systems, Computer Vision and Image Processing have moved from being research fields with applications only in very specialized fields (i.e. medicine, robotics) to being pervasive in consumer products (i.e. optical mice, video cameras). Tracking is one of the technologies that is already being transformed radically by the substitution of electromagnetic measurement and ultra-sound methods by computer vision. Numerous new applications in research (but also developing towards consumer products) make use of tracking in some way. The new paradigms of computing environments and Human-Computer interaction [23, 28, 34] make extensive use of the position coordinates of users and objects to provide meaningful and easy to use services. Computer Vision Tracking is getting to the point where i t can provide unobtrusive, reasonably precise, reliable tracking at an affordable price. The particular application that we are interested in is called PerspectiveCursor [22]. PerspectiveCursor is an interaction technique that provides access to a dynamic multi-screen environment from a single input device. This technique, which is explained in more detail in the next section, requires knowledge of the position of the user and the position of the different displays in order to create an intuitive input/control translation map. PerspectiveCursor has the potential of improving singleuser and groupware systems, but the current implementation, based in electro-magnetic tracking, has several very important drawbacks: it is very unstable in presence of metallic objects, it is extremely expensive (around $10 000 for the most basic tracking equipment) and it is tethered. This paper aims at exploring the different alternatives insi de the Computer Vision field that would allow a cheap and reliable implementation of PerspectiveCursor. For this, we consider the minimal requirements of the PCursor technique and the knowledge of the particular setting to tr y to find a match between the different CV techniques and an acceptable solution. The rest of the report is organized as follows: first, we describe PerspectiveCursor and clearly state the particularities of the Computer Vision problem that we want to solve, including the constraints, requirements, and desirable characteristics of a solution. Then we explore current literature at the same time that we comment on the benefits and drawbacks of each technology and how these would be helpful in providing a full or partial solutio n of the problem. Finally, we present our conclusions and the techniques we consider most valuable for this particular endeavor. MOTIVATION & PROBLEM STATEMENT In trying to find a suitable solution for our problem , we first need to explain how PCursor works, and then detail what exactly “suitable” means. PerspectiveCursor The technique of PerspectiveCursor is motivated by the need of accessing several displays using a single cursor in complex multi-display scenarios. In most current multidisplay settings, users usually work with two displays that are connected to the same machine and positioned in a very simple layout, namely, in the same plane and next to each other (see Figure 1).
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络