Regionspeak: Quick Comprehensive Spatial Descriptions Of Complex Images For Blind Users

Yu Zhong,Walter S. Lasecki,Erin Brady,Jeffrey P. Bigham

CHI '15: CHI Conference on Human Factors in Computing Systems Seoul Republic of Korea April, 2015（2015）

引用 79|浏览25

暂无评分

摘要

Blind people often seek answers to their visual questions from remote sources, however, the commonly adopted singleimage, single-response model does not always guarantee enough bandwidth between users and sources. This is especially true when questions concern large sets of information, or spatial layout, e. g., where is there to sit in this area, what tools are on this work bench, or what do the buttons on this machine do? Our RegionSpeak system addresses this problem by providing an accessible way for blind users to (i) combine visual information across multiple photographs via image stitching, (ii) quickly collect labels from the crowd for all relevant objects contained within the resulting large visual area in parallel, and (iii) then interactively explore the spatial layout of the objects that were labeled. The regions and descriptions are displayed on an accessible touchscreen interface, which allow blind users to interactively explore their spatial layout. We demonstrate that workers from Amazon Mechanical Turk are able to quickly and accurately identify relevant regions, and that asking them to describe only one region at a time results in more comprehensive descriptions of complex images. RegionSpeak can be used to explore the spatial layout of the regions identified. It also demonstrates broad potential for helping blind users to answer difficult spatial layout questions.

查看译文

关键词

H. 5.2 [Information interfaces and presentation]: User Interfaces,- Input devices and strategies,K. 4.2 [Computers and Society]: Social Issues - Assistive technologies for persons,with disabilities,Visual questions,crowdsourcing,stitching,accessibility

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要