Visual Perception of 3D Space and Shape in Time - Part III 2D Shape Recognition by Log-Scaling

Brian Ta, Maria E. M. M. Silva, Kelly Bartlett,Umaima Afifa, Annie Agazaryan,Ricardo Canela,Javier Carmona, Emmanuel John L. De Leon, Alyssa Drost,Diego Espino, Guadalupe Espinoza, Kyleigh Follis, Paul Gan, Lauren Ho,Christina Honoré, Emily Huang, Luis Ibarra, Tessa Jackson,Mira Khosla,Caominh Le, Victor Li,Trevor McCarthy,Elizabeth Mills,Sukanya Mohapatra, Yuuki Morishige, Nancy Nguyen, Ziyan Peng, Kimya Peyvan, Michael Phipps, Isabella Poschl, Jagannathan Rangarajan, Charÿsa Santos,Leonard Schummer, Sky Shi, Natalie Smale, April Smith, Divya Sood, Cindy Ta, Anna Tran, Michelle Tran,Rui Wang,Patrick Wilson, Nicole L. Yang,Megan Yu,Selena Yu,Aaron P. Blaisdell,Katsushi Arisaka

biorxiv(2022)

引用 2|浏览8
暂无评分
摘要
Human vision has a remarkable ability to recognize complex 3D objects such as faces that appear with any size and 3D orientations at any 3D location. If we initially memorize a face only with a normalized size and viewed from directly head on, the direct comparison between the one-sized memory and a new incoming image would demand tremendous mental frame translations in 7D. How can we perform such a demanding task so promptly and reliably as we experience the objects in the world around us? Intriguingly, our primary visual cortex exhibits a 2D retinotopy with a log-polar coordinate system, where scaling up/down of shape is converted to linear frame translation. As a result, mental scaling can be performed by linearly translating the memory or the perceptual image until they overlap with each other. According to our new model of NHT (Neural Holography Tomography), alpha brainwaves traveling at a constant speed can conduct this linear translation. With this scheme, every scaling up/down by a factor of two should take the same amount of extra mental time to recognize a smaller/larger face. To test this hypothesis, we designed a reaction time (RT) experiment, where participants were first asked to memorize sets of unfamiliar faces with a given specific size (4° or 8°). Following the memorization phase, similar stimuli with a wide range of sizes (from 1° to 32°) were presented, and RTs were recorded. As predicted, the increase in RT was proportional to the scaling factor in the log scale. Furthermore, we observed that RTs were fastest for 8° faces even if the memorized face was 4°. This supports our hypothesis that we always memorize faces at the exact size of ~8 °. To our surprise, the increases in RT were also consistent with the mentally-estimated depth sensation, which indicates that the apparent size of the recognized face can create a proper depth sensation. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
shape recognition,visual perception,3d space,log-scaling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要