Self-supervised learning via cluster distance prediction for operating room context awareness

International Journal of Computer Assisted Radiology and Surgery(2022)

引用 0|浏览7
暂无评分
摘要
Purpose Semantic segmentation and activity classification are key components to create intelligent surgical systems able to understand and assist clinical workflow. In the operating room, semantic segmentation is at the core of creating robots aware of clinical surroundings, whereas activity classification aims at understanding OR workflow at a higher level. State-of-the-art semantic segmentation and activity recognition approaches are fully supervised, which is not scalable. Self-supervision can decrease the amount of annotated data needed. Methods We propose a new 3D self-supervised task for OR scene understanding utilizing OR scene images captured with ToF cameras. Contrary to other self-supervised approaches, where handcrafted pretext tasks are focused on 2D image features, our proposed task consists of predicting relative 3D distance of image patches by exploiting the depth maps. By learning 3D spatial context, it generates discriminative features for our downstream tasks. Results Our approach is evaluated on two tasks and datasets containing multiview data captured from clinical scenarios. We demonstrate a noteworthy improvement in performance on both tasks, specifically on low-regime data where utility of self-supervised learning is the highest. Conclusion We propose a novel privacy-preserving self-supervised approach utilizing depth maps. Our proposed method shows performance on par with other self-supervised approaches and could be an interesting way to alleviate the burden of full supervision.
更多
查看译文
关键词
Self-supervision,Semantic segmentation,OR scene understanding,Activity classification,da Vinci surgical system
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要