LARNet-STC: Spatio-temporal orthogonal region selection network for laryngeal closure detection in endoscopy videos

COMPUTERS IN BIOLOGY AND MEDICINE(2022)

引用 3|浏览18
暂无评分
摘要
The vocal folds (VFs) are a pair of muscles in the larynx that play a critical role in breathing, swallowing, and speaking. VF function can be adversely affected by various medical conditions including head or neck injuries, stroke, tumor, and neurological disorders. In this paper, we propose a deep learning system for automated detection of laryngeal adductor reflex (LAR) events in laryngeal endoscopy videos to enable objective, quantitative analysis of VF function. The proposed deep learning system incorporates our novel orthogonal region selection network and temporal context. This network learns to directly map its input to a VF open/close state without first segmenting or tracking the VF region. This one-step approach drastically reduces manual annotation needs from labor-intensive segmentation masks or VF motion tracks to frame-level class labels. The proposed spatio-temporal network with an orthogonal region selection subnetwork allows integration of local image features, global image features, and VF state information in time for robust LAR event detection. The proposed network is evaluated against several network variations that incorporate temporal context and is shown to lead to better performance. The experimental results show promising performance for automated, objective, and quantitative analysis of LAR events from laryngeal endoscopy videos with over 90% and 99% F1 scores for LAR and non-LAR frames respectively.
更多
查看译文
关键词
Deep learning, Vocal folds, Laryngeal endoscopy, Laryngeal closure detection, Laryngeal adductor reflex
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要