Information-enhanced Network for Noncontact Heart Rate Estimation from Facial Videos

IEEE Transactions on Circuits and Systems for Video Technology(2023)

引用 0|浏览0
暂无评分
摘要
Remote photoplethysmography (rPPG) is a vital way of measuring heart rate (HR) to reflect human physical and mental health, which is useful for diagnosing cardiovascular and neurological diseases. Many non-contact HR estimation methods have been proposed gradually in recent years, but the majority of approaches are based on a single-modal HR information source, resulting in ineffective and unsatisfactory estimation results due to noise and insufficient information. This paper proposes a novel information-enhanced network for HR estimation based on multimodal (e.g., RGB and NIR) sources to address these problems. In the network, context and modal difference information are sequentially enhanced from spatiotemporal and modal views for accurately describing HR-aware features, while maximum frequency information is enhanced for inhibiting heartbeat noise. Specifically, a context-enhanced video Swin-Transformer (CET) module is exploited to extract useful rPPG signal features from facial visible-light and near-infrared videos. Then, a novel modal difference enhanced fusion (MDEF) module is designed to acquire a fused rPPG signal, which is taken as the input of the frequency-enhanced estimation (FEE) module to obtain the corresponding HR value. These three modules are integrated and jointly learned in an end-to-end way, and the multimodal combinations can provide highly complementary information for estimating HR value. Experimental and evaluation results on three multimodal datasets show that the proposed model achieves a superior effect compared to the state-of-the-art methods.
更多
查看译文
关键词
Heart rate,Context enhanced video Swin-Transformer,Modal difference enhanced fusion,Frequency enhanced estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要