Enhanced Spatial Stream of Two-Stream Network Using Optical Flow for Human Action Recognition

APPLIED SCIENCES-BASEL(2023)

引用 0|浏览0
暂无评分
摘要
Introduction: Convolutional neural networks (CNNs) have maintained their dominance in deep learning methods for human action recognition (HAR) and other computer vision tasks. However, the need for a large amount of training data always restricts the performance of CNNs. Method: This paper is inspired by the two-stream network, where a CNN is deployed to train the network by using the spatial and temporal aspects of an activity, thus exploiting the strengths of both networks to achieve better accuracy. Contributions: Our contribution is twofold: first, we deploy an enhanced spatial stream, and it is demonstrated that models pre-trained on a larger dataset, when used in the spatial stream, yield good performance instead of training the entire model from scratch. Second, a dataset augmentation technique is presented to minimize overfitting of CNNs, where we increase the dataset size by performing various transformations on the images such as rotation and flipping, etc. Results: UCF101 is a standard benchmark dataset for action videos, and our architecture has been trained and validated on it. Compared with the other two-stream networks, our results outperformed them in terms of accuracy.
更多
查看译文
关键词
human action recognition,optical flow,spatial stream,two-stream
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要