Neural Semantic Video Analysis

Advances in information quality and management(2024)

引用 0|浏览0
暂无评分
摘要
Videos are a rich form of data intended for capturing, storing, and communicating information. The availability of inexpensive and accessible video-capturing sensors in smartphones, handheld cameras, and consumer security cameras has exponentially increased global video footage generation over the past decade. Since video is a popular form of widely consumed and produced data, it is essential to develop automated systems to analyze and identify relevant information within the large body of video material. This chapter demonstrates how the emergence of neural networks, including CNNs and transformers, has revolutionized semantic video analysis. Through convolutional filters, spatial patterns can be captured at the pixel level through this type of neural network. The learning capability of CNN-based models has been exceeded more recently by self-attention-based models. Both CNN-based and transformer-based semantic video analysis models take advantage of transfer learning, self-supervised learning, and more to compensate for the lack of large, supervised video datasets.
更多
查看译文
关键词
video
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要