Analysis on Exposition of Speech Type Video Using SSD and CNN Techniques for Face Detection

Yeeshu Manu,Chetana Prakash,Santhosh Kumar S, Shaik Shafi, K. Shruthi

EAI/Springer Innovations in Communication and Computing(2023)

引用 0|浏览0
暂无评分
摘要
With the advancement of the internet and multimedia, more and more videos are being generated. In this regard, storing, managing, and indexing many videos have become a major problem. As a result, a system that extracts the relevant data with mixed emotions from the original video is proposed. The main purpose of video exposition is to present a significant abstract vision with emotions of complete video in a brief amount of instance by using deep learning techniques. The following limitations has been croos over through a novel implementation of a one-time facial image job examination technique, named SSD, for detection expressions and accomplishment expression related classifications, including emotions. The proposed model has been used fully convolutional neural networks (CNN) that continually detect videos uploading from the internet, local server or cloud, using speech samples storing data frames are read out by application and finding the face in running video. This application is most suitable for current video surveillance and CCTV footage analysis. Face-SSD includes two parallel branches: one for expression recognition and the other for expression analysis, which is the part of low level filters. The proposed model haven't need following steps: face identification, facial area removal, size normalisation, and facial region processing since the productions of together modules are spatially associated images created in sequence. Usually existing models like Random Forest optimisation (RFO), Genetic algorithm (GA), Decision tree (DT), and X-boosting techniques cannot solve the issue of face detection in dynamic video. Therefore, the necessity of multiple and multi-task face recognition models is there with measure rates. In this research, CNN-based speech type video extraction and face detection were performed for storage estimation and reduced content indexing complexity. Finally, performance measures have been estimated, like the accuracy of 98.45%, sensitivity 97.34%, recall 94.23%, and throughput.
更多
查看译文
关键词
speech type video,face detection,cnn techniques
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要