Multilevel Deep Learning-Based Processing For Lifelog Image Retrieval Enhancement

Ghada Feki,Fatma Ben Abdallah,Anis Ben Ammar,Chokri Ben Amar

2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC)（2018）

引用 6|浏览5

暂无评分

摘要

Remembering an event or a meeting, recalling the face or the name of a person, keeping in mind what we ate or the place of a lost object is sometimes a difficult task. The human memory has its limits. In order to go beyond these limits, researchers developed sensors and wearable cameras to capture individual's experiences. This trend called lifelog has recently been the subject of several panels, workshops and benchmarks. By analyzing the lifelog tasks of these events more closely, we notice that there are still challenges in managing, analyzing, indexing, retrieving, summarizing and visualizing the captured data. In this work, we present a multilevel deep learning-based processing for lifelog image retrieval enhancement. Our proposed approach is based on five phases in which we use deep learning at several levels. The first phase consists of data pre-processing based on low-level image features to filter out irrelevant, noisy and blurred images. In the second phase, we detect and cross high-level image features using pre-trained CNN to enhance the metadata image description. Then, we manage a semantic segmentation based on the VVU-Palmer measure similarity. This segmentation is performed to limit the search area and to control better the runtime and the complexity. The fourth phase consist in analyzing the query using LSTM to match concepts with queries. The final phase which based on doc2sequence aims at retrieving the images that is answering the query.

查看译文

关键词

Lifelog, Retrieval, Convolutional Neural Network, Word Embedding, Semantic Similarity, Long Short-Term Memory

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要