Spam Filtering: an Active Learning Approach using Incremental Clustering

WIMS(2014)

引用 13|浏览31
暂无评分
摘要
This paper introduces a method that deals with unwanted mail messages by combining active learning with incremental clustering. The proposed approach is motivated by the fact that the user cannot provide the correct category for all received messages. The email messages are divided into chronological batches (e.g. one per day). The user is asked to give the correct categories (labels) for the messages of the first batch and from then on the proposed algorithm decides when to ask for a new label, based on a clustering of the messages that is incrementally updated. We test different variants of the algorithm on a number of different datasets and show that it achieves very good results with only 2% of all email messages labelled by the user.
更多
查看译文
关键词
incremental clustering,algorithms,experimentation,semi-supervised learning,spam filtering,active learning,information storage and retrieval,machine learning,semi supervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要