End-User Feature Engineering in the Presence of Class Imbalance
msra(2013)
摘要
Intelligent user interfaces, such as recommender systems and email classifiers, use machine learning algorithms to customize their behavior to the preferences of an end user. Although these learning systems are somewhat reliable, they are not perfectly accurate. Traditionally, end users who need to correct these learning systems can only provide more labeled training data. In this paper, we focus on incorporating new features suggested by the end user into machine learning systems. To investigate the effects of user-generated features on accuracy we developed an auto- coding application that enables end users to assist a machine-learned program in coding a transcript by adding custom features. Our results show that adding user- generated features to the machine learning algorithm can result in modest improvements to its F1 score. Further improvements are possible if the algorithm accounts for class imbalance in the training data and deals with low- quality user-generated features that add noise to the learning algorithm. We show that addressing class imbalance improves performance to an extent but improving the quality of features brings about the most beneficial change. Finally, we discuss changes to the user interface that can help end users avoid the creation of low- quality features. Author Keywords Feature engineering, class imbalance, end-user programming, machine learning.
更多查看译文
关键词
user interface,artificial intelligence,recommender system,hci,machine,technical report,feature engineering,programming,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络