Manipulating Predictions over Discrete Inputs in Machine Teaching
CoRR(2024)
摘要
Machine teaching often involves the creation of an optimal (typically
minimal) dataset to help a model (referred to as the `student') achieve
specific goals given by a teacher. While abundant in the continuous domain, the
studies on the effectiveness of machine teaching in the discrete domain are
relatively limited. This paper focuses on machine teaching in the discrete
domain, specifically on manipulating student models' predictions based on the
goals of teachers via changing the training data efficiently. We formulate this
task as a combinatorial optimization problem and solve it by proposing an
iterative searching algorithm. Our algorithm demonstrates significant numerical
merit in the scenarios where a teacher attempts at correcting erroneous
predictions to improve the student's models, or maliciously manipulating the
model to misclassify some specific samples to the target class aligned with his
personal profits. Experimental results show that our proposed algorithm can
have superior performance in effectively and efficiently manipulating the
predictions of the model, surpassing conventional baselines.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要