AttributionScanner: A Visual Analytics System for Model Validation with Metadata-Free Slice Finding
arxiv(2024)
摘要
Data slice finding is an emerging technique for validating machine learning
(ML) models by identifying and analyzing subgroups in a dataset that exhibit
poor performance, often characterized by distinct feature sets or descriptive
metadata. However, in the context of validating vision models involving
unstructured image data, this approach faces significant challenges, including
the laborious and costly requirement for additional metadata and the complex
task of interpreting the root causes of underperformance. To address these
challenges, we introduce AttributionScanner, an innovative human-in-the-loop
Visual Analytics (VA) system, designed for metadata-free data slice finding.
Our system identifies interpretable data slices that involve common model
behaviors and visualizes these patterns through an Attribution Mosaic design.
Our interactive interface provides straightforward guidance for users to
detect, interpret, and annotate predominant model issues, such as spurious
correlations (model biases) and mislabeled data, with minimal effort.
Additionally, it employs a cutting-edge model regularization technique to
mitigate the detected issues and enhance the model's performance. The efficacy
of AttributionScanner is demonstrated through use cases involving two benchmark
datasets, with qualitative and quantitative evaluations showcasing its
substantial effectiveness in vision model validation, ultimately leading to
more reliable and accurate models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要