GenLens: A Systematic Evaluation of Visual GenAI Model Outputs
CoRR(2024)
摘要
The rapid development of generative AI (GenAI) models in computer vision
necessitates effective evaluation methods to ensure their quality and fairness.
Existing tools primarily focus on dataset quality assurance and model
explainability, leaving a significant gap in GenAI output evaluation during
model development. Current practices often depend on developers' subjective
visual assessments, which may lack scalability and generalizability. This paper
bridges this gap by conducting a formative study with GenAI model developers in
an industrial setting. Our findings led to the development of GenLens, a visual
analytic interface designed for the systematic evaluation of GenAI model
outputs during the early stages of model development. GenLens offers a
quantifiable approach for overviewing and annotating failure cases, customizing
issue tags and classifications, and aggregating annotations from multiple users
to enhance collaboration. A user study with model developers reveals that
GenLens effectively enhances their workflow, evidenced by high satisfaction
rates and a strong intent to integrate it into their practices. This research
underscores the importance of robust early-stage evaluation tools in GenAI
development, contributing to the advancement of fair and high-quality GenAI
models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要