SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark
CoRR(2024)
摘要
The paper introduces SceMQA, a novel benchmark for scientific multimodal
question answering at the college entrance level. It addresses a critical
educational phase often overlooked in existing benchmarks, spanning high school
to pre-college levels. SceMQA focuses on core science subjects including
Mathematics, Physics, Chemistry, and Biology. It features a blend of
multiple-choice and free-response formats, ensuring a comprehensive evaluation
of AI models' abilities. Additionally, our benchmark provides specific
knowledge points for each problem and detailed explanations for each answer.
SceMQA also uniquely presents problems with identical contexts but varied
questions to facilitate a more thorough and accurate assessment of reasoning
capabilities. In the experiment, we evaluate both open-source and close-source
state-of-the-art Multimodal Large Language Models (MLLMs), across various
experimental settings. The results show that further research and development
are needed in developing more capable MLLM, as highlighted by only 50
accuracy achieved by the strongest models. Our benchmark and analysis will be
available at https://scemqa.github.io/
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要