WeChat Mini Program
Old Version Features

SWE-PolyBench: A Multi-Language Benchmark for Repository Level Evaluation of Coding Agents

Muhammad Shihab Rashid, Christian Bock, Yuan Zhuang, Alexander Buchholz, Tim Esler, Simon Valentin, Luca Franceschi,Martin Wistuba, Prabhu Teja Sivaprasad, Woo Jung Kim,Anoop Deoras,Giovanni Zappella,Laurent Callot

arxiv(2025)

Cited 0|Views3
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined