ITBench: Evaluating AI Agents Across Diverse Real-World IT Automation Tasks
Saurabh Jha, Rohan Arora, Yuji Watanabe, Takumi Yanagawa, Yinfang Chen, Jackson Clark, Bhavya,Mudit Verma,Harshit Kumar,Hirokuni Kitahara,Noah Zheutlin, Saki Takano, Divya Pathak, Felix George, Xinbo Wu,Bekir O. Turkkan,Gerard Vanloo,Michael Nidd, Ting Dai,Oishik Chatterjee,Pranjal Gupta,Suranjana Samanta,Pooja Aggarwal, Rong Lee, Pavankumar Murali, Jae-wook Ahn,Debanjana Kar, Ameet Rahane, Carlos Fonseca,Amit Paradkar,Yu Deng, Pratibha Moogi,Prateeti Mohapatra,Naoki Abe,Chandrasekhar Narayanaswami,Tianyin Xu,Lav R. Varshney,Ruchi Mahindru,Anca Sailer, Laura Shwartz,Daby Sow, Nicholas C. M. Fuller,Ruchir Puri CoRR(2025)
AI 理解论文
溯源树
样例
