{"id":132801,"url":"https://github.com/euniai/awesome-code-agents","name":"awesome-code-agents","description":"A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding — they're redefining how software changes the world.","projects_count":471,"last_synced_at":"2026-06-16T19:00:21.094Z","repository":{"id":318518711,"uuid":"1070840924","full_name":"EuniAI/awesome-code-agents","owner":"EuniAI","description":"A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding — they're redefining how software changes the world.","archived":false,"fork":false,"pushed_at":"2026-06-06T15:12:19.000Z","size":1301,"stargazers_count":102,"open_issues_count":90,"forks_count":7,"subscribers_count":5,"default_branch":"main","last_synced_at":"2026-06-06T17:09:12.335Z","etag":null,"topics":["ai-agents","awesome-list","code-agents","large-language-models","llm","software-engineering"],"latest_commit_sha":null,"homepage":"https://euniai.github.io/awesome-code-agents/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EuniAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-06T14:03:18.000Z","updated_at":"2026-06-06T15:12:22.000Z","dependencies_parsed_at":"2025-10-07T18:17:19.327Z","dependency_job_id":"5913715d-7de6-4f05-80de-86b916fd815a","html_url":"https://github.com/EuniAI/awesome-code-agents","commit_stats":null,"previous_names":["euniai/awesome-code-agents"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/EuniAI/awesome-code-agents","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EuniAI%2Fawesome-code-agents","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EuniAI%2Fawesome-code-agents/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EuniAI%2Fawesome-code-agents/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EuniAI%2Fawesome-code-agents/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EuniAI","download_url":"https://codeload.github.com/EuniAI/awesome-code-agents/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EuniAI%2Fawesome-code-agents/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34419350,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"created_at":"2026-05-31T03:30:13.900Z","updated_at":"2026-06-16T19:00:21.095Z","primary_language":null,"list_of_lists":false,"displayable":true,"categories":["🙏 Acknowledgements","📚 Papers","🌟 Star History"],"sub_categories":["🧪 Frontier Labs and Teams","🔧 Software General Engineering Agents","🌐 Website Engineering Agents","🔬 Research Engineering Agents"],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"./docs/static/images/icon.jpg\" alt=\"Prometheus Logo\" width=\"160\"\u003e\n  \u003ch1 style=\"border-bottom: none;\"\u003e\n    \u003cb\u003e\u003ca href=\"https://euni.ai/\" target=\"_blank\"\u003e🤖 Awesome Code Agents\u003c/a\u003e\u003c/b\u003e\u003cbr\u003e\n    Towards AI-Powered Software 3.0\n  \u003c/h1\u003e\n\n  \u003cp align=\"center\"\u003e\n    \u003cstrong\u003eA curated list of research papers on autonomous code agents.\u003c/strong\u003e\u003cbr\u003e\n    \u003cem\u003eBeyond coding — these agents are redefining how software changes the world.\u003c/em\u003e\n  \u003c/p\u003e\n\n  \u003c!-- 🌍 Project Links --\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://euni.ai/\"\u003e\u003cb\u003eWebsite\u003c/b\u003e\u003c/a\u003e •\n    \u003ca href=\"https://x.com/Euni_AI\"\u003e\u003cb\u003eX/Twitter\u003c/b\u003e\u003c/a\u003e •\n    \u003ca href=\"https://www.linkedin.com/company/euni-ai/\"\u003e\u003cb\u003eLinkedIn\u003c/b\u003e\u003c/a\u003e •\n    \u003ca href=\"https://discord.gg/jDG4wqkKZj\"\u003e\u003cb\u003eDiscord\u003c/b\u003e\u003c/a\u003e •\n    \u003ca href=\"https://www.reddit.com/r/EuniAI\"\u003e\u003cb\u003eReddit\u003c/b\u003e\u003c/a\u003e •\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents\"\u003e\u003cb\u003eGitHub\u003c/b\u003e\u003c/a\u003e\n  \u003c/p\u003e\n\n  \u003c!-- Badges --\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents/stargazers\"\u003e\n      \u003cimg src=\"https://img.shields.io/github/stars/EuniAI/awesome-code-agents?style=for-the-badge\u0026color=yellow\" alt=\"Stars\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents/forks\"\u003e\n      \u003cimg src=\"https://img.shields.io/github/forks/EuniAI/awesome-code-agents?style=for-the-badge\u0026color=blue\" alt=\"Forks\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://opensource.org/licenses/Apache-2.0\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/license-Apache--2.0-green?style=for-the-badge\" alt=\"License: Apache 2.0\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://www.arxiv.org/abs/2507.19942\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/Paper-arXiv-red?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white\" alt=\"arXiv Paper\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents/graphs/contributors\"\u003e\n      \u003cimg src=\"https://img.shields.io/github/contributors/EuniAI/awesome-code-agents?style=for-the-badge\u0026color=orange\" alt=\"Contributors\"\u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://awesome.re\"\u003e\n      \u003cimg src=\"https://awesome.re/badge.svg\" alt=\"Awesome\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"./docs/static/badges/papers.svg\"\u003e\n      \u003cimg src=\"./docs/static/badges/papers.svg\" alt=\"Papers Count\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents\" target=\"_blank\"\u003e\n      \u003cimg src=\"https://img.shields.io/github/commit-activity/m/EuniAI/awesome-code-agents?label=Commits\u0026color=brightgreen\u0026style=flat\" alt=\"Commit Activity\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents/forks\" target=\"_blank\"\u003e\n      \u003cimg src=\"https://img.shields.io/github/forks/EuniAI/awesome-code-agents.svg?style=flat\u0026color=blue\u0026label=Forks\" alt=\"Forks\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents/issues\" target=\"_blank\"\u003e\n      \u003cimg alt=\"Issues Closed\" src=\"https://img.shields.io/github/issues-search?query=repo%3AEuniAI%2Fawesome-code-agents%20is%3Aclosed\u0026label=Issues%20Closed\u0026labelColor=%237d89b0\u0026color=%235d6b98\u0026style=flat\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/EuniAI/awesome-code-agents/discussions\" target=\"_blank\"\u003e\n      \u003cimg alt=\"Discussion Posts\" src=\"https://img.shields.io/github/discussions/EuniAI/awesome-code-agents?label=Discussions\u0026labelColor=%239b8afb\u0026color=%237a5af8\u0026style=flat\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://img.shields.io/badge/PRs-Welcome-red\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/PRs-Welcome-red\" alt=\"PRs Welcome\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://img.shields.io/github/last-commit/EuniAI/awesome-code-agents?color=green\"\u003e\n      \u003cimg src=\"https://img.shields.io/github/last-commit/EuniAI/awesome-code-agents?color=green\" alt=\"Last Commit\"\u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n\n  \u003cp align=\"center\"\u003e\n    \u003cimg src=\"docs/static/images/main_v1-min.png\" alt=\"Awesome Code Agents\" width=\"100%\" style=\"border-radius: 15px; box-shadow: 0 4px 24px rgba(0,0,0,.1); margin: 5px 0;\"\u003e\n  \u003c/p\u003e\n\n  *Photo Credit: [Gemini-Nano-Banana-Pro🍌](https://deepmind.google/models/gemini-image/pro/)*.\n\n\u003c/div\u003e\n\n\u003c!-- Optional teaser --\u003e\n\u003c!--\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/teaser.png\" width=\"520px\"/\u003e\n\u003c/p\u003e\n--\u003e\n\u003c!-- \u003cp align=\"center\"\u003e\n  A curated list of \u003cb\u003eproducts, benchmarks, and research papers\u003c/b\u003e on \u003cb\u003eCode Agents\u003c/b\u003e.\n\u003c/p\u003e --\u003e\n\n---\n\n## Quick Navigation\n\n\u003c!-- START PAPERS SUMMARY --\u003e\n🔥 **We are actively tracking the frontier research of code agents.**\u003cbr\u003e\n🧹 *We periodically curate our collection, retaining only published papers and interesting arXiv preprints from the last six months.*\u003cbr\u003e\n📚 *Currently collected:* **`499` papers** — *(Last update: 2026-06-07)*\n\u003c!-- END PAPERS SUMMARY --\u003e\n\n\u003c!-- - [🚀 Products \u0026 Tools](#-products--tools)\n  * [🧰 IDEs \u0026 Editors](#-ides--editors)\n  * [💻 CLI Tools](#-cli-tools)\n  * [🧩 Extensions \u0026 Plugins](#-extensions--plugins)\n  * [🌐 Web-Based Development Platforms](#-web-based-development-platforms)\n  * [🖥 Desktop Applications](#-desktop-applicatio)\n  * [📱 Mobile Tools](#-mobile-tools) --\u003e\n- [📚 Papers](#-papers)\n  * [🌍 Foundation Models](#-foundation-models)\n  * [🔧 Software Engineering Agents](#-software-general-engineering-agents)\n    + [🛠 Issue Resolution](#-issue-resolution)\n    + [🖥️ Terminal Operating](#-terminal-operating)\n    + [🧑‍💻 Code Generation](#-code-generation)\n    + [🏗 Environment Building](#-environment-building)\n    + [🔁 Issue Reproduction](#-issue-reproduction)\n    + [🎯 Issue Localization](#-issue-localization)\n    + [❓ Question Answering](#-question-answering)\n    + [🔍 Pull Request Review](#-pull-request-review)\n    + [✨ Feature Development](#-feature-development)\n    + [🔄 Git Management](#-git-management)\n    + [⚡ Performance Optimization](#-performance-optimization)\n    + [🧪 Test Generation](#-test-generation)\n    + [🚚 Code Migration](#-code-migration)\n    + [🧹 Code Refactoring](#-code-refactoring)\n  * [🔒 Software Security Agents](#-software-security-engineering-agents)\n  * [🖥️ System Agents](#-system-engineering-agents)\n  * [🗃️ Database Agents](#-database-engineering-agents)\n  * [⚙️ Hardware Agents](#-hardware-engineering-agents)\n  * [🌐 Website Agents](#-website-engineering-agents)\n    + [🌐 Front-End UI Generation](#front-end-ui-generation)\n    + [🖥️ Backend Service Generation](#-backend-service-generation)\n    + [🧭 Code-Executing Web Agents](#-code-executing-web-agents)\n  * [🔬 Research Agents](#-research-engineering-agents)\n    + [👩‍💻 Machine Learning Engineering](#-machine-learning-engineering)\n    + [🤖 Automated Data Science](#-automated-data-science)\n    + [📊 Agentic Visualization](#-agentic-visualization)\n  * [🎨 Visual Agents](#-visual-engineering-agents)\n    + [🌀 Animation Generation](#-animation-generation)\n    + [🖼️ SVG Generation](#-svg-generation)\n  * [🎮 Game Agents](#-game-engineering-agents)\n    + [🎮 Game Generation](#-game-generation)\n    + [🕹️ Code-Executing Game Agents](#code-executing-game-agents)\n  * [🧊 3D Agents](#-3d-engineering-agents)\n    + [🧊 3D Object Design](#-3d-object-design)\n    + [🏞 Scene Generation](#-scene-generation)\n  * [🤖 Embodied Agents](#-embodied-engineering-agents)\n    + [🤖 Code-Executing Embodied Agents](#-code-executing-embodied-agents)\n- [🗺️ Research Landscape](#-research-landscape)\n- [🤝 Contributing](#-contributing)\n- [🌟 Star History](#-star-history)\n- [🙏 Acknowledgements](#-acknowledgements)\n\n   \u003c!-- tags: dataset \u0026 benchmark, survey, position paper, empirical study --\u003e\n\n---\n\n\u003c!-- ## 🚀 Products \u0026 Tools\n\u003e Leading agentic systems, frameworks, and platforms for automated software development. --\u003e\n\n\u003c!-- START PAPERS:products_closed --\u003e\n\n\u003c!-- END PAPERS:products_closed --\u003e\n\u003c!-- --- --\u003e\n\n## 📚 Papers\n\u003e Explore foundational, recent, and influential works advancing the code agent research landscape.\n\n### 🌍 Foundation Models\n\u003e Large Language Models designed or extended for advanced software engineering capabilities.\n\n\u003c!-- START PAPERS:foundation_models --\u003e\n- **CWM: An Open-Weights LLM for Research on Code Generation with World Models.**  \n  _FAIR CodeGen team, Jade Copet, Quentin Carbonneaux, Gal Cohen, Jonas Gehring, Jacob Kahn, Jannik Kossen, Felix Kreuk, Emily McMilin, Michel Meyer, et al._ arXiv 2025/09.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.02387) [![GitHub Stars](https://img.shields.io/github/stars/facebookresearch/cwm?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/facebookresearch/cwm)\n\n- **Introducing: Devstral 2 and Mistral Vibe CLI.**  \n  _Mistral._ 2025/12.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://mistral.ai/news/devstral-2-vibe-cli)\n\n- **Qwen3-Coder: Agentic Coding in the World.**  \n  _QwenTeam._ 2025/07.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://qwen.ai/blog?id=qwen3-coder) [![GitHub Stars](https://img.shields.io/github/stars/QwenLM/Qwen3-Coder?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/QwenLM/Qwen3-Coder)\n\n- **Kimi K2: Open Agentic Intelligence.**  \n  _Kimi Team: Yifan Bai, Yiping Bao, Guanduo Chen, Jiahao Chen, Ningxin Chen, Ruijue Chen, Yanru Chen, Yuankun Chen, Yutian Chen, Zhuofu Chen, et al._ arXiv 2025/07.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.20534)\n\u003c!-- END PAPERS:foundation_models --\u003e\n\n### 🔧 Software General Engineering Agents\n\n#### 🛠 Issue Resolution\n\u003e Automated bug fixing, patch generation, repair techniques.\n\n\u003c!-- START PAPERS:issue_resolution --\u003e\n- **Empowering Autonomous Debugging Agents with Efficient Dynamic Analysis.**  \n  _Jiahong Xiang, Xiaoyang Xu, Xiaopan Chu, Hongliang Tian, Yuqun Zhang._ arXiv 2026/04.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2604.24212) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Position: Future Research and Challenges Remain Towards AI for Software Engineering.**  \n  _Alex Gu, Naman Jain, Wen-Ding Li, Manish Shetty, Kevin Ellis, Koushik Sen, Armando Solar-Lezama._ ICML 2025 Position Paper Track.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=RuLsq4LSZK) ![Position Paper](https://img.shields.io/badge/Position_Paper-9B59B6?style=for-the-badge)\n\n- **How can we assess human-agent interactions? Case studies in software agent design.**  \n  _Valerie Chen, Rohit Malhotra, Xingyao Wang, Juan Michelini, Xuhui Zhou, Aditya Bharat Soni, Hoang H. Tran, Calvin Smith, Ameet Talwalkar, Graham Neubig._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.09801) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Assessing and Advancing Benchmarks for Evaluating Large Language Models in Software Engineering Tasks.**  \n  _Xing Hu, Feifei Niu, Junkai Chen, Xin Zhou, Junwei Zhang, Junda He, Xin Xia, David Lo._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.08903) ![Benchmark \u0026 Dataset](https://img.shields.io/badge/Benchmark_%26_Dataset-F4A261?style=for-the-badge) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **A Comprehensive Empirical Evaluation of Agent Frameworks on Code-centric Software Engineering Tasks.**  \n  _Zhuowen Yin, Cuifeng Gao, Chunsong Fan, Wenzhang Yang, Yinxing Xue, Lijun Zhang._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.00872) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Large Language Model-Based Agents for Software Engineering: A Survey.**  \n  _Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, Yiling Lou._ arXiv 2024.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2409.02977) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge)\n\n- **A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System.**  \n  _Jiale Guo, Suizhi Huang, Mei Li, Dong Huang, Xingsheng Chen, Regina Zhang, Zhijiang Guo, Han Yu, Siu-Ming Yiu, Christian Jensen, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.09721) [![GitHub Stars](https://img.shields.io/github/stars/lisaGuojl/LLM-Agent-SE-Survey?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/lisaGuojl/LLM-Agent-SE-Survey) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge)\n\n- **Agents in software engineering: survey, landscape, and vision.**  \n  _Yanlin Wang, Wanjun Zhong, Yanxian Huang, Ensheng Shi, Min Yang, Jiachi Chen, Hui Li, Yuchi Ma, Qianxiang Wang, Zibin Zheng._ Automated Software Engineering, Springer, 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://link.springer.com/article/10.1007/s10515-025-00544-2) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge) ![Position Paper](https://img.shields.io/badge/Position_Paper-9B59B6?style=for-the-badge)\n\n- **Agentic Software Engineering: Foundational Pillars and a Research Roadmap.**  \n  _Ahmed E. Hassan, Hao Li, Dayi Lin, Bram Adams, Tse-Hsun Chen, Yutaro Kashiwa, Dong Qiu._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.06216) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge) ![Position Paper](https://img.shields.io/badge/Position_Paper-9B59B6?style=for-the-badge)\n\n- **How Does LLM Reasoning Work for Code? A Survey and a Call to Action.**  \n  _Ira Ceka, Saurabh Pujar, Irene Manotas, Gail Kaiser, Baishakhi Ray, Shyam Ramji._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.13932) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge) ![Position Paper](https://img.shields.io/badge/Position_Paper-9B59B6?style=for-the-badge)\n\n- **SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering.**  \n  _Xuehang Guo, Xingyao Wang, Yangyi Chen, Sha Li, Chi Han, Manling Li, Heng Ji._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.06994) [![GitHub Stars](https://img.shields.io/github/stars/xhguo7/SyncMind?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/xhguo7/SyncMind) [![Website](https://img.shields.io/website?url=https://xhguo7.github.io/SyncMind/\u0026up_message=SYNCMIND\u0026up_color=blue\u0026down_message=SYNCMIND\u0026down_color=blue\u0026style=for-the-badge)](https://xhguo7.github.io/SyncMind/) ![Benchmark \u0026 Dataset](https://img.shields.io/badge/Benchmark_%26_Dataset-F4A261?style=for-the-badge) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **SWE-Bench+: Enhanced Coding Benchmark for LLMs.**  \n  _Reem Aleithan, Haoran Xue, Mohammad Mahdi Mohajer, Elijah Nnorom, Gias Uddin, Song Wang._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=R40rS2afQ3) ![Benchmark \u0026 Dataset](https://img.shields.io/badge/Benchmark_%26_Dataset-F4A261?style=for-the-badge) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Prometheus: Unified Knowledge Graphs for Issue Resolution in Multilingual Codebases.**  \n  _Zimin Chen, Yue Pan, Siyu Lu, Jiayi Xu, Claire Le Goues, Martin Monperrus, He Ye._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.19942) [![GitHub Stars](https://img.shields.io/github/stars/EuniAI/Prometheus?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/EuniAI/Prometheus) [![Website](https://img.shields.io/website?url=https://euni.ai/\u0026up_message=EUNI.AI\u0026up_color=blue\u0026down_message=EUNI.AI\u0026down_color=blue\u0026style=for-the-badge)](https://euni.ai/)\n\n- **SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving.**  \n  _Chaofan Tao, Jierun Chen, Yuxin Jiang, Kaiqi Kou, Shaowei Wang, Ruoyu Wang, Xiaohui Li, Sidi Yang, Yiming Du, Jianbo Dai, et al._ arXiv 2026/01.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2601.01426) [![GitHub Stars](https://img.shields.io/github/stars/SWE-Lego/SWE-Lego?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/SWE-Lego/SWE-Lego)\n\n- **Are \"Solved Issues\" in SWE-bench Really Solved Correctly? An Empirical Study.**  \n  _You Wang, Michael Pradel, Zhongxin Liu._ ICSE 2026.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.15223)\n\n- **Unified Software Engineering Agent as AI Software Engineer.**  \n  _Leonhard Applis, Yuntong Zhang, Shanchao Liang, Nan Jiang, Lin Tan, Abhik Roychoudhury._ ICSE 2026.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.14683)\n\n- **Beyond Final Code: A Process-Oriented Error Analysis of Software Development Agents in Real-World GitHub Scenarios.**  \n  _Zhi Chen, Wei Ma, Lingxiao Jiang._ ICSE 2026.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.12374) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-808080?style=for-the-badge)\n\n- **LLM-based Agents for Automated Bug Fixing: How Far Are We?**  \n  _Xiangxin Meng, Zexiong Ma, Pengfei Gao, Chao Peng._ ICSE 2026.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2411.10213) [![GitHub Stars](https://img.shields.io/github/stars/ResearchOpenRepos/bug_fixing_agent_empirical_study?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/ResearchOpenRepos/bug_fixing_agent_empirical_study) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-808080?style=for-the-badge)\n\n- **Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem.**  \n  _Weixun Wang, XiaoXiao Xu, Wanhe An, Fangwen Dai, Wei Gao, Yancheng He, Ju Huang, Qiang Ji, Hanqi Jin, Xiaoyang Li, et al._ arXiv 2025/12.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2512.24873)\n\n- **Toward Training Superintelligent Software Agents through Self-Play SWE-RL.**  \n  _Yuxiang Wei, Zhiqing Sun, Emily McMilin, Jonas Gehring, David Zhang, Gabriel Synnaeve, Daniel Fried, Lingming Zhang, Sida Wang._ arXiv 2025/12.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2512.18552)\n\n- **Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases.**  \n  _Zhaodong Wang, Zhenting Qi, Sherman Wong, Nathan Hu, Samuel Lin, Jun Ge, Erwin Gao, Wenlin Chen, Yilun Du, Minlan Yu, et al._ arXiv 2025/12.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2512.10398)\n\n- **Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks.**  \n  _Songwen Zhao, Danqing Wang, Kexun Zhang, Jiaxuan Luo, Zhuo Li, Lei Li._ arXiv 2025/12.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2512.03262)\n\n- **Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?**  \n  _Chunqiu Steven Xia, Zhe Wang, Yan Yang, Yuxiang Wei, Lingming Zhang._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.13646) [![GitHub Stars](https://img.shields.io/github/stars/OpenAutoCoder/live-swe-agent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/OpenAutoCoder/live-swe-agent)\n\n- **The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents.**  \n  _Xingyao Wang, Simon Rosenberg, Juan Michelini, Calvin Smith, Hoang Tran, Engel Nyst, Rohit Malhotra, Xuhui Zhou, Valerie Chen, Robert Brennan, et al._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.03690) [![GitHub Stars](https://img.shields.io/github/stars/OpenHands/software-agent-sdk?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/OpenHands/software-agent-sdk)\n\n- **CodeClash: Benchmarking Goal-Oriented Software Engineering.**  \n  _John Yang, Kilian Lieret, Joyce Yang, Carlos E. Jimenez, Ofir Press, Ludwig Schmidt, Diyi Yang._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.00839) [![Website](https://img.shields.io/website?url=https://codeclash.ai/\u0026up_message=CODECLASH.AI\u0026up_color=blue\u0026down_message=CODECLASH.AI\u0026down_color=blue\u0026style=for-the-badge)](https://codeclash.ai/)\n\n- **Introducing cline-bench: A Real-World, Open Source Benchmark for Agentic Coding.**  \n  _Cline._ 2025/11.  \n  [![GitHub Stars](https://img.shields.io/github/stars/cline/cline-bench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/cline/cline-bench) [![Website](https://img.shields.io/website?url=https://cline.bot/blog/cline-bench-initiative\u0026up_message=CLINE-BENCH-INITIATIVE\u0026up_color=blue\u0026down_message=CLINE-BENCH-INITIATIVE\u0026down_color=blue\u0026style=for-the-badge)](https://cline.bot/blog/cline-bench-initiative)\n\n- **InfCode: Adversarial Iterative Refinement of Tests and Patches for Reliable Software Issue Resolution.**  \n  _KeFan Li, Mengfei Wang, Hengzhi Zhang, Zhichao Li, Yuan Yuan, Mu Li, Xiang Gao, Hailong Sun, Chunming Hu, Weifeng Lv._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.16004)\n\n- **Agent READMEs: An Empirical Study of Context Files for Agentic Coding.**  \n  _Worawalan Chatlatanagulchai, Hao Li, Yutaro Kashiwa, Brittany Reid, Kundjanasith Thonglek, Pattara Leelaprute, Arnon Rungsawang, Bundit Manaskasemsak, Bram Adams, Ahmed E. Hassan, et al._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.12884)\n\n- **Understanding Code Agent Behaviour: An Empirical Study of Success and Failure Trajectories.**  \n  _Oorja Majgaonkar, Zhiwei Fei, Xiang Li, Federica Sarro, He Ye._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.00197) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-808080?style=for-the-badge)\n\n- **SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models.**  \n  _Jingxuan Xu, Ken Deng, Weihao Li, Songwei Yu, Huaixi Tang, Haoyang Huang, Zhiyi Lai, Zizheng Zhan, Yanan Wu, Chenchen Zhang, et al._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.05459)\n\n- **More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents.**  \n  _Pengfei Gao, Chao Peng._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.16786) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-808080?style=for-the-badge)\n\n- **SWE-Sharp-Bench: A Reproducible Benchmark for C# Software Engineering Tasks.**  \n  _Sanket Mhatre, Yasharth Bajpai, Sumit Gulwani, Emerson Murphy-Hill, Gustavo Soares._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.02352) [![GitHub Stars](https://img.shields.io/github/stars/microsoft/prose?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/microsoft/prose/tree/main/misc/SWE-Sharp-Bench)\n\n- **U2F: Encouraging SWE-Agent to Seize Novelty without Losing Feasibility.**  \n  _Wencheng Ye, Yan Liu._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.03517)\n\n- **Programming with Pixels: Can Computer-Use Agents do Software Engineering?**  \n  _Pranjal Aggarwal, Sean Welleck._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.18525)\n\n- **Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents.**  \n  _Yueqi Song, Ketan Ramaneti, Zaid Sheikh, Ziru Chen, Boyu Gou, Tianbao Xie, Yiheng Xu, Danyang Zhang, Apurva Gandhi, Fan Yang, et al._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.24702) [![GitHub Stars](https://img.shields.io/github/stars/neulab/agent-data-protocol?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/neulab/agent-data-protocol) [![Website](https://img.shields.io/website?url=https://www.agentdataprotocol.com/\u0026up_message=AGENTDATAPROTOCOL\u0026up_color=blue\u0026down_message=AGENTDATAPROTOCOL\u0026down_color=blue\u0026style=for-the-badge)](https://www.agentdataprotocol.com/)\n\n- **Gistify! Codebase-Level Understanding via Runtime Execution.**  \n  _Hyunji Lee, Minseon Kim, Chinmay Singh, Matheus Pereira, Atharv Sonwane, Isadora White, Elias Stengel-Eskin, Mohit Bansal, Zhengyan Shi, Alessandro Sordoni, et al._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.26790)\n\n- ** Abstain and Validate: A Dual-LLM Policy for Reducing Noise in Agentic Program Repair.**  \n  _José Cambronero, Michele Tufano, Sherry Shi, Renyao Wei, Grant Uy, Runxiang Cheng, Chin-Jung Liu, Shiying Pan, Satish Chandra, Pat Rondon._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.03217)\n\n- **REFINE: Enhancing Program Repair Agents through Context-Aware Patch Refinement.**  \n  _Anvith Pabba, Simin Chen, Alex Mathai, Anindya Chakraborty, Baishakhi Ray._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.03588)\n\n- **Process-Level Trajectory Evaluation for Environment Configuration in Software Engineering Agents.**  \n  _Jiayi Kuang, Yinghui Li, Xin Zhang, Yangning Li, Di Yin, Xing Sun, Ying Shen, Philip S. Yu._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.25694)\n\n- **BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills.**  \n  _Atharv Sonwane, Isadora White, Hyunji Lee, Matheus Pereira, Lucas Caccia, Minseon Kim, Zhengyan Shi, Chinmay Singh, Alessandro Sordoni, Marc-Alexandre Côté, et al._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.19898)\n\n- **When “Correct” Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?**  \n  _Yibo Peng, James Song, Lei Li, Xinyu Yang, Mihai Christodorescu, Ravi Mangal, Corina Pasareanu, Haizhong Zheng, Beidi Chen._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.17862) [![GitHub Stars](https://img.shields.io/github/stars/Infini-AI-Lab/FCV?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/Infini-AI-Lab/FCV) [![Website](https://img.shields.io/website?url=https://infini-ai-lab.github.io/FCV/\u0026up_message=FCV\u0026up_color=blue\u0026down_message=FCV\u0026down_color=blue\u0026style=for-the-badge)](https://infini-ai-lab.github.io/FCV/)\n\n- **TDFlow: Agentic Workflows for Test Driven Software Engineering.**  \n  _Kevin Han, Siddharth Maddikayala, Tim Knappe, Om Patel, Austen Liao, Amir Barati Farimani._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.23761)\n\n- **Enhancing repository-level software repair via repository-aware knowledge graphs.**  \n  _Boyang Yang, Jiadong Ren, Shunfu Jin, Yang Liu, Feng Liu, Bach Le, Haoye Tian._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.21710) [![GitHub Stars](https://img.shields.io/github/stars/GLEAM-Lab/KGCompass?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/GLEAM-Lab/KGCompass)\n\n- **Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development.**  \n  _Xin Peng, Chong Wang._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.07967) ![Position Paper](https://img.shields.io/badge/Position_Paper-808080?style=for-the-badge)\n\n- **SIADAFIX: issue description response for adaptive program repair.**  \n  _Xin Cao, Nan Yu._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://www.arxiv.org/abs/2510.16059) [![GitHub Stars](https://img.shields.io/github/stars/liauto-siada/siada-cli?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/liauto-siada/siada-cli)\n\n- **Kimi-Dev: Agentless Training as Skill Prior for SWE-Agents.**  \n  _Zonghan Yang, Shengjie Wang, Kelin Fu, Wenyang He, Weimin Xiong, Yibo Liu, Yibo Miao, Bofei Gao, Yejie Wang, Yingwei Ma, et al._ arXiv 2025/09.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.23045)\n\n- **An Empirical Study on Failures in Automated Issue Solving.**  \n  _Simiao Liu, Fang Liu, Liehao Li, Xin Tan, Yinghao Zhu, Xiaoli Lian, Li Zhang._ arXiv 2025/09.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.13941) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-808080?style=for-the-badge)\n\n- **GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging.**  \n  _Ziyi Ni, Huacan Wang, Shuo Zhang, Shuo Lu, Ziyang He, Wang You, Zhenheng Tang, Yuntao Du, Bill Sun, Hongzhang Liu, et al._ arXiv 2025/09.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.18993) [![GitHub Stars](https://img.shields.io/github/stars/QuantaAlpha/GitTaskBench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/QuantaAlpha/GitTaskBench) [![Website](https://img.shields.io/website?url=https://gittaskbench.github.io/\u0026up_message=GITTASKBENCH\u0026up_color=blue\u0026down_message=GITTASKBENCH\u0026down_color=blue\u0026style=for-the-badge)](https://gittaskbench.github.io/)\n\n- **RepoForge: Training a SOTA Fast-thinking SWE Agent with an End-to-End Data Curation Pipeline Synergizing SFT and RL at Scale.**  \n  _Zhilong Chen, Chengzong Zhao, Boyuan Chen, Dayi Lin, Yihao Chen, Arthur Leung, Gopi Krishnan Rajbahadur, Gustavo A. Oliva, Haoxiang Zhang, Aaditya Bhatia, et al._ arXiv 2025/08.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.01550)\n\n- **Understanding Software Engineering Agents Through the Lens of Traceability: An Empirical Study.**  \n  _Ira Ceka, Saurabh Pujar, Shyam Ramji, Luca Buratti, Gail Kaiser, Baishakhi Ray._ arXiv 2025/06.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.08311)\n\n- **Is Your Automated Software Engineer Trustworthy?**  \n  _Noble Saji Mathews, Meiyappan Nagappan._ arXiv 2025/06.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.17812) [![GitHub Stars](https://img.shields.io/github/stars/uw-swag/BouncerBench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/uw-swag/BouncerBench) [![Website](https://img.shields.io/website?url=https://bouncerbench.com/\u0026up_message=BOUNCERBENCH\u0026up_color=blue\u0026down_message=BOUNCERBENCH\u0026down_color=blue\u0026style=for-the-badge)](https://bouncerbench.com/)\n\n- **Interactive Agents to Overcome Ambiguity in Software Engineering.**  \n  _Sanidhya Vijayvargiya, Xuhui Zhou, Akhila Yerukola, Maarten Sap, Graham Neubig._ arXiv 2025/02.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.13069)\n\n- **SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?**  \n  _Xiang Deng, Jeff Da, Edwin Pan, Yannis Yiming He, Charles Ide, Kanak Garg, Niklas Lauffer, Andrew Park, Nitin Pasari, Chetan Rane, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.16941) [![GitHub Stars](https://img.shields.io/github/stars/scaleapi/SWE-bench_Pro-os?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/scaleapi/SWE-bench_Pro-os) [![Website](https://img.shields.io/website?url=https://scale.com/leaderboard/swe_bench_pro_public\u0026up_message=SWE-BENCH-PRO-PUBLIC\u0026up_color=blue\u0026down_message=SWE-BENCH-PRO-PUBLIC\u0026down_color=blue\u0026style=for-the-badge)](https://scale.com/leaderboard/swe_bench_pro_public)\n\n- **SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents.**  \n  _Muhammad Shihab Rashid, Christian Bock, Yuan Zhuang, Alexander Buchholz, Tim Esler, Simon Valentin, Luca Franceschi, Martin Wistuba, Prabhu Teja Sivaprasad, Woo Jung Kim, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.08703) [![GitHub Stars](https://img.shields.io/github/stars/amazon-science/SWE-PolyBench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/amazon-science/SWE-PolyBench) [![Website](https://img.shields.io/website?url=https://amazon-science.github.io/SWE-PolyBench/\u0026up_message=SWE-POLYBENCH\u0026up_color=blue\u0026down_message=SWE-POLYBENCH\u0026down_color=blue\u0026style=for-the-badge)](https://amazon-science.github.io/SWE-PolyBench/)\n\n- **Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving.**  \n  _Daoguang Zan, Zhirong Huang, Wei Liu, Hanwu Chen, Linhao Zhang, Shulin Xin, Lu Chen, Qi Liu, Xiaojian Zhong, Aoyan Li, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.02605) [![GitHub Stars](https://img.shields.io/github/stars/multi-swe-bench/multi-swe-bench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/multi-swe-bench/multi-swe-bench) [![Website](https://img.shields.io/website?url=https://multi-swe-bench.github.io/\u0026up_message=MULTI-SWE-BENCH\u0026up_color=blue\u0026down_message=MULTI-SWE-BENCH\u0026down_color=blue\u0026style=for-the-badge)](https://multi-swe-bench.github.io/)\n\n- **SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents.**  \n  _Ibragim Badertdinov, Alexander Golubev, Maksim Nekrashevich, Anton Shevtsov, Simon Karasik, Andrei Andriushchenko, Maria Trofimova, Daria Litvintseva, Boris Yangel._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.20411) [![Website](https://img.shields.io/website?url=https://swe-rebench.com\u0026up_message=SWE-REBENCH\u0026up_color=blue\u0026down_message=SWE-REBENCH\u0026down_color=blue\u0026style=for-the-badge)](https://swe-rebench.com)\n\n- **Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling.**  \n  _Trae Research Team: Pengfei Gao, Zhao Tian, Xiangxin Meng, Xinchen Wang, Ruida Hu, Yuanan Xiao, Yizhou Liu, Zhao Zhang, Junjie Chen, Cuiyun Gao, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.23370) [![GitHub Stars](https://img.shields.io/github/stars/bytedance/trae-agent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/bytedance/trae-agent) [![Website](https://img.shields.io/website?url=https://www.trae.ai/\u0026up_message=TRAE.AI\u0026up_color=blue\u0026down_message=TRAE.AI\u0026down_color=blue\u0026style=for-the-badge)](https://www.trae.ai/)\n\n- **ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory.**  \n  _Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.25140)\n\n- **EXPEREPAIR: Dual-Memory Enhanced LLM-based Repository-Level Program Repair.**  \n  _Fangwen Mu, Junjie Wang, Lin Shi, Song Wang, Shoubin Li, Qing Wang._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.10484) [![GitHub Stars](https://img.shields.io/github/stars/ExpeRepair/ExpeRepair?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/ExpeRepair/ExpeRepair)\n\n- **SWE-Exp: Experience-Driven Software Issue Resolution.**  \n  _Silin Chen, Shaoxin Lin, Xiaodong Gu, Yuling Shi, Heng Lian, Longfei Yun, Dong Chen, Weiguo Sun, Lin Cao, Qianxiang Wang._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.23361) [![GitHub Stars](https://img.shields.io/github/stars/YerbaPage/SWE-Exp?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/YerbaPage/SWE-Exp)\n\n- **Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute.**  \n  _Yingwei Ma, Yongbin Li, Yihong Dong, Xue Jiang, Rongyu Cao, Jue Chen, Fei Huang, Binhua Li._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.23803) [![GitHub Stars](https://img.shields.io/github/stars/yingweima2022/SWE-Reasoner?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/yingweima2022/SWE-Reasoner)\n\n- **AutoCodeSherpa: Symbolic Explanations in AI Coding Agents.**  \n  _Sungmin Kang, Haifeng Ruan, Abhik Roychoudhury._ arXiv 2025/07.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.22414)\n\n- **Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering.**  \n  _Guangtao Zeng, Maohao Shen, Delin Chen, Zhenting Qi, Subhro Das, Dan Gutfreund, David Cox, Gregory Wornell, Wei Lu, Zhang-Wei Hong, et al._ arXiv 2025/05.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.23604) [![GitHub Stars](https://img.shields.io/github/stars/satori-reasoning/Satori-SWE?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/satori-reasoning/Satori-SWE) [![Website](https://img.shields.io/website?url=https://satori-reasoning.github.io/\u0026up_message=SATORI-REASONING\u0026up_color=blue\u0026down_message=SATORI-REASONING\u0026down_color=blue\u0026style=for-the-badge)](https://satori-reasoning.github.io/)\n\n- **CrashFixer: A crash resolution agent for the Linux kernel.**  \n  _Alex Mathai, Chenxi Huang, Suwei Ma, Jihwan Kim, Hailie Mitchell, Aleksandr Nogikh, Petros Maniatis, Franjo Ivančić, Junfeng Yang, Baishakhi Ray._ arXiv 2025/04.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.20412)\n\n- **DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal.**  \n  _Vaibhav Aggarwal, Ojasv Kamal, Abhinav Japesh, Zhijing Jin, Bernhard Schölkopf._ arXiv 2025/03.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.14269)\n\n- **Large Language Model Critics for Execution-Free Evaluation of Code Changes.**  \n  _Aashish Yadavally, Hoan Nguyen, Laurent Callot, Gauthier Guinet._ arXiv 2025/01.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2501.16655)\n\n- **debug-gym: A Text-Based Environment for Interactive Debugging.**  \n  _Xingdi Yuan, Morgane M Moss, Charbel El Feghali, Chinmay Singh, Darya Moldavskaya, Drew MacPhee, Lucas Caccia, Matheus Pereira, Minseon Kim, Alessandro Sordoni, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.21557) [![GitHub Stars](https://img.shields.io/github/stars/microsoft/debug-gym?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/microsoft/debug-gym) [![Website](https://img.shields.io/website?url=https://microsoft.github.io/debug-gym/\u0026up_message=DEBUG-GYM\u0026up_color=blue\u0026down_message=DEBUG-GYM\u0026down_color=blue\u0026style=for-the-badge)](https://microsoft.github.io/debug-gym/)\n\n- **R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents.**  \n  _Naman Jain, Jaskirat Singh, Manish Shetty, Liang Zheng, Koushik Sen, Ion Stoica._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.07164) [![GitHub Stars](https://img.shields.io/github/stars/R2E-Gym/R2E-Gym?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/R2E-Gym/R2E-Gym) [![Website](https://img.shields.io/website?url=https://r2e-gym.github.io/\u0026up_message=R2E-GYM\u0026up_color=blue\u0026down_message=R2E-GYM\u0026down_color=blue\u0026style=for-the-badge)](https://r2e-gym.github.io/)\n\n- **HAFixAgent: History-Aware Automated Program Repair Agent.**  \n  _Yu Shi, Hao Li, Bram Adams, Ahmed E. Hassan._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.01047)\n\n- **SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution.**  \n  _Han Li, Yuling Shi, Shaoxin Lin, Xiaodong Gu, Heng Lian, Xin Wang, Yantao Jia, Tao Huang, Qianxiang Wang._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.23348) [![GitHub Stars](https://img.shields.io/github/stars/YerbaPage/SWE-Debate?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/YerbaPage/SWE-Debate)\n\n- **SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks.**  \n  _Lianghong Guo, Yanlin Wang, Caihua Li, Pengyu Yang, Jiachi Chen, Wei Tao, Yingtian Zou, Duyu Tang, Zibin Zheng._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.10954) [![GitHub Stars](https://img.shields.io/github/stars/DeepSoftwareAnalytics/swe-factory?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/DeepSoftwareAnalytics/swe-factory)\n\n- **Kimi-Dev: Agentless Training as Skill Prior for SWE-Agents.**  \n  _Zonghan Yang, Shengjie Wang, Kelin Fu, Wenyang He, Weimin Xiong, Yibo Liu, Yibo Miao, Bofei Gao, Yejie Wang, Yingwei Ma, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.23045)\n\n- **SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement.**  \n  _Antonis Antoniades, Albert Örwall, Kexun Zhang, Yuxi Xie, Anirudh Goyal, William Wang._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2410.20285) [![GitHub Stars](https://img.shields.io/github/stars/aorwall/moatless-tree-search?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/aorwall/moatless-tree-search)\n\n- **SEAlign: Alignment Training for Software Engineering Agent.**  \n  _Kechi Zhang, Huangzhao Zhang, Ge Li, Jinliang You, Jia Li, Yunfei Zhao, Zhi Jin._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.18455)\n\n- **Lingxi: Repository-Level Issue Resolution Framework Enhanced by Procedural Knowledge Guided Scaling.**  \n  _Xu Yang, Jiayuan Zhou, Michael Pacheco, Wenhan Zhu, Pengfei He, Shaowei Wang, Kui Liu, Ruiqi Pan._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.11838) [![GitHub Stars](https://img.shields.io/github/stars/lingxi-agent/Lingxi?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/lingxi-agent/Lingxi)\n\n- **Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks.**  \n  _Hongyuan Tao, Ying Zhang, Zhenhao Tang, Hongen Peng, Xukun Zhu, Bingchang Liu, Yingguang Yang, Ziyin Zhang, Zhaogui Xu, Haipeng Zhang, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.16901)\n\n- **SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs.**  \n  _Minh V.T. Pham, Huy N. Phan, Hoang N. Phan, Cuong Le Chi, Tien N. Nguyen, Nghi D. Q. Bui._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.14757) [![GitHub Stars](https://img.shields.io/github/stars/FSoft-AI4Code/SWE-Synth?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/FSoft-AI4Code/SWE-Synth)\n\n- **ComBench: Compilation Error Repair Benchmark Platform.**  \n  _Anonymous._ 2025.  \n  [![GitHub Stars](https://img.shields.io/github/stars/conference-submit/ComBench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/conference-submit/ComBench)\n\n- **SWE-Bench-CL: Continual Learning for Coding Agents.**  \n  _Thomas Joshi, Shayan Chowdhury, Fatih Uysal._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.00014) [![GitHub Stars](https://img.shields.io/github/stars/thomasjoshi/agents-never-forget?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/thomasjoshi/agents-never-forget/)\n\n- **A Self-Improving Coding Agent.**  \n  _Maxime Robeyns, Martin Szummer, Laurence Aitchison._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.15228) [![GitHub Stars](https://img.shields.io/github/stars/MaximeRobeyns/self_improving_coding_agent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/MaximeRobeyns/self_improving_coding_agent)\n\n- **Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards.**  \n  _Jeff Da, Clinton Wang, Xiang Deng, Yuntao Ma, Nikhil Barhate, Sean Hendryx._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.11425)\n\n- **Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning.**  \n  _Alexander Golubev, Maria Trofimova, Sergei Polezhaev, Ibragim Badertdinov, Maksim Nekrashevich, Anton Shevtsov, Simon Karasik, Sergey Abramov, Andrei Andriushchenko, Filipp Fisin, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.03501)\n\n- **SWE-Mirror: Scaling Issue-Resolving Datasets by Mirroring Issues Across Repositories.**  \n  _Junhao Wang, Daoguang Zan, Shulin Xin, Siyao Liu, Yurong Wu, Kai Shen._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.08724)\n\n- **SWE-Effi: Re-Evaluating Software AI Agent System Effectiveness Under Resource Constraints.**  \n  _Zhiyu Fan, Kirill Vasilevski, Dayi Lin, Boyuan Chen, Yihao Chen, Zhiqing Zhong, Jie M. Zhang, Pinjia He, Ahmed E. Hassan._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.09853) [![Website](https://img.shields.io/website?url=https://github.com/Centre-for-Software-Excellence/SWE-Effi\u0026up_message=SWE-EFFI\u0026up_color=blue\u0026down_message=SWE-EFFI\u0026down_color=blue\u0026style=for-the-badge)](https://github.com/Centre-for-Software-Excellence/SWE-Effi)\n\n- **Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments.**  \n  _Hongjin Su, Ruoxi Sun, Jinsung Yoon, Pengcheng Yin, Tao Yu, Sercan Ö. Arık._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2501.10893)\n\n- **Enhancing repository-level software repair via repository-aware knowledge graphs.**  \n  _Boyang Yang, Jiadong Ren, Shunfu Jin, Yang Liu, Feng Liu, Bach Le, Haoye Tian._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.21710) [![GitHub Stars](https://img.shields.io/github/stars/GLEAM-Lab/KGCompass?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/GLEAM-Lab/KGCompass)\n\n- **SemAgent: A Semantics Aware Program Repair Agent.**  \n  _Anvith Pabba, Alex Mathai, Anindya Chakraborty, Baishakhi Ray._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.16650)\n\n- **HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale.**  \n  _Huy Nhat Phan, Tien N. Nguyen, Phong X. Nguyen, Nghi D. Q. Bui._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2409.16299) [![GitHub Stars](https://img.shields.io/github/stars/FSoft-AI4Code/HyperAgent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/FSoft-AI4Code/HyperAgent)\n\n- **Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation.**  \n  _Spandan Garg, Ben Steenhoek, Yufan Huang._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.08996)\n\n- **RepoForge: Training a SOTA Fast-thinking SWE Agent with an End-to-End Data Curation Pipeline Synergizing SFT and RL at Scale.**  \n  _Zhilong Chen, Chengzong Zhao, Boyuan Chen, Dayi Lin, Yihao Chen, Arthur Leung, Gopi Krishnan Rajbahadur, Gustavo Oliva, Haoxiang Zhang, Aadi Bhatia, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.01550) [![Website](https://img.shields.io/website?url=https://centre-for-software-excellence.github.io/docs/blog/repoforge\u0026up_message=REPOFORGE\u0026up_color=blue\u0026down_message=REPOFORGE\u0026down_color=blue\u0026style=for-the-badge)](https://centre-for-software-excellence.github.io/docs/blog/repoforge)\n\n- **MCTS-Refined CoT: High-Quality Fine-Tuning Data for LLM-Based Repository Issue Resolution.**  \n  _Yibo Wang, Zhihao Peng, Ying Wang, Zhao Wei, Hai Yu, Zhiliang Zhu._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.12728) [![Website](https://img.shields.io/website?url=https://mcts-refine.github.io/\u0026up_message=MCTS-REFINE\u0026up_color=blue\u0026down_message=MCTS-REFINE\u0026down_color=blue\u0026style=for-the-badge)](https://mcts-refine.github.io/)\n\n- **SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language Models on Software Engineering Tasks.**  \n  _Pavel Adamenko, Mikhail Ivanov, Aidar Valeev, Rodion Levichev, Pavel Zadorozhny, Ivan Lopatin, Dmitry Babayev, Alena Fenogenova, Valentin Malykh._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.11059) [![GitHub Stars](https://img.shields.io/github/stars/MERA-Evaluation/repotest?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/MERA-Evaluation/repotest) [![Website](https://img.shields.io/website?url=https://mera-evaluation.github.io/demo-swe-mera/\u0026up_message=DEMO-SWE-MERA\u0026up_color=blue\u0026down_message=DEMO-SWE-MERA\u0026down_color=blue\u0026style=for-the-badge)](https://mera-evaluation.github.io/demo-swe-mera/)\n\n- **Auto-SWE-Bench: A Framework for the Scalable Generation of Software Engineering Benchmark from Open-Source Repositories.**  \n  _Anonymous Authors._ 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=Gxw1EDSm9S)\n\n- **Can Agents Fix Agent Issues?**  \n  _Alfin Wijaya Rahardja, Junwei Liu, Weitong Chen, Zhenpeng Chen, Yiling Lou._ NeurIPS 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.20749) [![GitHub Stars](https://img.shields.io/github/stars/alfin06/AgentIssue-Bench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/alfin06/AgentIssue-Bench)\n\n- **Co-PatcheR: Collaborative Software Patching with Component(s)-specific Small Reasoning Models.**  \n  _Yuheng Tang, Hongwei Li, Kaijie Zhu, Michael Yang, Yangruibo Ding, Wenbo Guo._ NeurIPS 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.18955) [![GitHub Stars](https://img.shields.io/github/stars/ucsb-mlsec/Co-PatcheR?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/ucsb-mlsec/Co-PatcheR)\n\n- **RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving.**  \n  _Huacan Wang, Ziyi Ni, Shuo Zhang, Shuo Lu, Sen Hu, Ziyang He, Chen Hu, Jiaye Lin, Yifu Guo, Ronghao Chen, et al._ NeurIPS 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.21577) [![GitHub Stars](https://img.shields.io/github/stars/QuantaAlpha/RepoMaster?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/QuantaAlpha/RepoMaster) [![Website](https://img.shields.io/website?url=https://quantaalpha.com/\u0026up_message=QUANTAALPHA\u0026up_color=blue\u0026down_message=QUANTAALPHA\u0026down_color=blue\u0026style=for-the-badge)](https://quantaalpha.com/)\n\n- **SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents.**  \n  _Jiaye Lin, Yifu Guo, Yuzhen Han, Sen Hu, Ziyi Ni, Licheng Wang, Mingguang Chen, Hongzhang Liu, Ronghao Chen, Yangfan He, et al._ NeurIPS 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.02085) [![GitHub Stars](https://img.shields.io/github/stars/JARVIS-Xs/SE-Agent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/JARVIS-Xs/SE-Agent) [![Website](https://img.shields.io/website?url=https://quantaalpha.com/\u0026up_message=QUANTAALPHA\u0026up_color=blue\u0026down_message=QUANTAALPHA\u0026down_color=blue\u0026style=for-the-badge)](https://quantaalpha.com/)\n\n- **Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning.**  \n  _Yinjie Wang, Ling Yang, Ye Tian, Ke Shen, Mengdi Wang._ NeurIPS 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.03136) [![GitHub Stars](https://img.shields.io/github/stars/Gen-Verse/CURE?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/Gen-Verse/CURE)\n\n- **SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution.**  \n  _Yuxiang Wei, Olivier Duchenne, Jade Copet, Quentin Carbonneaux, Lingming Zhang, Daniel Fried, Gabriel Synnaeve, Rishabh Singh, Sida I. Wang._ NeurIPS 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.18449) [![GitHub Stars](https://img.shields.io/github/stars/facebookresearch/swe-rl?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/facebookresearch/swe-rl)\n\n- **SWE-smith: Scaling Data for Software Engineering Agents.**  \n  _John Yang, Kilian Lieret, Carlos E. Jimenez, Alexander Wettig, Kabir Khandpur, Yanzhe Zhang, Binyuan Hui, Ofir Press, Ludwig Schmidt, Diyi Yang._ NeurIPS 2025 Datasets \u0026 Benchmarks Track.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.21798) [![GitHub Stars](https://img.shields.io/github/stars/SWE-bench/SWE-smith?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/SWE-bench/SWE-smith) [![Website](https://img.shields.io/website?url=https://swesmith.com/\u0026up_message=SWESMITH\u0026up_color=blue\u0026down_message=SWESMITH\u0026down_color=blue\u0026style=for-the-badge)](https://swesmith.com/)\n\n- **SWE-bench Goes Live!**  \n  _Linghao Zhang, Shilin He, Chaoyun Zhang, Yu Kang, Bowen Li, Chengxing Xie, Junhao Wang, Maoquan Wang, Yufan Huang, Shengyu Fu, et al._ NeurIPS 2025 Datasets \u0026 Benchmarks Track.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.23419) [![GitHub Stars](https://img.shields.io/github/stars/microsoft/SWE-bench-Live?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/microsoft/SWE-bench-Live) [![Website](https://img.shields.io/website?url=https://swe-bench-live.github.io/\u0026up_message=SWE-BENCH-LIVE\u0026up_color=blue\u0026down_message=SWE-BENCH-LIVE\u0026down_color=blue\u0026style=for-the-badge)](https://swe-bench-live.github.io/)\n\n- **Training Software Engineering Agents and Verifiers with SWE-Gym.**  \n  _Jiayi Pan, Xingyao Wang, Graham Neubig, Navdeep Jaitly, Heng Ji, Alane Suhr, Yizhe Zhang._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2412.21139) [![GitHub Stars](https://img.shields.io/github/stars/SWE-Gym/SWE-Gym?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/SWE-Gym/SWE-Gym)\n\n- **SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner.**  \n  _Lei Zhang, Jiaxi Yang, Min Yang, Jian Yang, Mouxiang Chen, Jiajun Zhang, Zeyu Cui, Binyuan Hui, Junyang Lin._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.09003) [![GitHub Stars](https://img.shields.io/github/stars/Hambaobao/SWE-Flow?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/Hambaobao/SWE-Flow)\n\n- **Nemotron-CORTEXA: Enhancing LLM Agents for Software Engineering Tasks via Improved Localization and Solution Diversity.**  \n  _Atefeh Sohrabizadeh, Jialin Song, Mingjie Liu, Rajarshi Roy, Chankyu Lee, Jonathan Raiman, Bryan Catanzaro._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=k6p8UKRdH7)\n\n- **Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents.**  \n  _Karina Zainullina, Alexander Golubev, Maria Trofimova, Sergei Polezhaev, Ibragim Badertdinov, Daria Litvintseva, Simon Karasik, Filipp Fisin, Sergei Skvortsov, Maksim Nekrashevich, et al._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.13652)\n\n- **SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?**  \n  _Samuel Miserendino, Michele Wang, Tejal Patwardhan, Johannes Heidecke._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.12115) [![GitHub Stars](https://img.shields.io/github/stars/openai/frontier-evals?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/openai/frontier-evals/tree/main/project/swelancer) [![Website](https://img.shields.io/website?url=https://openai.com/index/swe-lancer/\u0026up_message=SWE-LANCER\u0026up_color=blue\u0026down_message=SWE-LANCER\u0026down_color=blue\u0026style=for-the-badge)](https://openai.com/index/swe-lancer/)\n\n- **Automated Benchmark Generation for Repository-Level Coding Tasks.**  \n  _Konstantinos Vergopoulos, Mark Niklas Mueller, Martin Vechev._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/pdf/2503.07701)\n\n- **PatchPilot: A Cost-Efficient Software Engineering Agent with Early Attempts on Formal Verification.**  \n  _Hongwei Li, Yuheng Tang, Shiqi Wang, Wenbo Guo._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.02747) [![GitHub Stars](https://img.shields.io/github/stars/ucsb-mlsec/PatchPilot?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/ucsb-mlsec/PatchPilot)\n\n- **Automated Benchmark Generation for Repository-Level Coding Tasks.**  \n  _Konstantinos Vergopoulos, Mark Niklas Müller, Martin Vechev._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=qnE2m3pIAb)\n\n- **OpenHands: An Open Platform for AI Software Developers as Generalist Agents.**  \n  _Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, et al._ ICLR 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2407.16741) [![GitHub Stars](https://img.shields.io/github/stars/All-Hands-AI/OpenHands?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/All-Hands-AI/OpenHands) [![Website](https://img.shields.io/website?url=https://www.all-hands.dev/\u0026up_message=ALL-HANDS\u0026up_color=blue\u0026down_message=ALL-HANDS\u0026down_color=blue\u0026style=for-the-badge)](https://www.all-hands.dev/)\n\n- **RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph.**  \n  _Siru Ouyang, Wenhao Yu, Kaixin Ma, Zilin Xiao, Zhihan Zhang, Mengzhao Jia, Jiawei Han, Hongming Zhang, Dong Yu._ ICLR 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2410.14684) [![GitHub Stars](https://img.shields.io/github/stars/ozyyshr/RepoGraph?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/ozyyshr/RepoGraph)\n\n- **Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents.**  \n  _Kexun Zhang, Weiran Yao, Zuxin Liu, Yihao Feng, Zhiwei Liu, Rithesh R N, Tian Lan, Lei Li, Renze Lou, Jiacheng Xu, et al._ ICLR 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=cKlzKs3Nnb) [![GitHub Stars](https://img.shields.io/github/stars/SalesforceAIResearch/swecomm?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/SalesforceAIResearch/swecomm) [![Website](https://img.shields.io/website?url=https://salesforce-research-dei-agents.github.io/\u0026up_message=SALESFORCE-RESEARCH-DEI-AGENTS\u0026up_color=blue\u0026down_message=SALESFORCE-RESEARCH-DEI-AGENTS\u0026down_color=blue\u0026style=for-the-badge)](https://salesforce-research-dei-agents.github.io/)\n\n- **SWE-GPT: A Process-Centric Language Model for Automated Software Improvement.**  \n  _Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen, Yibo Liu, Yuchen Liu, Binhua Li, Fei Huang, Yongbin Li._ ISSTA 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2411.00622) [![GitHub Stars](https://img.shields.io/github/stars/LingmaTongyi/Lingma-SWE-GPT?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/LingmaTongyi/Lingma-SWE-GPT)\n\n- **SpecRover: Code Intent Extraction via LLMs.**  \n  _Haifeng Ruan, Yuntong Zhang, Abhik Roychoudhury._ ICSE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2408.02232)\n\n- **Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories.**  \n  _Islem Bouzenia, Michael Pradel._ ASE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.18824)\n\n- **\"My productivity is boosted, but ...\" Demystifying Users' Perception on AI Coding Assistants.**  \n  _Yunbo Lyu, Zhou Yang, Jieke Shi, Jianming Chang, Yue Liu, David Lo._ ASE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.12285)\n\n- **SPICE: An Automated SWE-Bench Labeling Pipeline for Issue Clarity, Test Coverage, and Effort Estimation.**  \n  _Gustavo A. Oliva, Gopi Krishnan Rajbahadur, Aaditya Bhatia, Haoxiang Zhang, Yihao Chen, Zhilong Chen, Arthur Leung, Dayi Lin, Boyuan Chen, Ahmed E. Hassan._ ASE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.09108)\n\n- **DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal.**  \n  _Vaibhav Aggarwal, Ojasv Kamal, Abhinav Japesh, Zhijing Jin, Bernhard Schölkopf._ ACL 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.acl-long.973/) [![GitHub Stars](https://img.shields.io/github/stars/vaibhavagg303/DARS-Agent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/vaibhavagg303/DARS-Agent)\n\n- **CompileAgent: Automated Real-World Repo-Level Compilation with Tool-Integrated LLM-based Agent System.**  \n  _Li Hu, Guoqiang Chen, Xiuwei Shang, Shaoyin Cheng, Benlong Wu, LiGangyang LiGangyang, Xu Zhu, Weiming Zhang, Nenghai Yu._ ACL 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.acl-long.103/) [![GitHub Stars](https://img.shields.io/github/stars/Ch3nYe/AutoCompiler?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/Ch3nYe/AutoCompiler)\n\n- **SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning.**  \n  _Zexiong Ma, Chao Peng, Pengfei Gao, Xiangxin Meng, Yanzhen Zou, Bing Xie._ ACL 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.acl-long.559/)\n\n- **SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution.**  \n  _Chengxing Xie, Bowen Li, Chang Gao, He Du, Wai Lam, Difan Zou, Kai Chen._ ACL 2025 Findings.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2501.05040) [![GitHub Stars](https://img.shields.io/github/stars/InternLM/SWE-Fixer?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/InternLM/SWE-Fixer)\n\n- **SynFix: Dependency-Aware Program Repair via RelationGraph Analysis.**  \n  _Xunzhu Tang, Jiechao Gao, Jin Xu, Tiezhu Sun, Yewei Song, Saad Ezzini, Wendkûuni C. Ouédraogo, Jacques Klein, Tegawendé F. Bissyandé._ ACL 2025 Findings.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.findings-acl.252/)\n\n- **UniDebugger: Hierarchical Multi-Agent Framework for Unified Software Debugging.**  \n  _Cheryl Lee, Chunqiu Steven Xia, Longji Yang, Jen-tse Huang, Zhouruixing Zhu, Lingming Zhang, Michael R. Lyu._ EMNLP 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.emnlp-main.921/)\n\n- **Agentless: Demystifying LLM-based Software Engineering Agents.**  \n  _Chunqiu Steven Xia, Yinlin Deng, Soren Dunn, Lingming Zhang._ FSE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://dl.acm.org/doi/10.1145/3715754) [![GitHub Stars](https://img.shields.io/github/stars/OpenAutoCoder/Agentless?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/OpenAutoCoder/Agentless)\n\n- **OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution.**  \n  _Lianghong Guo, Wei Tao, Runhan Jiang, Yanlin Wang, Jiachi Chen, Xilin Liu, Yuchi Ma, Mingzhi Mao, Hongyu Zhang, Zibin Zheng._ ISSTA 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.04606) [![GitHub Stars](https://img.shields.io/github/stars/DeepSoftwareAnalytics/OmniGIRL?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/DeepSoftwareAnalytics/OmniGIRL) [![Website](https://img.shields.io/website?url=https://deepsoftwareanalytics.github.io/omnigirl_leaderboard.html\u0026up_message=OMNIGIRL-LEADERBOARD\u0026up_color=blue\u0026down_message=OMNIGIRL-LEADERBOARD\u0026down_color=blue\u0026style=for-the-badge)](https://deepsoftwareanalytics.github.io/omnigirl_leaderboard.html)\n\n- **Boosting Open-Source LLMs for Program Repair via Reasoning Transfer and LLM-Guided Reinforcement Learning.**  \n  _Xunzhu Tang, Jacques Klein, Tegawendé F. Bissyandé._ TOSEM 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2506.03921)\n\n- **Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program Repair.**  \n  _Qiong Feng, Xiaotian Ma, Jiayi Sheng, Ziyuan Feng, Wei Song, Peng Liang._ TOSEM 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://dl.acm.org/doi/abs/10.1145/3770581)\n\n- **Agentic Program Repair from Test Failures at Scale: A Neuro-symbolic approach with static analysis and test execution feedback.**  \n  _Chandra Maddila, Adam Tait, Claire Chang, Daniel Cheng, Nauman Ahmad, Vijayaraghavan Murali, Marshall Roch, Arnaud Avondet, Aaron Meltzer, Victor Montalvao, et al._ TSE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.18755)\n\n- **AutoCodeRover: Autonomous Program Improvement.**  \n  _Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury._ ISSTA 2024.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2404.05427) [![GitHub Stars](https://img.shields.io/github/stars/AutoCodeRoverSG/auto-code-rover?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/AutoCodeRoverSG/auto-code-rover) [![Website](https://img.shields.io/website?url=https://autocoderover.dev/\u0026up_message=AUTOCODEROVER\u0026up_color=blue\u0026down_message=AUTOCODEROVER\u0026down_color=blue\u0026style=for-the-badge)](https://autocoderover.dev/)\n\n- **SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering.**  \n  _John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik R. Narasimhan, Ofir Press._ NeurIPS 2024.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2405.15793) [![GitHub Stars](https://img.shields.io/github/stars/SWE-agent/SWE-agent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/SWE-agent/SWE-agent) [![Website](https://img.shields.io/website?url=https://swe-agent.com/\u0026up_message=SWE-AGENT\u0026up_color=blue\u0026down_message=SWE-AGENT\u0026down_color=blue\u0026style=for-the-badge)](https://swe-agent.com/)\n\n- **MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution.**  \n  _Wei Tao, Yucheng Zhou, Yanlin Wang, Wenqiang Zhang, Hongyu Zhang, Yu Cheng._ NeurIPS 2024.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://proceedings.neurips.cc/paper_files/paper/2024/hash/5d1f02132ef51602adf07000ca5b6138-Abstract-Conference.html) [![GitHub Stars](https://img.shields.io/github/stars/co-evolve-lab/magis?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/co-evolve-lab/magis)\n\n- **MASAI: Modular Architecture for Software-engineering AI Agents.**  \n  _Nalin Wadhwa, Atharv Sonwane, Daman Arora, Abhav Mehrotra, Saiteja Utpala, Ramakrishna B. Bairi, Aditya Kanade, Nagarajan Natarajan._ NeurIPS 2024 Workshop.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=NSINt8lLYB)\n\n- **SWE-bench: Can Language Models Resolve Real-World GitHub Issues?**  \n  _Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan._ ICLR 2024.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2310.06770) [![GitHub Stars](https://img.shields.io/github/stars/SWE-bench/SWE-bench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/SWE-bench/SWE-bench) [![Website](https://img.shields.io/website?url=https://www.swebench.com/\u0026up_message=SWEBENCH\u0026up_color=blue\u0026down_message=SWEBENCH\u0026down_color=blue\u0026style=for-the-badge)](https://www.swebench.com/)\n\n- **InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback.**  \n  _John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao._ NeurIPS 2023 Datasets \u0026 Benchmarks Track.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2306.14898) [![GitHub Stars](https://img.shields.io/github/stars/princeton-nlp/intercode?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/princeton-nlp/intercode) [![Website](https://img.shields.io/website?url=https://intercode-benchmark.github.io/\u0026up_message=INTERCODE-BENCHMARK\u0026up_color=blue\u0026down_message=INTERCODE-BENCHMARK\u0026down_color=blue\u0026style=for-the-badge)](https://intercode-benchmark.github.io/)\n\u003c!-- END PAPERS:issue_resolution --\u003e\n\n---\n\n#### 🖥️ Terminal Operating\n\u003e AI agents that operate within terminal environments, executing shell commands, managing system operations, and automating command-line workflows through natural language interfaces and autonomous task execution.\n\n\u003c!-- START PAPERS:terminal --\u003e\n- **Terminal-Bench: A Benchmark for AI Agents in Terminal Environments.**  \n  _The Terminal-Bench Team._ 2025.  \n  [![GitHub Stars](https://img.shields.io/github/stars/laude-institute/terminal-bench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/laude-institute/terminal-bench) [![Website](https://img.shields.io/website?url=https://www.tbench.ai/\u0026up_message=TBENCH.AI\u0026up_color=blue\u0026down_message=TBENCH.AI\u0026down_color=blue\u0026style=for-the-badge)](https://www.tbench.ai/) ![Benchmark \u0026 Dataset](https://img.shields.io/badge/Benchmark_%26_Dataset-F4A261?style=for-the-badge) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces.**  \n  _Mike A. Merrill, Alexander G. Shaw, Nicholas Carlini, Boxuan Li, Harsh Raj, Ivan Bercovich, Lin Shi, Jeong Yeon Shin, Thomas Walshe, E. Kelly Buchanan, et al._ 2026.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2601.11868) [![GitHub Stars](https://img.shields.io/github/stars/harbor-framework/terminal-bench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/harbor-framework/terminal-bench) [![Website](https://img.shields.io/website?url=https://www.tbench.ai/\u0026up_message=TBENCH.AI\u0026up_color=blue\u0026down_message=TBENCH.AI\u0026down_color=blue\u0026style=for-the-badge)](https://www.tbench.ai/)\n\u003c!-- END PAPERS:terminal --\u003e\n\n---\n\n#### 🧑‍💻 Code Generation\n\u003e AI agents that autonomously generate, scaffold, and synthesize code at the repository level, leveraging external tools and APIs to create new modules, build complete projects, and construct large-scale codebases.\n\n\u003c!-- START PAPERS:code_generation --\u003e\n- **Code as Agent Harness.**  \n  _Xuying Ning, Katherine Tieu, Dongqi Fu, Tianxin Wei, Zihao Li, Yuanchen Bei, Jiaru Zou, Mengting Ai, Zhining Liu, Ting-Wei Li, et al._ arXiv 2026/05.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2605.18747) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge) ![Position Paper](https://img.shields.io/badge/Position_Paper-9B59B6?style=for-the-badge)\n\n- **Is Multi-Agent Debate (MAD) the Silver Bullet? An Empirical Analysis of MAD in Code Summarization and Translation.**  \n  _Jina Chun, Qihong Chen, Jiawei Li, Iftekhar Ahmed._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2503.12029) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.**  \n  _Joel Becker, Nate Rush, Elizabeth Barnes, David Rein._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.09089) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows.**  \n  _Valerie Chen, Ameet Talwalkar, Robert Brennan, Graham Neubig._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.08149) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Assessing and Advancing Benchmarks for Evaluating Large Language Models in Software Engineering Tasks.**  \n  _Xing Hu, Feifei Niu, Junkai Chen, Xin Zhou, Junwei Zhang, Junda He, Xin Xia, David Lo._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.08903) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Vibe Checker: Aligning Code Evaluation with Human Preference.**  \n  _Ming Zhong, Xiang Zhou, Ting-Yun Chang, Qingze Wang, Nan Xu, Xiance Si, Dan Garrette, Shyam Upadhyay, Jeremiah Liu, Jiawei Han, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.07315) ![Benchmark \u0026 Dataset](https://img.shields.io/badge/Benchmark_%26_Dataset-F4A261?style=for-the-badge) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-4A90D9?style=for-the-badge)\n\n- **Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI.**  \n  _Ranjan Sapkota, Konstantinos I. Roumeliotis, Manoj Karkee._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.19443) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge) ![Position Paper](https://img.shields.io/badge/Position_Paper-9B59B6?style=for-the-badge)\n\n- **Position: Vibe Coding Needs Vibe Reasoning: Improving Vibe Coding with Formal Verification.**  \n  _Jacqueline Mitchell, Yasser Shaaban._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.00202) ![Position Paper](https://img.shields.io/badge/Position_Paper-9B59B6?style=for-the-badge) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge)\n\n- **A Survey on Code Generation with LLM-based Agents.**  \n  _Yihong Dong, Xue Jiang, Jiaru Qian, Tian Wang, Kechi Zhang, Zhi Jin, Ge Li._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.00083) [![GitHub Stars](https://img.shields.io/github/stars/JiaruQian/awesome-llm-based-agent4code?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/JiaruQian/awesome-llm-based-agent4code) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge)\n\n- **A Survey of Vibe Coding with Large Language Models.**  \n  _Yuyao Ge, Lingrui Mei, Zenghao Duan, Tianhao Li, Yujia Zheng, Yiwei Wang, Lexin Wang, Jiayu Yao, Tianyu Liu, Yujun Cai, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.12399) [![GitHub Stars](https://img.shields.io/github/stars/YuyaoGe/Awesome-Vibe-Coding?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/YuyaoGe/Awesome-Vibe-Coding) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge)\n\n- **Does AI-Assisted Coding Deliver? A Difference-in-Differences Study of Cursor's Impact on Software Projects.**  \n  _Hao He, Courtney Miller, Shyam Agarwal, Christian Kästner, Bogdan Vasilescu._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.04427) ![Empirical Study](https://img.shields.io/badge/Empirical_Study-808080?style=for-the-badge)\n\n- **Lost in Code Generation: Reimagining the Role of Software Models in AI-driven Software Engineering.**  \n  _Jürgen Cito, Dominik Bork._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.02475) ![Position Paper](https://img.shields.io/badge/Position_Paper-808080?style=for-the-badge)\n\n- **SlopCodeBench: Community driven benchmark for measuring code erosion under iterative specification refinement.**  \n  _Sprocket Lab._ 2025.  \n  [![GitHub Stars](https://img.shields.io/github/stars/SprocketLab/slop-code-bench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/SprocketLab/slop-code-bench) [![Website](https://img.shields.io/website?url=https://www.scbench.ai/\u0026up_message=SCBENCH.AI\u0026up_color=blue\u0026down_message=SCBENCH.AI\u0026down_color=blue\u0026style=for-the-badge)](https://www.scbench.ai/) ![Benchmark   Dataset](https://img.shields.io/badge/Benchmark___Dataset-808080?style=for-the-badge)\n\n- **Towards Realistic Project-Level Code Generation via Multi-Agent Collaboration and Semantic Architecture Modeling.**  \n  _Qianhui Zhao, Li Zhang, Fang Liu, Junhang Cheng, Chengru Wu, Junchen Ai, Qiaoyuanhe Meng, Lichen Zhang, Xiaoli Lian, Shubin Song, et al._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.03404) [![GitHub Stars](https://img.shields.io/github/stars/whisperzqh/ProjectGen?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/whisperzqh/ProjectGen) ![Benchmark   Dataset](https://img.shields.io/badge/Benchmark___Dataset-808080?style=for-the-badge)\n\n- **Smarter Together: Creating Agentic Communities of Practice through Shared Experiential Learning.**  \n  _Valentin Tablan, Scott Taylor, Gabriel Hurtado, Kristoffer Bernhem, Anders Uhrenholt, Gabriele Farei, Karo Moilanen._ arXiv 2025/11.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2511.08301)\n\n- **Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches.**  \n  _Yicheng Tao, Yao Qin, Yepang Liu._ arXiv 2025/10.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2510.04905) ![Survey](https://img.shields.io/badge/Survey-2A9D8F?style=for-the-badge)\n\n- **GRACE: Graph-Guided Repository-Aware Code Completion through Hierarchical Code Fusion.**  \n  _Xingliang Wang, Baoyi Wang, Chen Zhi, Junxiao Han, Xinkui Zhao, Jianwei Yin, Shuiguang Deng._ arXiv 2025/09.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.05980)\n\n- **Next Edit Prediction: Learning to Predict Code Edits from Context and Interaction History.**  \n  _Ruofan Lu, Yintong Huo, Meng Zhang, Yichen Li, Michael R. Lyu._ arXiv 2025/09.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2508.10074) [![GitHub Stars](https://img.shields.io/github/stars/lurf21/NextEditPrediction?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/lurf21/NextEditPrediction)\n\n- **FullStack Bench: Evaluating LLMs as Full Stack Coders.**  \n  _Bytedance-Seed-Foundation-Code-Team: Yao Cheng, Jianfeng Chen, Jie Chen, Li Chen, Liyu Chen, Wentao Chen, Zhengyu Chen, Shijie Geng, Aoyan Li, Bo Li, et al._ arXiv 2025/05.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2412.00535) [![GitHub Stars](https://img.shields.io/github/stars/bytedance/FullStackBench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/bytedance/FullStackBench) ![Benchmark   Dataset](https://img.shields.io/badge/Benchmark___Dataset-808080?style=for-the-badge)\n\n- **RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation.**  \n  _Jane Luo, Xin Zhang, Steven Liu, Jie Wu, Jianfeng Liu, Yiming Huang, Yangyu Huang, Chengyu Yin, Ying Xin, Yuefeng Zhan, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2509.16198)\n\n- **SimdBench: Benchmarking Large Language Models for SIMD-Intrinsic Code Generation.**  \n  _Yibo He, Shuoran Zhao, Jiaming Huang, Yingjie Fu, Hao Yu, Cunjian Huang, Tao Xie._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.15224)\n\n- **MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use.**  \n  _Zaid Khan, Ali Farhadi, Ranjay Krishna, Luca Weihs, Mohit Bansal, Tanmay Gupta._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.15872)\n\n- **Improving Cursor Tab with online RL.**  \n  _Jacob Jackson, Phillip Kravtsov, Shomil Jain._ 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://cursor.com/blog/tab-rl)\n\n- **EvoAgentX: An Automated Framework for Evolving Agentic Workflows.**  \n  _Yingxu Wang, Siwei Liu, Jinyuan Fang, Zaiqiao Meng._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.03616) [![GitHub Stars](https://img.shields.io/github/stars/EvoAgentX/EvoAgentX?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/EvoAgentX/EvoAgentX) [![Website](https://img.shields.io/website?url=https://evoagentx.github.io/EvoAgentX/\u0026up_message=EVOAGENTX\u0026up_color=blue\u0026down_message=EVOAGENTX\u0026down_color=blue\u0026style=for-the-badge)](https://evoagentx.github.io/EvoAgentX/)\n\n- **SEW: Self-Evolving Agentic Workflows for Automated Code Generation.**  \n  _Siwei Liu, Jinyuan Fang, Han Zhou, Yingxu Wang, Zaiqiao Meng._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.18646) [![GitHub Stars](https://img.shields.io/github/stars/EvoAgentX/EvoAgentX?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/EvoAgentX/EvoAgentX) [![Website](https://img.shields.io/website?url=https://evoagentx.github.io/EvoAgentX/\u0026up_message=EVOAGENTX\u0026up_color=blue\u0026down_message=EVOAGENTX\u0026down_color=blue\u0026style=for-the-badge)](https://evoagentx.github.io/EvoAgentX/)\n\n- **Repository-level Code Search with Neural Retrieval Methods.**  \n  _Siddharth Gandhi, Luyu Gao, Jamie Callan._ arXiv 2025/02.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.07067)\n\n- **Co-Saving: Resource Aware Multi-Agent Collaboration for Software Development.**  \n  _Rennai Qiu, Chen Qian, Ran Li, Yufan Dang, Weize Chen, Cheng Yang, Yingli Zhang, Ye Tian, Xuantang Xiong, Lei Han, et al._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.21898)\n\n- **Think Like an Engineer: A Neuro-Symbolic Collaboration Agent for Generative Software Requirements Elicitation and Self-Review.**  \n  _Sai Zhang, Zhenchang Xing, Jieshan Chen, Dehai Zhao, Zizhong Zhu, Xiaowang Zhang, Zhiyong Feng, Xiaohong Li._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2507.14969)\n\n- **HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale.**  \n  _Huy Nhat Phan, Tien N. Nguyen, Phong X. Nguyen, Nghi D. Q. Bui._ arXiv 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2409.16299) [![GitHub Stars](https://img.shields.io/github/stars/FSoft-AI4Code/HyperAgent?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/FSoft-AI4Code/HyperAgent)\n\n- **KernelBench: Can LLMs Write Efficient GPU Kernels?**  \n  _Anne Ouyang, Simon Guo, Simran Arora, Alex L. Zhang, William Hu, Christopher Ré, Azalia Mirhoseini._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2502.10517) [![GitHub Stars](https://img.shields.io/github/stars/ScalingIntelligence/KernelBench?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/ScalingIntelligence/KernelBench) [![Website](https://img.shields.io/website?url=https://scalingintelligence.stanford.edu/blogs/kernelbench/\u0026up_message=KERNELBENCH\u0026up_color=blue\u0026down_message=KERNELBENCH\u0026down_color=blue\u0026style=for-the-badge)](https://scalingintelligence.stanford.edu/blogs/kernelbench/)\n\n- **EpiCoder: Encompassing Diversity and Complexity in Code Generation.**  \n  _Yaoxiang Wang, Haoling Li, Xin Zhang, Jie Wu, Xiao Liu, Wenxiang Hu, Zhongxin Guo, Yangyu Huang, Ying Xin, Yujiu Yang, et al._ ICML 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2501.04694)\n\n- **On the Impacts of Contexts on Repository-Level Code Generation.**  \n  _Nam Le Hai, Dung Manh Nguyen, Nghi D. Q. Bui._ NAACL 2025 Findings.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.findings-naacl.82/) [![GitHub Stars](https://img.shields.io/github/stars/FSoft-AI4Code/RepoExec?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/FSoft-AI4Code/RepoExec) [![Website](https://img.shields.io/website?url=https://fsoft-ai4code.github.io/repoexec/\u0026up_message=REPOEXEC\u0026up_color=blue\u0026down_message=REPOEXEC\u0026down_color=blue\u0026style=for-the-badge)](https://fsoft-ai4code.github.io/repoexec/)\n\n- **CodeSIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging.**  \n  _Md. Ashraful Islam, Mohammed Eunus Ali, Md Rizwan Parvez._ NAACL 2025 Findings.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.findings-naacl.82/) [![GitHub Stars](https://img.shields.io/github/stars/kagnlp/CodeGenerator?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/kagnlp/CodeGenerator) [![Website](https://img.shields.io/website?url=https://kagnlp.github.io/codesim.github.io/\u0026up_message=CODESIM.GITHUB.IO\u0026up_color=blue\u0026down_message=CODESIM.GITHUB.IO\u0026down_color=blue\u0026style=for-the-badge)](https://kagnlp.github.io/codesim.github.io/)\n\n- **On the Impacts of Contexts on Repository-Level Code Generation.**  \n  _Nam Le Hai, Dung Manh Nguyen, Nghi D. Q. Bui._ NAACL 2025 Findings.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.findings-naacl.82/)\n\n- **ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation.**  \n  _Kaiyuan Liu, Youcheng Pan, Yang Xiang, Daojing He, Jing Li, Yexing Du, Tianrun Gao._ ACL 2025 Findings.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://aclanthology.org/2025.findings-acl.1036/) [![GitHub Stars](https://img.shields.io/github/stars/RyanLoil/ProjectEval?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/RyanLoil/ProjectEval/)\n\n- **AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology.**  \n  _Minh Huynh Nguyen, Thang Chau Phan, Phong X. Nguyen, Nghi D. Q. Bui._ FORGE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2406.11912) [![GitHub Stars](https://img.shields.io/github/stars/FSoft-AI4Code/AgileCoder?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/FSoft-AI4Code/AgileCoder) [![Website](https://img.shields.io/website?url=https://fsoft-ai4code.github.io/agilecoder/\u0026up_message=AGILECODER\u0026up_color=blue\u0026down_message=AGILECODER\u0026down_color=blue\u0026style=for-the-badge)](https://fsoft-ai4code.github.io/agilecoder/)\n\n- **CodeVisionary: An Agent-based Framework for Evaluating Large Language Models in Code Generation.**  \n  _Xinchen Wang, Pengfei Gao, Chao Peng, Ruida Hu, Cuiyun Gao._ ASE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2504.13472) [![GitHub Stars](https://img.shields.io/github/stars/Eshe0922/CodeVisionary?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/Eshe0922/CodeVisionary)\n\n- **Multi-Agent Collaboration via Evolving Orchestration.**  \n  _Yufan Dang, Chen Qian, Xueheng Luo, Jingru Fan, Zihao Xie, Ruijie Shi, Weize Chen, Cheng Yang, Xiaoyin Che, Ye Tian, et al._ NeurIPS 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2505.19591) [![GitHub Stars](https://img.shields.io/github/stars/OpenBMB/ChatDev?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/OpenBMB/ChatDev/tree/puppeteer)\n\n- **Multi-Agent Collaboration via Cross-Team Orchestration.**  \n  _Zhuoyun Du, Chen Qian, Wei Liu, Zihao Xie, YiFei Wang, Rennai Qiu, Yufan Dang, Weize Chen, Cheng Yang, Ye Tian, et al._ ACL 2025 Findings.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2406.08979) [![GitHub Stars](https://img.shields.io/github/stars/OpenBMB/ChatDev?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/OpenBMB/ChatDev/tree/macnet)\n\n- **Scaling Large Language Model-based Multi-Agent Collaboration.**  \n  _Chen Qian, Zihao Xie, YiFei Wang, Wei Liu, Kunlun Zhu, Hanchen Xia, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, et al._ ICLR 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2406.07155) [![GitHub Stars](https://img.shields.io/github/stars/OpenBMB/ChatDev?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/OpenBMB/ChatDev/tree/macnet)\n\n- **Commit0: Library Generation from Scratch.**  \n  _Wenting Zhao, Nan Jiang, Celine Lee, Justin T Chiu, Claire Cardie, Matthias Gallé, Alexander M Rush._ ICLR 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://openreview.net/forum?id=MMwaQEVsAg) [![GitHub Stars](https://img.shields.io/github/stars/commit-0/commit0?style=for-the-badge\u0026logo=github\u0026label=GitHub\u0026color=black)](https://github.com/commit-0/commit0) [![Website](https://img.shields.io/website?url=https://commit-0.github.io/\u0026up_message=COMMIT-0\u0026up_color=blue\u0026down_message=COMMIT-0\u0026down_color=blue\u0026style=for-the-badge)](https://commit-0.github.io/)\n\n- **RLCoder: Reinforcement Learning for Repository-Level Code Completion.**  \n  _Yanlin Wang, Yanli Wang, Daya Guo, Jiachi Chen, Ruikai Zhang, Yuchi Ma, Zibin Zheng._ ICSE 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2407.19487)\n\n- **CATCODER: Repository-Level Code Generation with Relevant Code and Type Context.**  \n  _Zhiyuan Pan, Xing Hu, Xin Xia, Xiaohu Yang._ TOSEM 2025.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2406.03283)\n\n- **Iterative Experience Refinement of Software-Developing Agents.**  \n  _Chen Qian, Jiahao Li, Yufan Dang, Wei Liu, YiFei Wang, Zihao Xie, Weize Chen, Cheng Yang, Yingli Zhang, Zhiyuan Liu, et al._ arXiv 2024.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2405.04219)\n\n- **CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models.**  \n  _Jierui Li, Hung Le, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Doyen Sahoo._ arXiv 2024.  \n  [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/abs/2411.04329) [![","projects_url":"https://awesome.ecosyste.ms/api/v1/lists/euniai%2Fawesome-code-agents/projects"}