{"id":29166067,"url":"https://github.com/zjunlp/recode","last_synced_at":"2025-07-01T08:09:24.698Z","repository":{"id":302199623,"uuid":"981397155","full_name":"zjunlp/ReCode","owner":"zjunlp","description":"ReCode: Reinforced Code Knowledge Editing for API Updates","archived":false,"fork":false,"pushed_at":"2025-07-01T04:07:41.000Z","size":729,"stargazers_count":10,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-07-01T05:24:56.577Z","etag":null,"topics":["api-update","artificial-intelligence","code-language-model","knowledge-editing","large-language-models","model-editing","natural-language-processing","recode","software-engineering"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zjunlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-11T02:29:29.000Z","updated_at":"2025-07-01T04:07:44.000Z","dependencies_parsed_at":"2025-07-01T05:25:05.697Z","dependency_job_id":"d31f4235-192b-457d-bcd6-9432d93da460","html_url":"https://github.com/zjunlp/ReCode","commit_stats":null,"previous_names":["zjunlp/recode"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zjunlp/ReCode","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FReCode","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FReCode/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FReCode/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FReCode/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zjunlp","download_url":"https://codeload.github.com/zjunlp/ReCode/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FReCode/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262925005,"owners_count":23385463,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api-update","artificial-intelligence","code-language-model","knowledge-editing","large-language-models","model-editing","natural-language-processing","recode","software-engineering"],"created_at":"2025-07-01T08:09:23.580Z","updated_at":"2025-07-01T08:09:24.678Z","avatar_url":"https://github.com/zjunlp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\u003ch1 align=\"center\"\u003eReCode\u003c/h1\u003e\n\u003ch3 align=\"center\"\u003eUpdating Code API Knowledge with Reinforcement Learning\u003c/h3\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://www.arxiv.org/abs/2506.20495\"\u003e📄arXiv\u003c/a\u003e •\n  \u003ca href=\"https://huggingface.co/collections/zjunlp/recode-68634da23d3b7bcc416c1007\"\u003e🤗HuggingFace\u003c/a\u003e •\n  \u003ca href=\"https://huggingface.co/datasets/zjunlp/ReCode-Train-Data\"\u003e📖Datasets\u003c/a\u003e\n\u003c/p\u003e\n\n[![Awesome](https://awesome.re/badge.svg)](https://github.com/zjunlp/ReCode) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) ![](https://img.shields.io/github/last-commit/zjunlp/ReCode?color=blue)\n\n\u003c/div\u003e\n\n## Table of Contents\n\n- [🌟Overview](#overview)\n- [🔧Installation](#installation)\n- [📚Dataset Preparation](#dataset-preparation)\n- [📉Training](#training)\n- [🚩Citation](#citation)\n- [🌻Acknowledgement](#acknowledgement)\n\n## 🌟Overview\n\nLarge Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs. This critical limitation, stemming from reliance on outdated API knowledge from their training data, even with access to current documentation, impedes reliable code generation in dynamic environments. To tackle this issue, we propose ReCode (rule-based **Re**inforcement learning for **Code** Update), a novel framework that mimics human programmer adaptation to API changes.\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"./assets/overview.png\" width=\"50%\"\u003e\n\u003c/div\u003e\n\n## 🔧Installation\n\nWe recommend creating a new conda environment to run our project:\n\n```bash\n# create a conda environment\nconda create -n recode python=3.10\nconda activate recode\n\n# clone our project\ngit clone https://github.com/zjunlp/ReCode.git\ncd ReCode\n\n# install dependencies\npip install -r requirements.txt\n```\n\n## 📚Dataset Preparation\n\nWe have uploaded our collected data to Hugging Face, you can download it [here](https://huggingface.co/datasets/zjunlp/ReCode-Train-Data). Each piece of data contains the following six fields:\n\n- **dependency** specifies the library;\n- **new_version** indicates the required version of the library;\n- **description** provides an explanation of the code's functionality;\n- **update_info** notes the details of the updates \n- **old_code** contains the original code snippets\n- **new_code** contains the updated code snippets\n\nThe following illustrates an example:\n\n```json\n{\n    \"dependency\": \"Numpy\", \n    \"new_version\": \"==2.2.0\", \n    \"description\": \"The code demonstrates how to assign a custom docstring to a NumPy ufunc to enhance its documentation and improve code readability.\", \n    \"update_info\": \"_add_newdoc_ufunc is now deprecated. ufunc.__doc__ = newdoc should be used instead.\", \n    \"old_code\": \"import numpy as np\\n\\nmy_ufunc = np.frompyfunc(lambda x: x**2, 1, 1)\\n\\nnp._add_newdoc_ufunc(my_ufunc, \\\"This is a custom ufunc that squares the input.\\\")\", \n    \"new_code\": \"import numpy as np\\n\\nmy_ufunc = np.frompyfunc(lambda x: x**2, 1, 1)\\n\\nmy_ufunc.__doc__ = \\\"This is a custom ufunc that squares the input.\\\"\"\n}\n```\n\n## 📉Training\n\n### GRPO Training\n\nFor GRPO training, we provide the script located at `scripts/grpo.sh`. Before using it, simply fill in the necessary content within the script.\n\n```bash\nbash scripts/grpo.sh\n```\n\n### DAPO Training\n\nDue to the requirements of the verl library for data format, it is necessary to preprocess the data before training.\n\n```bash\npython3 src/DAPO/data_process.py\n```\n\nAfter that, you can directly execute the script:\n\n```bash\nbash scripts/dapo.sh\n```\n\n\n## 🚩Citation\n\nIf you find our work helpful, please cite our paper:\n\n```bibtex\n@misc{wu2025recodeupdatingcodeapi,\n      title={ReCode: Updating Code API Knowledge with Reinforcement Learning}, \n      author={Haoze Wu and Yunzhi Yao and Wenhao Yu and Huajun Chen and Ningyu Zhang},\n      year={2025},\n      eprint={2506.20495},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2506.20495}, \n}\n```\n\n## 🌻Acknowledgement\n\nOur code is build on [trl](https://github.com/huggingface/trl) and [verl](https://github.com/volcengine/verl). Thanks to their great works!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Frecode","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzjunlp%2Frecode","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Frecode/lists"}