{"id":41947831,"url":"https://github.com/sjtu-sai-agents/ML-Master","last_synced_at":"2026-02-04T17:01:26.784Z","repository":{"id":299605239,"uuid":"1002831452","full_name":"sjtu-sai-agents/ML-Master","owner":"sjtu-sai-agents","description":"The official implementation of \"ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning\"","archived":false,"fork":false,"pushed_at":"2026-01-16T11:14:47.000Z","size":49313,"stargazers_count":333,"open_issues_count":3,"forks_count":41,"subscribers_count":12,"default_branch":"main","last_synced_at":"2026-01-17T01:01:26.402Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sjtu-sai-agents.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-16T08:01:16.000Z","updated_at":"2026-01-16T20:39:05.000Z","dependencies_parsed_at":"2025-06-27T05:19:22.187Z","dependency_job_id":"82428ab7-19c8-4125-bce2-504f98b349c2","html_url":"https://github.com/sjtu-sai-agents/ML-Master","commit_stats":null,"previous_names":["zeroxleo/ml-master","sjtu-sai-agents/ml-master"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sjtu-sai-agents/ML-Master","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-sai-agents%2FML-Master","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-sai-agents%2FML-Master/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-sai-agents%2FML-Master/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-sai-agents%2FML-Master/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sjtu-sai-agents","download_url":"https://codeload.github.com/sjtu-sai-agents/ML-Master/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-sai-agents%2FML-Master/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29091317,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-04T03:31:03.593Z","status":"ssl_error","status_checked_at":"2026-02-04T03:29:50.742Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-25T20:00:26.085Z","updated_at":"2026-02-04T17:01:26.778Z","avatar_url":"https://github.com/sjtu-sai-agents.png","language":"Python","funding_links":[],"categories":["AutoML Agents","📈 Papers - Memory for Agent Evolution"],"sub_categories":["🧭 Reinforcement Learning \u0026 Continual Learning"],"readme":"\u003cdiv align=\"center\"\u003e\n\n![ML-Master Logo](./assets/logo.gif)\n\n\u003c/div\u003e\n\n## 📰 What's New\n- [2026/01/16] Release the preprint version of ML-Master 2.0! See the [ArXiv](https://arxiv.org/abs/2601.10402).\n- [2025/12/16] 🎉 **ML-Master 2.0 reaches new heights!**  Achieving #1 on [MLE-Bench](https://github.com/openai/mle-bench) Leaderboard with 56.44% overall performance (92.7% improvement over 1.0). Thanks to [EigenAI](https://www.eigenai.com/) for their high-performance AI infrastructure support.\n- [2025/10/30] We upload a new branch `feature-dev` with improved readability and maintainability. If you need to continue developing on ML-Master or apply ML-Master to downstream tasks, please switch the branch to `feature-dev`. \n- [2025/10/29] We now provide a Docker image for environment setup! Check it out [here](https://hub.docker.com/r/sjtuagents/ml-master).\n- [2025/10/27] Add support for gpt-5.\n- [2025/08/08] Initial code release is now available on GitHub!\n- [2025/06/19] Release the preprint version! See the [ArXiv](https://arxiv.org/abs/2506.16499).\n- [2025/06/17] Release the initial version! See the initial manuscript [here](./assets/ML-Master_github.pdf).\n\n# ML-Master 2.0: Cognitive Accumulation for Ultra-Long-Horizon Agentic Science in Machine Learning Engineering\n[![project](https://img.shields.io/badge/project-Page-blue)](https://sjtu-sai-agents.github.io/ML-Master)\n[![arXiv](https://img.shields.io/badge/arXiv-2601.10402-b31b1b.svg)](https://arxiv.org/abs/2601.10402)\n[![WeChat](https://img.shields.io/badge/WeChat-新智元-lightgreen)](https://mp.weixin.qq.com/s/dv1MD5S2vr3MB-skV4Thrw)\n\n\n## 🚀 Overview\n\n**ML-Master 2.0** is a pioneering agentic science framework that tackles the challenge of ultra-long-horizon autonomy through cognitive accumulation, facilitated by a Hierarchical Cognitive Caching (HCC) architecture that dynamically distills transient execution traces into stable long-term knowledge, ensuring that tactical execution and strategic planning remain decoupled yet co-evolve throughout complex, long-horizon scientific explorations. \n\n![ML-Master 2.0](./assets/ML-Master2.0-figure.png)\n\n## 📊 Performance Highlights\n\n![ML-Master 2.0 Score](./assets/ML-Master2.0_score.png)\n\n**ML-Master 2.0** achieves **#1 on [MLE-Bench](https://github.com/openai/mle-bench) Leaderboard** with massive performance gains:\n\n| Metric (%)                  | ML-Master 1.0 | ML-Master 2.0 | Relative Improvement |\n|----------------------------|---------------|---------------|---------------------|\n| 🥇 Overall (All)           | 29.33         | **56.44**     | **+92.7% ↑**       |\n| 🟢 Low Complexity          | 48.48         | **75.76**     | **+56.2% ↑**       |\n| 🟡 Medium Complexity       | 20.18         | **50.88**     | **+152.2% ↑**      |\n| 🔴 High Complexity         | 24.44         | **42.22**     | **+72.8% ↑**       |\n\n## 📆 Coming Soon\n\n- [x] Grading report release\n- [x] Paper release of ML-Master 2.0\n- [ ] Initial code release of ML-Master 2.0\n\n## 🙏 Acknowledgements\n\u003ctable align=\"center\"\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://sai.sjtu.edu.cn/\"\u003e\n        \u003cimg src=\"./assets/sai_logo.png\" height=\"80\" alt=\"SJTU SAI\"\u003e\n      \u003c/a\u003e\n      \u003cbr\u003e\n      \u003ca href=\"https://sai.sjtu.edu.cn/\"\u003eSJTU SAI\u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://www.eigenai.com/\" style=\"text-decoration: none;\"\u003e\n        \u003cimg src=\"./assets/eigenai_logo.png\" height=\"80\" style=\"vertical-align: top;\" alt=\"EigenAI Logo\"\u003e\n        \u003cimg src=\"./assets/eigenai_name.png\" height=\"80\" style=\"vertical-align: top;\" alt=\"EigenAI Name\"\u003e\n      \u003c/a\u003e\n      \u003cbr\u003e\n      \u003ca href=\"https://www.eigenai.com/\" style=\"text-decoration: none;\"\u003eEigenAI\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n## ✍️ Citation\n\nIf you find our work helpful, please use the following citations.\n\n```bibtex\n@misc{zhu2026ultralonghorizonagenticsciencecognitive,\n      title={Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering}, \n      author={Xinyu Zhu and Yuzhu Cai and Zexi Liu and Bingyang Zheng and Cheng Wang and Rui Ye and Jiaao Chen and Hanrui Wang and Wei-Chen Wang and Yuzhi Zhang and Linfeng Zhang and Weinan E and Di Jin and Siheng Chen},\n      year={2026},\n      eprint={2601.10402},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2601.10402}, \n}\n```\n\n```bibtex\n@misc{liu2025mlmasteraiforaiintegrationexploration,\n      title={ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning}, \n      author={Zexi Liu and Yuzhu Cai and Xinyu Zhu and Yujie Zheng and Runkun Chen and Ying Wen and Yanfeng Wang and Weinan E and Siheng Chen},\n      year={2025},\n      eprint={2506.16499},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2506.16499}, \n}\n```\n\n---\n\n# ML-Master: Towards AI-for-AI via Intergration of Exploration and Reasoning\n\n[![project](https://img.shields.io/badge/project-Page-blue)](https://sjtu-sai-agents.github.io/ML-Master/1.0.html)\n[![arXiv](https://img.shields.io/badge/arXiv-2506.16499-b31b1b.svg)](https://arxiv.org/abs/2506.16499)\n[![WeChat](https://img.shields.io/badge/WeChat-新智元-lightgreen)](https://mp.weixin.qq.com/s/8Dn7Hvpmp59-0xDD28nQkw)\n[![DockerHub](https://img.shields.io/badge/DockerHub-repository-blue.svg)](https://hub.docker.com/r/sjtuagents/ml-master)\n\n\u003e **Status**: ⌛ Initial code release is now available!\n\n## 🚀 Overview\n\n**ML-Master** is a novel AI4AI (AI-for-AI) agent that integrates exploration and reasoning into a coherent iterative methodology, facilitated by an adaptive memory mechanism that selectively captures and summarizes relevant insights and outcomes, ensuring each component mutually reinforces the other without compromising either. \n\n![ML-Master](./assets/ML-Master_figure.png)\n\n## 📊 Performance Highlights\n\nML-Master outperforms prior baselines on the **[MLE-Bench](https://github.com/openai/mle-bench)**:\n\n| Metric                      | Result                |\n|----------------------------|-----------------------|\n| 🥇 Average Medal Rate       | **29.3%**             |\n| 🧠 Medium Task Medal Rate   | **20.2%**, more than doubling the previous SOTA            | \n| 🕒 Runtime Efficiency        | **12 hours**, 50% budget |\n\n![ML-Master](./assets/ML-Master_score.png)\n\n## 📆 Coming Soon\n\n- [x] Grading report release\n- [x] Paper release of ML-Master\n- [x] Initial code release of ML-Master (expected early August)\n- [x] Code refactoring for improved readability and maintainability\n\n## 🚀 Quick Start\n\n### 🛠️ Environment Setup\n\n#### Pull and Start Docker Container\nPlease execute the following commands to pull the latest image and start an interactive container:\n\n```bash\n# Pull the latest image\ndocker pull sjtuagents/ml-master:latest\n\n# Start the container\ndocker run --rm --gpus all --ipc=host --shm-size=64g \\\n  --runtime=nvidia --ulimit memlock=-1 --ulimit stack=67108864 \\\n  -it sjtuagents/ml-master:latest /bin/bash\n\n# Clone the repository\ngit clone https://github.com/sjtu-sai-agents/ML-Master.git\ncd ML-Master\nconda activate ml-master\n```\n\n#### Install ml-master\nTo get started, make sure to first install the environment of **[MLE-Bench](https://github.com/openai/mle-bench)**. After that, install additional packages based on `requirements.txt`.\n\n```bash\ngit clone https://github.com/sjtu-sai-agents/ML-Master.git\ncd ML-Master\nconda create -n ml-master python=3.12\nconda activate ml-master\n\n# 🔧 Install MLE-Bench environment here\n# (Follow the instructions in its README)\n\npip install -r requirements.txt\n```\n\n---\n\n### 📦 Download MLE-Bench Data\n\nThe full MLE-Bench dataset is over **2TB**. We recommend downloading and preparing the dataset using the scripts and instructions provided by **[MLE-Bench](https://github.com/openai/mle-bench)**.\n\nOnce prepared, the expected dataset structure looks like this:\n\n```\n/path/to/mle-bench/plant-pathology-2020-fgvc7/\n└── prepared\n    ├── private\n    │   └── test.csv\n    └── public\n        ├── description.md\n        ├── images/\n        ├── sample_submission.csv\n        ├── test.csv\n        └── train.csv\n```\n\n\u003e 🪄 ML-Master uses symbolic links to access the dataset. You can download the data to your preferred location and ML-Master will link it accordingly.\n\n---\n\n### 🧠 Configure DeepSeek and GPT\n\nML-Master requires LLMs to return custom `\u003cthink\u003e\u003c/think\u003e` tags in the response. Ensure your **DeepSeek** API supports this and follows the `OpenAI` client interface below:\n\n```python\nself.client = OpenAI(\n    api_key=self.api_key,\n    base_url=self.base_url\n)\nresponse = self.client.completions.create(**params)\n```\nIf your API does not support this interface or you are using a closed source model(e.g. gpt-5) as coding model, please add `agent.steerable_reasoning=false` to `run.sh`. This may result in some performance loss.\n\nSet your `base_url` and `api_key` in the `run.sh` script.\n**GPT-4o** is used *only* for evaluation and feedback, consistent with **[MLE-Bench](https://github.com/openai/mle-bench)**.\n\n```bash\n# Basic configuration\nAGENT_DIR=./\nEXP_ID=plant-pathology-2020-fgvc7   # Competition name\ndataset_dir=/path/to/mle-bench      # Path to prepared dataset\nMEMORY_INDEX=0                      # GPU device ID\n\n# DeepSeek config\ncode_model=deepseek-r1\ncode_temp=0.5\ncode_base_url=\"your_base_url\"\ncode_api_key=\"your_api_key\"\n\n# GPT config (used for feedback \u0026 metrics)\nfeedback_model=gpt-4o-2024-08-06\nfeedback_temp=0.5\nfeedback_base_url=\"your_base_url\"\nfeedback_api_key=\"your_api_key\"\n\n# CPU allocation\nstart_cpu=0\nCPUS_PER_TASK=36\nend_cpu=$((start_cpu + CPUS_PER_TASK - 1))\n\n# Time limit (in seconds)\nTIME_LIMIT_SECS=43200\n```\n\n---\n\n### ▶️ Start Running\nBefore running ML-Master, you need to launch a server which tells agent whether the submission is valid or not, allowed and used by MLE-Bench.\n```bash\nbash launch_server.sh\n```\n\nAfter that, simply run the following command:\n\n```bash\nbash run.sh\n```\n\n📝 Logs and solutions will be saved in:\n\n* `./logs` (for logs)\n* `./workspaces` (for generated solutions)\n\n---\n### 📊 Evaluation\n\nFor evaluation details, please refer to the official **[MLE-Bench evaluation guide](https://github.com/openai/mle-bench)**.\n\n\n## 🙏 Acknowledgements\n\nWe would like to express our sincere thanks to the following open-source projects that made this work possible:\n\n* 💡 **[MLE-Bench](https://github.com/openai/mle-bench)** — for providing a comprehensive and professional AutoML benchmarking platform.\n* 🌲 **[AIDE](https://github.com/WecoAI/aideml)** — for offering a powerful tree-search-based AutoML code framework that inspired parts of our implementation.\n\n\n## 💬 Contact Us\n\nWe welcome discussions, questions, and feedback! Join our WeChat group:\n\n\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"./assets/Wechat_Group.png\" alt=\"WeChat Group\" height=\"400\"\u003e\n\n\u003c/div\u003e\n\n## ✍️ Citation\n\nIf you find our work helpful, please use the following citations.\n\n```bibtex\n@misc{liu2025mlmasteraiforaiintegrationexploration,\n      title={ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning}, \n      author={Zexi Liu and Yuzhu Cai and Xinyu Zhu and Yujie Zheng and Runkun Chen and Ying Wen and Yanfeng Wang and Weinan E and Siheng Chen},\n      year={2025},\n      eprint={2506.16499},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2506.16499}, \n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsjtu-sai-agents%2FML-Master","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsjtu-sai-agents%2FML-Master","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsjtu-sai-agents%2FML-Master/lists"}