{"id":28404574,"url":"https://github.com/hpai-bsc/turtle","last_synced_at":"2025-07-18T23:06:42.838Z","repository":{"id":294455519,"uuid":"951227741","full_name":"HPAI-BSC/TuRTLe","owner":"HPAI-BSC","description":"A Unified Evaluation of LLMs for RTL Generation 🐢 (MLCAD 2025)","archived":false,"fork":false,"pushed_at":"2025-07-09T15:18:34.000Z","size":494,"stargazers_count":19,"open_issues_count":0,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-07-10T14:14:32.176Z","etag":null,"topics":["evaluation-framework","rtl"],"latest_commit_sha":null,"homepage":"https://huggingface.co/spaces/HPAI-BSC/TuRTLe-Leaderboard","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HPAI-BSC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-19T11:00:34.000Z","updated_at":"2025-07-10T10:22:28.000Z","dependencies_parsed_at":"2025-06-12T08:30:59.802Z","dependency_job_id":"e49e4325-da05-4d7c-a7a9-fd03b0587975","html_url":"https://github.com/HPAI-BSC/TuRTLe","commit_stats":null,"previous_names":["hpai-bsc/turtle"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/HPAI-BSC/TuRTLe","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HPAI-BSC%2FTuRTLe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HPAI-BSC%2FTuRTLe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HPAI-BSC%2FTuRTLe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HPAI-BSC%2FTuRTLe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HPAI-BSC","download_url":"https://codeload.github.com/HPAI-BSC/TuRTLe/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HPAI-BSC%2FTuRTLe/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265849011,"owners_count":23838195,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["evaluation-framework","rtl"],"created_at":"2025-06-01T20:09:43.428Z","updated_at":"2025-07-18T23:06:42.829Z","avatar_url":"https://github.com/HPAI-BSC.png","language":"Python","readme":"\u003cdiv align=\"center\" style=\"line-height: 1;\"\u003e\n\u003cimg src=\"images/TuRTLe_logo.png\" width=\"250\" alt=\"HPAI\"/\u003e\n\u003c/div\u003e\n\u003cbr/\u003e\n\u003cdiv align=\"center\" style=\"line-height: 1;\"\u003e\n  \u003ca href=\"https://hpai.bsc.es/\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"Web\" src=\"https://img.shields.io/badge/Website-HPAI-8A2BE2\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://huggingface.co/HPAI-BSC\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"Hugging Face\" src=\"https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-HPAI-ffc107?color=ffc107\u0026logoColor=white\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/HPAI-BSC\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"GitHub\" src=\"https://img.shields.io/badge/GitHub-HPAI-%23121011.svg?logo=github\u0026logoColor=white\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/HPAI-BSC/turtle/stargazers\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"GitHub Repo stars\" src=\"https://img.shields.io/github/stars/HPAI-BSC/turtle\"\n    style=\"display: inline-block; vertical-align: middle;\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/orgs/HPAI-BSC/followers\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"HPAI followers\" src=\"https://img.shields.io/github/followers/HPAI-BSC\"\n    style=\"display: inline-block; vertical-align: middle;\"\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\u003cdiv align=\"center\" style=\"line-height: 1;\"\u003e\n  \u003ca href=\"https://www.linkedin.com/company/hpai\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"Linkedin\" src=\"https://img.shields.io/badge/Linkedin-HPAI-blue\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://bsky.app/profile/hpai.bsky.social\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"BlueSky\" src=\"https://img.shields.io/badge/Bluesky-HPAI-0285FF?logo=bluesky\u0026logoColor=fff\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://linktr.ee/hpai_bsc\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"LinkTree\" src=\"https://img.shields.io/badge/Linktree-HPAI-43E55E?style=flat\u0026logo=linktree\u0026logoColor=white\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\u003cdiv align=\"center\" style=\"line-height: 1;\"\u003e\n  \u003ca href=\"https://arxiv.org/abs/2504.01986\" target=\"_blank\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"Arxiv\" src=\"https://img.shields.io/badge/arXiv-2409.15127-b31b1b.svg\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"LICENSE\" style=\"margin: 1px;\"\u003e\n    \u003cimg alt=\"License\" src=\"https://img.shields.io/github/license/HPAI-BSC/turtle\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\nTuRTLe is a framework to assess LLMs across\nkey RTL generation tasks systematically. It integrates multiple existing benchmarks and automates the evaluation process, enabling a comprehensive assessment of LLM performance in syntax correctness,\nfunctional correctness, synthesis, PPA optimization, and exact line\ncompletion.\n\nThis work extends the functionality and flexibility of [bigcode-evaluation-harness](https://github.com/bigcode-project/bigcode-evaluation-harness) with the use of open-source EDA tools to run Specification-to-RTL and RTL Code Completion benchmarks. Furthermore, it is inspired from [vllm-code-harness](https://github.com/iNeil77/vllm-code-harness) to allow an efficient inference with vLLM.\n\nBenchmarks implemented so far are:\n\n- [VerilogEval v2.0](https://github.com/NVlabs/verilog-eval): Specification-to-RTL and Module Completion\n- [RTLLM v1.1 and v2.0](https://github.com/hkust-zhiyao/RTLLM): Specification-to-RTL\n- [VGen](https://github.com/shailja-thakur/VGen): Module Completion\n- [RTL-Repo](https://github.com/AUCOHL/RTL-Repo): Single Line Completion\n\nOpen-source EDA tools integrated:\n\n- [Icarus Verilog](https://github.com/steveicarus/iverilog): syntax and functionality\n- [Verilator](https://www.veripool.org/verilator/): syntax and functionality\n- [Yosys](https://github.com/YosysHQ/yosys): synthesis\n- [OpenROAD](https://github.com/The-OpenROAD-Project/OpenROAD): PPA\n- [OpenLane](https://github.com/The-OpenROAD-Project/OpenLane): to integrate YoSys and OpenROAD\n\nFor more details about our work, refer to our [ArXiv paper](https://arxiv.org/abs/2504.01986). Here you have a diagram of the high-level structure of the framework:\n![TuRTLe diagram](images/TuRTLe_diagram.png)\n\n## News\n\n- **[2025-07-03]** TuRTLe now supports Verilator as a simulator to check for Syntax and Functionality\n- **[2025-06-12]** We add support for multi-node inference with Ray and the configurations for bigger models\n- **[2025-05-19]** The project’s source code is now publicly released. We’d love to hear your feedback, so give it a try!\n- **[2025-03-31]** Our paper *\"TuRTLe: A Unified Evaluation of LLMs for RTL Generation\"* is now available on [ArXiv](https://arxiv.org/abs/2504.01986)!\n- **[2025-03-20]** The leaderboard is now live! Check it out on our [Huggingface Space](https://huggingface.co/spaces/HPAI-BSC/TuRTLe-Leaderboard)\n\n## Road Map\n\n- **[In progress]** Release repo compatible with local execution\n\n## Leaderboard 🥇 \n\nCheck the [TuRTLe Leaderboard](https://huggingface.co/spaces/HPAI-BSC/TuRTLe-Leaderboard) to know the best open-source models for each task.\n![Leaderboard screenshot](images/Leaderboard_screenshot.png)\n\n## Usage  \n\n\u003e [!WARNING]\n\u003e **Dependencies Notice**  \n\u003e **vLLM** currently supports up to **Python 3.12**. Ensure that your Python version does not exceed this limit to avoid compatibility issues.\n\n### HPC Environment Requirements\n\nMost of the modes require to be executed in HPC environments. For this reason, TuRTLe currently relies on **Slurm** and **Singularity** for its execution.\n\n### Installation\n\n1. **Clone the repository**:\n\n   ```bash\n   git clone --recursive https://github.com/HPAI-BSC/TuRTLe.git\n   ```\n\n2. **(Optional) Create and activate a virtual environment**:\n\n   ```bash\n   python3 -m venv venv\n   source venv/bin/activate\n   ```\n\n3. **Install Python dependencies**:\n\n    ```bash\n    pip install -r requirements.txt\n    ```\n    On non-Linux devices the above command will raise:\n    ```\n    AssertionError: vLLM only supports Linux platform (including WSL).​\n    ```\n    In this case, vLLM has to be installed from source (see their [installation page](https://docs.vllm.ai/en/stable/getting_started/installation.html) for details).\n\n4. **Install bigcode-evaluation-harness as a pypi package**:\n    \n    ```bash\n    cd TuRTLe/bigcode-evaluation-harness/​\n    pip install -e .\n    ```\n\n5. **Intall EDA Tools (not required for single line completion benchmarks)**\n\n    To install **OpenLane**, follow the instructions provided in the [OpenLane Installation Guide](https://openlane2.readthedocs.io/en/latest/getting_started/installation_overview.html).\n    \n    To install **ICARUS Verilog** on Windows check the [Icarus Verilog Windows download page](https://bleyer.org/icarus/). To install it on Linux execute:\n    ```bash\n    sudo apt-get update\n    sudo apt-get install iverilog\n    ```\n\nFinally, we recommend using Singularity for containerization on HPC environments. TuRTLe can dynamically create and submit Slurm job script. To enable this, include the following settings in your benchmark configuration file:\n- **singularity_image**: path to your singularity image.\n- For each model, specify a **slurm_config** from `turtle/configs/slurm.yml` with the slurm directives to run the benchmark.\n\n### Quick Demo\n\nComing soon.\n\n### Running the Project\n\nTo execute the project, use the `turtle/run.py` script with the appropriate arguments. Below are the details of the available parameters:\n\n```bash\npython turtle/run.py [--benchmark \u003cconfig_file\u003e] [--model \u003cmodel_name\u003e] [--run_all]\n```\n\nIf the configuration file includes both `singularity_image` and `slurm_config`, TuRTLe will automatically generate and execute a Slurm script to run the benchmark using the specified Singularity image.\n\n#### Core Parameters\n\n- `--benchmark`: Name of the .yml file in `turtle/configs/` with the configurations of the benchmark to run (e.g., `rtlrepo`, `rtllm_v2.0`, `verilog_eval_cc`, `verilog_eval_rtl`, `verigen`).\n- `--model`: Specify a particular model to run. If not provided, all models in the configuration file will be executed.\n- `--run_all`: Use this flag to run all benchmarks against all models.\n\n#### Additional Parameters\n\nDue to the dual-image setup, one for inference and another including EDA tools (e.g., Icarus Verilog, Verilator, Yosys, OpenLane), you can control each phase of the pipeline separately:\n\n- `--generation_only`: Use this flag to only perform inference.\n- `--evaluation_only`: Use this flag to only perform evaluation. We load the generations automatically from the YAML `metric_output_path` variable\n\n#### Examples\n\n1. Run all models specified in the configuration file for the RTL-Repo benchmark:\n   ```bash\n   python turtle/run.py --benchmark rtlrepo \n   ```\n\n2. Test Qwen2.5-32B against the benchmark VerilogEval Code Completion:\n   ```bash\n   python turtle/run.py --benchmark verilog_eval_cc --model Qwen2.5-32B\n   ```\n\n3. Run all benchmarks against all models:\n   ```bash\n   python turtle/run.py --run_all\n   ```\n\n### Add your benchmark   \n\nThe process to implement a benchmark is very similar to the one described by [bigcode-evaluation-harness guide](https://github.com/bigcode-project/bigcode-evaluation-harness/blob/main/docs/guide.md). Follow these steps:\n\n1. Copy the `turtle/tasks/template/new_task.py` into `turtle/tasks/` and rename it to the name of your benchmark `\u003cbenchmark_name\u003e.py`.\n3. Complete all the TODO comments in the template file.\n3. Define a configuration file named `turtle/configs/\u003cbenchmark_name\u003e.yml` and list the models you want to evaluate along with their required parameters.\n4. Update the `_load_new_modules()` and `_create_extended_registry()` methods within `turtle/src/utils/task_updater.py`.\n\n## Citation\n\n```\n@misc{garciagasulla2025turtleunifiedevaluationllms,\n      title={TuRTLe: A Unified Evaluation of LLMs for RTL Generation}, \n      author={Dario Garcia-Gasulla and Gokcen Kestor and Emanuele Parisi and Miquel Albert\\'i-Binimelis and Cristian Gutierrez and Razine Moundir Ghorab and Orlando Montenegro and Bernat Homs and Miquel Moreto},\n      year={2025},\n      eprint={2504.01986},\n      archivePrefix={arXiv},\n      primaryClass={cs.AR},\n      url={https://arxiv.org/abs/2504.01986}, \n}\n```\n\n## How to contribute 🤝  \n\nAny contribution is more than welcome! If you've found a bug or have an idea for an improvement, don't hesitate to [open a new issue](https://github.com/HPAI-BSC/TuRTLe/issues) using our issue forms. We also encourage people to do pull requests with new benchmarks of any task relevant for chip design.\n\n## Contact\n\nIf you have any questions or feedback, feel free to email us at hpai@bsc.es. You can also support the project by following or starring the repository.\n\n---\n\n**Made with ❤️ by [HPAI](https://hpai.bsc.es/) at the [Barcelona Supercomputing Center (BSC)](https://www.bsc.es/)**  \n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhpai-bsc%2Fturtle","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhpai-bsc%2Fturtle","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhpai-bsc%2Fturtle/lists"}