{"id":16135210,"url":"https://github.com/k4black/codebleu","last_synced_at":"2025-04-04T20:06:39.293Z","repository":{"id":176366444,"uuid":"657552432","full_name":"k4black/codebleu","owner":"k4black","description":"Pip compatible CodeBLEU metric implementation available for linux/macos/win","archived":false,"fork":false,"pushed_at":"2024-10-21T07:12:33.000Z","size":1320,"stargazers_count":61,"open_issues_count":8,"forks_count":11,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-21T10:09:32.761Z","etag":null,"topics":["code","code-evaluation","code-generation","codebleu","evaluation","evaluation-metrics"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/codebleu/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/k4black.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-23T10:09:02.000Z","updated_at":"2024-10-21T07:12:35.000Z","dependencies_parsed_at":"2024-02-19T10:31:12.681Z","dependency_job_id":"6da2a062-e16f-4206-9536-e0506514255e","html_url":"https://github.com/k4black/codebleu","commit_stats":{"total_commits":123,"total_committers":7,"mean_commits":"17.571428571428573","dds":0.5203252032520325,"last_synced_commit":"038eb1683471c2fe6b2c0f28e055e3de76e9be58"},"previous_names":["k4black/codebleu"],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k4black%2Fcodebleu","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k4black%2Fcodebleu/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k4black%2Fcodebleu/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k4black%2Fcodebleu/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/k4black","download_url":"https://codeload.github.com/k4black/codebleu/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247242669,"owners_count":20907133,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["code","code-evaluation","code-generation","codebleu","evaluation","evaluation-metrics"],"created_at":"2024-10-09T23:06:11.956Z","updated_at":"2025-04-04T20:06:39.245Z","avatar_url":"https://github.com/k4black.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CodeBLEU\n[![Publish](https://github.com/k4black/codebleu/actions/workflows/publish.yml/badge.svg)](https://github.com/k4black/codebleu/actions/workflows/publish.yml)\n[![Test](https://github.com/k4black/codebleu/actions/workflows/test.yml/badge.svg?event=push)](https://github.com/k4black/codebleu/actions/workflows/test.yml)\n[![codecov](https://codecov.io/gh/k4black/codebleu/branch/main/graph/badge.svg?token=60BIFPWRCE)](https://codecov.io/gh/k4black/codebleu)\n[![PyPI version](https://badge.fury.io/py/codebleu.svg)](https://badge.fury.io/py/codebleu)\n\n\nThis repository contains an unofficial `CodeBLEU` implementation that supports `Linux`, `MacOS` (incl. M-series) and `Windows`. It is available through `PyPI` and the `evaluate` library.\n\nAvailable for: `Python`, `C`, `C#`, `C++`, `Java`, `JavaScript`, `PHP`, `Go`, `Ruby`, `Rust`.\n\n---\n\nThe code is based on the original [CodeXGLUE/CodeBLEU](https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/code-to-code-trans/evaluator/CodeBLEU) and updated version by [XLCoST/CodeBLEU](https://github.com/reddy-lab-code-research/XLCoST/tree/main/code/translation/evaluator/CodeBLEU).  It has been refactored, tested, built for macOS and Windows, and multiple improvements have been made to enhance usability.\n\n## Metric Description\n\n\u003e An ideal evaluation metric should consider the grammatical correctness and the logic correctness.\n\u003e We propose weighted n-gram match and syntactic AST match to measure grammatical correctness, and introduce semantic data-flow match to calculate logic correctness.\n\u003e ![CodeBLEU](CodeBLEU.jpg)  \n[from [CodeXGLUE](https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/code-to-code-trans/evaluator/CodeBLEU) repo]\n\nIn a nutshell, `CodeBLEU` is a weighted combination of `n-gram match (BLEU)`, `weighted n-gram match (BLEU-weighted)`, `AST match` and `data-flow match` scores.\n\nThe metric has shown higher correlation with human evaluation than `BLEU` and `accuracy` metrics.\n\n\n## Installation\n\nThis library requires `so` file compilation with tree-sitter, so it is platform dependent.  \nCurrently available for `Linux` (manylinux), `MacOS` and `Windows` with Python 3.8+.\n\nThe metrics is available as [pip package](https://pypi.org/project/codebleu/) and can be installed as indicated above:\n```bash\npip install codebleu\n```\nor directly from git repo (require internet connection to download tree-sitter):\n```bash\npip install git+https://github.com/k4black/codebleu.git\n```\n\nAlso you have to install tree-sitter language you need (e.g. python, rust, etc):\n```bash\npip install tree-sitter-python\n```\nOr you can install all languages:\n```bash\npip install codebleu[all]\n```\n\nNote: At the moment (May 2024) precompiled languages are NOT available for arm64 (M1) MacOS, so you have to install and build tree-sitter languages manually, for example:\n```bash\npip install pip install git+https://github.com/tree-sitter/tree-sitter-python.git\n```\n\n\n## Usage \n\n```python\nfrom codebleu import calc_codebleu\n\nprediction = \"def add ( a , b ) :\\n return a + b\"\nreference = \"def sum ( first , second ) :\\n return second + first\"\n\nresult = calc_codebleu([reference], [prediction], lang=\"python\", weights=(0.25, 0.25, 0.25, 0.25), tokenizer=None)\nprint(result)\n# {\n#   'codebleu': 0.5537, \n#   'ngram_match_score': 0.1041, \n#   'weighted_ngram_match_score': 0.1109, \n#   'syntax_match_score': 1.0, \n#   'dataflow_match_score': 1.0\n# }\n```\nwhere `calc_codebleu` takes the following arguments:\n- `refarences` (`list[str]` or `list[list[str]]`): reference code\n- `predictions` (`list[str]`) predicted code\n- `lang` (`str`): code language, see `codebleu.AVAILABLE_LANGS` for available languages (python, c_sharp c, cpp, javascript, java, php, go and ruby at the moment)\n- `weights` (`tuple[float,float,float,float]`): weights of the `ngram_match`, `weighted_ngram_match`, `syntax_match`, and `dataflow_match` respectively, defaults to `(0.25, 0.25, 0.25, 0.25)`\n- `tokenizer` (`callable`): to split code string to tokens, defaults to `s.split()`\n\nand outputs the `dict[str, float]` with following fields:\n- `codebleu`: the final `CodeBLEU` score\n- `ngram_match_score`: `ngram_match` score (BLEU)\n- `weighted_ngram_match_score`: `weighted_ngram_match` score (BLEU-weighted)\n- `syntax_match_score`: `syntax_match` score (AST match)\n- `dataflow_match_score`: `dataflow_match` score\n\nAlternatively, you can use `k4black/codebleu` from HuggingFace Spaces (`codebleu` package required):\n```python\nimport evaluate\nmetric = evaluate.load(\"dvitel/codebleu\")\n\nprediction = \"def add ( a , b ) :\\n return a + b\"\nreference = \"def sum ( first , second ) :\\n return second + first\"\n\nresult = metric.compute([reference], [prediction], lang=\"python\", weights=(0.25, 0.25, 0.25, 0.25))\n```\n\nFeel free to check the HF Space with online example: [k4black/codebleu](https://huggingface.co/spaces/k4black/codebleu) \n\n\n## Contributing\n\nContributions are welcome!  \nIf you have any questions, suggestions, or bug reports, please open an issue on GitHub.\n\nMake your own fork and clone it:\n```bash\ngit clone https://github.com/k4black/codebleu\n```\n\nFor development, you need to install library with `all` precompiled languages and `test` extra:  \n(require internet connection to download tree-sitter)\n```bash\npython -m pip install -e .[all,test]\npython -m pip install -e .\\[all,test\\]  # for macos\n```\n\nFor testing just run pytest:\n```bash\npython -m pytest\n```\n\nTo perform a style check, run:\n```bash\npython -m isort codebleu --check\npython -m black codebleu --check\npython -m ruff codebleu\npython -m mypy codebleu\n```\n\n\n## License\n\nThis project is licensed under the terms of the MIT license.\n\n\n## Citation\n\nOfficial [CodeBLEU paper](https://arxiv.org/abs/2009.10297) can be cited as follows:\n```bibtex\n@misc{ren2020codebleu,\n      title={CodeBLEU: a Method for Automatic Evaluation of Code Synthesis}, \n      author={Shuo Ren and Daya Guo and Shuai Lu and Long Zhou and Shujie Liu and Duyu Tang and Neel Sundaresan and Ming Zhou and Ambrosio Blanco and Shuai Ma},\n      year={2020},\n      eprint={2009.10297},\n      archivePrefix={arXiv},\n      primaryClass={cs.SE}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fk4black%2Fcodebleu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fk4black%2Fcodebleu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fk4black%2Fcodebleu/lists"}