{"id":13705439,"url":"https://github.com/obss/jury","last_synced_at":"2025-10-07T06:18:53.935Z","repository":{"id":40295212,"uuid":"385946633","full_name":"obss/jury","owner":"obss","description":"Comprehensive NLP Evaluation System","archived":false,"fork":false,"pushed_at":"2024-08-08T12:15:39.000Z","size":298,"stargazers_count":188,"open_issues_count":5,"forks_count":19,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-10-03T13:35:54.533Z","etag":null,"topics":["datasets","evaluate","evaluation","huggingface","machine-learning","metrics","natural-language-processing","nlp","nlp-evaluation","python","pytorch","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/obss.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-07-14T13:18:11.000Z","updated_at":"2025-08-07T12:50:47.000Z","dependencies_parsed_at":"2024-04-18T16:44:42.085Z","dependency_job_id":"f6d137c7-1424-453a-974b-f315773c48f0","html_url":"https://github.com/obss/jury","commit_stats":{"total_commits":96,"total_committers":9,"mean_commits":"10.666666666666666","dds":"0.35416666666666663","last_synced_commit":"2fecd9edfe2119d51875fa0db1d4522bdea74600"},"previous_names":[],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/obss/jury","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obss%2Fjury","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obss%2Fjury/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obss%2Fjury/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obss%2Fjury/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/obss","download_url":"https://codeload.github.com/obss/jury/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obss%2Fjury/sbom","scorecard":{"id":701257,"data":{"date":"2025-08-11","repo":{"name":"github.com/obss/jury","commit":"c8d32259d0683ef8f43088ac9b754d98e62dfa79"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Code-Review","score":1,"reason":"Found 3/30 approved changesets -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/ci.yml:1","Warn: no topLevel permission defined: .github/workflows/publish_pypi.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:20: update your workflow using https://app.stepsecurity.io/secureworkflow/obss/jury/ci.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/obss/jury/ci.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:28: update your workflow using https://app.stepsecurity.io/secureworkflow/obss/jury/ci.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:36: update your workflow using https://app.stepsecurity.io/secureworkflow/obss/jury/ci.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:44: update your workflow using https://app.stepsecurity.io/secureworkflow/obss/jury/ci.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/publish_pypi.yml:12: update your workflow using https://app.stepsecurity.io/secureworkflow/obss/jury/publish_pypi.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/publish_pypi.yml:14: update your workflow using https://app.stepsecurity.io/secureworkflow/obss/jury/publish_pypi.yml/main?enable=pin","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:53","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:56","Warn: pipCommand not pinned by hash: .github/workflows/publish_pypi.yml:19","Warn: pipCommand not pinned by hash: .github/workflows/publish_pypi.yml:20","Info:   0 out of   7 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   4 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Vulnerabilities","score":1,"reason":"9 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2021-356 / GHSA-2ww3-fxvq-293j","Warn: Project is vulnerable to: PYSEC-2024-167 / GHSA-cgvx-9447-vcch","Warn: Project is vulnerable to: PYSEC-2021-859 / GHSA-f8m6-h2c7-8h9x","Warn: Project is vulnerable to: PYSEC-2019-106 / GHSA-mr7p-25v2-35wr","Warn: Project is vulnerable to: PYSEC-2022-5 / GHSA-rqjh-jp2r-59cj","Warn: Project is vulnerable to: PYSEC-2020-107 / GHSA-jjw5-xxj6-pcv5","Warn: Project is vulnerable to: PYSEC-2024-110 / GHSA-jw8x-6495-233v","Warn: Project is vulnerable to: PYSEC-2020-108","Warn: Project is vulnerable to: PYSEC-2017-74"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 20 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-22T05:15:42.072Z","repository_id":40295212,"created_at":"2025-08-22T05:15:42.072Z","updated_at":"2025-08-22T05:15:42.072Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278322144,"owners_count":25967874,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-04T02:00:05.491Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datasets","evaluate","evaluation","huggingface","machine-learning","metrics","natural-language-processing","nlp","nlp-evaluation","python","pytorch","transformers"],"created_at":"2024-08-02T22:00:41.099Z","updated_at":"2025-10-07T06:18:53.919Z","avatar_url":"https://github.com/obss.png","language":"Python","funding_links":[],"categories":["⚖️ Evaluation","Libraries"],"sub_categories":["Books"],"readme":"\u003ch1 align=\"center\"\u003eJury\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://pypi.org/project/jury\"\u003e\u003cimg src=\"https://img.shields.io/pypi/pyversions/jury\" alt=\"Python versions\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pepy.tech/project/jury\"\u003e\u003cimg src=\"https://pepy.tech/badge/jury\" alt=\"downloads\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pypi.org/project/jury\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/jury?color=blue\" alt=\"PyPI version\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/obss/jury/releases/latest\"\u003e\u003cimg alt=\"Latest Release\" src=\"https://img.shields.io/github/release-date/obss/jury\"\u003e\u003c/a\u003e\n\u003ca href=\"https://colab.research.google.com/github/obss/jury/blob/main/examples/jury_evaluate.ipynb\" target=\"_blank\"\u003e\u003cimg alt=\"Open in Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"\u003e\u003c/a\u003e\n\u003cbr\u003e\n\u003ca href=\"https://github.com/obss/jury/actions\"\u003e\u003cimg alt=\"Build status\" src=\"https://github.com/obss/jury/actions/workflows/ci.yml/badge.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://libraries.io/pypi/jury\"\u003e\u003cimg alt=\"Dependencies\" src=\"https://img.shields.io/librariesio/github/obss/jury\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/psf/black\"\u003e\u003cimg alt=\"Code style: black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/obss/jury/blob/main/LICENSE\"\u003e\u003cimg alt=\"License: MIT\" src=\"https://img.shields.io/pypi/l/jury\"\u003e\u003c/a\u003e\n\u003cbr\u003e\n\u003ca href=\"https://doi.org/10.48550/arXiv.2310.02040\"\u003e\u003cimg src=\"https://img.shields.io/badge/DOI-10.48550%2FarXiv.2310.02040-blue\" alt=\"DOI\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\nA comprehensive toolkit for evaluating NLP experiments offering various automated metrics. Jury offers a smooth and easy-to-use interface. It uses a more advanced version of [evaluate](https://github.com/huggingface/evaluate/) design for underlying metric computation, so that adding custom metric is easy as extending proper class.\n\nMain advantages that Jury offers are:\n\n- Easy to use for any NLP project.\n- Unified structure for computation input across all metrics.\n- Calculate many metrics at once.\n- Metrics calculations can be handled concurrently to save processing time.\n- It seamlessly supports evaluation for multiple predictions/multiple references.\n\nTo see more, check the [official Jury blog post](https://medium.com/codable/jury-evaluating-performance-of-nlg-models-730eb9c9999f).\n\n## 🔥 News\n\n* (2024.05.29) [Retraction Watch Post](https://retractionwatch.com/2024/05/29/caught-by-a-reviewer-a-plagiarizing-deep-learning-paper-lingers/) regarding retraction of a paper has been posted. [The plagiarised paper](https://aclanthology.org/2022.coling-1.306.pdf) has been retracted.\n* (2023.10.03) Jury paper is out currently is on [arxiv](https://arxiv.org/abs/2310.02040). Please cite this paper if your work use Jury, and if your publication material will be submitted to the venues after this date.  \n* (2023.07.30) **Public notice:** You can reach our official [Public Notice](https://docs.google.com/document/d/1mFFT0cR8BUHKJki8mAg6b36QhmsRxvKR3pwOlcxbnss/edit?usp=sharing) document that poses a claim about plagiarism of the work, *jury*, presented in this codebase.\n\n## Available Metrics\n\nThe table below shows the current support status for available metrics.\n\n| Metric                                                                        | Jury Support       | HF/evaluate Support |\n|-------------------------------------------------------------------------------|--------------------|---------------------|\n| Accuracy-Numeric                                                              | :heavy_check_mark: | :white_check_mark:  |\n| Accuracy-Text                                                                 | :heavy_check_mark: | :x:                 |\n| Bartscore                                                                     | :heavy_check_mark: | :x:                 |\n| Bertscore                                                                     | :heavy_check_mark: | :white_check_mark:  |\n| Bleu                                                                          | :heavy_check_mark: | :white_check_mark:  |\n| Bleurt                                                                        | :heavy_check_mark: | :white_check_mark:  |\n| CER                                                                           | :heavy_check_mark: | :white_check_mark:  |\n| CHRF                                                                          | :heavy_check_mark: | :white_check_mark:  |\n| COMET                                                                         | :heavy_check_mark: | :white_check_mark:  |\n| F1-Numeric                                                                    | :heavy_check_mark: | :white_check_mark:  |\n| F1-Text                                                                       | :heavy_check_mark: | :x:                 |\n| METEOR                                                                        | :heavy_check_mark: | :white_check_mark:  |\n| Precision-Numeric                                                             | :heavy_check_mark: | :white_check_mark:  |\n| Precision-Text                                                                | :heavy_check_mark: | :x:                 |\n| Prism                                                                         | :heavy_check_mark: | :x:                 |\n| Recall-Numeric                                                                | :heavy_check_mark: | :white_check_mark:  |\n| Recall-Text                                                                   | :heavy_check_mark: | :x:                 |\n| ROUGE                                                                         | :heavy_check_mark: | :white_check_mark:  |\n| SacreBleu                                                                     | :heavy_check_mark: | :white_check_mark:  |\n| Seqeval                                                                       | :heavy_check_mark: | :white_check_mark:  |\n| Squad                                                                         | :heavy_check_mark: | :white_check_mark:  |\n| TER                                                                           | :heavy_check_mark: | :white_check_mark:  |\n| WER                                                                           | :heavy_check_mark: | :white_check_mark:  |\n| [Other metrics](https://github.com/huggingface/evaluate/tree/master/metrics)* | :white_check_mark: | :white_check_mark:  |\n\n_*_ Placeholder for the rest of the metrics available in `evaluate` package apart from those which are present in the \ntable. \n\n**Notes**\n\n* The entry :heavy_check_mark: represents that full Jury support is available meaning that all combinations of input \ntypes (single prediction \u0026 single reference, single prediction \u0026 multiple references, multiple predictions \u0026 multiple \nreferences) are supported\n\n* The entry :white_check_mark: means that this metric is supported (for Jury through the `evaluate`), so that it \ncan (and should) be used just like the `evaluate` metric as instructed in `evaluate` implementation although \nunfortunately full Jury support for those metrics are not yet available.\n\n## Request for a New Metric\n\nFor the request of a new metric please [open an issue](https://github.com/obss/jury/issues/new?assignees=\u0026labels=\u0026template=new-metric.md\u0026title=) providing the minimum information. Also, PRs addressing new metric \nsupports are welcomed :).\n\n## \u003cdiv align=\"center\"\u003e Installation \u003c/div\u003e\n\nThrough pip,\n\n    pip install jury\n\nor build from source,\n\n    git clone https://github.com/obss/jury.git\n    cd jury\n    python setup.py install\n\n**NOTE:** There may be malfunctions of some metrics depending on `sacrebleu` package on Windows machines which is \nmainly due to the package `pywin32`. For this, we fixed pywin32 version on our setup config for Windows platforms. \nHowever, if pywin32 causes trouble in your environment we strongly recommend using `conda` manager install the package \nas `conda install pywin32`.\n\n## \u003cdiv align=\"center\"\u003e Usage \u003c/div\u003e\n\n### API Usage\n\nIt is only two lines of code to evaluate generated outputs.\n\n```python\nfrom jury import Jury\n\nscorer = Jury()\npredictions = [\n    [\"the cat is on the mat\", \"There is cat playing on the mat\"], \n    [\"Look!    a wonderful day.\"]\n]\nreferences = [\n    [\"the cat is playing on the mat.\", \"The cat plays on the mat.\"], \n    [\"Today is a wonderful day\", \"The weather outside is wonderful.\"]\n]\nscores = scorer(predictions=predictions, references=references)\n```\n\nSpecify metrics you want to use on instantiation.\n\n```python\nscorer = Jury(metrics=[\"bleu\", \"meteor\"])\nscores = scorer(predictions, references)\n```\n\n#### Use of Metrics standalone\n\nYou can directly import metrics from `jury.metrics` as classes, and then instantiate and use as desired.\n\n```python\nfrom jury.metrics import Bleu\n\nbleu = Bleu.construct()\nscore = bleu.compute(predictions=predictions, references=references)\n```\n\nThe additional parameters can either be specified on `compute()`\n\n```python\nfrom jury.metrics import Bleu\n\nbleu = Bleu.construct()\nscore = bleu.compute(predictions=predictions, references=references, max_order=4)\n```\n\n, or alternatively on instantiation\n\n```python\nfrom jury.metrics import Bleu\nbleu = Bleu.construct(compute_kwargs={\"max_order\": 1})\nscore = bleu.compute(predictions=predictions, references=references)\n```\n\nNote that you can seemlessly access both `jury` and `evaluate` metrics through `jury.load_metric`. \n\n```python\nimport jury\n\nbleu = jury.load_metric(\"bleu\")\nbleu_1 = jury.load_metric(\"bleu\", resulting_name=\"bleu_1\", compute_kwargs={\"max_order\": 1})\n# metrics not available in `jury` but in `evaluate`\nwer = jury.load_metric(\"competition_math\") # It falls back to `evaluate` package with a warning\n```\n\n### CLI Usage\n\nYou can specify predictions file and references file paths and get the resulting scores. Each line should be paired in both files. You can optionally provide reduce function and an export path for results to be written.\n\n    jury eval --predictions /path/to/predictions.txt --references /path/to/references.txt --reduce_fn max --export /path/to/export.txt\n\nYou can also provide prediction folders and reference folders to evaluate multiple experiments. In this set up, however, it is required that the prediction and references files you need to evaluate as a pair have the same file name. These common names are paired together for prediction and reference.\n\n    jury eval --predictions /path/to/predictions_folder --references /path/to/references_folder --reduce_fn max --export /path/to/export.txt\n\nIf you want to specify metrics, and do not want to use default, specify it in config file (json) in `metrics` key.\n\n```json\n{\n  \"predictions\": \"/path/to/predictions.txt\",\n  \"references\": \"/path/to/references.txt\",\n  \"reduce_fn\": \"max\",\n  \"metrics\": [\n    \"bleu\",\n    \"meteor\"\n  ]\n}\n```\n\nThen, you can call jury eval with `config` argument.\n\n    jury eval --config path/to/config.json\n\n### Custom Metrics\n\nYou can use custom metrics with inheriting `jury.metrics.Metric`, you can see current metrics implemented on Jury from [jury/metrics](https://github.com/obss/jury/tree/master/jury/metrics). Jury falls back to `evaluate` implementation of metrics for the ones that are currently not supported by Jury, you can see the metrics available for `evaluate` on [evaluate/metrics](https://github.com/huggingface/evaluate/tree/master/metrics). \n\nJury itself uses `evaluate.Metric` as a base class to drive its own base class as `jury.metrics.Metric`. The interface is similar; however, Jury makes the metrics to take a unified input type by handling the inputs for each metrics, and allows supporting several input types as;\n\n- single prediction \u0026 single reference\n- single prediction \u0026 multiple reference\n- multiple prediction \u0026 multiple reference\n\nAs a custom metric both base classes can be used; however, we strongly recommend using `jury.metrics.Metric` as it has several advantages such as supporting computations for the input types above or unifying the type of the input.\n\n```python\nfrom jury.metrics import MetricForTask\n\nclass CustomMetric(MetricForTask):\n    def _compute_single_pred_single_ref(\n        self, predictions, references, reduce_fn = None, **kwargs\n    ):\n        raise NotImplementedError\n\n    def _compute_single_pred_multi_ref(\n        self, predictions, references, reduce_fn = None, **kwargs\n    ):\n        raise NotImplementedError\n\n    def _compute_multi_pred_multi_ref(\n            self, predictions, references, reduce_fn = None, **kwargs\n    ):\n        raise NotImplementedError\n```\n\nFor more details, have a look at base metric implementation [jury.metrics.Metric](./jury/metrics/_base.py)\n\n## \u003cdiv align=\"center\"\u003e Contributing \u003c/div\u003e\n\nPRs are welcomed as always :)\n\n### Installation\n\n    git clone https://github.com/obss/jury.git\n    cd jury\n    pip install -e \".[dev]\"\n\nAlso, you need to install the packages which are available through a git source separately with the following command. \nFor the folks who are curious about \"why?\"; a short explaination is that PYPI does not allow indexing a package which \nare directly dependent on non-pypi packages due to security reasons. The file `requirements-dev.txt` includes packages \nwhich are currently only available through a git source, or they are PYPI packages with no recent release or \nincompatible with Jury, so that they are added as git sources or pointing to specific commits.\n\n    pip install -r requirements-dev.txt\n\n### Tests\n\nTo tests simply run.\n\n    python tests/run_tests.py\n\n### Code Style\n\nTo check code style,\n\n    python tests/run_code_style.py check\n\nTo format codebase,\n\n    python tests/run_code_style.py format\n\n\n## \u003cdiv align=\"center\"\u003e Citation \u003c/div\u003e\n\nIf you use this package in your work, please cite it as:\n\n    @misc{cavusoglu2023jury,\n      title={Jury: A Comprehensive Evaluation Toolkit}, \n      author={Devrim Cavusoglu and Ulas Sert and Secil Sen and Sinan Altinuc},\n      year={2023},\n      eprint={2310.02040},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      doi={10.48550/arXiv.2310.02040}\n    }\n\n## Community Interaction\n\nWe use the GitHub Issue Tracker to track issues in general. Issues can be bug reports, feature requests or implementation of a new metric type. Please refer to the related issue template for opening new issues.\n\n|                                | Location                                                                                           |\n|--------------------------------|----------------------------------------------------------------------------------------------------|\n| Bug Report                     | [Bug Report Template](https://github.com/obss/jury/issues/new?assignees=\u0026labels=\u0026projects=\u0026template=bug_report.md\u0026title=) |\n| New Metric Request             | [Request Metric Implementation](https://github.com/obss/jury/issues/new?assignees=\u0026labels=\u0026projects=\u0026template=new-metric.md\u0026title=) |\n| All other issues and questions | [General Issues](https://github.com/obss/jury/issues/new)                                                            |\n\n## \u003cdiv align=\"center\"\u003e License \u003c/div\u003e\n\nLicensed under the [MIT](LICENSE) License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobss%2Fjury","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fobss%2Fjury","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobss%2Fjury/lists"}