{"id":18517713,"url":"https://github.com/ruanchaves/napolab","last_synced_at":"2025-04-09T07:06:54.413Z","repository":{"id":148962948,"uuid":"620806122","full_name":"ruanchaves/napolab","owner":"ruanchaves","description":"A Natural Portuguese Language Benchmark (Napolab) for the evaluation of language models.","archived":false,"fork":false,"pushed_at":"2025-03-04T13:21:22.000Z","size":232,"stargazers_count":67,"open_issues_count":0,"forks_count":3,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-04-02T03:43:12.949Z","etag":null,"topics":["benchmarks","catalan","datasets","english","galician","hate-speech","huggingface","huggingface-transformers","large-language-models","nlp","portuguese","python","question-answering","semantic-similarity","spanish","text-simplification","textual-entailment","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ruanchaves.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-29T12:10:14.000Z","updated_at":"2025-03-07T04:05:46.000Z","dependencies_parsed_at":"2023-09-06T15:40:48.292Z","dependency_job_id":"d3d4c5a0-8e33-433f-82cd-613fcfed3415","html_url":"https://github.com/ruanchaves/napolab","commit_stats":{"total_commits":67,"total_committers":1,"mean_commits":67.0,"dds":0.0,"last_synced_commit":"5420e872f0bcd009692fbf5bd6103c4b6ff9aa4b"},"previous_names":["ruanchaves/napolab"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruanchaves%2Fnapolab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruanchaves%2Fnapolab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruanchaves%2Fnapolab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruanchaves%2Fnapolab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ruanchaves","download_url":"https://codeload.github.com/ruanchaves/napolab/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247994121,"owners_count":21030050,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarks","catalan","datasets","english","galician","hate-speech","huggingface","huggingface-transformers","large-language-models","nlp","portuguese","python","question-answering","semantic-similarity","spanish","text-simplification","textual-entailment","transformers"],"created_at":"2024-11-06T17:07:45.748Z","updated_at":"2025-04-09T07:06:54.406Z","avatar_url":"https://github.com/ruanchaves.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🌎 Natural Portuguese Language Benchmark (Napolab)\n\nThe [**Napolab**](https://huggingface.co/datasets/ruanchaves/napolab) is your go-to collection of Portuguese datasets for the evaluation of Large Language Models.\n\n## 📊 Napolab for Large Language Models (LLMs)\n\nA format of Napolab specifically designed for researchers experimenting with Large Language Models (LLMs) is now available. This format includes two main fields:\n\n* **Prompt**: The input prompt to be fed into the LLM.\n* **Answer**: The expected classification output label from the LLM, which is always a number between 0 and 5.\n\nThe dataset in this format can be accessed at [https://huggingface.co/datasets/ruanchaves/napolab](https://huggingface.co/datasets/ruanchaves/napolab). If you’ve used Napolab for LLM evaluations, please share your findings with us!\n\n## Leaderboards \n\nThe [Open PT LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) incorporates datasets from Napolab. \n\nThe Master's thesis [Lessons Learned from the Evaluation of Portuguese Language Models](https://www.um.edu.mt/library/oar/handle/123456789/120557) features an extensive evaluation of Transformer models on Napolab.\n\n## Guidelines\n\nNapolab adopts the following guidelines for the inclusion of datasets:\n\n* 🌿 **Natural**: As much as possible, datasets consist of natural Portuguese text or professionally translated text.\n* ✅ **Reliable**: Metrics correlate reliably with human judgments (accuracy, F1 score, Pearson correlation, etc.).\n* 🌐 **Public**: Every dataset is available through a public link.\n* 👩‍🔧 **Human**: Expert human annotations only. No automatic or unreliable annotations.\n* 🎓 **General**: No domain-specific knowledge or advanced preparation is needed to solve dataset tasks.\n\n[Napolab](https://huggingface.co/datasets/ruanchaves/napolab) currently includes the following datasets:\n\n| | | |\n| :---: |  :---:  |  :---: |\n|[assin](https://huggingface.co/datasets/assin) | [assin2](https://huggingface.co/datasets/assin2) | [rerelem](https://huggingface.co/datasets/ruanchaves/rerelem)|\n|[hatebr](https://huggingface.co/datasets/ruanchaves/hatebr)| [reli-sa](https://huggingface.co/datasets/ruanchaves/reli-sa) | [faquad-nli](https://huggingface.co/datasets/ruanchaves/faquad-nli) |\n|[porsimplessent](https://huggingface.co/datasets/ruanchaves/porsimplessent) | | |\n\n**💡 Contribute**: We're open to expanding Napolab! Suggest additions in the issues. For more information, read our [CONTRIBUTING.md](CONTRIBUTING.md).\n\n🌍 For broader accessibility, all datasets have translations in **Catalan, English, Galician and Spanish** using the `facebook/nllb-200-1.3B model` via [Easy-Translate](https://github.com/ikergarcia1996/Easy-Translate).\n\n## 🤖 Models\n\nWe've made several models, fine-tuned on this benchmark, available on Hugging Face Hub:\n\n| Datasets                     | mDeBERTa v3                                                                                                    | BERT Large                                                                                                    | BERT Base                                                                                                     |\n|:----------------------------:|:--------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------------------------:|\n| **ASSIN 2 - STS**            | [Link](https://huggingface.co/ruanchaves/mdeberta-v3-base-assin2-similarity)                                   | [Link](https://huggingface.co/ruanchaves/bert-large-portuguese-cased-assin2-similarity)                       | [Link](https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin2-similarity)                       |\n| **ASSIN 2 - RTE**            | [Link](https://huggingface.co/ruanchaves/mdeberta-v3-base-assin2-entailment)                                  | [Link](https://huggingface.co/ruanchaves/bert-large-portuguese-cased-assin2-entailment)                       | [Link](https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin2-entailment)                       |\n| **ASSIN - STS**              | [Link](https://huggingface.co/ruanchaves/mdeberta-v3-base-assin-similarity)                                   | [Link](https://huggingface.co/ruanchaves/bert-large-portuguese-cased-assin-similarity)                        | [Link](https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin-similarity)                        |\n| **ASSIN - RTE**              | [Link](https://huggingface.co/ruanchaves/mdeberta-v3-base-assin-entailment)                                   | [Link](https://huggingface.co/ruanchaves/bert-large-portuguese-cased-assin-entailment)                        | [Link](https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin-entailment)                        |\n| **HateBR**                   | [Link](https://huggingface.co/ruanchaves/mdeberta-v3-base-hatebr)                                             | [Link](https://huggingface.co/ruanchaves/bert-large-portuguese-cased-hatebr)                                 | [Link](https://huggingface.co/ruanchaves/bert-base-portuguese-cased-hatebr)                                  |\n| **FaQUaD-NLI**               | [Link](https://huggingface.co/ruanchaves/mdeberta-v3-base-faquad-nli)                                         | [Link](https://huggingface.co/ruanchaves/bert-large-portuguese-cased-faquad-nli)                             | [Link](https://huggingface.co/ruanchaves/bert-base-portuguese-cased-faquad-nli)                              |\n| **PorSimplesSent**           | [Link](https://huggingface.co/ruanchaves/mdeberta-v3-base-porsimplessent)                                     | [Link](https://huggingface.co/ruanchaves/bert-large-portuguese-cased-porsimplessent)                         | [Link](https://huggingface.co/ruanchaves/bert-base-portuguese-cased-porsimplessent)                          |\n\n\nFor model fine-tuning details and benchmark results, visit [EVALUATION.md](EVALUATION.md). \n\n## Usage\n\nTo reproduce the Napolab benchmark available on the Hugging Face Hub locally, follow these steps:\n\n1. Clone the repository and install the library:\n\n```bash\ngit clone https://github.com/ruanchaves/napolab.git\ncd napolab\npip install -e .\n```\n\n2. Generate the benchmark file:\n   \n```python\nfrom napolab import export_napolab_benchmark, convert_to_completions_format\ninput_df = export_napolab_benchmark()\noutput_df = convert_to_completions_format(input_df)\noutput_df.reset_index().to_csv(\"test.csv\", index=False)\n```\n\n## Citation\n\nIf you would like to cite our work or models, please reference the Master's thesis [Lessons Learned from the Evaluation of Portuguese Language Models](https://www.um.edu.mt/library/oar/handle/123456789/120557).\n\n```\n@mastersthesis{chaves2023lessons,\n  title={Lessons learned from the evaluation of Portuguese language models},\n  author={Chaves Rodrigues, Ruan},\n  year={2023},\n  school={University of Malta},\n  url={https://www.um.edu.mt/library/oar/handle/123456789/120557}\n}\n```\n\n## Disclaimer\n\nThe HateBR dataset, including all its components, is provided strictly for academic and research purposes. The use of the HateBR dataset for any commercial or non-academic purpose is expressly prohibited without the prior written consent of [SINCH](https://www.sinch.com/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fruanchaves%2Fnapolab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fruanchaves%2Fnapolab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fruanchaves%2Fnapolab/lists"}