{"id":15051516,"url":"https://github.com/hitz-zentroa/odesia-struct","last_synced_at":"2026-01-01T22:38:39.171Z","repository":{"id":273439676,"uuid":"853375092","full_name":"hitz-zentroa/Odesia-Struct","owner":"hitz-zentroa","description":"IXA Submission for the 2024 ODESIA Challenge","archived":false,"fork":false,"pushed_at":"2025-02-02T20:08:16.000Z","size":27694,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-02-02T21:19:28.843Z","etag":null,"topics":["evaluation","huggingface","llama","llama3","outlines","python","pytorch","spanish","torch","transformers"],"latest_commit_sha":null,"homepage":"https://leaderboard.odesia.uned.es/leaderboard/challenge","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hitz-zentroa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-06T14:29:21.000Z","updated_at":"2025-02-02T20:08:21.000Z","dependencies_parsed_at":"2025-01-20T23:36:01.693Z","dependency_job_id":null,"html_url":"https://github.com/hitz-zentroa/Odesia-Struct","commit_stats":null,"previous_names":["hitz-zentroa/odesia-struct"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitz-zentroa%2FOdesia-Struct","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitz-zentroa%2FOdesia-Struct/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitz-zentroa%2FOdesia-Struct/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitz-zentroa%2FOdesia-Struct/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hitz-zentroa","download_url":"https://codeload.github.com/hitz-zentroa/Odesia-Struct/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243523001,"owners_count":20304512,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["evaluation","huggingface","llama","llama3","outlines","python","pytorch","spanish","torch","transformers"],"created_at":"2024-09-24T21:36:30.500Z","updated_at":"2026-01-01T22:38:39.164Z","avatar_url":"https://github.com/hitz-zentroa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n    \u003cbr\u003e\n    \u003cimg src=\"ODESIA.png\" style=\"height: 250px;\"\u003e\n    \u003cbr\u003e\n    \u003ch3 align=\"center\"\u003eEvaluation of NLP models in Spanish\u003c/h3\u003e\n    \u003ch1 align=\"center\"\u003eIXA Submission for the 2024 ODESIA Challenge\u003c/h1\u003e\n    \n\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://twitter.com/intent/tweet?text=The+IXA+Code+for+Odesia:\u0026url=https%3A%2F%2Fgithub.com%2Fhitz-zentroa%2FOdesia-Struct\"\u003e\u003cimg alt=\"Twitter\" src=\"https://img.shields.io/twitter/url?style=social\u0026url=https%3A%2F%2Fgithub.com%2Fhitz-zentroa%2FOdesia-Struct\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://github.com/hitz-zentroa/Odesia-Struct/blob/main/LICENSE.md\"\u003e\u003cimg alt=\"GitHub license\" src=\"https://img.shields.io/github/license/hitz-zentroa/Odesia-Struct\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://huggingface.co/collections/HiTZ/odesia-challenge-2024-66ea8e1731b52eaa8f3d8cd8\"\u003e\u003cimg alt=\"Pretrained Models\" src=\"https://img.shields.io/badge/🤗HuggingFace-Pretrained Models-green\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://upload.wikimedia.org/wikipedia/commons/8/80/Comingsoon.png\"\u003e\u003cimg alt=\"Paper\" src=\"https://img.shields.io/badge/📖-Paper-orange\"\u003e\u003c/a\u003e\n\u003cbr\u003e\n     \u003ca href=\"http://www.hitz.eus/\"\u003e\u003cimg src=\"https://img.shields.io/badge/HiTZ-Basque%20Center%20for%20Language%20Technology-blueviolet\"\u003e\u003c/a\u003e\n    \u003ca href=\"http://www.ixa.eus/?language=en\"\u003e\u003cimg src=\"https://img.shields.io/badge/IXA-%20NLP%20Group-ff3333\"\u003e\u003c/a\u003e\n    \u003cbr\u003e\n     \u003cbr\u003e\n\u003c/p\u003e\n\n\nThis repository contains the IXA submission for the 2024 ODESIA Challenge.\n- 📈 ODESIA Leaderboard: https://leaderboard.odesia.uned.es/leaderboard/challenge\n- 📒 System Description Paper: Cooming Soon\n\n\n# Explanation of the approach\n\nEvery task is converted into a text-to-text task in a format suitable for current state-of-the-art decoder-only models. \n\n## ⤵️ Model Input\nWe format every task following the same prompt schema. The prompt includes the guidelines for the task, up to 20 few-shot examples randomly sampled from the train split, and the input to analyze. \n\n```jinja\n{{ Guidelines }}\n\nExamples\n--------\n\n{% for example in examples %}\nInput: {{ example.question }}\nOutput: {{ example.answer.model_dump_json() }}\n\n{% endfor %}\n\n--------\n\nNow, analyze the following input:\n\nInput: {{ question }}\n```\n\n## ➡️ Model output\n\nEvery task output is defined as a JSON schema using Pydantic. For example, for `DIPROMATS_2023`, which is a multi-label classification task, the output is defined as follows:\n\n\n```python\nclass LabelEnum(str, Enum):\n    ad_populum = \"ad-populum\"\n    flag_waving = \"flag-waving\"\n    absurdity_appeal = \"absurdity-appeal\"\n    demonization = \"demonization\"\n    doubt = \"doubt\"\n    fear_appeals_destructive = \"fear-appeals-destructive\"\n    name_calling = \"name-calling\"\n    propaganda_slinging = \"propaganda-slinging\"\n    scapegoating = \"scapegoating\"\n    undiplomatic_assertiveness_whataboutism = (\n        \"undiplomatic-assertiveness-whataboutism\"\n    )\n    loaded_language = \"loaded-language\"\n    appeal_to_false_authority = \"appeal-to-false-authority\"\n    bandwagoning = \"bandwagoning\"\n    non_propaganda = \"non-propaganda\"\n\nclass Identification(BaseModel):\n    label: List[LabelEnum]\n```\n\n\nUsing [🗒️ Outlines](https://github.com/dottxt-ai/outlines), we use guided generation to produce the output. At inference, the model is forced to produce a valid `JSON` output that is compliant with the Pydantic specification. For example:\n\n\n```python\n{\"label\":[\"ad-populum\",\"loaded-language\"]}\n```\n\n\nThe Guidelines and output specification for every task are defined in [src/tasks](src/tasks)\n\n## Model finetuning\n\nWe finetune a decoder-only model (gemma-2b or Llama3.1) in a multi-task setting. This means we train a single model that works for every task. Our pretrained models are available on 🤗HuggingFace: https://huggingface.co/collections/HiTZ/odesia-challenge-2024-66ea8e1731b52eaa8f3d8cd8\n\n# Reproduce our results\n\n## Requirements\n\nYou should install the following requirements. All of them can be installed with `pip install [requirement]`\n\n\n```\ntorch\ntransformers\naccelerate\ndeepspeed\noutlines\npydantic\nbitsandbytes\njinja2\nseqeval\n```\n\nTo reproduce our environment install the `requirements.txt` file:\n\n```bash\npip install -r requirements.txt\n```\n \n\nYou should unzip the `.zip` file in [data/](data/).  \nThe expected data structure is \n```\ndata/\n    diann_2023/\n        DIANN_2023_T1_en.json\n        DIANN_2023_T1_en.json\n        ...\n    dipromats_2023/\n        ...\n    exist_2022/\n        ...\n    exist_2023/\n        ...\n    sqac_squad_2024\n        ...\n```\n\n## Models\nWe have trained 3 models of different size:\n- HiTZ/Qwen2.5-14B-Instruct_ODESIA: https://huggingface.co/HiTZ/Qwen2.5-14B-Instruct_ODESIA\n- HiTZ/Hermes-3-Llama-3.1-8B_ODESIA: https://huggingface.co/HiTZ/Hermes-3-Llama-3.1-8B_ODESIA\n- HiTZ/gemma-2b-it_ODESIA: https://huggingface.co/HiTZ/gemma-2b-it_ODESIA\n\n## Run Evaluation/Inference\n\nYou must run this command before launching the scripts below\n```\nexport PYTHONPATH=\"$PYTHONPATH:$PWD\"\n```\n\nYou can evaluate any model on the development set with the following command:\n\n```bash\ntorchrun --standalone --nproc_per_node=1 src/evaluate.py --tasks all --model_name HiTZ/Qwen2.5-14B-Instruct_ODESIA --output_dir results/finetune/Qwen2.5-14B-Instruct\n\n```\n\nTo reproduce our leaderboard results, you can run inference on the test sets using the following command. The resulting output files are ready to be submitted to the ODESIA challenge:\n\n```bash\ntorchrun --standalone --nproc_per_node=1 src/inference.py --tasks all --model_name HiTZ/Qwen2.5-14B-Instruct_ODESIA --output_dir results/finetune/Qwen2.5-14B-Instruct\n```\n\nYou can also run inference in selected tasks\n```bash\ntorchrun --standalone --nproc_per_node=1 src/inference.py \\\n--tasks \\\nexist_2022_t1_es \\\nexist_2022_t2_es \\\nexist_2023_t1_es \\\nexist_2023_t2_es \\\nexist_2023_t3_es \\\ndipromats_2023_t1_es \\\ndipromats_2023_t2_es \\\ndipromats_2023_t3_es \\\ndiann_2023_t1_es \\\nsquad_2024_t1_es \\\n--model_name HiTZ/Qwen2.5-14B-Instruct_ODESIA \\\n--output_dir results/finetune/Qwen2.5-14B-Instruct\n```\n\n\n\u003e Warning: The test sets do not contain the labels. If you want to evaluate the predictions, you should submit them to the ODESIA leaderboard [https://leaderboard.odesia.uned.es/leaderboard/challenge](https://leaderboard.odesia.uned.es/leaderboard/challenge) or use the PyEvAll library [https://github.com/UNEDLENAR/PyEvALL/tree/main](https://github.com/UNEDLENAR/PyEvALL/tree/main)\n\n\u003e Warning: We randomly sample few-shot examples from the train split for every input. These few-shot examples vary each evaluation run, so the evaluation results may change slightly  each time you run an evaluation. \n \n\n### 4-bit quantization\nIf you do not have enough VRAM to run a model, you can use 4-bit quantization by adding the `--quantization` flag to the previous commands. Example:\n\n\n```bash\ntorchrun --standalone --nproc_per_node=1 src/inference.py --quantization --tasks all --model_name HiTZ/Qwen2.5-14B-Instruct_ODESIA --output_dir results/finetune/Qwen2.5-14B-Instruct \n```\n\n\u003e Warning: Quantization affects the model performance. Expect lower scores when running the model with 4 bit quantization. \n \n\n## Run Training\n\nTo finetune a model, you first need to define a `Training config`. Config examples for LLama3.1, Gemma and qwen using Full-Finetuning and LoRA are available in the [train_configs/](train_configs/) directory. Full-Finetuning will achieve slightly better results but requires a lot of VRAM (We use 4x A100 80GB for the 8B model and 8xA100 80GB for the 14B model). LoRA uses much less VRAM and supports model quantization, so it can be run on a single GPU. \n\nWe use Deepspeed to split the model across the GPUs. You can reproduce our fine-tuning results with the following command (It will split the model across 8 GPUs)\n\n```bash\nexport PYTHONPATH=\"$PYTHONPATH:$PWD\"\naccelerate launch --config_file train_configs/deepspeed_8.json src/train.py train_configs/qwen14B.yaml\n\n```\n\n\nIf you want to run LoRA finetuning with a single GPU, you can use the following command:\n\n\n```bash\npython3 -m src.train train_configs/gemma2B_LoRa.yaml\n```\n\nIf you want to enable model quantization, you can set `quantization:4` in the training configuration file. \n\n\u003e Warning: Our inputs are very long, as we use many few-shot examples. Therefore, training requires a lot of VRAM and might be very slow. You can reduce the number of few-shot examples by modifying the __init__ default parameter for every task in [src/tasks](src/tasks). \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhitz-zentroa%2Fodesia-struct","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhitz-zentroa%2Fodesia-struct","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhitz-zentroa%2Fodesia-struct/lists"}