{"id":23726004,"url":"https://github.com/voidful/seq2seq-lm-trainer","last_synced_at":"2025-08-31T14:03:52.592Z","repository":{"id":170036666,"uuid":"630079394","full_name":"voidful/seq2seq-lm-trainer","owner":"voidful","description":"This is a simple example of using the T5 model for sequence-to-sequence tasks, leveraging Hugging Face's `Trainer` for efficient model training. ","archived":false,"fork":false,"pushed_at":"2023-08-29T14:50:21.000Z","size":11,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-05T00:24:56.283Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/voidful.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-19T16:12:41.000Z","updated_at":"2023-07-31T17:50:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"b115a51b-d236-4be4-816d-b0b7028f32c1","html_url":"https://github.com/voidful/seq2seq-lm-trainer","commit_stats":null,"previous_names":["voidful/t5-seq2seq-trainer","voidful/seq2seq-lm-trainer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/voidful/seq2seq-lm-trainer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fseq2seq-lm-trainer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fseq2seq-lm-trainer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fseq2seq-lm-trainer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fseq2seq-lm-trainer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/voidful","download_url":"https://codeload.github.com/voidful/seq2seq-lm-trainer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fseq2seq-lm-trainer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272988919,"owners_count":25026961,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-31T00:18:07.979Z","updated_at":"2025-08-31T14:03:52.572Z","avatar_url":"https://github.com/voidful.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# seq2seq-lm-trainer\n\nThis is a simple example of seq2seq lm training, leveraging Hugging Face's `Trainer` for efficient model training.   \nThe repository includes a configurable interface for dataset processing and evaluation metrics, allowing for seamless adaptation to various tasks and datasets.  \n\n## Features\n\n- Utilize the seq2seq lm model\n- Easy configuration for custom dataset processing and evaluation metrics\n- Integration with Hugging Face's `Trainer` for efficient training and evaluation\n\n## Usage\n\n1. **Dataset processing**: Modify `data_processing.py` to accommodate your own dataset. The script should take care of loading, preprocessing, and tokenizing the data as required by the T5 model.\n\n2. **Evaluation metric**: Customize the evaluation metric by modifying `eval_metric.py`. This script should implement the necessary logic to compute the desired evaluation metric for your task (e.g., BLEU score, ROUGE score, etc.).\n\n3. **Training and evaluation**: Execute `main.py` to start the training and evaluation process. This script will use the custom dataset processing and evaluation metric functions specified in the previous steps, along with the Hugging Face `Trainer`, to efficiently train and evaluate the T5 model on your task.\n\n## Requirements\n\n- Python 3.6 or later\n- Hugging Face Transformers library\n- PyTorch\n- tqdm\n\nTo install the required packages, run:\n\n```\npip install -r requirements.txt\n```\n\n## Example\n\nAn example dataset and evaluation metric (e.g., machine translation with BLEU score) can be provided in the repository to demonstrate the usage and modification of the data processing and evaluation metric scripts.\n\n## License\n\nThis project is licensed under the [MIT License](LICENSE).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidful%2Fseq2seq-lm-trainer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvoidful%2Fseq2seq-lm-trainer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidful%2Fseq2seq-lm-trainer/lists"}