{"id":17989488,"url":"https://github.com/evilfreelancer/impruver","last_synced_at":"2025-08-17T01:32:01.456Z","repository":{"id":245347177,"uuid":"816205571","full_name":"EvilFreelancer/impruver","owner":"EvilFreelancer","description":"A set of scripts and configurations for pretraining of Large Language Models (LLM)","archived":false,"fork":false,"pushed_at":"2025-03-02T12:01:11.000Z","size":328,"stargazers_count":30,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-11T04:39:02.960Z","etag":null,"topics":["accelerate","ddp","gpt2","llama","llm","machine-learning","nlp","rugpt","train","transformers"],"latest_commit_sha":null,"homepage":"https://t.me/evilfreelancer","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EvilFreelancer.png","metadata":{"files":{"readme":"README.en.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-17T08:57:11.000Z","updated_at":"2025-07-05T20:07:23.000Z","dependencies_parsed_at":"2024-06-21T16:22:32.308Z","dependency_job_id":"4b80d56c-4a1a-401c-bc91-f735743b51bf","html_url":"https://github.com/EvilFreelancer/impruver","commit_stats":null,"previous_names":["evilfreelancer/impruver"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/EvilFreelancer/impruver","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fimpruver","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fimpruver/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fimpruver/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fimpruver/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EvilFreelancer","download_url":"https://codeload.github.com/EvilFreelancer/impruver/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fimpruver/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270796227,"owners_count":24647320,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accelerate","ddp","gpt2","llama","llm","machine-learning","nlp","rugpt","train","transformers"],"created_at":"2024-10-29T19:14:48.029Z","updated_at":"2025-08-17T01:32:01.446Z","avatar_url":"https://github.com/EvilFreelancer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Impruver: Framework for Training Large Language Models (LLMs)\n\n[Русский](./README.md) | [中文](./README.zh.md) | **English**\n\nA set of scripts and configurations for training Large Language Models (LLMs) independently.\n\nInspired by projects like [saiga](https://github.com/IlyaGusev/saiga),\n[torchtune](https://github.com/pytorch/torchtune)\nand [nanoGPT](https://github.com/karpathy/nanoGPT).\n\nFeatures:\n\n- Unified configuration in YAML format for dataset preparation, training, and inference\n    - Allows specifying a tokenizer and model separately\n    - Supports training models from scratch, full training, and LoRA/Peft fine-tuning\n- Flexible dataset preparation system that enables combining multiple datasets, individually slicing and transforming\n  each, and then merging and deduplicating them\n    - Supports datasets in `instruct` or `chat` formats, converting them into OpenAI-style chat message formats\n    - Enables training function call models with `function_call` and `function_response` roles\n- Unlike other implementations, it uses classes from the `transformers` library. However, you can specify any other\n  class for the model and/or tokenizer, and `impruver` will use them\n- Supports distributed training using `accelerate`\n\nFor more details, check the project's [documentation](https://github.com/EvilFreelancer/impruver/wiki).\n\n## Requirements\n\n* Python 3.12\n* Python Virtual Environment\n* Nvidia GPU with 24GB VRAM (for GPUs with less VRAM, you can reduce the values of `per_device_*_batch_size`\n  and/or `gradient_accumulation_steps`)\n* Nvidia drivers and CUDA\n\n## Installation\n\nInstall with a single command:\n\n```shell\npip install impruver\n```\n\nThis will add the `impruver` CLI utility to your PATH.\n\nIf you plan to train models using Flash Attention, also run:\n\n```shell\npip install setuptools psutil torch flash-attn --no-build-isolation\n```\n\n## Available Configurations\n\nGet a full list of training recipes and configurations by running:\n\n```shell\nimpruver ls\n```\n\nYou can copy a configuration locally:\n\n```shell\nimpruver cp ruGPT-3.5/13B_lora_saiga2 ./ruGPT-3.5_13B_lora_saiga2.yaml\n```\n\nLearn more about [configurations](https://github.com/EvilFreelancer/impruver/wiki/Конфигурация) in the project wiki.\n\n## Usage\n\nBefore training a model, prepare and deduplicate the dataset, then split it into training and validation sample sets.\n\nThese tasks can be performed using the `compose_dataset` recipe with the specified configuration:\n\n```shell\nimpruver run compose_dataset --config ./ruGPT-3.5_13B_lora_saiga2.yaml\n```\n\nOr by using a configuration from the default set:\n\n```shell\nimpruver run compose_dataset --config ruGPT-3.5/13B_lora_saiga2\n```\n\nNext, run the `finetune` recipe to train the transformer model:\n\n```shell\nimpruver run finetune --config ./ruGPT-3.5_13B_lora_saiga2.yaml\n```\n\nThe training script supports logging to Weights and Biases (W\u0026B). By default, this is disabled, but you can enable it by\nadding the `--report-to=wandb` option to the training command.\n\nOnce training is complete, you can launch an interactive chat session using the `chat` recipe:\n\n```shell\nimpruver run chat ./ruGPT-3.5_13B_lora_saiga2.yaml\n```\n\nTo exit the chat shell, use the `Ctrl+D` or `Ctrl+C` keyboard shortcuts.\n\nLearn more about [training](https://github.com/EvilFreelancer/impruver/wiki/Обучение) in the project wiki.\n\n## License\n\nThis project is distributed under the MIT license. See the [LICENSE](./LICENSE) file for details.\n\n## Citation\n\n```\n@misc{impruver2024sources,\n    author       = {Pavel Rykov},\n    title        = {{Impruver: Framework for Training Large Language Models}},\n    howpublished = {\\url{https://github.com/EvilFreelancer/impruver}},\n    year         = {2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Fimpruver","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevilfreelancer%2Fimpruver","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Fimpruver/lists"}