{"id":17989531,"url":"https://github.com/evilfreelancer/rugpt3-custom","last_synced_at":"2026-04-05T11:31:57.322Z","repository":{"id":110658024,"uuid":"603393789","full_name":"EvilFreelancer/rugpt3-custom","owner":"EvilFreelancer","description":"Pre-training custom ruGPT3 model on books written by F.M. Dostoevski","archived":false,"fork":false,"pushed_at":"2023-09-03T01:23:50.000Z","size":4328,"stargazers_count":7,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-26T11:40:40.059Z","etag":null,"topics":["dataset","dostoevsky","gpt","prediction","rugpt","training","transformers"],"latest_commit_sha":null,"homepage":"https://t.me/evilfreelancer","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EvilFreelancer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-18T11:27:03.000Z","updated_at":"2024-04-12T20:21:20.000Z","dependencies_parsed_at":"2024-10-29T19:54:56.556Z","dependency_job_id":null,"html_url":"https://github.com/EvilFreelancer/rugpt3-custom","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/EvilFreelancer/rugpt3-custom","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Frugpt3-custom","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Frugpt3-custom/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Frugpt3-custom/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Frugpt3-custom/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EvilFreelancer","download_url":"https://codeload.github.com/EvilFreelancer/rugpt3-custom/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Frugpt3-custom/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31434624,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-05T08:13:15.228Z","status":"ssl_error","status_checked_at":"2026-04-05T08:13:11.839Z","response_time":75,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","dostoevsky","gpt","prediction","rugpt","training","transformers"],"created_at":"2024-10-29T19:14:53.236Z","updated_at":"2026-04-05T11:31:57.277Z","avatar_url":"https://github.com/EvilFreelancer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tuned ruGPT3 on custom data\n\nThe following was used as initial data:\n\n* Archive with digitized books by F.M. Dostoevsky\n* Model ruGPT3small\n\nThe model was trained for five epochs, resulting in a model file of approximately 600 megabytes in size.\n\nThe specified file has been uploaded to the HuggingFace service and can be used locally for testing.\n\n\u003e Details here: https://dzen.ru/a/ZHTfs9pggmVlGC79 (on russian)\n\n## Requirements\n\nIf you prefer the Docker way:\n\n* Docker Engine\n* Docker Compose\n* Docker Nvidia Runtime\n* CUDA 11.7\n\nor if you prefer to install everything manually:\n\n* Python 3.10\n* CUDA 11.7\n* NVCC\n\n## How it was made\n\nAt the first step I've checked GitHub for projects in which was created custom\nruGPT3 model, which was trained on any text data\n\nI've found [K7chyp/DostoevskyDoesntWriteIt](https://github.com/K7chyp/DostoevskyDoesntWriteIt) project, researched\nsources and extracted commands, logic and prepared dataset with text.\n\nMost important parts was copied to [train.sh](train.sh) and [prompt.sh](prompt.sh) scripts,\nin general it was just a python scripts for executing pre-training and using pre-trained model, taken from original\nruGPT3 by [AI Forever](https://github.com/ai-forever/ru-gpts).\n\nOn next step I've tried to train own model with default parameters passed to `pretrain_transformers.py` and\nfound limitations of graphics card, 8Gb VRAM on my Nvidia RTX 3050 was not enough.\n\nAfter several unsuccessful attempts, I managed to understand that changing the `block_size` parameter affects the amount\nof memory used during model training. Therefore, I reduced it from 2048 to 512, after which the training was completed\nwithout errors.\n\nNext I've created Dockerfile and docker-compose.yml and project was done.\n\n## How to install\n\nClone the repo, then switch working directory to sources root:\n\n```shell\ngit clone --recursive git@github.com:EvilFreelancer/rugpt3-custom.git\ncd rugpt3-custom\n```\n\n### The Doker way\n\nCopy config:\n\n```shell\ncp docker-compose.dist.yml docker-compose.yml\n```\n\nBuild and start:\n\n```shell\ndocker-compose build\ndocker-compose up -d\n```\n\nEnter into container:\n\n```shell\ndocker-compose exec app bash\n```\n\n### Manually\n\n```shell\n# Install packages\napt-get install -y software-properties-common curl build-essential git\n\n# Install RUST\nexport PATH=\"~/.cargo/bin:${PATH}\"\ncurl https://sh.rustup.rs -sSf | bash -s -- -y\n\n# Install packages required for Apex\npip install packaging==23.0 torch==1.13.1+cu117 -f https://download.pytorch.org/whl/torch_stable.html\n\n# Download and build Apex\nexport CUDA_HOME=/usr/local/cuda\ngit clone https://github.com/NVIDIA/apex.git\ncd ./apex \u0026\u0026 git checkout 8b7a1ff183741dd8f9b87e7bafd04cfde99cea28 \u0026\u0026 cd ..\npip install -v --no-cache-dir --global-option=\"--cpp_ext\" --global-option=\"--cuda_ext\" ./apex\n\n# Install ru-gpts\ngit clone https://github.com/EvilFreelancer/ru-gpts.git ru_gpts\n\n# Install other dependencies\npip install -r requirements.txt\n\n# For ruGPT3XL need to use requirements-xl.txt file\npip install -r requirements-xl.txt\n```\n\n## How to train (optional)\n\nFirst you need to create train and validation data from [output.csv](./data/output.csv), for this need to execute:\n\n```shell\npython3 prepare.py\n```\n\nThen execute following script:\n\n```shell\n./train.sh\n```\n\nAnd wait for a some time.\n\nTraining on my Nvidia RTX 3050 took about 35 minutes, GPU temp 64\u0026deg;С\n\n## How to use\n\nIf you want to use your own model then exec following script:\n\n```shell\n./prompt.sh\n```\n\nBut if you want to use my pretrained model uploaded to HuggingFace:\n\n```shell\n./prompt.hf.sh\n```\n\nAfter the model is loaded, you will see a command line prompt, just write a phrase and wait the result.\n\n## Few examples\n\n```\nМосква, 19 июня /\u003c18\u003e69.  \u003c…\u003e У меня, например, есть один приятель, очень умный человек, но которого я непонимаю. Он\nговорит мне:  –Знаете, Лев Николаич, я давно уже вас презирал, но вы, как человек умный, меня никогда не могли обидеть…\n```\n\n```\nОднажды вечером, за обедом, я вдруг увидал, что у меня как будто все лицо изменяется: глаза смыкались, губы двигались;\nнос тоже становился тоньше и суше, глаза сверкали и сверкали,– точно я что‑то предчувствовал и предугадывал. Я тотчас\nже подошел к нему, поздоровался с ним, но он не ответил мне и только молча указал мне на стул, где я сидел. Я сел и\nтотчас же опять начал его разглядывать. Он тотчас же потупил глаза и с минуту сидел неподвижно.\n```\n\n```\nМеж тем он стал меня допрашивать.  –Ну, что же?– сказал я ему,– что же?  –А вот-с, что же-с!– отвечал он,– что же-с,\nчто ж?  –А вот что, Марья Александровна, что ж?– сказал я, немного покраснев от гнева,– что ж, что же? что же?  –Ах,\nбоже мой! Да ведь это все пустяки-с.\n```\n\n## Links\n\n* https://dzen.ru/a/ZHTfs9pggmVlGC79\n* https://huggingface.co/evilfreelancer/dostoevsky_doesnt_write_it\n* https://github.com/K7chyp/DostoevskyDoesntWriteIt/\n* https://github.com/ai-forever/ru-gpts\n* https://github.com/GraphGrailAi/ruGPT3-ZhirV/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Frugpt3-custom","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevilfreelancer%2Frugpt3-custom","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Frugpt3-custom/lists"}