{"id":22253073,"url":"https://github.com/bbartling/distributedgpt","last_synced_at":"2025-10-06T02:10:48.974Z","repository":{"id":265925599,"uuid":"896600242","full_name":"bbartling/DistributedGPT","owner":"bbartling","description":"Parallel computing hobby project for learning purposes","archived":false,"fork":false,"pushed_at":"2024-12-25T18:31:12.000Z","size":782,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"develop","last_synced_at":"2025-01-30T11:23:45.650Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bbartling.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-30T19:49:41.000Z","updated_at":"2024-12-25T18:31:16.000Z","dependencies_parsed_at":"2024-12-16T17:31:53.427Z","dependency_job_id":"3995173f-0970-44ee-baad-4401454401b7","html_url":"https://github.com/bbartling/DistributedGPT","commit_stats":null,"previous_names":["bbartling/distributedgpt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bbartling%2FDistributedGPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bbartling%2FDistributedGPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bbartling%2FDistributedGPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bbartling%2FDistributedGPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bbartling","download_url":"https://codeload.github.com/bbartling/DistributedGPT/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245459689,"owners_count":20618902,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-03T07:16:26.062Z","updated_at":"2025-10-06T02:10:43.942Z","avatar_url":"https://github.com/bbartling.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DistributedGPT\n\nThis is a hobby project aimed at learning and experimentation. The primary goal is to explore running a large language model (LLM) entirely on `localhost` by breaking the model into smaller portions, loading them sequentially into memory for inference, and unloading them to optimize resource usage. At moment we are expermenting with seeing if the LLM can come up with good pseudo code for HVAC systems optimizations.\n\n## Installation\n\n1. Clone the repository:\n\n2. Create and activate a Python virtual environment:\n   ```bash\n   python -m venv env\n   source env/bin/activate  # On Windows: .\\env\\Scripts\\activate\n   ```\n\n3. Install dependencies:\n   ```bash\n   pip install torch transformers psutil\n   ```\n\n## Notes\n\n- Check your system memory on Windows in PowerShell:\n   ```powershell\n   Get-CIMInstance Win32_OperatingSystem | Select-Object -Property *memory*\n   ```\n\n- On Windows in PowerShell set your API key as a temporary operating system variable:\n   ```powershell\n   $env:HUGGINGFACE_API_KEY = \"your_huggingface_api_key\"\n   ```\n\n## Download a model from Huggingface\n\nUse the `downloader.py` and modify as necessary to grab a model from Huggingface. Models on my PC are cached in this location `C:\\Users\\ben\\.cache\\huggingface\\hub` and I can use PowerShell to see how big they are in gigabytes.\nThis data is used as a setting another Py file that divides up the model into different parts. Do a change directory or a `Set-Location` as shown below:\n\n```powershell\n# Change to the Hugging Face cache directory\nSet-Location -Path 'C:\\Users\\ben\\.cache\\huggingface\\hub'\n```\n\nLoop through each folder and print out model sizes which also maybe available somewhere on Huggingface as well for each model:\n\n```powershell\n# Loop through directories and calculate their size\nGet-ChildItem -Directory | ForEach-Object {\n    $sizeBytes = (Get-ChildItem -Path $_.FullName -Recurse | Where-Object { $_.PSIsContainer -eq $false } | Measure-Object -Property Length -Sum).Sum\n    $sizeGB = [math]::Round($sizeBytes / 1GB, 2)\n    Write-Output \"Model: $($_.Name), Size: $sizeGB GB\"\n}\n```\n\nThis prints out as:\n\n```powershell\nModel: models--distilgpt2, Size: 0.33 GB\nModel: models--meta-llama--CodeLlama-7b-Python-hf, Size: 12.55 GB\nModel: models--meta-llama--Llama-3.1-8B, Size: 14.97 GB\nModel: models--meta-llama--Llama-3.2-1B, Size: 2.31 GB\n```\n\nRemove a model with PowerShell:\n\n```powershell\nRemove-Item -Recurse -Force \"models--meta-llama--Llama-3.2-1B\"\n```\n\n## Splitting up the model \n\nThis Python file splits up the model into a hard coded defined number of files. With a text editor in `split_up_model.py` hard code in your model cache location and size:\n```python\nFALCON_7B_CACHE_DIRECTORY = r\"C:\\Users\\ben\\.cache\\huggingface\\hub\\models--tiiuae--falcon-7b-instruct\\snapshots\\8782b5c5d8c9290412416618f36a133653e85285\"\nFALCON_7B_MODEL_PARTS_DIR = \"./7_b_model_parts\"\nFALCON_7B_HARDCODED_NUM_CHUNKS = 8  # Example hard-coded value\n```\n\n## Run an inference test\n\nNow that the model is split up into parts locally in the `model_parts_ directory` then we can try running the `run_an_inf_tester.py`:\n\n```powershell\n\u003e python .\\run_an_inf_tester.py\nTokenizer loaded successfully!\nModel loaded successfully!\n```\n\n## Lesson Learned So Far...\n\n- [x] **Test on GPT2**  \n  *Works, but it’s GPT2!*\n  \n- [x] **Test on Llama-3.1-8B and Llama-3.2-1B**  \n  *Errors trying to use Llama models.*\n  * Try again in the future! When splitting up the model I need research more about the deep learning architecture especially internal workings of the Python transformer library in how deep learning models are defined.\n\n- [x] **Test on on a fine tuned model - ericzzz--falcon-rw-1b-instruct-openorca**\n  * https://huggingface.co/ericzzz/falcon-rw-1b-instruct-openorca\n\n- [x] **Test on on a fine tuned model - tiiuae/falcon-7b-instruct**\n  * https://huggingface.co/tiiuae/falcon-7b-instruct\n  * Notice its a different prompt template as compared to the 1b Falcon. This was found out by trial and error ☹️😒 but we got it working! 👍😊👌💪\n\n- [ ] **Test on largest Falcon model???**\n\n- [ ] **Test with a GPU that supports CUDA!**\n\n- [ ] **Experiment with multi-agent(s) performing tasks for the human.**\n\n- [ ] **Brainstorm further ideas or improvements.** 🤔\n\n## TODO\nImplement a conversation history and prompt input function in Py to push model to complete creations.\n* https://youtu.be/v-dO88wU-jM?si=AErw3jewRs-oQ_zr\n* https://langchain-ai.github.io/langgraph/concepts/memory/\n\nExperiment with a coding LLM to get better pseudo code results.\n\n## License\n\nMIT","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbbartling%2Fdistributedgpt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbbartling%2Fdistributedgpt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbbartling%2Fdistributedgpt/lists"}