{"id":19485726,"url":"https://github.com/devforth/gpt-j-6b-gpu-docker","last_synced_at":"2025-04-25T17:30:36.913Z","repository":{"id":38348013,"uuid":"458926008","full_name":"devforth/gpt-j-6b-gpu-docker","owner":"devforth","description":null,"archived":false,"fork":false,"pushed_at":"2023-01-12T11:36:19.000Z","size":24,"stargazers_count":130,"open_issues_count":8,"forks_count":28,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-04-04T00:51:12.955Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devforth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-02-13T20:40:31.000Z","updated_at":"2024-11-25T22:03:09.000Z","dependencies_parsed_at":"2023-02-09T11:32:06.384Z","dependency_job_id":null,"html_url":"https://github.com/devforth/gpt-j-6b-gpu-docker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devforth%2Fgpt-j-6b-gpu-docker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devforth%2Fgpt-j-6b-gpu-docker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devforth%2Fgpt-j-6b-gpu-docker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devforth%2Fgpt-j-6b-gpu-docker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devforth","download_url":"https://codeload.github.com/devforth/gpt-j-6b-gpu-docker/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250861923,"owners_count":21499184,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T20:29:50.525Z","updated_at":"2025-04-25T17:30:36.682Z","avatar_url":"https://github.com/devforth.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Run GPT-J-6B model (text generation open source GPT-3 analog) for inference on server with GPU using zero-dependency Docker image. \n\nFirst script loads model into video RAM (can take several minutes) and then runs internal HTTP server which is listening on 8080.\n\n# Prerequirements to run GPT-J on GPU\n\nYou can run this image only on instance with 16 GB Video memory and Linux (e.g. Ubuntu)\n\nServer machine should have NVIDIA Driver and Docker daemon with NVIDIA Container Toolkit. See below.\n\n\u003e Tested on NVIDIA Titan RTX, NVIDIA Tesla P100, \n\u003e Not supported: NVIDIA RTX 3090, RTX A5000, RTX A6000. Reasone Cuda+PyTorch coombination:\n\u003e CUDA capability sm_86 is not supported, PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 (we use latest PyTorch during image build), [match sm_x to video card](https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/)\n\n## Install Nvidia Drivers\n\nYou can skip this step if you already have `nvidia-smi` and it outputs the table with CUDA Version:\n\n``` \nMon Feb 14 14:28:16 2022       \n+-----------------------------------------------------------------------------+\n| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |\n|-------------------------------+----------------------+----------------------+\n| ...\n\n```\n\nE.g. for Ubuntu 20.04\n```\napt purge *nvidia*\napt autoremove\nadd-apt-repository ppa:graphics-drivers/ppa\napt update\napt install -y ubuntu-drivers-common\nubuntu-drivers autoinstall\n```\n\n\u003e Note: Unfortunetely NVIDIA drivers installation process might be quite challenging sometimes, e.g. there might be some known issues https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390/+bug/1768050/comments/3, Google helps a lot\n\nAfter installing and rebooting, test it with `nvidia-smi`, you should see table.\n\n## Install Dockerd with NVIDIA Container Toolkit:\n\nHow to install it on Ubuntu:\n\n```\ndistribution=$(. /etc/os-release;echo $ID$VERSION_ID) \\\n   \u0026\u0026 curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add - \\\n   \u0026\u0026 curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list\n\napt update \u0026\u0026 apt -y upgrade\ncurl https://get.docker.com | sh \u0026\u0026 systemctl --now restart docker \napt install -y nvidia-docker2\n```\nAnd reboot server.\n\nTo test that CUDA in Docker works run :\n\n```\ndocker run --rm --gpus all nvidia/cuda:11.1-base nvidia-smi\n```\n\nIf all was installed correctly it should show same table as `nvidia-smi` on host.\nIf you have no NVIDIA Container Toolkit or did not reboot server yet you would get `docker: Error response from daemon: could not select device driver \"\" with capabilities: [[gpu]]` \n\n\n# Docker command to run image:\n\n```\ndocker run -p8080:8080 --gpus all --rm -it devforth/gpt-j-6b-gpu\n```\n\n\u003e `--gpus all` passes GPU into docker container, so internal bundled cuda instance will smoothly use it \n\n\u003e Though for apu we are using async FastAPI web server, calls to model which generate a text are blocking, so you should not expect parallelism from this webserver\n\nThen you can call model by using REST API:\n\n```\nPOST http://yourServerPublicIP:8080/generate/\nContent-Type: application/json\nBody: \n\n{\n  \"text\": \"Client: Hi, who are you?\\nAI: I am Vincent and I am barista!\\nClient: What do you do every day?\\nAI:\",\n  \"generate_tokens_limit\": 40,\n  \"top_p\": 0.7,\n  \"top_k\": 0,\n  \"temperature\":1.0\n}\n```\n\n\nFor developemnt clone the repository and run on server:\n\n```\ndocker run -p8080:8080 --gpus all --rm -it $(docker build -q .)\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevforth%2Fgpt-j-6b-gpu-docker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevforth%2Fgpt-j-6b-gpu-docker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevforth%2Fgpt-j-6b-gpu-docker/lists"}