{"id":13478022,"url":"https://github.com/google/gemma_pytorch","last_synced_at":"2025-05-13T18:06:47.613Z","repository":{"id":223608691,"uuid":"760663631","full_name":"google/gemma_pytorch","owner":"google","description":"The official PyTorch implementation of Google's Gemma models","archived":false,"fork":false,"pushed_at":"2025-03-21T04:36:37.000Z","size":5647,"stargazers_count":5426,"open_issues_count":6,"forks_count":534,"subscribers_count":39,"default_branch":"main","last_synced_at":"2025-04-25T14:50:36.758Z","etag":null,"topics":["gemma","google","pytorch"],"latest_commit_sha":null,"homepage":"https://ai.google.dev/gemma","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-20T17:53:21.000Z","updated_at":"2025-04-24T17:46:00.000Z","dependencies_parsed_at":"2024-02-21T06:22:49.078Z","dependency_job_id":"386cc447-90b5-4ae2-9205-53997a6d7335","html_url":"https://github.com/google/gemma_pytorch","commit_stats":null,"previous_names":["google/gemma_pytorch"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fgemma_pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fgemma_pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fgemma_pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fgemma_pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google","download_url":"https://codeload.github.com/google/gemma_pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254000848,"owners_count":21997441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gemma","google","pytorch"],"created_at":"2024-07-31T16:01:51.397Z","updated_at":"2025-05-13T18:06:47.578Z","avatar_url":"https://github.com/google.png","language":"Python","readme":"# Gemma in PyTorch\n\n**Gemma** is a family of lightweight, state-of-the art open models built from research and technology used to create Google Gemini models. They include both text-only and multimodal decoder-only large language models, with open weights, pre-trained variants, and instruction-tuned variants. For more details, please check out the following links:\n\n * [Gemma on Google AI](https://ai.google.dev/gemma)\n * [Gemma on Kaggle](https://www.kaggle.com/models/google/gemma-3)\n * [Gemma on Vertex AI Model Garden](https://pantheon.corp.google.com/vertex-ai/publishers/google/model-garden/gemma3)\n\nThis is the official PyTorch implementation of Gemma models. We provide model and inference implementations using both PyTorch and PyTorch/XLA, and support running inference on CPU, GPU and TPU.\n\n## Updates\n\n * [March 12th, 2025 🔥] Support Gemma v3. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma-3/pytorch) and [Hugging Face](https://huggingface.co/models?other=gemma_torch)\n\n * [June 26th, 2024] Support Gemma v2. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma-2/pytorch) and Hugging Face\n\n * [April 9th, 2024] Support CodeGemma. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/codegemma/pytorch) and [Hugging Face](https://huggingface.co/collections/google/codegemma-release-66152ac7b683e2667abdee11)\n\n * [April 5, 2024] Support Gemma v1.1. You can find the v1.1 checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma/frameworks/pyTorch) and [Hugging Face](https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b).\n\n## Download Gemma model checkpoint\n\nYou can find the model checkpoints on Kaggle:\n\n- [Gemma 3](https://www.kaggle.com/models/google/gemma-3/pyTorch)\n- [Gemma 2](https://www.kaggle.com/models/google/gemma-2/pyTorch)\n- [Gemma](https://www.kaggle.com/models/google/gemma/pyTorch)\n\nAlternatively, you can find the model checkpoints on the Hugging Face Hub [here](https://huggingface.co/models?other=gemma_torch). To download the models, go the the model repository of the model of interest and click the `Files and versions` tab, and download the model and tokenizer files. For  programmatic downloading, if you have `huggingface_hub` installed, you can also run:\n\n```\nhuggingface-cli download google/gemma-3-4b-it-pytorch\n```\n\nThe following model sizes are available:\n\n- **Gemma 3**: \n  - **Text only**: 1b\n  - **Multimodal**: 4b, 12b, 27b_v3\n- **Gemma 2**: \n  - **Text only**: 2b-v2, 9b, 27b\n- **Gemma**: \n  - **Text only**: 2b, 7b\n\n\nNote that you can choose between the 1B, 4B, 12B, and 27B variants.\n\n```\nVARIANT=\u003c1b, 2b, 2b-v2, 4b, 7b, 9b, 12b, 27b, 27b_v3\u003e\nCKPT_PATH=\u003cInsert ckpt path here\u003e\n```\n\n## Try it free on Colab\n\nFollow the steps at\n[https://ai.google.dev/gemma/docs/pytorch_gemma](https://ai.google.dev/gemma/docs/pytorch_gemma).\n\n## Try it out with PyTorch\n\nPrerequisite: make sure you have setup docker permission properly as a non-root user.\n\n```bash\nsudo usermod -aG docker $USER\nnewgrp docker\n```\n\n### Build the docker image.\n\n```bash\nDOCKER_URI=gemma:${USER}\n\ndocker build -f docker/Dockerfile ./ -t ${DOCKER_URI}\n```\n\n### Run Gemma inference on CPU.\n\n\u003e NOTE: This is a multimodal example. Use a multimodal variant.\n\n```bash\ndocker run -t --rm \\\n    -v ${CKPT_PATH}:/tmp/ckpt \\\n    ${DOCKER_URI} \\\n    python scripts/run_multimodal.py \\\n    --ckpt=/tmp/ckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Run Gemma inference on GPU.\n\n\u003e NOTE: This is a multimodal example. Use a multimodal variant.\n\n```bash\ndocker run -t --rm \\\n    --gpus all \\\n    -v ${CKPT_PATH}:/tmp/ckpt \\\n    ${DOCKER_URI} \\\n    python scripts/run_multimodal.py \\\n    --device=cuda \\\n    --ckpt=/tmp/ckpt \\\n    --variant=\"${VARIANT}\"\n    # add `--quant` for the int8 quantized model.\n```\n\n## Try It out with PyTorch/XLA\n\n### Build the docker image (CPU, TPU).\n\n```bash\nDOCKER_URI=gemma_xla:${USER}\n\ndocker build -f docker/xla.Dockerfile ./ -t ${DOCKER_URI}\n```\n\n### Build the docker image (GPU).\n\n```bash\nDOCKER_URI=gemma_xla_gpu:${USER}\n\ndocker build -f docker/xla_gpu.Dockerfile ./ -t ${DOCKER_URI}\n```\n\n### Run Gemma inference on CPU.\n\n\u003e NOTE: This is a multimodal example. Use a multimodal variant.\n\n```bash\ndocker run -t --rm \\\n    --shm-size 4gb \\\n    -e PJRT_DEVICE=CPU \\\n    -v ${CKPT_PATH}:/tmp/ckpt \\\n    ${DOCKER_URI} \\\n    python scripts/run_xla.py \\\n    --ckpt=/tmp/ckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Run Gemma inference on TPU.\n\nNote: be sure to use the docker container built from `xla.Dockerfile`.\n\n```bash\ndocker run -t --rm \\\n    --shm-size 4gb \\\n    -e PJRT_DEVICE=TPU \\\n    -v ${CKPT_PATH}:/tmp/ckpt \\\n    ${DOCKER_URI} \\\n    python scripts/run_xla.py \\\n    --ckpt=/tmp/ckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Run Gemma inference on GPU.\n\nNote: be sure to use the docker container built from `xla_gpu.Dockerfile`.\n\n```bash\ndocker run -t --rm --privileged \\\n    --shm-size=16g --net=host --gpus all \\\n    -e USE_CUDA=1 \\\n    -e PJRT_DEVICE=CUDA \\\n    -v ${CKPT_PATH}:/tmp/ckpt \\\n    ${DOCKER_URI} \\\n    python scripts/run_xla.py \\\n    --ckpt=/tmp/ckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Tokenizer Notes\n\n99 unused tokens are reserved in the pretrained tokenizer model to assist with more efficient training/fine-tuning. Unused tokens are in the string format of `\u003cunused[0-97]\u003e` with token id range of `[7-104]`. \n\n```\n\"\u003cunused0\u003e\": 7,\n\"\u003cunused1\u003e\": 8,\n\"\u003cunused2\u003e\": 9,\n...\n\"\u003cunused98\u003e\": 104,\n```\n\n## Disclaimer\n\nThis is not an officially supported Google product.\n","funding_links":[],"categories":["Python","Project List","A01_文本生成_文本对话","Summary","Repos"],"sub_categories":["\u003cspan id=\"tool\"\u003eLLM (LLM \u0026 Tool)\u003c/span\u003e","大语言对话模型及数据"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fgemma_pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle%2Fgemma_pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fgemma_pytorch/lists"}