{"id":16698802,"url":"https://github.com/ai-hypercomputer/jetstream","last_synced_at":"2025-10-23T02:43:29.347Z","repository":{"id":225252297,"uuid":"765455287","full_name":"AI-Hypercomputer/JetStream","owner":"AI-Hypercomputer","description":"JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).","archived":false,"fork":false,"pushed_at":"2025-05-09T19:30:29.000Z","size":6656,"stargazers_count":322,"open_issues_count":23,"forks_count":39,"subscribers_count":19,"default_branch":"main","last_synced_at":"2025-05-09T19:37:20.875Z","etag":null,"topics":["gemma","gpt","gpu","inference","jax","large-language-models","llama","llama2","llm","llm-inference","llmops","mlops","model-serving","pytorch","tpu","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AI-Hypercomputer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-03-01T00:24:07.000Z","updated_at":"2025-05-07T16:49:55.000Z","dependencies_parsed_at":"2024-04-19T16:39:30.492Z","dependency_job_id":"3b56d146-8106-4f68-8fcf-e3853a8d7b3b","html_url":"https://github.com/AI-Hypercomputer/JetStream","commit_stats":{"total_commits":121,"total_committers":21,"mean_commits":5.761904761904762,"dds":0.6446280991735538,"last_synced_commit":"d462ca9bbc55531bbe785203cb076e7797250f2a"},"previous_names":["google/jetstream","ai-hypercomputer/jetstream"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Hypercomputer%2FJetStream","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Hypercomputer%2FJetStream/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Hypercomputer%2FJetStream/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Hypercomputer%2FJetStream/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AI-Hypercomputer","download_url":"https://codeload.github.com/AI-Hypercomputer/JetStream/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254471060,"owners_count":22076585,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gemma","gpt","gpu","inference","jax","large-language-models","llama","llama2","llm","llm-inference","llmops","mlops","model-serving","pytorch","tpu","transformer"],"created_at":"2024-10-12T18:02:32.342Z","updated_at":"2025-10-23T02:43:23.560Z","avatar_url":"https://github.com/AI-Hypercomputer.png","language":"Python","readme":"[![Unit Tests](https://github.com/google/JetStream/actions/workflows/unit_tests.yaml/badge.svg?branch=main)](https://github.com/google/JetStream/actions/workflows/unit_tests.yaml?query=branch:main)\n[![PyPI version](https://badge.fury.io/py/google-jetstream.svg)](https://badge.fury.io/py/google-jetstream)\n[![PyPi downloads](https://img.shields.io/pypi/dm/google-jetstream?style=flat-square\u0026logo=pypi\u0026logoColor=white)](https://pypi.org/project/google-jetstream/)\n[![Contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CONTRIBUTING.md)\n\n# JetStream is a throughput and memory optimized engine for LLM inference on XLA devices.\n\n## About\n\nJetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).\n\n## JetStream Engine Implementation \n\nCurrently, there are two reference engine implementations available -- one for Jax models and another for Pytorch models.\n\n### Jax\n\n- Git: https://github.com/google/maxtext\n- README: https://github.com/google/JetStream/blob/main/docs/online-inference-with-maxtext-engine.md\n\n### Pytorch\n\n- Git: https://github.com/google/jetstream-pytorch \n- README: https://github.com/google/jetstream-pytorch/blob/main/README.md \n\n## Documentation\n- [Online Inference with MaxText on v5e Cloud TPU VM](https://cloud.google.com/tpu/docs/tutorials/LLM/jetstream) [[README](https://github.com/google/JetStream/blob/main/docs/online-inference-with-maxtext-engine.md)]\n- [Online Inference with Pytorch on v5e Cloud TPU VM](https://cloud.google.com/tpu/docs/tutorials/LLM/jetstream-pytorch) [[README](https://github.com/google/jetstream-pytorch/tree/main?tab=readme-ov-file#jetstream-pytorch)]\n- [Serve Gemma using TPUs on GKE with JetStream](https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-tpu-jetstream)\n- [Benchmark JetStream Server](https://github.com/google/JetStream/blob/main/benchmarks/README.md)\n- [Observability in JetStream Server](https://github.com/google/JetStream/blob/main/docs/observability-prometheus-metrics-in-jetstream-server.md)\n- [Profiling in JetStream Server](https://github.com/google/JetStream/blob/main/docs/profiling-with-jax-profiler-and-tensorboard.md)\n- [JetStream Standalone Local Setup](#jetstream-standalone-local-setup)\n\n\n# JetStream Standalone Local Setup\n\n## Getting Started\n\n### Setup\n```\nmake install-deps\n```\n\n### Run local server \u0026 Testing\n\nUse the following commands to run a server locally:\n```\n# Start a server\npython -m jetstream.core.implementations.mock.server\n\n# Test local mock server\npython -m jetstream.tools.requester\n\n# Load test local mock server\npython -m jetstream.tools.load_tester\n\n```\n\n### Test core modules\n```\n# Test JetStream core orchestrator\npython -m unittest -v jetstream.tests.core.test_orchestrator\n\n# Test JetStream core server library\npython -m unittest -v jetstream.tests.core.test_server\n\n# Test JetStream lora adapter tensorstore\npython -m unittest -v jetstream.tests.core.lora.test_adapter_tensorstore\n\n# Test mock JetStream engine implementation\npython -m unittest -v jetstream.tests.engine.test_mock_engine\n\n# Test mock JetStream token utils\npython -m unittest -v jetstream.tests.engine.test_token_utils\npython -m unittest -v jetstream.tests.engine.test_utils\n\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fai-hypercomputer%2Fjetstream","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fai-hypercomputer%2Fjetstream","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fai-hypercomputer%2Fjetstream/lists"}