https://github.com/lbliii/pounce
=^..^= Pounce — Free-threading-native ASGI server for Python 3.14+ with real thread parallelism and streaming-first I/O
https://github.com/lbliii/pounce
asgi free-threading http nogil python server
Last synced: about 1 month ago
JSON representation
=^..^= Pounce — Free-threading-native ASGI server for Python 3.14+ with real thread parallelism and streaming-first I/O
- Host: GitHub
- URL: https://github.com/lbliii/pounce
- Owner: lbliii
- License: mit
- Created: 2026-02-07T16:56:27.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-03-26T16:22:55.000Z (about 2 months ago)
- Last Synced: 2026-03-26T17:46:56.197Z (about 2 months ago)
- Topics: asgi, free-threading, http, nogil, python, server
- Language: Python
- Homepage: https://lbliii.github.io/pounce/
- Size: 1.03 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Roadmap: ROADMAP.md
Awesome Lists containing this project
- awesome-python-web-frameworks - Pounce - a free-threading-native ASGI server for Python 3.14t with HTTP/1.1, HTTP/2, HTTP/3, WebSocket, and streaming support. (Http servers / More)
README
# =^..^= Pounce
[](https://pypi.org/project/bengal-pounce/)
[](https://github.com/lbliii/pounce/actions/workflows/ci.yml)
[](https://pypi.org/project/bengal-pounce/)
[](https://opensource.org/licenses/MIT)
[](https://pypi.org/project/bengal-pounce/)
**Pure Python ASGI server. 7x faster HTTP parsing. True thread parallelism on Python 3.14t.**
```python
import pounce
pounce.run("myapp:app")
```
---
## What is Pounce?
Pounce is a Python ASGI server for Python 3.14+, with a worker model designed for
free-threaded Python 3.14t. It runs standard ASGI applications, supports streaming
responses, and gives you a clear upgrade path from process-based servers such as
Uvicorn.
Pounce's built-in HTTP/1.1 parser runs at ~3 us per request (7x faster than h11), its frozen configuration eliminates all locking overhead, and its rolling reload spawns new workers while draining old ones — zero dropped requests.
On Python 3.14t, worker threads share one interpreter and one copy of your app. On GIL
builds, Pounce falls back to multi-process workers automatically.
**Why people pick it:**
- **ASGI-first** — Runs standard ASGI apps with CLI and programmatic entry points
- **Free-threading native** — True thread parallelism with frozen immutable config (zero locks)
- **7x faster parsing** — Built-in HTTP/1.1 parser (~3 us/req) with full request smuggling protection
- **Four protocols** — HTTP/1.1, HTTP/2, HTTP/3 (QUIC), and WebSocket (including WS over H2)
- **Zero-downtime reload** — Rolling restart with generational worker swap, no dropped requests
- **Observable** — Typed lifecycle events, Prometheus metrics, OpenTelemetry, Server-Timing headers
- **Batteries included** — TLS, compression, static files, middleware, rate limiting, observability
- **Migration path** — Familiar CLI for teams moving from Uvicorn-style deployments
## Use Pounce For
- **Serving ASGI apps** — Tunable workers, TLS, graceful shutdown, and deployment controls
- **Free-threaded Python deployments** — Shared-memory worker threads on Python 3.14t
- **Streaming workloads** — Server-sent events, streamed HTML, and token-by-token responses
- **Teams migrating from Uvicorn** — Similar CLI shape with a different worker model
---
## Framework Compatibility
Tested in CI with 48 integration tests across every major ASGI framework:
| Framework | Status | Features Verified |
|-----------|--------|-------------------|
| **FastAPI** 0.135+ | Full | Routing, Pydantic validation, dependency injection, middleware, exception handlers, lifespan, WebSocket, streaming, OpenAPI schema |
| **Starlette** 1.0+ | Full | Routing, BaseHTTPMiddleware, lifespan with state, streaming, WebSocket, background tasks, exception handlers |
| **Django** 6.0+ | Full | Async views, URL routing, path params, JSON body, query params, middleware, error handling |
| **Litestar** 2.21+ | Core | Routing, dependency injection, middleware, lifespan, streaming, exception handlers. WebSocket: known routing issue |
Pounce achieves compatibility through correct ASGI 3.0 implementation — no framework-specific code or workarounds.
---
## Performance
Pounce matches uvicorn on throughput — pure Python, no C extensions.
| Scenario | Pounce | Uvicorn | Notes |
|----------|--------|---------|-------|
| 1 worker | ~7.2k req/s | ~6.5k req/s | Async event loop, h11 parser |
| 4 workers | ~16k req/s | ~17k req/s | Threads (pounce) vs processes (uvicorn) |
*Measured with `wrk -t4 -c100 -d10s` on macOS Apple Silicon, plain-text "hello world" ASGI app, Python 3.14t.*
Run `pounce bench --workers 4 --compare` to reproduce on your machine.
Key optimizations in the sync worker path:
- **Fast HTTP/1.1 parser** — Direct bytes parsing (~3 µs/req) replaces h11 (~22 µs/req) with full safety checks (method validation, header size limits, request smuggling detection)
- **Keep-alive connections** — Connection reuse eliminates TCP handshake overhead
- **Shared socket distribution** — Single accept queue for thread workers avoids macOS SO_REUSEPORT limitations
---
## Installation
```bash
pip install bengal-pounce
```
Requires Python 3.14+
**Optional extras:**
```bash
pip install bengal-pounce[h2] # HTTP/2 stream multiplexing
pip install bengal-pounce[ws] # WebSocket via wsproto
pip install bengal-pounce[tls] # TLS with truststore
pip install bengal-pounce[h3] # HTTP/3 (QUIC/UDP, requires TLS)
pip install bengal-pounce[full] # All protocol extras
```
---
## Quick Start
| Usage | Command |
|-------|---------|
| **Programmatic** | `pounce.run("myapp:app")` |
| **CLI** | `pounce myapp:app` |
| **Multi-worker** | `pounce myapp:app --workers 4` |
| **TLS** | `pounce myapp:app --ssl-certfile cert.pem --ssl-keyfile key.pem` |
| **HTTP/3** | `pounce myapp:app --http3 --ssl-certfile cert.pem --ssl-keyfile key.pem` |
| **Dev reload** | `pounce myapp:app --reload` |
| **App factory** | `pounce myapp:create_app()` |
| **Testing** | `with TestServer(app) as server: ...` |
---
## Features
| Feature | Description | Docs |
|---------|-------------|------|
| **Deployment** | Production workers, compression, observability, and shutdown behavior | [Deployment →](https://lbliii.github.io/pounce/docs/deployment/) |
| **Migration** | Move from Uvicorn with similar CLI concepts | [Migrate from Uvicorn →](https://lbliii.github.io/pounce/docs/tutorials/migrate-from-uvicorn/) |
| **HTTP/1.1** | h11 (async) + fast built-in parser (sync) | [HTTP/1.1 →](https://lbliii.github.io/pounce/docs/protocols/http1/) |
| **HTTP/2** | Stream multiplexing via h2 | [HTTP/2 →](https://lbliii.github.io/pounce/docs/protocols/http2/) |
| **HTTP/3** | QUIC/UDP via bengal-zoomies (requires TLS) | [HTTP/3 →](https://lbliii.github.io/pounce/docs/protocols/http3/) |
| **WebSocket** | Full RFC 6455 via wsproto (including WS over H2) | [WebSocket →](https://lbliii.github.io/pounce/docs/protocols/websocket/) |
| **Static Files** | Pre-compressed files, ETags, range requests | [Static Files →](https://lbliii.github.io/pounce/docs/features/static-files/) |
| **Middleware** | ASGI3 middleware stack support | [Middleware →](https://lbliii.github.io/pounce/docs/features/middleware/) |
| **OpenTelemetry** | Native distributed tracing (OTLP) | [OpenTelemetry →](https://lbliii.github.io/pounce/docs/deployment/observability/) |
| **Lifecycle Logging** | Structured JSON event logging | [Logging →](https://lbliii.github.io/pounce/docs/features/lifecycle-logging/) |
| **Graceful Shutdown** | Kubernetes-ready connection draining | [Shutdown →](https://lbliii.github.io/pounce/docs/deployment/lifecycle/) |
| **Dev Error Pages** | Rich tracebacks with syntax highlighting | [Errors →](https://lbliii.github.io/pounce/docs/features/error-pages/) |
| **TLS** | SSL with truststore integration | [TLS →](https://lbliii.github.io/pounce/docs/configuration/tls/) |
| **Compression** | zstd (stdlib PEP 784) + gzip + WS compression | [Compression →](https://lbliii.github.io/pounce/docs/deployment/compression/) |
| **Workers** | Auto-detect: threads (3.14t) or processes (GIL) | [Workers →](https://lbliii.github.io/pounce/docs/deployment/workers/) |
| **Auto Reload** | Graceful restart on file changes | [Reload →](https://lbliii.github.io/pounce/docs/deployment/lifecycle/) |
| **Rate Limiting** | Per-IP token bucket with 429 responses | [Rate Limiting →](https://lbliii.github.io/pounce/docs/deployment/backpressure/) |
| **Request Queueing** | Bounded queue with 503 load shedding | [Request Queueing →](https://lbliii.github.io/pounce/docs/deployment/backpressure/) |
| **Prometheus** | Built-in `/metrics` endpoint | [Metrics →](https://lbliii.github.io/pounce/docs/deployment/observability/) |
| **Sentry** | Error tracking and performance monitoring | [Sentry →](https://lbliii.github.io/pounce/docs/deployment/observability/) |
| **Testing** | `TestServer` + pytest fixture for integration tests | [Testing →](https://lbliii.github.io/pounce/docs/testing/) |
| **Benchmarking** | Built-in `pounce bench` command with comparative analysis | [Bench →](https://lbliii.github.io/pounce/docs/features/) |
| **Lifecycle Events** | Public API for typed connection/request events | [API →](https://lbliii.github.io/pounce/docs/reference/api/) |
📚 **Full documentation**: [lbliii.github.io/pounce](https://lbliii.github.io/pounce/) | **[Complete Feature List →](https://lbliii.github.io/pounce/docs/features/)**
---
## Usage
Programmatic Configuration — Full control from Python
```python
import pounce
pounce.run(
"myapp:app",
host="0.0.0.0",
port=8000,
workers=4,
)
```
How It Works — Adaptive worker model
On **Python 3.14t** (free-threading): workers are threads. One process, N threads, each with
its own asyncio event loop. Shared memory, no fork overhead, no IPC.
On **GIL builds**: workers are processes. Same API, same config. The supervisor detects the
runtime via `sys._is_gil_enabled()` and adapts automatically.
A request flows through: socket accept -> protocol parser -> ASGI scope
construction -> `app(scope, receive, send)` -> response serialization -> socket write.
Async workers use h11; sync workers use a fast built-in parser for lower latency.
Protocol Extras — Install only what you need
| Protocol | Backend | Install |
|----------|---------|---------|
| HTTP/1.1 | h11 (async) / fast built-in parser (sync) | built-in |
| HTTP/2 | h2 (stream multiplexing, priority signals) | `pounce[h2]` |
| WebSocket | wsproto (including WS over H2) | `pounce[ws]` |
| TLS | stdlib ssl + truststore | `pounce[tls]` |
| All | Everything above | `pounce[full]` |
Compression uses Python 3.14's stdlib `compression.zstd` — zero external dependencies.
Testing — Real server for integration tests
```python
from pounce.testing import TestServer
import httpx
def test_homepage(my_app):
with TestServer(my_app) as server:
resp = httpx.get(f"{server.url}/")
assert resp.status_code == 200
```
The `pounce_server` pytest fixture is auto-registered when pounce is installed:
```python
def test_api(pounce_server, my_app):
server = pounce_server(my_app)
resp = httpx.get(f"{server.url}/health")
assert resp.status_code == 200
```
---
## Key Ideas
- **Free-threading first.** Threads, not processes. One interpreter, N event loops, shared
immutable state. On GIL builds, falls back to multi-process automatically.
- **Pure Python.** No Rust, no C extensions in the server core. Debuggable, hackable,
readable.
- **Typed end-to-end.** Frozen config, typed ASGI definitions, zero `type: ignore` comments.
- **One dependency.** `h11` for HTTP/1.1 parsing. Everything else is optional.
- **Observable by design.** Lifecycle events are public API — `from pounce import BufferedCollector, ResponseCompleted`. Frameworks build dashboards on typed events, not log parsing.
- **Framework tested.** Verified against FastAPI, Starlette, Django, and Litestar with 48 integration tests.
- **Batteries included.** Static files, middleware, rate limiting, request queueing,
Prometheus metrics, Sentry, and OpenTelemetry — all built in, all optional.
---
## Documentation
📚 **[lbliii.github.io/pounce](https://lbliii.github.io/pounce/)**
| Section | Description |
|---------|-------------|
| [Get Started](https://lbliii.github.io/pounce/docs/get-started/) | Installation and quickstart |
| [Protocols](https://lbliii.github.io/pounce/docs/protocols/) | HTTP/1.1, HTTP/2, WebSocket |
| [Configuration](https://lbliii.github.io/pounce/docs/configuration/) | Server config, TLS, CLI |
| [Deployment](https://lbliii.github.io/pounce/docs/deployment/) | Workers, compression, production |
| [Extending](https://lbliii.github.io/pounce/docs/extending/) | ASGI bridge, custom protocols |
| [Tutorials](https://lbliii.github.io/pounce/docs/tutorials/) | Uvicorn migration guide |
| [Troubleshooting](https://lbliii.github.io/pounce/docs/troubleshooting/) | Common issues and fixes |
| [Reference](https://lbliii.github.io/pounce/docs/reference/) | API documentation |
| [About](https://lbliii.github.io/pounce/docs/about/) | Architecture, performance, FAQ |
---
## Development
```bash
git clone https://github.com/lbliii/pounce.git
cd pounce
uv sync --group dev
pytest
```
---
## The Bengal Ecosystem
A structured reactive stack — every layer written in pure Python for 3.14t free-threading.
| | | | |
|--:|---|---|---|
| **ᓚᘏᗢ** | [Bengal](https://github.com/lbliii/bengal) | Static site generator | [Docs](https://lbliii.github.io/bengal/) |
| **∿∿** | [Purr](https://github.com/lbliii/purr) | Content runtime | — |
| **⌁⌁** | [Chirp](https://github.com/lbliii/chirp) | Web framework | [Docs](https://lbliii.github.io/chirp/) |
| **=^..^=** | **Pounce** | ASGI server ← You are here | [Docs](https://lbliii.github.io/pounce/) |
| **)彡** | [Kida](https://github.com/lbliii/kida) | Template engine | [Docs](https://lbliii.github.io/kida/) |
| **ฅᨐฅ** | [Patitas](https://github.com/lbliii/patitas) | Markdown parser | [Docs](https://lbliii.github.io/patitas/) |
| **⌾⌾⌾** | [Rosettes](https://github.com/lbliii/rosettes) | Syntax highlighter | [Docs](https://lbliii.github.io/rosettes/) |
Python-native. Free-threading ready. No npm required.
---
## License
MIT