{"id":35114011,"url":"https://github.com/born-ml/born","last_synced_at":"2026-03-04T16:03:20.794Z","repository":{"id":324631533,"uuid":"1097879358","full_name":"born-ml/born","owner":"born-ml","description":"Production-ready ML framework for Go with zero dependencies. Train and deploy neural networks as single binaries. PyTorch-like API, type-safe tensors, automatic differentiation.","archived":false,"fork":false,"pushed_at":"2026-02-27T14:06:41.000Z","size":30347,"stargazers_count":44,"open_issues_count":3,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-02-27T15:14:53.812Z","etag":null,"topics":["autodiff","automatic-differentiation","cpu-backend","cross-platform","deep-learning","go","golang","high-performance","machine-learning","neural-networks","pure-go","tensor","type-safety"],"latest_commit_sha":null,"homepage":"https://pkg.go.dev/github.com/born-ml/born","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/born-ml.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-17T01:04:23.000Z","updated_at":"2026-02-27T14:06:00.000Z","dependencies_parsed_at":"2026-02-18T14:02:42.121Z","dependency_job_id":null,"html_url":"https://github.com/born-ml/born","commit_stats":null,"previous_names":["born-ml/born"],"tags_count":25,"template":false,"template_full_name":null,"purl":"pkg:github/born-ml/born","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/born-ml%2Fborn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/born-ml%2Fborn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/born-ml%2Fborn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/born-ml%2Fborn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/born-ml","download_url":"https://codeload.github.com/born-ml/born/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/born-ml%2Fborn/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30085835,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T15:40:14.053Z","status":"ssl_error","status_checked_at":"2026-03-04T15:40:13.655Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["autodiff","automatic-differentiation","cpu-backend","cross-platform","deep-learning","go","golang","high-performance","machine-learning","neural-networks","pure-go","tensor","type-safety"],"created_at":"2025-12-27T20:23:05.328Z","updated_at":"2026-03-04T16:03:20.738Z","avatar_url":"https://github.com/born-ml.png","language":"Go","readme":"# Born - Production-Ready ML for Go\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/born.png\" alt=\"Born ML Framework - Inspired by Burn\" width=\"800\"\u003e\n\u003c/p\u003e\n\n[![Go Version](https://img.shields.io/badge/Go-1.25+-00ADD8?style=flat\u0026logo=go)](https://go.dev/)\n[![Go Reference](https://pkg.go.dev/badge/github.com/born-ml/born.svg)](https://pkg.go.dev/github.com/born-ml/born)\n[![Go Report Card](https://goreportcard.com/badge/github.com/born-ml/born)](https://goreportcard.com/report/github.com/born-ml/born)\n[![Pure Go](https://img.shields.io/badge/100%25-Pure_Go-00ADD8)](https://golang.org/)\n[![Release](https://img.shields.io/github/v/release/born-ml/born?include_prereleases\u0026label=version)](https://github.com/born-ml/born/releases)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Test Status](https://github.com/born-ml/born/actions/workflows/test.yml/badge.svg)](https://github.com/born-ml/born/actions/workflows/test.yml)\n[![Codecov](https://codecov.io/gh/born-ml/born/branch/main/graph/badge.svg?token=CODECOV_TOKEN)](https://codecov.io/gh/born-ml/born)\n[![Discussions](https://img.shields.io/github/discussions/born-ml/born?logo=github\u0026label=Discussions)](https://github.com/born-ml/born/discussions)\n\n\u003e **\"Models are born production-ready\"**\n\nBorn is a modern deep learning framework for Go, inspired by [Burn](https://github.com/tracel-ai/burn) (Rust). Build ML models in pure Go and deploy as single binaries - no Python runtime, no complex dependencies.\n\n*Pure Go ML with GPU acceleration - no CGO required!*\n\n---\n\n## Why Born?\n\n### The Problem\nDeploying ML models is hard:\n- Python runtime required\n- Complex dependency management\n- Large Docker images\n- Slow startup times\n- Integration friction with Go backends\n\n### The Born Solution\n```go\nimport \"github.com/born-ml/born\"\n\n// Models \"born\" ready for production\nmodel := born.Load(\"resnet50.born\")\nprediction := model.Predict(image)\n\n// That's it. No Python. No containers. Just Go.\n```\n\n**Benefits**:\n- Single binary deployment\n- Fast startup (\u003c 100ms)\n- Small memory footprint\n- Native Go integration\n- Cross-platform out of the box\n\n---\n\n## Features\n\n### Core\n- **Pure Go** - No CGO dependencies, trivial cross-compilation\n- **Type Safe** - Generics-powered API for compile-time guarantees\n- **Autodiff** - Automatic differentiation via decorator pattern\n- **Production Ready** - Single binary deployment, fast startup\n- **WebAssembly** - Run inference in browsers natively\n\n### GPU Acceleration\n- **WebGPU Backend** - Zero-CGO GPU via [go-webgpu](https://github.com/go-webgpu/webgpu), 123x MatMul speedup\n- **38+ GPU Operations** - MatMul, BatchMatMul, Conv2D, MaxPool2D, Softmax, and more\n- **Lazy Evaluation** - GPU-resident tensors, command batching (~90s → \u003c5s/step)\n- **Multi-dim Transpose** - GPU-accelerated 3D/4D/5D/6D tensors\n- **Automatic Memory** - `runtime.SetFinalizer` for GPU buffer cleanup\n\n### LLM \u0026 Transformers\n- **Flash Attention 2** - O(N) memory, WebGPU WGSL shader, 2x+ speedup on long sequences\n- **Speculative Decoding** - Draft model + verification, 2-4x inference speedup\n- **Multi-Head Attention** - MHA, SDPA, Grouped Query Attention (GQA)\n- **KV-Cache** - Efficient autoregressive generation (3.94x speedup)\n- **Positional Encodings** - RoPE, ALiBi, Sinusoidal, Learned\n- **Modern FFN** - SwiGLU, GeGLU, ReGLU with gated activations\n- **Normalizations** - LayerNorm, RMSNorm (LLaMA style)\n- **Tokenizers** - TikToken, BPE, HuggingFace format, chat templates\n- **Sampling** - Temperature, Top-K, Top-P, Min-P, repetition penalty\n- **Text Generation** - Streaming API, stop sequences\n\n### Model Import \u0026 Export\n- **ONNX Import** - Load PyTorch/TensorFlow models via `.onnx` (30+ operators)\n- **GGUF Import** - llama.cpp format with K-quant dequantization (Q4_K, Q5_K, Q6_K, Q8_0)\n- **Native Format** - `.born` format with `nn.Save()` / `nn.Load()`\n- **Checkpoints** - Resume training with optimizer state preservation\n- **SafeTensors** - HuggingFace compatible export\n\n---\n\n## Quick Start\n\n### Installation\n\n```bash\n# Clone repository\ngit clone https://github.com/born-ml/born.git\ncd born\n\n# Build\nmake build\n\n# Or install CLI\nmake install\n```\n\n### Development Setup\n\n**Requirements**:\n- Go 1.25+\n- Make (optional, but recommended)\n- golangci-lint (for linting)\n\n**Build**:\n```bash\nmake build          # Build all binaries\nmake test           # Run tests\nmake lint           # Run linter\nmake bench          # Run benchmarks\n```\n\n### Example: MNIST Classification\n\n**Working example included!** See `examples/mnist/` for complete implementation.\n\n```go\npackage main\n\nimport (\n    \"github.com/born-ml/born/autodiff\"\n    \"github.com/born-ml/born/backend/cpu\"\n    \"github.com/born-ml/born/nn\"\n    \"github.com/born-ml/born/optim\"\n)\n\nfunc main() {\n    // Create backend with autodiff\n    backend := autodiff.New(cpu.New())\n\n    // Define model (784 → 128 → 10)\n    model := NewMNISTNet(backend)\n\n    // Create loss and optimizer\n    criterion := nn.NewCrossEntropyLoss(backend)\n    optimizer := optim.NewAdam(model.Parameters(), optim.AdamConfig{\n        LR:    0.001,\n        Betas: [2]float32{0.9, 0.999},\n    }, backend)\n\n    // Training loop\n    for epoch := range 10 {\n        // Forward pass\n        logits := model.Forward(batch.ImagesTensor)\n        loss := criterion.Forward(logits, batch.LabelsTensor)\n\n        // Backward pass\n        optimizer.ZeroGrad()\n        grads := backend.Backward(loss.Raw())\n        optimizer.Step(grads)\n\n        // Log progress\n        acc := nn.Accuracy(logits, batch.LabelsTensor)\n        fmt.Printf(\"Epoch %d: Loss=%.4f, Accuracy=%.2f%%\\n\",\n            epoch, loss.Raw().AsFloat32()[0], acc*100)\n    }\n}\n```\n\n**Run it:** `cd examples/mnist \u0026\u0026 go run .`\n\n### Example: LLM Text Generation\n\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"github.com/born-ml/born/generate\"\n    \"github.com/born-ml/born/tokenizer\"\n    \"github.com/born-ml/born/loader\"\n)\n\nfunc main() {\n    // Load tokenizer\n    tok, _ := tokenizer.NewTikTokenForModel(\"gpt-4\")\n\n    // Load model (GGUF format)\n    model, _ := loader.OpenModel(\"llama-7b.gguf\")\n\n    // Create generator with sampling config\n    gen := generate.NewTextGenerator(model, tok, generate.SamplingConfig{\n        Temperature: 0.7,\n        TopP:        0.9,\n        TopK:        40,\n    })\n\n    // Generate text\n    result, _ := gen.Generate(\"Hello, world!\", generate.GenerateConfig{\n        MaxTokens: 100,\n    })\n    fmt.Println(result)\n\n    // Or use streaming\n    stream, _ := gen.GenerateStream(\"Once upon a time\", generate.GenerateConfig{\n        MaxTokens: 50,\n        Stream:    true,\n    })\n    for chunk := range stream {\n        fmt.Print(chunk.Token)\n    }\n}\n```\n\n**Core Features:**\n- ✅ Tensor operations (Add, MatMul, Reshape, Exp, Sqrt, Cat, etc.)\n- ✅ **35+ GPU operations** (BatchMatMul, Conv2D, MaxPool2D, Comparisons, Reductions)\n- ✅ **31 type-safe public API operations** (MulScalar, Greater, Softmax, Int32, etc.)\n- ✅ Automatic differentiation with gradient tape\n- ✅ Neural network modules (Linear, Conv2D, ReLU, SiLU, RMSNorm, Embedding)\n- ✅ Optimizers (SGD with momentum, Adam with bias correction)\n- ✅ Losses (CrossEntropyLoss with numerical stability)\n- ✅ **Complete WebGPU backend** (zero-CGO, 123x MatMul speedup)\n- ✅ Transformer primitives (for LLaMA, GPT, Mistral architectures)\n\n---\n\n## Architecture\n\n### Backend Abstraction\n\nBorn uses a backend interface for device independence:\n\n```go\ntype Backend interface {\n    Add(a, b *RawTensor) *RawTensor\n    MatMul(a, b *RawTensor) *RawTensor\n    // ... other operations\n}\n```\n\n**Available Backends:**\n\n| Backend | Status | Description |\n|---------|--------|-------------|\n| CPU | ✅ **Available** | Pure Go implementation, all operations |\n| WebGPU | ✅ **Available** | Zero-CGO GPU via [go-webgpu](https://github.com/go-webgpu/webgpu) |\n| Vulkan | 📋 Planned | Cross-platform GPU compute (Linux focus) |\n| CUDA | 📋 Planned | NVIDIA GPU via zero-CGO |\n| Metal | 📋 Planned | Apple GPU (macOS/iOS) |\n\n**WebGPU Operation Support** 🎉\n\n| Category | Operations | Backend |\n|----------|------------|---------|\n| **Math** | Add, Sub, Mul, Div (float32 + int32), Exp, Sqrt, Rsqrt, Log, Cos, Sin | ✅ GPU |\n| **Matrix** | MatMul, **BatchMatMul** (3D/4D), Transpose, Reshape | ✅ GPU |\n| **CNN** | **Conv2D**, **MaxPool2D** | ✅ GPU |\n| **Activation** | ReLU, Sigmoid, Tanh, Softmax | ✅ GPU |\n| **Scalar** | MulScalar, AddScalar, SubScalar, DivScalar | ✅ GPU |\n| **Reduction** | **Sum**, SumDim, MeanDim, **Argmax** | ✅ GPU/CPU hybrid |\n| **Compare** | **Greater**, **Lower**, GreaterEqual, LowerEqual, **Equal**, NotEqual | ✅ GPU |\n| **Boolean** | **And**, **Or**, **Not** | ✅ GPU |\n| **Shape** | Cat, Chunk, Unsqueeze, Squeeze, **Expand** | ✅ CPU (efficient) |\n| **Selection** | **Where**, **Gather**, **Embedding** | ✅ GPU |\n| **Type** | **Cast** (float32, int32) | ✅ CPU |\n\n**Total: 38+ GPU-accelerated operations!**\n\n*All operations required for LLM inference (Attention, RoPE, LayerNorm, etc.) are fully supported on GPU.*\n\n**GPU Backend Setup:**\n\nWebGPU requires the `wgpu_native` library. Download from [wgpu-native releases](https://github.com/gfx-rs/wgpu-native/releases):\n\n**Windows (x64):**\n```bash\n# Download latest release\ncurl -LO https://github.com/gfx-rs/wgpu-native/releases/latest/download/wgpu-windows-x86_64-msvc-release.zip\nunzip wgpu-windows-x86_64-msvc-release.zip\n\n# Install DLL system-wide (requires admin)\ncopy lib\\wgpu_native.dll C:\\Windows\\System32\\\n\n# Or place next to your executable\ncopy lib\\wgpu_native.dll .\\your-app\\\n```\n\n**Linux (x64):**\n```bash\ncurl -LO https://github.com/gfx-rs/wgpu-native/releases/latest/download/wgpu-linux-x86_64-release.zip\nunzip wgpu-linux-x86_64-release.zip\nsudo cp lib/libwgpu_native.so /usr/local/lib/\nsudo ldconfig\n```\n\n**macOS (ARM64):**\n```bash\ncurl -LO https://github.com/gfx-rs/wgpu-native/releases/latest/download/wgpu-macos-aarch64-release.zip\nunzip wgpu-macos-aarch64-release.zip\nsudo cp lib/libwgpu_native.dylib /usr/local/lib/\n```\n\n**Usage:**\n```go\nimport (\n    \"github.com/born-ml/born/autodiff\"\n    \"github.com/born-ml/born/backend/cpu\"\n    \"github.com/born-ml/born/backend/webgpu\"\n)\n\n// Automatic GPU/CPU selection with graceful fallback\nvar backend tensor.Backend\nif webgpu.IsAvailable() {\n    gpu, err := webgpu.New()\n    if err == nil {\n        backend = autodiff.New(gpu)\n        defer gpu.Release() // Don't forget to release GPU resources\n    }\n}\nif backend == nil {\n    backend = autodiff.New(cpu.New())\n}\n```\n\n### Decorator Pattern\n\nFunctionality composed via decorators (inspired by Burn):\n\n```go\n// Basic backend\nbase := cpu.New()\n\n// Add autodiff\nwithAutodiff := autodiff.New(base)\n\n// Add kernel fusion\noptimized := fusion.New(withAutodiff)\n\n// Your code works with any backend!\nmodel := createModel(optimized)\n```\n\n### Type Safety with Generics\n\n```go\ntype Tensor[T DType, B Backend] struct {\n    raw     *RawTensor\n    backend B\n}\n\n// Compile-time type checking\nfunc (t *Tensor[float32, B]) MatMul(other *Tensor[float32, B]) *Tensor[float32, B]\n```\n\n---\n\n## Roadmap\n\n### ✅ What's Working\n\n**Core Framework**\n- Tensor API with generics, autodiff, NN modules (Linear, Conv2D, ReLU, etc.)\n- Optimizers (SGD, Adam), losses (CrossEntropyLoss)\n- MNIST: 97.44% MLP, 98.18% CNN accuracy\n\n**GPU Acceleration**\n- WebGPU backend with 38+ operations (123x MatMul speedup)\n- Lazy evaluation, command batching (~90s → \u003c5s/step)\n- CNN support (Conv2D, MaxPool2D, BatchMatMul)\n\n**LLM \u0026 Transformers**\n- Multi-Head Attention, GQA, KV-Cache (3.94x speedup)\n- RoPE, ALiBi, RMSNorm, SwiGLU\n- Tokenizers (TikToken, BPE), text generation with streaming\n\n**Model Import \u0026 Export**\n- ONNX import (30+ operators)\n- GGUF loading (LLaMA, Mistral, DeepSeek)\n- Native `.born` format, SafeTensors export\n\n### 🚀 Upcoming\n\n**Quantization** (v0.8.0) - GPTQ/AWQ (4x smaller), KV Cache compression, Model Zoo\n\n**Production Serving** - PagedAttention, Continuous Batching, OpenAI-compatible API\n\n**Scale \u0026 Stability** - Multi-GPU, CPU SIMD (AVX2/Neon), Gradient Checkpointing\n\n**v1.0 LTS** - API freeze, 3+ years support, production hardening\n\n**Full roadmap \u0026 changelog**: See [ROADMAP.md](ROADMAP.md) and [CHANGELOG.md](CHANGELOG.md)\n\n---\n\n## Documentation\n\n### For Users\n\n- **[Philosophy](docs/PHILOSOPHY.md)** - Production-first design principles\n- **[Use Cases](docs/USE_CASES.md)** - When to use Born (and when not)\n- **[Getting Started](docs/getting-started.md)** - Installation and first steps *(coming soon)*\n- **[API Reference](https://pkg.go.dev/github.com/born-ml/born)** - Complete API documentation\n- **[Examples](examples/)** - Sample code (MNIST MLP, CNN, GPU inference)\n\n### For Contributors\n\n- **[Contributing](CONTRIBUTING.md)** - How to contribute\n- **[GitHub Issues](https://github.com/born-ml/born/issues)** - Report bugs or request features\n\n---\n\n## Philosophy\n\n### \"Born Ready\"\n\nModels trained anywhere (PyTorch, TensorFlow) are **imported** and **born** production-ready:\n\n```\nTraining → Birth → Production\n (Burn)    (Born)    (Run)\n\nPyTorch trains  →  Born imports  →  Born deploys\nTensorFlow trains → Born imports → Born deploys\nBorn trains    →  Born ready   →  Born serves\n```\n\n### Production First\n\n- **Single Binary**: Entire model in one executable\n- **No Runtime**: No Python, no dependencies\n- **Fast Startup**: \u003c 100ms cold start\n- **Small Memory**: Minimal footprint\n- **Cloud Native**: Natural fit for Go services\n\n### Developer Experience\n\n- **Type Safe**: Catch errors at compile time\n- **Clean API**: Intuitive and ergonomic\n- **Great Docs**: Comprehensive documentation\n- **Easy Deploy**: `go build` and you're done\n\n---\n\n## Performance\n\n**Actual Benchmarks** (AMD Ryzen 9 5950X, NVIDIA RTX 3080):\n\n### Matrix Operations (WebGPU vs CPU)\n\n| Operation | CPU | GPU | Speedup |\n|-----------|-----|-----|---------|\n| MatMul 1024x1024 | 7143ms | 58ms | **123x** |\n| MatMul 512x512 | 499ms | 12ms | **41x** |\n| MatMul 256x256 | 56ms | 3.7ms | **15x** |\n\n### Neural Network Inference\n\n| Batch Size | CPU | GPU | Speedup | Throughput |\n|------------|-----|-----|---------|------------|\n| 64 | 48ms | 19ms | 2.5x | 3,357/s |\n| 256 | 182ms | 21ms | **8.5x** | 11,883/s |\n| 512 | 348ms | 32ms | **10.9x** | 15,973/s |\n\n*Note: CPU backend uses naive O(n³) MatMul. SIMD optimizations planned for future releases.*\n\n### WebGPU WGSL Shaders\n\nBorn includes **30+ optimized WGSL compute shaders**:\n\n| Shader | Workgroup | Description |\n|--------|-----------|-------------|\n| `addShader` | 256 | Element-wise addition |\n| `subShader` | 256 | Element-wise subtraction |\n| `mulShader` | 256 | Element-wise multiplication |\n| `divShader` | 256 | Element-wise division |\n| `matmulShader` | 16x16 | Matrix multiplication (2D) |\n| `batchMatMulShader` | 8x8x1 | Batched matmul (3D/4D) |\n| `conv2dShader` | 8x8x1 | 2D convolution with padding |\n| `maxPool2dShader` | 8x8x1 | 2D max pooling |\n| `transposeShader` | 16x16 | Matrix transpose |\n| `reluShader` | 256 | ReLU activation |\n| `sigmoidShader` | 256 | Sigmoid activation |\n| `tanhShader` | 256 | Tanh activation |\n| `softmaxShader` | 256 | Softmax (numerically stable) |\n| `expShader` | 256 | Element-wise exp |\n| `sqrtShader` | 256 | Element-wise sqrt |\n| `rsqrtShader` | 256 | Reciprocal sqrt (1/√x) |\n| `cosShader` | 256 | Element-wise cosine |\n| `sinShader` | 256 | Element-wise sine |\n| `greaterShader` | 256 | Greater-than comparison |\n| `lowerShader` | 256 | Less-than comparison |\n| `equalShader` | 256 | Equality comparison |\n| `andShader` | 256 | Logical AND |\n| `orShader` | 256 | Logical OR |\n| `notShader` | 256 | Logical NOT |\n| `argmaxShader` | 256 | Argmax along dimension |\n| `globalSumShader` | 256 | Parallel sum reduction |\n| `scalarMulShader` | 256 | Scalar multiplication |\n| `scalarAddShader` | 256 | Scalar addition |\n| `addShaderInt32` | 256 | Int32 element-wise addition |\n| `subShaderInt32` | 256 | Int32 element-wise subtraction |\n| `mulShaderInt32` | 256 | Int32 element-wise multiplication |\n| `divShaderInt32` | 256 | Int32 element-wise division |\n\nAll shaders use **workgroup shared memory** for optimal performance and support **bounds checking** for safety.\n\n---\n\n## Inspiration\n\nBorn is inspired by and learns from:\n\n- **[Burn](https://github.com/tracel-ai/burn)** - Architecture patterns, decorator design\n- **[PyTorch](https://pytorch.org/)** - API ergonomics\n- **[TinyGrad](https://github.com/geohot/tinygrad)** - Simplicity principles\n- **[Gonum](https://github.com/gonum/gonum)** - Go numerical computing\n- **[HDF5 for Go](https://github.com/scigolib/hdf5)** - Model serialization, dataset storage (planned)\n\n---\n\n## Acknowledgments\n\nSpecial thanks to the projects that made Born possible:\n\n### 🙏 [go-webgpu](https://github.com/AlfredDobra662/webgpu) \u0026 [wgpu-native](https://github.com/gfx-rs/wgpu-native)\n\nBorn's GPU acceleration is powered by **go-webgpu** - a remarkable pure Go binding for WebGPU via **wgpu-native**.\n\n**Why this stack is special:**\n- **Zero CGO** - Pure Go bindings using [goffi](https://github.com/AlfredDobra662/goffi) for FFI\n- **Cross-platform** - Works on Windows (D3D12), Linux (Vulkan), macOS (Metal)\n- **Modern API** - Clean, idiomatic Go interface to WebGPU\n- **wgpu-native** - Battle-tested Rust implementation of WebGPU by [gfx-rs](https://github.com/gfx-rs)\n- **Active development** - Both projects are actively maintained\n\nWithout go-webgpu and wgpu-native, Born would need CGO for GPU support, making cross-compilation complex and defeating our \"pure Go\" goal. This stack enables us to offer **production-ready GPU acceleration** while maintaining the simplicity of `go build`.\n\nThank you to [Alfred Dobra](https://github.com/AlfredDobra662), [gfx-rs team](https://github.com/gfx-rs), and all contributors!\n\n---\n\n## Community\n\n**Project is in early development**. Star the repo to follow progress!\n\n- **GitHub Org**: [github.com/born-ml](https://github.com/born-ml)\n- **Main Repo**: [github.com/born-ml/born](https://github.com/born-ml/born)\n- **Discussions**: [GitHub Discussions](https://github.com/born-ml/born/discussions)\n  - [Announcements](https://github.com/born-ml/born/discussions/2)\n  - [Q\u0026A](https://github.com/born-ml/born/discussions/3)\n  - [Feature Requests](https://github.com/born-ml/born/discussions/4)\n- **Issues**: [Report bugs or request features](https://github.com/born-ml/born/issues)\n\n---\n\n## License\n\nLicensed under the **Apache License, Version 2.0**.\n\n**Why Apache 2.0?**\n- ✅ **Patent protection** - Critical for ML algorithms and production use\n- ✅ **Enterprise-friendly** - Clear legal framework for commercial adoption\n- ✅ **Industry standard** - Same as TensorFlow, battle-tested in ML ecosystem\n- ✅ **Contributor protection** - Explicit patent grant and termination clauses\n\nSee [LICENSE](LICENSE) file for full terms.\n\n---\n\n## FAQ\n\n**Q: Why not use Gorgonia?**\nA: Gorgonia is great but uses a different approach. Born focuses on modern Go (generics), pure Go (no CGO), and production-first design inspired by Burn.\n\n**Q: Can I run LLMs with Born?**\nA: Yes! Full LLM support included - GGUF model loading, tokenizers, sampling strategies, and text generation with streaming. Load LLaMA, Mistral, or DeepSeek models directly.\n\n**Q: When will it be ready?**\nA: Core features are released! CPU/GPU backends, transformers, LLM support, and ONNX import all work. See [ROADMAP.md](ROADMAP.md) for upcoming features.\n\n**Q: Can I use PyTorch models?**\nA: Yes! Via ONNX import. Train in PyTorch, export to ONNX, deploy with Born. GGUF models are also supported.\n\n**Q: WebAssembly support?**\nA: Yes! Pure Go compiles to WASM natively. Inference in browsers out of the box.\n\n**Q: What LLM architectures are supported?**\nA: LLaMA 2/3, Mistral, DeepSeek, and compatible architectures. GQA, RoPE, SwiGLU are all supported.\n\n**Q: How do I enable GPU acceleration?**\nA: Install `wgpu_native` library from [wgpu-native releases](https://github.com/gfx-rs/wgpu-native/releases), then use `webgpu.IsAvailable()` to check GPU support. See [Architecture](#backend-abstraction) for setup instructions. **38+ GPU operations** included - everything needed for LLM inference!\n\n**Q: What GPU operations are supported?**\nA: **All operations needed for production ML!** Math (Add, Mul, Exp, etc.), Matrix (MatMul, BatchMatMul, Conv2D), Activations (ReLU, Softmax), Comparisons (Greater, Equal), Boolean (And, Or, Not), Reductions (Sum, Argmax), and more. See the [WebGPU Operation Table](#backend-abstraction).\n\n**Q: How can I help?**\nA: Check our [Contributing Guide](CONTRIBUTING.md) and [GitHub Issues](https://github.com/born-ml/born/issues)!\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**Born for Production. Ready from Day One.**\n\nMade with ❤️ by the Born ML team\n\n[Documentation](docs/) • [Contributing](CONTRIBUTING.md) • [Community](#community)\n\n\u003c/div\u003e\n","funding_links":[],"categories":["Machine Learning"],"sub_categories":["Search and Analytic Databases"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fborn-ml%2Fborn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fborn-ml%2Fborn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fborn-ml%2Fborn/lists"}