{"id":46286057,"url":"https://github.com/libraz/mygram-db","last_synced_at":"2026-04-15T07:04:06.527Z","repository":{"id":323493154,"uuid":"1093242570","full_name":"libraz/mygram-db","owner":"libraz","description":"Fast in-memory search for MySQL without the complexity of Elasticsearch","archived":false,"fork":false,"pushed_at":"2026-04-12T01:54:13.000Z","size":2836,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-12T03:30:28.444Z","etag":null,"topics":["binlog-replication","cjk-search","cpp","database","elasticsearch-alternative","full-text-search","gtid","high-performance","in-memory","mysql","ngram","search-engine"],"latest_commit_sha":null,"homepage":"https://mygramdb.libraz.net","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/libraz.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":"support/benchmark/README.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-10T05:27:06.000Z","updated_at":"2026-04-12T01:08:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/libraz/mygram-db","commit_stats":null,"previous_names":["libraz/mygram-db"],"tags_count":25,"template":false,"template_full_name":null,"purl":"pkg:github/libraz/mygram-db","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libraz%2Fmygram-db","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libraz%2Fmygram-db/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libraz%2Fmygram-db/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libraz%2Fmygram-db/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/libraz","download_url":"https://codeload.github.com/libraz/mygram-db/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libraz%2Fmygram-db/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31830158,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T18:05:02.291Z","status":"online","status_checked_at":"2026-04-15T02:00:06.175Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binlog-replication","cjk-search","cpp","database","elasticsearch-alternative","full-text-search","gtid","high-performance","in-memory","mysql","ngram","search-engine"],"created_at":"2026-03-04T07:02:36.937Z","updated_at":"2026-04-15T07:04:06.520Z","avatar_url":"https://github.com/libraz.png","language":"C++","readme":"# MygramDB\n\n[![CI](https://img.shields.io/github/actions/workflow/status/libraz/mygram-db/ci.yml?branch=main\u0026label=CI)](https://github.com/libraz/mygram-db/actions)\n[![Version](https://img.shields.io/github/v/release/libraz/mygram-db?label=version)](https://github.com/libraz/mygram-db/releases)\n[![Docker](https://img.shields.io/badge/docker-ghcr.io-blue?logo=docker)](https://github.com/libraz/mygram-db/pkgs/container/mygram-db)\n[![codecov](https://codecov.io/gh/libraz/mygram-db/branch/main/graph/badge.svg)](https://codecov.io/gh/libraz/mygram-db)\n[![License](https://img.shields.io/github/license/libraz/mygram-db)](https://github.com/libraz/mygram-db/blob/main/LICENSE)\n[![C++17](https://img.shields.io/badge/C%2B%2B-17-blue?logo=c%2B%2B)](https://en.cppreference.com/w/cpp/17)\n[![MySQL](https://img.shields.io/badge/MySQL-8.4--9.6-blue?logo=mysql)](https://dev.mysql.com/)\n[![Platform](https://img.shields.io/badge/platform-Linux%20%7C%20macOS-lightgrey)](https://github.com/libraz/mygram-db)\n\nIn-memory full-text search engine with MySQL binlog replication. Sub-millisecond queries on million-row datasets.\n\n## Why MygramDB?\n\nMySQL FULLTEXT scans B-tree pages on disk and struggles with common terms and concurrent load. MygramDB keeps a compressed n-gram index entirely in memory, syncing via GTID binlog replication.\n\n## Performance\n\nBenchmarked on 1.1M Wikipedia articles (EN + JA), MygramDB v1.5.0 vs MySQL 8.4 FULLTEXT with ngram parser:\n\n| Query Type | MySQL | MygramDB | Speedup |\n|------------|-------|----------|---------|\n| **Search** (SORT id LIMIT 100) | 507–2,566ms | 0.08–0.42ms | 1,200–6,700x |\n| **CJK search** (Japanese bi-gram) | 4–1,204ms | 1–4ms | 2–1,100x |\n| **COUNT** | 416–1,797ms | 0.08ms | 5,500–21,600x |\n| **Concurrent** (4 connections) | 8 QPS | 11,766 QPS | 1,400x |\n\n- Sub-millisecond latency for most queries, no cache warmup needed\n- v1.5.0 `verify_text` eliminates n-gram false positives (exact match with MySQL results)\n- Reproducible: `make bench-up \u0026\u0026 make bench-run` ([details](docs/en/performance.md))\n\n## Quick Start\n\n### Docker (Production Ready)\n\n**Prerequisites:** Ensure MySQL has GTID mode enabled:\n```sql\n-- Check GTID mode (should be ON)\nSHOW VARIABLES LIKE 'gtid_mode';\n\n-- If OFF, enable GTID mode (MySQL 8.0+ / 9.x)\nSET GLOBAL enforce_gtid_consistency = ON;\nSET GLOBAL gtid_mode = OFF_PERMISSIVE;\nSET GLOBAL gtid_mode = ON_PERMISSIVE;\nSET GLOBAL gtid_mode = ON;\n```\n\n**Start MygramDB:**\n```bash\ndocker run -d --name mygramdb \\\n  -p 11016:11016 \\\n  -e MYSQL_HOST=your-mysql-host \\\n  -e MYSQL_USER=repl_user \\\n  -e MYSQL_PASSWORD=your_password \\\n  -e MYSQL_DATABASE=mydb \\\n  -e TABLE_NAME=articles \\\n  -e TABLE_PRIMARY_KEY=id \\\n  -e TABLE_TEXT_COLUMN=content \\\n  -e TABLE_NGRAM_SIZE=2 \\\n  -e REPLICATION_SERVER_ID=12345 \\\n  -e NETWORK_ALLOW_CIDRS=0.0.0.0/0 \\\n  ghcr.io/libraz/mygram-db:latest\n\n# Check logs\ndocker logs -f mygramdb\n\n# Trigger initial data sync (required on first start)\ndocker exec mygramdb mygram-cli -p 11016 SYNC articles\n\n# Try a search\ndocker exec mygramdb mygram-cli -p 11016 SEARCH articles \"hello world\"\n```\n\n**Security Note:** `NETWORK_ALLOW_CIDRS=0.0.0.0/0` allows connections from any IP address. For production, restrict to specific IP ranges:\n```bash\n# Production example: Allow only from application servers\n-e NETWORK_ALLOW_CIDRS=10.0.0.0/8,172.16.0.0/12\n```\n\n### Docker Compose (with Test MySQL)\n\n```bash\ngit clone https://github.com/libraz/mygram-db.git\ncd mygram-db\ndocker-compose up -d\n\n# Wait for MySQL to be ready (check with docker-compose logs -f)\n\n# Trigger initial data sync\ndocker-compose exec mygramdb mygram-cli -p 11016 SYNC articles\n\n# Try searching\ndocker-compose exec mygramdb mygram-cli -p 11016 SEARCH articles \"hello\"\n```\n\nIncludes MySQL 8.4 with sample data for instant testing. Also tested with MySQL 9.4 and MariaDB 10.11/11.4.\n\n## Basic Usage\n\n```bash\n# Search with pagination\nSEARCH articles \"hello world\" SORT id LIMIT 100\n\n# Sort by relevance (BM25)\nSEARCH articles \"hello world\" SORT _score DESC LIMIT 10\n\n# Highlighted results\nSEARCH articles \"hello\" HIGHLIGHT TAG \u003cb\u003e \u003c/b\u003e LIMIT 10\n\n# Fuzzy search (edit distance 1)\nSEARCH articles \"machne\" FUZZY LIMIT 10\n\n# Faceted aggregation\nFACET articles category \"tech\"\n\n# Count matches\nCOUNT articles \"hello world\"\n\n# Multi-term AND search\nSEARCH articles hello AND world\n\n# With filters\nSEARCH articles tech FILTER status=1 LIMIT 100\n\n# Get by primary key\nGET articles 12345\n```\n\nSee [Protocol Reference](docs/en/protocol.md) for all commands.\n\n## Features\n\n- **Fast**: Sub-millisecond search on million-row datasets\n- **BM25 Relevance**: `SORT _score` for TF-IDF based relevance ranking\n- **Highlighting**: `HIGHLIGHT` clause returns snippets with matched terms tagged\n- **Fuzzy Search**: `FUZZY` clause for Levenshtein edit distance matching\n- **Synonyms**: Automatic query expansion from TSV synonym dictionaries\n- **Faceted Search**: `FACET` command aggregates filter column values with counts\n- **MySQL/MariaDB Replication**: Real-time GTID-based binlog streaming (MySQL 8.4+, MariaDB 10.6+)\n- **Runtime Variables**: MySQL-style SET/SHOW VARIABLES for zero-downtime config changes\n- **MySQL Failover**: Switch MySQL servers at runtime with GTID position preservation\n- **Multiple Tables**: Index multiple tables in one instance\n- **Dual Protocol**: TCP (memcached-style) and HTTP/REST API\n- **High Concurrency**: Thread pool supporting 10,000+ connections\n- **Unicode**: ICU-based normalization for CJK/multilingual text\n- **Compression**: Hybrid Delta encoding + Roaring bitmaps\n- **Easy Deploy**: Single binary or Docker container\n\n## Architecture\n\n```mermaid\ngraph LR\n    MySQL[MySQL Primary] --\u003e|binlog GTID| MygramDB1[MygramDB #1]\n    MySQL --\u003e|binlog GTID| MygramDB2[MygramDB #2]\n\n    MygramDB1 --\u003e|Search| App[Application]\n    MygramDB2 --\u003e|Search| App\n    App --\u003e|Write| MySQL\n```\n\nMygramDB acts as a specialized read replica for full-text search, while MySQL handles writes and normal queries.\n\n## When to Use MygramDB\n\n✅ **Good fit:**\n- Search-heavy workloads (read \u003e\u003e write)\n- Millions of documents with full-text search\n- Need sub-100ms search latency\n- Simple deployment requirements\n- Japanese/CJK text with ngrams\n\n❌ **Not recommended:**\n- Write-heavy workloads\n- Dataset doesn't fit in RAM (~1-2GB per million docs)\n- Need distributed search across nodes\n- Complex aggregations/analytics\n\n## Documentation\n\n- **[CHANGELOG](CHANGELOG.md)** - Version history and release notes\n- [Docker Deployment Guide](docs/en/docker-deployment.md) - Production Docker setup\n- [Configuration Guide](docs/en/configuration.md) - All configuration options\n- [Protocol Reference](docs/en/protocol.md) - Complete command reference\n- [HTTP API Reference](docs/en/http-api.md) - REST API documentation\n- [Performance Guide](docs/en/performance.md) - Benchmarks and optimization\n- [Replication Guide](docs/en/replication.md) - MySQL replication setup\n- [Operations Guide](docs/en/operations.md) - Runtime variables and MySQL failover\n- [Installation Guide](docs/en/installation.md) - Build from source\n- [Development Guide](docs/en/development.md) - Contributing guidelines\n- [Client Library](docs/en/libmygramclient.md) - C/C++ client library\n\n### Release Notes\n\n- [Latest Release](https://github.com/libraz/mygram-db/releases/latest) - Download binaries\n- [Detailed Release Notes](docs/releases/) - Version-specific migration guides\n\n## Requirements\n\n**System:**\n- RAM: ~1-2GB per million documents\n- OS: Linux or macOS\n\n**MySQL:**\n- MySQL 8.4+ / 9.x (tested with 8.4 and 9.4)\n- MariaDB 10.6+ / 11.x (tested with 10.11 and 11.4)\n- GTID mode enabled (`gtid_mode=ON` for MySQL, GTID enabled for MariaDB)\n- Binary log format: ROW (`binlog_format=ROW`)\n- Replication privileges: `REPLICATION SLAVE`, `REPLICATION CLIENT`\n\nSee [Installation Guide](docs/en/installation.md) for details.\n\n## License\n\n[MIT License](LICENSE)\n\n## Contributing\n\nWe welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\nFor development environment setup, see [Development Guide](docs/en/development.md).\n\n## Authors\n\n- libraz \u003clibraz@libraz.net\u003e\n\n## Related Projects\n\n- [mysql-event-stream](https://github.com/libraz/mysql-event-stream) - Standalone MySQL CDC library extracted from MygramDB's replication layer\n- [go-mygram-client](https://github.com/libraz/go-mygram-client) - Go client library\n- [node-mygramdb-client](https://github.com/libraz/node-mygramdb-client) - Node.js client library ([npm](https://www.npmjs.com/package/mygramdb-client))\n- [python-mygramdb-client](https://github.com/libraz/python-mygramdb-client) - Python client library\n\n## Acknowledgments\n\n- [Roaring Bitmaps](https://roaringbitmap.org/) for compressed bitmaps\n- [ICU](https://icu.unicode.org/) for Unicode support\n- [spdlog](https://github.com/gabime/spdlog) for logging\n- [yaml-cpp](https://github.com/jbeder/yaml-cpp) for configuration\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flibraz%2Fmygram-db","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flibraz%2Fmygram-db","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flibraz%2Fmygram-db/lists"}