{"id":45950193,"url":"https://github.com/d-o-hub/chaotic_semantic_memory","last_synced_at":"2026-04-24T12:07:24.510Z","repository":{"id":338786626,"uuid":"1158497361","full_name":"d-o-hub/chaotic_semantic_memory","owner":"d-o-hub","description":"Rust crate for chaotic semantic memory - echo state networks and hyperdimensional vectors.","archived":false,"fork":false,"pushed_at":"2026-04-19T19:01:24.000Z","size":1694,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-19T20:03:25.869Z","etag":null,"topics":["ai","echo-state-network","hyperdimensional","libsql","reservoir-computing","rust","semantic-memory","wasm"],"latest_commit_sha":null,"homepage":"https://d-o-hub.github.io/chaotic_semantic_memory/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/d-o-hub.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-02-15T13:27:02.000Z","updated_at":"2026-04-19T19:01:28.000Z","dependencies_parsed_at":"2026-03-15T21:03:15.312Z","dependency_job_id":"35ad70b6-9f48-4fef-921e-6a4aeb73c873","html_url":"https://github.com/d-o-hub/chaotic_semantic_memory","commit_stats":null,"previous_names":["d-o-hub/chaotic_semantic_memory"],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/d-o-hub/chaotic_semantic_memory","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-o-hub%2Fchaotic_semantic_memory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-o-hub%2Fchaotic_semantic_memory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-o-hub%2Fchaotic_semantic_memory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-o-hub%2Fchaotic_semantic_memory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/d-o-hub","download_url":"https://codeload.github.com/d-o-hub/chaotic_semantic_memory/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-o-hub%2Fchaotic_semantic_memory/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32222537,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T10:26:35.452Z","status":"ssl_error","status_checked_at":"2026-04-24T10:25:27.643Z","response_time":64,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","echo-state-network","hyperdimensional","libsql","reservoir-computing","rust","semantic-memory","wasm"],"created_at":"2026-02-28T12:19:15.818Z","updated_at":"2026-04-24T12:07:24.491Z","avatar_url":"https://github.com/d-o-hub.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# chaotic_semantic_memory\n\n[![CI](https://github.com/d-o-hub/chaotic_semantic_memory/actions/workflows/ci.yml/badge.svg)](https://github.com/d-o-hub/chaotic_semantic_memory/actions/workflows/ci.yml)\n[![Crates.io](https://img.shields.io/crates/v/chaotic_semantic_memory.svg)](https://crates.io/crates/chaotic_semantic_memory)\n[![docs.rs](https://img.shields.io/docsrs/chaotic_semantic_memory)](https://docs.rs/chaotic_semantic_memory)\n[![npm](https://img.shields.io/npm/v/@d-o-hub/chaotic_semantic_memory)](https://www.npmjs.com/package/@d-o-hub/chaotic_semantic_memory)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n`chaotic_semantic_memory` is a Rust crate for AI memory systems built on\n**Hyperdimensional Computing (HDC)** — not transformer embeddings:\n- 10240-bit binary hypervectors with SIMD-accelerated operations\n- chaotic echo-state reservoirs for temporal processing\n- libSQL persistence (local SQLite or remote Turso)\n\nIt targets both native and `wasm32` builds with explicit threading guards.\n\n## Quick Links\n\n| Resource | Link |\n|----------|------|\n| Documentation | [docs.rs/chaotic_semantic_memory](https://docs.rs/chaotic_semantic_memory) |\n| Crates.io | [crates.io/crates/chaotic_semantic_memory](https://crates.io/crates/chaotic_semantic_memory) |\n| Issues | [GitHub Issues](https://github.com/d-o-hub/chaotic_semantic_memory/issues) |\n| Changelog | [CHANGELOG.md](CHANGELOG.md) |\n\n## Important: HDC, Not Semantic Embeddings\n\nThis crate uses **Hyperdimensional Computing (HDC)** for text encoding — it is\n**not** a transformer or embedding model. Understanding this distinction is critical:\n\n| | HDC (this crate) | Transformer Embeddings (e.g. sentence-transformers) |\n|---|---|---|\n| **Method** | Hash-based token → random hypervector | Learned neural network encodings |\n| **Similarity** | Tokens + position match → similar vectors | Semantic meaning → similar vectors |\n| **\"cat\" vs \"kitten\"** | Low similarity (different tokens) | High similarity (synonyms) |\n| **\"cat sat\" vs \"sat cat\"** | Different (position-aware) | Often similar |\n| **Compute** | CPU-only, deterministic, no GPU | GPU-accelerated, learned weights |\n| **Use case** | Keyword/lexical search, exact-match recall | Semantic search, paraphrase detection |\n\n**Bottom line:** `inject_text` / `probe_text` match on shared tokens at similar\npositions. For true semantic similarity, use an external embedding model and inject\nvectors directly via `inject_concept`.\n\n## Features\n\n- **Hyperdimensional Computing**: 10240-bit binary hypervectors with SIMD-accelerated operations\n- **Chaotic Reservoirs**: Configurable echo-state networks with spectral radius controls `[0.9, 1.1]`\n- **Semantic Memory**: Concept graphs with weighted associations and similarity search\n- **Optimized Retrieval**: Two-stage retrieval pipeline with heuristic-based candidate generation (bucket, graph) and dense-vector scoring.\n- **Persistence**: libSQL for local SQLite or remote Turso database\n- **WASM Support**: Browser-compatible with memory-based import/export\n- **CLI**: Full-featured command-line interface with shell completions\n- **Production-Ready**: Structured logging, metrics, input validation, memory guardrails\n\n## Installation\n\n### Rust Library\n\n```bash\ncargo add chaotic_semantic_memory\n```\n\nFor WASM targets, build with `--target wasm32-unknown-unknown`. No additional feature flag is needed;\nWASM support is enabled automatically when compiling for the `wasm32` target architecture.\n\n```toml\n[dependencies]\nchaotic_semantic_memory = { version = \"0.3\" }\n```\n\nFor library-only consumers who don't need the CLI binary or its dependencies:\n\n```toml\n[dependencies]\nchaotic_semantic_memory = { version = \"0.3\", default-features = false }\n```\n\n### CLI Binary\n\n**via npm (recommended for Node.js users):**\n```bash\nnpm install -g @d-o-hub/csm\n```\n\n**via cargo:**\n```bash\ncargo install chaotic_semantic_memory --bin csm\n```\n\n\u003e **Note:** Using `\"0.2\"` ensures compatibility with the latest 0.2.x patch versions.\n\n## Core Components\n\n- `hyperdim`: binary hypervector math (`HVec10240`) and similarity operations\n- `reservoir`: sparse chaotic reservoir dynamics with spectral radius controls\n- `singularity`: concept graph, associations, retrieval, and memory limits\n- `framework`: high-level async orchestration API\n- `persistence`: libSQL-backed storage (native only)\n- `wasm`: JS-facing bindings for browser/runtime integration (wasm32 target only)\n- `encoder`: text and binary encoding utilities\n- `graph_traversal`: graph walk and reachability utilities\n- `metadata_filter`: metadata query and filtering\n- `bundle`: snapshot and bundle helpers\n- `cli`: Command-line interface (`csm` binary)\n- `semantic_bridge`: Semantic Bridge Layer for concept expansion (ADR-0061)\n- `bridge_retrieval`: Bridge retrieval pipeline for zero-drift semantic search\n- `retrieval`: Hybrid BM25/HDC retrieval module (ADR-0062)\n\n## Semantic Bridge Layer (v0.3.0)\n\nZero-drift semantic expansion on top of deterministic HDC memory:\n- **CanonicalConcept**: Versioned concept with labels and related IDs\n- **ConceptGraph**: In-memory label index for token-to-concept matching\n- **BridgeRetrieval**: Pipeline: normalize → HDC recall → concept expansion → second recall\n- **MemoryPacket**: Compressed output for LLM context injection\n\n**Key principle**: Additive, non-destructive, never mutates canonical memory vectors.\n\n## Hybrid BM25+HDC Retrieval (v0.3.0)\n\nHybrid retrieval combines keyword (BM25) and semantic (HDC) search:\n- **BM25 parameters**: k1=1.2 (TF saturation), b=0.75 (doc length normalization)\n- **Query-length-dependent weights**:\n\n| Tokens | Keyword | Semantic |\n|--------|---------|----------|\n| 1-2    | 0.9     | 0.1      |\n| 3-4    | 0.7     | 0.3      |\n| 5-8    | 0.4     | 0.6      |\n| 9+     | 0.2     | 0.8      |\n\n## How Text Encoding Works (HDC Pipeline)\n\nThe built-in `TextEncoder` uses **Hyperdimensional Computing (HDC)** — a deterministic,\nhash-based encoding, **not** a learned neural network:\n\n```\n┌─────────────┐     ┌──────────────────┐     ┌──────────────────┐     ┌─────────┐\n│  \"hello     │     │ FNV-1a hash      │     │ Positional       │     │ Bundle  │\n│   world\"    │──▶ │ per token        │────▶│ permutation      │───▶│ majority│\n│             │     │ PRNG → HVec10240 │     │ (word order)     │     │ rule    │\n└─────────────┘     └──────────────────┘     └──────────────────┘     └─────────┘\n    Tokenize             Token→HVec              Position Encode         Final HV\n```\n\n**Pipeline steps:**\n\n1. **Tokenize**: Split on whitespace, lowercase (`hello world` → `[\"hello\", \"world\"]`)\n2. **Token → HVec**: FNV-1a hash → seed PRNG → generate random `HVec10240` per token\n3. **Positional encoding**: Permute each token vector by its position (order matters)\n4. **Bundle**: Majority-rule combination into a single `HVec10240`\n\n**Key properties:**\n- **Deterministic**: Same text always produces the same vector (FNV-1a is stable across Rust versions)\n- **Token-sensitive**: Similar tokens in similar positions → similar vectors\n- **NOT semantic**: Synonyms/paraphrases (\"cat\" vs \"kitten\") will NOT match\n- **Position-aware**: \"cat sat\" ≠ \"sat cat\" (order matters)\n\n### Recommended API\n\n```rust\n// HDC text encoding — good for lexical/keyword similarity\nframework.inject_text(\"doc-1\", \"reservoir computing overview\").await?;\nlet hits = framework.probe_text(\"reservoir computing\", 5).await?;\n\n// External embeddings — good for semantic similarity\nlet embedding: HVec10240 = my_model.encode(\"an overview of echo-state networks\");\nframework.inject_concept(\"doc-2\", embedding).await?;\n```\n\n**Use `inject_text`/`probe_text` for:**\n- Keyword search and exact-match recall\n- Document deduplication (same/similar text)\n- Indexing text where token overlap matters\n\n**Use external embeddings (`inject_concept`) for:**\n- Semantic search (synonyms, paraphrases)\n- Concept-level similarity across different wording\n- Cross-lingual matching\n\n### Turso Vector Alternative\n\nThis crate uses libSQL (local SQLite or remote Turso) for persistence. For\nsemantic similarity, you can add Turso's [native vector search](https://turso.tech/vector)\ntables alongside the crate's HDC storage using the same database:\n\n```rust\nuse libsql::Builder;\n\n// Connect to the same database this crate uses for persistence\nlet db = Builder::new_local(\"csm_memory.db\").build().await?;\nlet conn = db.connect()?;\n\n// Add semantic vector table alongside the crate's concepts table\nconn.execute_batch(\"\n    CREATE TABLE IF NOT EXISTS semantic_vectors (\n        id TEXT PRIMARY KEY,\n        embedding F32_BLOB(384)\n    );\n    CREATE INDEX IF NOT EXISTS semantic_idx ON semantic_vectors(\n        libsql_vector_idx(embedding, 'metric=cosine')\n    );\n\").await?;\n\n// Query with vector_top_k\nlet rows = conn.query(\n    \"SELECT id FROM vector_top_k('semantic_idx', vector(?), 10)\",\n    libsql::params![query_embedding_f32_as_string]\n).await?;\n```\n\nThis keeps HDC and semantic vectors in the **same database**: the crate manages\n`csm_concepts` and `csm_associations` tables (with `csm_` prefix for namespace isolation),\nwhile you manage `semantic_vectors` for float-vector similarity search. Both query the same libSQL/Turso instance.\n\n## CLI Usage\n\nThe `csm` binary provides command-line access:\n\n```bash\n# Inject a concept\ncsm inject my-concept --database csm_memory.db\n\n# Find similar concepts\ncsm probe my-concept -k 10 --database csm_memory.db\n\n# Create associations\ncsm associate source-concept target-concept --strength 0.8 --database csm_memory.db\n\n# Export memory state\ncsm export --output backup.json\n\n# Import memory state\ncsm import backup.json --merge\n\n# Generate shell completions\ncsm completions bash \u003e ~/.local/share/bash-completion/completions/csm\n```\n\n### CLI Commands\n\n| Command | Description |\n|---------|-------------|\n| `inject` | Inject a new concept with a random or provided vector |\n| `probe` | Find similar concepts by concept ID |\n| `associate` | Create an association between two concepts |\n| `export` | Export memory state to JSON or binary |\n| `import` | Import memory state from file |\n| `version` | Show version information |\n| `completions` | Generate shell completions |\n\n## Quick Start\n\n```rust\nuse chaotic_semantic_memory::prelude::*;\n\n#[tokio::main]\nasync fn main() -\u003e Result\u003c()\u003e {\n    let framework = ChaoticSemanticFramework::builder()\n        .without_persistence()\n        .build()\n        .await?;\n\n    let concept = ConceptBuilder::new(\"cat\".to_string()).build()?;\n    framework.inject_concept(\"cat\".to_string(), concept.vector.clone()).await?;\n\n    let hits = framework.probe(concept.vector.clone(), 5).await?;\n    println!(\"hits: {}\", hits.len());\n    Ok(())\n}\n```\n\nSee `examples/proof_of_concept.rs` for an end-to-end flow.\nSee `examples/basic_in_memory.rs` for the minimal in-memory workflow.\n\n## Configuration\n\n`ChaoticSemanticFramework::builder()` exposes runtime tuning knobs.\n\n| Parameter | Default | Valid Range | Effect |\n|---|---:|---|---|\n| `reservoir_size` | `50_000` | `\u003e 0` | Reservoir capacity and memory footprint |\n| `reservoir_input_size` | `10_240` | `\u003e 0` | Width of each sequence step |\n| `chaos_strength` | `0.1` | `0.0..=1.0` (recommended) | Noise amplitude in chaotic updates |\n| `enable_persistence` | `true` | boolean | Enables libSQL persistence setup |\n| `max_concepts` | `None` | optional positive | Evicts oldest concepts when reached |\n| `max_associations_per_concept` | `None` | optional positive | Keeps strongest associations only |\n| `connection_pool_size` | `10` | `\u003e= 1` | Turso/libSQL remote pool size |\n| `max_probe_top_k` | `10_000` | `\u003e= 1` | Input guard for `probe` and batch probes |\n| `max_metadata_bytes` | `None` | optional positive | Metadata payload size guard |\n| `concept_cache_size` | `128` | `\u003e= 1` | Similarity query cache capacity (set via `with_concept_cache_size`, stored separately from `FrameworkConfig`) |\n\n### Tuning Guide\n\n- Small workloads: disable persistence and use `reservoir_size` around `10_240`.\n- Mid-sized workloads: keep defaults and set `max_concepts` to enforce memory ceilings.\n- Large workloads: keep persistence enabled, increase `connection_pool_size`, and tune `max_probe_top_k` to practical limits.\n\n## API Patterns\n\nIn-memory flow:\n\n```rust\nlet framework = ChaoticSemanticFramework::builder()\n    .without_persistence()\n    .build()\n    .await?;\n```\n\nPersistent flow:\n\n```rust\nlet framework = ChaoticSemanticFramework::builder()\n    .with_local_db(\"csm_memory.db\")\n    .build()\n    .await?;\n```\n\nBatch APIs for bulk workloads:\n\n```rust\nframework.inject_concepts(\u0026concepts).await?;\nframework.associate_many(\u0026edges).await?;\nlet hits = framework.probe_batch(\u0026queries, 10).await?;\n```\n\nLoad semantics:\n\n- `load_replace()`: clear in-memory state, then load persisted data.\n- `load_merge()`: merge persisted state into current in-memory state.\n\n## WASM Build\n\n```bash\nrustup target add wasm32-unknown-unknown\ncargo check --target wasm32-unknown-unknown\n```\n\nNotes:\n- WASM threading-sensitive paths are guarded with `#[cfg(not(target_arch = \"wasm32\"))]`.\n- Persistence is intentionally unavailable on `wasm32` in this crate build.\n- WASM parity APIs include `processSequence`, `exportToBytes`, and `importFromBytes`.\n\n## Concurrency Model\n\nInternal state is protected by `tokio::sync::RwLock` — safe for concurrent\naccess from multiple Tokio tasks via `Arc\u003cChaoticSemanticFramework\u003e`.\n\n### Multi-Instance Safety\n\nMultiple `ChaoticSemanticFramework` instances sharing the same database file\nare safe for concurrent operation:\n\n- **Reads** (`probe`, `get_concept`, `get_associations`, `stats`) acquire\n  `RwLock` read guards and can run fully concurrently across tasks and\n  framework instances.\n- **Writes** (`inject_concept`, `associate`, `delete_concept`) acquire write\n  guards in-process and are serialized at the database layer by SQLite's WAL\n  write lock. Two instances writing to the same database will queue on WAL\n  without data corruption.\n\n### SQLite WAL Mode\n\nLocal SQLite connections explicitly enable `PRAGMA journal_mode=WAL` during\ninitialization (`src/persistence.rs`). This provides:\n\n- Concurrent readers never block each other or a writer.\n- A single writer never blocks readers (readers see the last consistent snapshot).\n- Checkpoints via `PRAGMA wal_checkpoint(TRUNCATE)` merge WAL data back into\n  the main database file.\n\nRemote Turso connections delegate concurrency to the server and do not set\nWAL mode locally.\n\n### Lock Discipline\n\nWrite locks on `singularity` are held only for in-memory operations and are\nnever held across `.await` points (see [ADR-0040](plans/adr/0040-async-lock-safety.md)).\nPersistence I/O happens after the write lock is released, so concurrent probes\nare never blocked by database writes.\n\n### Scaling Characteristics\n\n| Operation | Complexity | Notes |\n|---|---|---|\n| `inject_concept` | O(1) amortized | HashMap insert + dense vector append |\n| `associate` | O(1) amortized | HashMap insert with optional eviction |\n| `probe` (exact scan) | O(n) | Cosine similarity over all n concepts; parallelized via Rayon on native |\n| `probe` (bucket candidates) | O(n / 2^w) | w-bit bucket width narrows candidate set before exact scoring |\n| `probe` (graph candidates) | O(f^d) | BFS from nearest neighbor at depth d, fanout f |\n\nThe default retrieval path is an **exact O(n) scan** over all stored concept\nvectors. For larger corpora, two-stage candidate generation can be enabled via\n`RetrievalConfig`:\n\n- **Bucket candidates**: Coarse hash-bucketing on the first w bits of the\n  hypervector narrows the candidate set before exact scoring.\n- **Graph candidates**: BFS expansion from the nearest-neighbor seed through\n  the association graph, bounded by depth and fanout.\n\nBoth reduce the scored subset from n to a smaller candidate set while\npreserving exact similarity semantics on the reranking pass.\n\n### ANN/LSH Deferred\n\nApproximate nearest-neighbor (ANN) or locality-sensitive hashing (LSH) indexing\nis **intentionally deferred** until benchmarks demonstrate latency regression\nbeyond the current threshold. As documented in\n[ADR-0056](plans/adr/0056-performance-follow-up-priorities.md), the trigger is\n**\u003e200k concepts** with latency degradation. Current benchmarks show the exact\nscan completes in ~24ms at 200k concepts, well within acceptable bounds\n(see [ADR-0059](plans/adr/0059-retrieval-optimization.md) for retrieval\noptimization details and benchmark methodology).\n\n### Async Runtime\n\nThe framework is fully async. Do **not** wrap calls in `block_on` inside an\nexisting Tokio runtime — use `.await` directly or spawn a task. All public\nAPIs return `Result\u003cT, MemoryError\u003e` and use Tokio for I/O, with Rayon\ngated behind `#[cfg(not(target_arch = \"wasm32\"))]` for CPU parallelism.\n\n## Development Gates\n\n```bash\ncargo check --quiet\ncargo test --all-features --quiet\ncargo fmt --check --quiet\ncargo clippy --quiet -- -D warnings\n```\n\nLOC policy: each source file in `src/` must stay at or below 500 lines.\n\n## Mutation Testing\n\nInstall cargo-mutants once:\n\n```bash\ncargo install cargo-mutants\n```\n\nRun profiles:\n\n```bash\nscripts/mutation_test.sh fast\nscripts/mutation_test.sh full\n```\n\nReports are written under `progress/mutation/`.\n\n## Benchmark Gates\n\n```bash\ncargo bench --bench benchmark -- --save-baseline main\ncargo bench --bench benchmark -- --baseline main\ncargo bench --bench persistence_benchmark -- --save-baseline main\ncargo bench --bench persistence_benchmark -- --baseline main\n```\n\nPrimary perf gate: `reservoir_step_50k \u003c 100us`.\n\n## License\n\n[MIT](LICENSE)\n\n## Contributing\n\nContributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) for:\n- Code style and linting requirements\n- Test and benchmark commands\n- Pull request process\n- ADR submission for architectural changes\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-o-hub%2Fchaotic_semantic_memory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fd-o-hub%2Fchaotic_semantic_memory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-o-hub%2Fchaotic_semantic_memory/lists"}