{"id":31841575,"url":"https://github.com/dirvine/saorsa-rsps","last_synced_at":"2025-10-12T05:21:13.813Z","repository":{"id":310042437,"uuid":"1038455547","full_name":"dirvine/saorsa-rsps","owner":"dirvine","description":"Root-Scoped Provider Summaries using Golomb Coded Sets for P2P DHT","archived":false,"fork":false,"pushed_at":"2025-08-15T12:23:02.000Z","size":136,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-21T02:39:20.716Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dirvine.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-15T08:23:30.000Z","updated_at":"2025-08-15T12:23:06.000Z","dependencies_parsed_at":"2025-08-15T18:50:32.534Z","dependency_job_id":null,"html_url":"https://github.com/dirvine/saorsa-rsps","commit_stats":null,"previous_names":["dirvine/saorsa-rsps","dirvine/saorsa-rsps-foundation"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dirvine/saorsa-rsps","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dirvine%2Fsaorsa-rsps","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dirvine%2Fsaorsa-rsps/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dirvine%2Fsaorsa-rsps/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dirvine%2Fsaorsa-rsps/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dirvine","download_url":"https://codeload.github.com/dirvine/saorsa-rsps/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dirvine%2Fsaorsa-rsps/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279010341,"owners_count":26084738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-12T05:21:09.276Z","updated_at":"2025-10-12T05:21:13.808Z","avatar_url":"https://github.com/dirvine.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Saorsa RSPS\n\n[![Rust](https://github.com/dirvine/saorsa-rsps-foundation/actions/workflows/rust.yml/badge.svg)](https://github.com/dirvine/saorsa-rsps-foundation/actions/workflows/rust.yml)\n[![docs.rs](https://docs.rs/saorsa-rsps/badge.svg)](https://docs.rs/saorsa-rsps)\n[![Crates.io](https://img.shields.io/crates/v/saorsa-rsps.svg)](https://crates.io/crates/saorsa-rsps)\n\nRoot-Scoped Provider Summaries using Golomb Coded Sets (GCS) for efficient DHT lookups and cache management in P2P networks.\n\n## What This Solves\n\nIn decentralized networks like IPFS, BitTorrent, and other DHT-based P2P systems, finding content is expensive. Traditional approaches require:\n\n- **Broadcasting provider records** for every piece of content to the entire DHT network, consuming massive bandwidth\n- **Flooding the network** with discovery requests when searching for related content \n- **Storing individual provider records** for millions of Content IDs (CIDs), overwhelming DHT nodes with storage and lookup overhead\n\n**Saorsa RSPS** solves this by introducing **hierarchical content organization** with **ultra-compact summaries**:\n\n### The Problem: DHT Provider Record Explosion\nWhen you store a large dataset (like a website, software repository, or media collection) in a P2P network, each file chunk gets its own CID. A typical website might have thousands of CIDs, a software repository tens of thousands. Publishing provider records for each CID individually to the DHT:\n- Creates **millions of DHT messages** for large content\n- **Overwhelms DHT nodes** with storage requirements  \n- Makes **content discovery slow** due to network-wide searches\n- **Wastes bandwidth** with redundant provider advertisements\n\n### The Solution: Root-Scoped Provider Summaries\nInstead of advertising individual CIDs, RSPS lets you:\n\n1. **Group related content** under a single \"root CID\" (like a directory, repository, or collection)\n2. **Create a compact summary** using Golomb Coded Sets that represents thousands of CIDs in just a few KB\n3. **Advertise only the summary** to the DHT, reducing messages by 1000x or more\n4. **Enable fast batch discovery** - one lookup tells you if ANY of thousands of CIDs might be available\n\n### Real-World Use Cases\n\n- **Content Distribution**: Efficiently advertise that you host an entire website/app without flooding the DHT\n- **Software Repositories**: Let peers discover if you have specific versions/packages without individual lookups\n- **Media Collections**: Advertise entire albums, movie series, or dataset collections as single summaries\n- **Version Control**: Organize git-like repositories with hierarchical content discovery\n- **Caching Networks**: Smart cache admission - only cache content that's part of advertised collections\n\n### Performance Benefits\n\n- **20-30% more compact** than Bloom filters for the same false positive rate\n- **1000x reduction** in DHT provider record messages for large content collections\n- **Sub-millisecond membership testing** for thousands of CIDs\n- **Minimal memory overhead** - entire summaries fit in L1 cache\n- **Network-efficient serialization** - summaries transport in single UDP packets\n\n## Features\n\n- **Golomb Coded Sets**: Space-efficient Content ID (CID) summaries with configurable false positive rates\n- **Root-anchored Cache**: Cache admission policies anchored to root CIDs for hierarchical data organization\n- **TTL Management**: Sophisticated time-to-live management with hit tracking and witness receipts\n- **VRF Pseudonyms**: Verifiable Random Function pseudonyms for witness receipt systems\n- **Async/Await Support**: Full async/await support with Tokio\n- **High Performance**: Optimized for P2P DHT operations with minimal memory overhead\n\n## Quick Start\n\nAdd this to your `Cargo.toml`:\n\n```toml\n[dependencies]\nsaorsa-rsps = \"0.1.0\"\n```\n\n## Example Usage\n\n```rust\nuse saorsa_rsps::{Rsps, RspsConfig, Cid};\n\n#[tokio::main]\nasync fn main() -\u003e Result\u003c(), Box\u003cdyn std::error::Error\u003e\u003e {\n    let root_cid = [1u8; 32];\n    let cids = vec![\n        [2u8; 32],\n        [3u8; 32],\n        [4u8; 32],\n    ];\n    let config = RspsConfig::default();\n    \n    // Create RSPS for a root with associated CIDs\n    let rsps = Rsps::new(root_cid, 1, \u0026cids, \u0026config)?;\n    \n    // Check if a CID might be under this root\n    let test_cid = [2u8; 32];\n    if rsps.contains(\u0026test_cid) {\n        println!(\"CID might be under this root\");\n    }\n    \n    // Get digest for DHT advertisement\n    let digest = rsps.digest();\n    println!(\"RSPS digest: {:?}\", digest);\n    \n    Ok(())\n}\n```\n\n## Components\n\n### Golomb Coded Sets (GCS)\nEfficient probabilistic data structure for representing sets with configurable false positive rates. Optimized for P2P networks where bandwidth and storage efficiency are critical.\n\n### Cache Management\nRoot-anchored cache policies that organize data hierarchically under root CIDs, with sophisticated TTL management based on hit frequency and witness receipts.\n\n### TTL Engine\nAdvanced time-to-live management with:\n- Base TTL for new entries\n- TTL extension per cache hit\n- TTL extension per witness receipt\n- Temporal bucketing for receipt aggregation\n\n### Witness System\nVRF-based witness receipts for distributed verification and reputation systems in P2P networks.\n\n## Architecture\n\nRSPS (Root-Scoped Provider Summaries) organize content hierarchically under root CIDs, enabling efficient DHT lookups for complex data structures. Each RSPS contains:\n\n- **Root CID**: The anchor point for hierarchical organization\n- **Epoch**: Version/time identifier for cache invalidation\n- **GCS**: Space-efficient summary of CIDs under this root\n- **Salt**: Deterministic salt for GCS construction\n- **Metadata**: Creation timestamp and configuration\n\n## Using RSPS in a Decentralized Network\n\nThis crate is designed for use in DHT-based or gossip-based networks where providers advertise summaries of content clustered under a root CID. A typical flow:\n\n1. **Producer builds RSPS**\n   - Select a `root_cid` and gather the set of child CIDs.\n   - Build an `Rsps` with an epoch representing the dataset version.\n   - Serialize or extract the `digest()` for lightweight advertisement in the DHT.\n\n2. **Advertisement**\n   - Publish either the RSPS bytes or its `digest` keyed by `root_cid`+`epoch` in your routing layer.\n   - Peers cache the announcement, possibly with TTL heuristics matching their local policy.\n\n3. **Discovery**\n   - A client looking for a `cid` first fetches the RSPS for the relevant `root_cid`/`epoch`.\n   - Use `rsps.contains(\u0026cid)` to check probabilistic membership.\n   - On positive result, proceed to fetch the content from providers under that root.\n\n4. **Cache integration**\n   - Register an `Rsps` with `RootAnchoredCache` to gate cache admission: only items in the RSPS are eligible.\n   - Use the `TtlEngine` to manage lifetimes based on hits and witness receipts.\n\n5. **Epoch rotation**\n   - When the dataset changes, increment `epoch` and republish a new RSPS.\n   - Consumers prefer the highest known epoch and drop stale ones per policy.\n\n### False-positive tuning\n\n- `RspsConfig.target_fpr` sets the target false positive rate. This crate uses Golomb–Rice coding internally, selecting a `p = 2^k` consistent with the requested FPR.\n- Trade-offs:\n  - Lower FPR → larger RSPS, more CPU to encode/decode, less cache pollution.\n  - Higher FPR → smaller RSPS, faster, but occasional extra fetches.\n\n### Why Golomb–Rice coding?\n\nGolomb coding represents non-negative integers as a quotient (unary-coded) and a remainder (binary-coded) with respect to a parameter `p`. When `p` is a power of two (`p = 2^k`), the scheme is called Golomb–Rice coding. We choose Rice coding for RSPS because:\n\n- Simpler and faster decode: the remainder is exactly `k = log2(p)` bits; no truncated-binary logic is needed, which keeps parsing branch-light and cache-friendly.\n- Deterministic footprint vs. FPR: we pick `k = ceil(log2(1 / target_fpr))` so the parameter ties directly to the requested false-positive rate.\n- Good practical compression for hashed, near-uniform deltas: the sorted hashed values modulo `n * p` produce geometric-like deltas that Rice handles efficiently.\n\nIn this crate, we enforce Rice coding by deriving `p` as a power of two and validating `p` during decode. This provides predictable encode/decode behavior and avoids edge cases that arise with general Golomb parameters.\n### Serialization and transport\n\n- `GolombCodedSet::to_bytes`/`from_bytes` serialize the GCS; an RSPS can be reconstructed from its components across nodes.\n- For transport, include: `root_cid`, `epoch`, `salt`, and `gcs.to_bytes()`.\n\n### Witness receipts\n\n- Nodes can issue witness receipts for successful retrievals under a root to extend TTLs and inform reputation systems.\n- Uses production-ready ed25519-dalek v2 for signatures and RFC 9381 ECVRF on ristretto255 for VRF pseudonyms.\n- Implements domain separation to prevent cross-protocol attacks.\n\n## Security Notes\n\n### Cryptographic Implementation\n- **Ed25519 signatures**: Uses `ed25519-dalek` v2.x, a battle-tested, misuse-resistant implementation\n- **VRF pseudonyms**: RFC 9381 compliant ECVRF on ristretto255 via `vrf-r255` crate\n- **Domain separation**: All cryptographic operations use distinct domain prefixes:\n  - VRF inputs: `b\"saorsa-rsps:vrf:v1:\"`\n  - Witness signatures: `b\"saorsa-rsps:witness:v1:\"`\n- **Key hygiene**: Secret keys are automatically zeroized on drop\n- **Separate key domains**: Ed25519 and VRF keys are kept completely separate\n\n### Security Properties\n- **No panics**: All cryptographic operations return `Result` types with proper error handling\n- **Strict validation**: Input validation for all key sizes and proof lengths\n- **Memory safety**: Built with Rust 2024 edition, `#![forbid(unsafe_code)]` in crypto module\n- **Audit trail**: Uses well-audited cryptographic libraries with extensive test coverage\n\n### Implementation Notes\n- GCS uses Golomb–Rice coding (power-of-two parameter) to ensure decode correctness and performance\n- VRF keys are separate from Ed25519 signature keys (different ciphersuites)\n- Domain separation prevents attacks where signatures/proofs from one context are replayed in another\n- All cryptographic operations are deterministic for the same inputs\n\n## Performance\n\nSaorsa RSPS is optimized for:\n- **Memory Efficiency**: GCS provides space-efficient CID summaries\n- **Network Efficiency**: Minimal bandwidth for DHT advertisements\n- **Lookup Speed**: Fast probabilistic membership testing\n- **Cache Effectiveness**: Smart TTL management with hit tracking\n\n## Safety and Security\n\n- Built with Rust 2024 edition for memory safety\n- No `unwrap()` or `expect()` in production code\n- Comprehensive error handling with `thiserror`\n- Security audit workflow with `cargo-audit`\n- Cryptographically secure random number generation\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the AGPL-3.0 license. See [LICENSE](LICENSE) for details.\n\n## Related Projects\n\n- [saorsa-fec](https://github.com/dirvine/saorsa-foundation) - Patent-free erasure coding\n- [saorsa-core](https://github.com/maidsafe/p2p) - P2P networking foundation","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdirvine%2Fsaorsa-rsps","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdirvine%2Fsaorsa-rsps","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdirvine%2Fsaorsa-rsps/lists"}