{"id":29613807,"url":"https://github.com/freddiehaddad/mvcompression","last_synced_at":"2025-10-19T00:12:17.933Z","repository":{"id":305526547,"uuid":"1023105541","full_name":"freddiehaddad/mvcompression","owner":"freddiehaddad","description":"🧠 Thread-safe adaptive compression decision system - learns when to skip ineffective compression using lock-free algorithms.","archived":false,"fork":false,"pushed_at":"2025-07-20T15:55:29.000Z","size":27,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-20T17:19:20.961Z","etag":null,"topics":["adaptive-algorithms","atomic-operations","compression","data-processing","lock-free","machine-learning","optimization","performance","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/freddiehaddad.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-20T14:37:50.000Z","updated_at":"2025-07-20T15:55:32.000Z","dependencies_parsed_at":"2025-07-20T17:19:27.829Z","dependency_job_id":null,"html_url":"https://github.com/freddiehaddad/mvcompression","commit_stats":null,"previous_names":["freddiehaddad/mvcompression"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/freddiehaddad/mvcompression","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/freddiehaddad%2Fmvcompression","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/freddiehaddad%2Fmvcompression/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/freddiehaddad%2Fmvcompression/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/freddiehaddad%2Fmvcompression/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/freddiehaddad","download_url":"https://codeload.github.com/freddiehaddad/mvcompression/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/freddiehaddad%2Fmvcompression/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266211497,"owners_count":23893347,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adaptive-algorithms","atomic-operations","compression","data-processing","lock-free","machine-learning","optimization","performance","rust"],"created_at":"2025-07-20T22:37:47.309Z","updated_at":"2025-10-19T00:12:12.915Z","avatar_url":"https://github.com/freddiehaddad.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MVCompression - Adaptive Compression Decision System\n\n[![Rust](https://img.shields.io/badge/rust-1.70+-blue.svg)](https://www.rust-lang.org)\n[![License](https://img.shields.io/badge/license-MIT%20OR%20Apache--2.0-blue.svg)](LICENSE)\n[![Documentation](https://img.shields.io/badge/docs-pending%20publication-orange.svg)](#documentation)\n\nA thread-safe, lock-free adaptive compression decision system that learns from past compression performance to intelligently decide whether to compress future data blocks.\n\n## 🚀 Features\n\n- **🧠 Adaptive Learning**: Automatically learns from compression effectiveness over time\n- **🔒 Thread-Safe**: Lock-free atomic operations for high-performance concurrent access\n- **⚡ High Performance**: Minimal overhead with atomic compare-and-swap operations\n- **🎯 Self-Tuning**: Automatically adjusts behavior based on data characteristics\n- **📊 Monitoring**: Built-in metrics for algorithm state and performance tracking\n- **🛡️ Safe**: Memory-safe Rust implementation with comprehensive testing\n\n## 📖 Overview\n\nThe MVCompression algorithm maintains a \"compression value\" score and moving averages of compressed/uncompressed block sizes to make intelligent compression decisions. It adapts its behavior based on historical compression effectiveness, automatically skipping compression when it's likely to be ineffective.\n\n### How It Works\n\n1. **Compression Value**: Starts at -80 and adjusts based on compression results\n   - Good compression (ratio ≤ 0.9): decreases value by 10\n   - Poor compression (ratio \u003e 0.9): increases value by 4\n   - Skip events: decreases value by 1\n\n2. **Skip Logic**: When compression value becomes positive:\n   - Compares incoming block size to historical average\n   - Skips compression if size is within 25% of expected size\n\n3. **Moving Averages**: Tracks compressed and uncompressed block sizes\n   - Uses exponential moving average with smoothing factor\n   - Helps predict future compression effectiveness\n\n## 🚀 Quick Start\n\nAdd this to your `Cargo.toml`:\n\n```toml\n[dependencies]\nmvcompression = \"0.1.0\"\n```\n\n### Basic Usage\n\n```rust\nuse mvcompression::MVCompression;\n\nfn main() {\n    let mvc = MVCompression::new();\n    \n    // Process data blocks\n    for block_data in data_blocks {\n        if mvc.should_skip_compression(block_data.len()) {\n            // Skip compression for this block\n            store_uncompressed(block_data);\n        } else {\n            // Attempt compression\n            let compressed = compress(block_data);\n            \n            // Update algorithm with results\n            mvc.update_compression_ratio(compressed.len(), block_data.len());\n            store_compressed(compressed);\n        }\n    }\n}\n```\n\n### Thread-Safe Usage\n\n```rust\nuse mvcompression::MVCompression;\nuse std::sync::Arc;\nuse std::thread;\n\nfn main() {\n    let mvc = Arc::new(MVCompression::new());\n    \n    // Spawn multiple worker threads\n    let handles: Vec\u003c_\u003e = (0..4).map(|_| {\n        let mvc = Arc::clone(\u0026mvc);\n        thread::spawn(move || {\n            // Each thread can safely use the same MVCompression instance\n            if mvc.should_skip_compression(1024) {\n                // Skip compression\n            } else {\n                // Perform compression and update\n                mvc.update_compression_ratio(512, 1024);\n            }\n        })\n    }).collect();\n    \n    for handle in handles {\n        handle.join().unwrap();\n    }\n}\n```\n\n## 📊 Algorithm Parameters\n\nThe algorithm uses several tunable constants that affect its behavior:\n\n| Parameter | Value | Description |\n|-----------|-------|-------------|\n| `BLOCK_COMPRESSABLE_RATIO` | 0.9 | Threshold for good vs poor compression |\n| `INITIAL_COMPRESSION_VALUE` | -80 | Starting compression value |\n| `COMPRESSIBLE_BLOCK_WEIGHT` | -10 | Adjustment for good compression |\n| `NON_COMPRESSIBLE_BLOCK_WEIGHT` | 4 | Adjustment for poor compression |\n| `SKIP_COMPRESSION_BLOCK_WEIGHT` | -1 | Adjustment when skipping |\n| `MAX_COMPRESSION_VALUE` | 200 | Upper bound for compression value |\n| `MIN_COMPRESSION_VALUE` | -300 | Lower bound for compression value |\n\nThese parameters create a system that:\n- Starts optimistic (negative value = always compress)\n- Quickly adapts to poor compression (small positive weight vs large negative)\n- Gradually returns to compression attempts through skip penalties\n\n## 🔧 API Reference\n\n### Core Methods\n\n- `MVCompression::new()` - Create a new instance\n- `should_skip_compression(size: usize) -\u003e bool` - Check if compression should be skipped\n- `update_compression_ratio(compressed: usize, uncompressed: usize)` - Update algorithm with compression results\n\n### Monitoring Methods\n\n- `get_compression_value() -\u003e i32` - Get current compression bias value\n- `get_compressed_average() -\u003e usize` - Get smoothed compressed size average\n- `get_uncompressed_average() -\u003e usize` - Get smoothed uncompressed size average\n\n## 📈 Performance Characteristics\n\n- **Lock-free**: All operations use atomic compare-and-swap loops\n- **Memory efficient**: Only three atomic values per instance (12-16 bytes)\n- **Low overhead**: Minimal computation per decision (~10-20 CPU cycles)\n- **Scalable**: Performance doesn't degrade with thread count\n- **Cache-friendly**: Compact memory layout with good locality\n\n## 🧪 Examples\n\n### Run the Basic Example\n\n```bash\ncargo run --example basic_usage\n```\n\nThis runs a simulation showing how the algorithm learns to skip ineffective compression over time.\n\n### Run Performance Analysis\n\n```bash\ncargo run --release --example performance_analysis\n```\n\nThis provides comprehensive performance metrics including throughput, latency, memory usage, and convergence analysis.\n\n### Expected Output\n\n```\nMVCompression Algorithm Demo\n============================\nSimulating compression of 30 blocks (1000 bytes each)\n...\nBlock 21: COMPRESSED 1000 -\u003e 1000 bytes (ratio: 1.00)\nBlock 22: SKIPPED compression (size: 1000 bytes)\nBlock 23: SKIPPED compression (size: 1000 bytes)\n...\n✓ Algorithm successfully learned to skip ineffective compression!\n```\n\n## 🧪 Testing\n\nRun the comprehensive test suite:\n\n```bash\n# Run all tests\ncargo test\n\n# Run tests with output\ncargo test -- --nocapture\n\n# Run specific test\ncargo test test_thread_safety\n```\n\nThe test suite includes:\n- Basic functionality tests\n- Thread safety verification\n- Edge case handling\n- Boundary condition testing\n- Performance regression tests\n\n## 🔍 Use Cases\n\nThis algorithm is particularly useful for:\n\n1. **Streaming Data Processing**: Real-time decision making for large data streams\n2. **Database Storage**: Adaptive compression for variable data types\n3. **Network Protocols**: Dynamic compression decisions based on payload characteristics\n4. **File Systems**: Intelligent compression for diverse file types\n5. **Backup Systems**: Optimizing backup speed vs storage efficiency\n6. **CDN/Caching**: Adaptive compression for web content delivery\n\n## 🧮 Algorithm Analysis\n\n### Convergence Behavior\n\nThe algorithm typically converges to optimal behavior within 20-30 blocks:\n\n- **Highly compressible data**: Maintains negative compression value, rarely skips\n- **Poorly compressible data**: Develops positive compression value, frequently skips\n- **Mixed data**: Adapts dynamically based on recent block characteristics\n\n### Mathematical Properties\n\n- **Stability**: Bounded compression value prevents oscillation\n- **Responsiveness**: Asymmetric weights (10 vs 4) provide quick adaptation\n- **Memory**: Exponential moving average provides historical context\n- **Convergence**: System converges to optimal skip rate for given data characteristics\n\n## ⚡ Performance Analysis\n\n### Single-Threaded Performance\n\nBased on release-mode benchmarks on modern hardware:\n\n| Operation | Throughput | Latency |\n|-----------|------------|---------|\n| `should_skip_compression()` | ~2.1 billion ops/sec | ~0.47 ns |\n| `update_compression_ratio()` | ~109 million ops/sec | ~9.18 ns |\n\n### Multi-Threaded Scalability\n\nThe lock-free atomic design provides excellent concurrent performance:\n\n| Threads | Combined Throughput | Efficiency |\n|---------|-------------------|------------|\n| 1 | 91M ops/sec | 100% |\n| 2 | 56M ops/sec | 62% |\n| 4 | 95M ops/sec | 52% |\n| 8 | 157M ops/sec | 43% |\n| 16 | 260M ops/sec | 36% |\n\n*Note: Efficiency decreases with thread count due to cache coherency overhead, but total throughput continues to scale.*\n\n### Memory Characteristics\n\n- **Struct size**: 24 bytes total\n  - `AtomicI32`: 4 bytes (compression value)\n  - `AtomicUsize` × 2: 16 bytes (moving averages)\n  - Padding: 4 bytes\n- **No heap allocations**: Stack-only data structure\n- **Cache-friendly**: Fits in single cache line (64 bytes)\n- **Memory bandwidth**: Minimal (3 atomic loads/stores per operation)\n\n### Convergence Performance\n\nAlgorithm adapts quickly to data characteristics:\n\n| Data Pattern | Convergence Point | Final Skip Rate | Stability |\n|--------------|------------------|-----------------|-----------|\n| Highly Compressible (20% ratio) | 10 blocks | 0% | Excellent |\n| Poorly Compressible (95% ratio) | 10 blocks | 48% | Excellent |\n| Mixed Data (alternating) | 10 blocks | 0% | Good |\n| Random Data (30-90% ratios) | 10 blocks | 0% | Good |\n\n### Worst-Case Scenarios\n\nThe algorithm handles pathological cases gracefully:\n\n- **Alternating patterns**: 10K operations in \u003c1ms\n- **High contention**: 32 threads × 1K operations in \u003c1ms\n- **Lock-free guarantee**: No deadlocks or priority inversion\n- **Bounded behavior**: Always terminates within value bounds\n\n### Performance Optimization Tips\n\n1. **Batch operations**: Group multiple decisions when possible\n2. **Avoid false sharing**: Keep `MVCompression` instances on separate cache lines\n3. **Release builds**: Performance is 10-100x better than debug builds\n4. **CPU-specific optimization**: Use `RUSTFLAGS=\"-C target-cpu=native\"`\n\n### Comparison with Alternatives\n\n| Approach | Latency | Thread Safety | Memory | Adaptability |\n|----------|---------|---------------|--------|--------------|\n| **MVCompression** | ~0.5ns | Lock-free | 24 bytes | Excellent |\n| Mutex-based | ~20-100ns | Blocking | 32+ bytes | Good |\n| Thread-local | ~0.3ns | None | 24×threads | Poor |\n| Fixed threshold | ~0.1ns | Perfect | 0 bytes | None |\n\n## 🚦 Limitations\n\n- **Learning Period**: Requires 15-30 blocks to learn data characteristics\n- **Block Size Sensitivity**: Works best with relatively consistent block sizes\n- **Compression Ratio Threshold**: Fixed 0.9 threshold may not suit all use cases\n- **Memory Overhead**: Small but non-zero overhead for tracking state\n\n## 🛠️ Development\n\n### Building\n\n```bash\n# Debug build\ncargo build\n\n# Release build\ncargo build --release\n\n# With full optimizations\nRUSTFLAGS=\"-C target-cpu=native\" cargo build --release\n```\n\n### Benchmarking\n\n```bash\n# Run comprehensive performance analysis\ncargo run --release --example performance_analysis\n\n# Run criterion benchmarks (if available)\ncargo bench\n\n# Profile with perf (Linux)\ncargo build --release\nperf record --call-graph dwarf target/release/examples/performance_analysis\n```\n\n### Documentation\n\nTo view the full API documentation locally:\n\n```bash\n# Generate and open documentation in your browser\ncargo doc --open\n```\n\nThe documentation includes:\n- Complete API reference with examples\n- Algorithm implementation details\n- Thread safety guarantees\n- Performance characteristics\n\n*Note: Online documentation will be available at [docs.rs](https://docs.rs/mvcompression) once the crate is published.*\n\n## 📝 License\n\nThis project is licensed under either of\n\n * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or\n   http://www.apache.org/licenses/LICENSE-2.0)\n * MIT license ([LICENSE-MIT](LICENSE-MIT) or\n   http://opensource.org/licenses/MIT)\n\nat your option.\n\n## 🤝 Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n### Development Setup\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Add tests for your changes\n5. Ensure all tests pass (`cargo test`)\n6. Commit your changes (`git commit -m 'Add amazing feature'`)\n7. Push to the branch (`git push origin feature/amazing-feature`)\n8. Open a Pull Request\n\n### Code Style\n\n- Follow standard Rust formatting (`cargo fmt`)\n- Ensure no clippy warnings (`cargo clippy`)\n- Add documentation for public APIs\n- Include tests for new functionality\n\n## 📚 References\n\n- API Documentation: Run `cargo doc --open` to view locally\n\n## 🔗 Related Projects\n\n- [LZ4](https://github.com/lz4/lz4) - Fast compression algorithm\n- [Zstd](https://github.com/facebook/zstd) - High-performance compression\n- [Snappy](https://github.com/google/snappy) - Fast compression/decompression library\n\n---\n\n**Made with ❤️ in Rust** 🦀\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffreddiehaddad%2Fmvcompression","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffreddiehaddad%2Fmvcompression","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffreddiehaddad%2Fmvcompression/lists"}