{"id":47909550,"url":"https://github.com/mbhatt1/maif","last_synced_at":"2026-04-04T05:14:11.712Z","repository":{"id":300510000,"uuid":"998582554","full_name":"mbhatt1/maif","owner":"mbhatt1","description":"Cryptographically-secure, auditable file format for AI agent memory with provenance tracking","archived":false,"fork":false,"pushed_at":"2026-01-22T22:43:38.000Z","size":139241,"stargazers_count":6,"open_issues_count":18,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-22T23:24:25.958Z","etag":null,"topics":["ai","ai-agent-tools","ai-agents-framework","cryptography"],"latest_commit_sha":null,"homepage":"https://vineethsai.github.io/maif/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mbhatt1.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":null},"created_at":"2025-06-08T23:04:23.000Z","updated_at":"2026-01-22T22:43:41.000Z","dependencies_parsed_at":"2025-06-22T05:37:20.824Z","dependency_job_id":null,"html_url":"https://github.com/mbhatt1/maif","commit_stats":null,"previous_names":["mbhatt1/maifscratch"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/mbhatt1/maif","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbhatt1%2Fmaif","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbhatt1%2Fmaif/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbhatt1%2Fmaif/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbhatt1%2Fmaif/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mbhatt1","download_url":"https://codeload.github.com/mbhatt1/maif/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbhatt1%2Fmaif/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31388345,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T04:26:24.776Z","status":"ssl_error","status_checked_at":"2026-04-04T04:23:34.147Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-agent-tools","ai-agents-framework","cryptography"],"created_at":"2026-04-04T05:14:11.045Z","updated_at":"2026-04-04T05:14:11.705Z","avatar_url":"https://github.com/mbhatt1.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/assets/maif-logo.svg\" alt=\"MAIF Logo\" width=\"200\"/\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eMAIF\u003c/h1\u003e\n\u003ch3 align=\"center\"\u003eMultimodal Artifact File Format for Trustworthy AI Agents\u003c/h3\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://pypi.org/project/maif/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/maif.svg\" alt=\"PyPI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/maif/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/dm/maif.svg\" alt=\"Downloads\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://www.python.org/downloads/\"\u003e\u003cimg src=\"https://img.shields.io/badge/python-3.9+-blue.svg\" alt=\"Python 3.9+\"\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-green.svg\" alt=\"License: MIT\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/vineethsai/maif/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://img.shields.io/github/actions/workflow/status/vineethsai/maif/ci.yml?branch=main\u0026label=tests\" alt=\"CI Tests\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://vineethsai.github.io/maif/\"\u003e\u003cimg src=\"https://img.shields.io/badge/docs-online-blue.svg\" alt=\"Documentation\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://deepwiki.com/vineethsai/maif\"\u003e\u003cimg src=\"https://img.shields.io/badge/DeepWiki-API%20Docs-blue.svg\" alt=\"DeepWiki\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/vineethsai/maif/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/vineethsai/maif\" alt=\"Release\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/vineethsai/maif/blob/main/CODE_OF_CONDUCT.md\"\u003e\u003cimg src=\"https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg\" alt=\"Code of Conduct\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  Cryptographically-secure, auditable file format for AI agent memory with provenance tracking\n\u003c/p\u003e\n\n---\n\n## Overview\n\nMAIF is a file format and SDK designed for AI agents that need **trustworthy memory**. Every piece of data is cryptographically linked, creating tamper-evident audit trails that prove exactly what happened, when, and by which agent.\n\n**Key Capabilities:**\n\n- **Cryptographic Provenance** - Hash-chained blocks for tamper-evident audit trails\n- **Multi-Agent Coordination** - Shared artifacts with agent-specific logging\n- **Multimodal Storage** - Text, embeddings, images, video, knowledge graphs\n- **Privacy-by-Design** - Encryption, anonymization, access control built-in\n- **High Performance** - Memory-mapped I/O, streaming, semantic compression\n\n## Use Cases\n\n- **Multi-Agent Systems** - Shared memory with full provenance (see LangGraph example)\n- **RAG Pipelines** - Document storage with embeddings, search, and citation tracking\n- **Compliance \u0026 Audit** - Immutable audit trails for regulated industries\n- **Research** - Reproducible experiments with complete data lineage\n- **Enterprise AI** - Secure, auditable AI workflows with access control\n\n## Framework Integrations\n\nMAIF provides drop-in integrations for popular AI agent frameworks:\n\n| Framework | Status | Description |\n|-----------|--------|-------------|\n| LangGraph | Available | State checkpointer with provenance |\n| CrewAI | Available | Crew/Agent callbacks, Memory |\n| LangChain | Available | Callbacks, VectorStore, Memory |\n| AWS Strands | Available | Agent callbacks |\n\n```bash\npip install maif[integrations]\n```\n\n### LangGraph\n\n```python\nfrom langgraph.graph import StateGraph\nfrom maif.integrations.langgraph import MAIFCheckpointer\n\ncheckpointer = MAIFCheckpointer(\"state.maif\")\napp = graph.compile(checkpointer=checkpointer)\nresult = app.invoke(state, config)\ncheckpointer.finalize()\n```\n\n### CrewAI\n\n```python\nfrom crewai import Crew\nfrom maif.integrations.crewai import MAIFCrewCallback\n\ncallback = MAIFCrewCallback(\"crew.maif\")\ncrew = Crew(\n    agents=[...],\n    tasks=[...],\n    task_callback=callback.on_task_complete,\n    step_callback=callback.on_step,\n)\nresult = crew.kickoff()\ncallback.finalize()\n```\n\nSee the [integrations documentation](docs/guide/integrations/) for full details.\n\n---\n\n## Quick Start\n\n**Prerequisites:** Python 3.9+\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/vineethsai/maif.git\ncd maif\n\n# Install MAIF\npip install -e .\n\n# With ML features (embeddings, semantic search)\npip install -e \".[ml]\"\n```\n\n### Your First MAIF Artifact\n\n```python\nfrom maif import MAIFEncoder, MAIFDecoder, verify_maif\n\n# Create an agent memory artifact (Ed25519 signed automatically)\nencoder = MAIFEncoder(\"agent_memory.maif\", agent_id=\"my-agent\")\n\n# Add content with automatic provenance tracking\nencoder.add_text_block(\"User asked about weather in NYC\", metadata={\"type\": \"query\"})\nencoder.add_text_block(\"Temperature is 72°F, sunny\", metadata={\"type\": \"response\"})\n\n# Finalize (signs and seals the file)\nencoder.finalize()\n\n# Later: Load and verify integrity\ndecoder = MAIFDecoder(\"agent_memory.maif\")\ndecoder.load()\n\nis_valid, errors = decoder.verify_integrity()\nprint(f\"Valid: {is_valid}, Blocks: {len(decoder.blocks)}\")\n\n# Read content\nfor i, block in enumerate(decoder.blocks):\n    text = decoder.get_text_content(i)\n    print(f\"Block {i}: {text}\")\n```\n\n**Secure MAIF Format:**\n- **Self-contained** - No separate manifest files, everything in one `.maif` file\n- **Ed25519 signatures** - Fast, compact 64-byte signatures on every block\n- **Immutable blocks** - Each block is signed immediately on write\n- **Tamper detection** - Cryptographic verification catches any modification\n- **Embedded provenance** - Full audit trail built into the file\n\n---\n\n## Featured Example: Multi-Agent RAG System\n\nA multi-agent system example with **LangGraph orchestration** and **MAIF provenance tracking** for demonstration purposes.\n\n```bash\ncd examples/langgraph\n\n# Configure API key\necho \"GEMINI_API_KEY=your_key\" \u003e .env\n\n# Install dependencies\npip install -r requirements_enhanced.txt\n\n# Create knowledge base with embeddings\npython3 create_kb_enhanced.py\n\n# Run the interactive demo\npython3 demo_enhanced.py\n```\n\n**What's Included:**\n- 5 specialized agents (Retriever, Synthesizer, Fact-Checker, Citation, Web Search)\n- ChromaDB vector store with semantic search\n- Gemini API integration for LLM reasoning\n- Complete audit trail of every agent action\n- Multi-turn conversation support\n\nSee [`examples/langgraph/README.md`](examples/langgraph/README.md) for full documentation.\n\n---\n\n## NEW: Enterprise AI Governance Demo\n\nInteractive demonstration of MAIF's enterprise-grade governance features:\n\n```bash\ncd examples/integrations/langgraph_governance_demo\npython main.py\n```\n\n**Features demonstrated:**\n- Cryptographic provenance (Ed25519 signatures, hash chains)\n- Tamper detection and data integrity verification\n- Role-based access control with audit logging\n- Multi-agent coordination with clear handoffs\n- Compliance report generation (Markdown, JSON, CSV)\n\nSee [`examples/integrations/langgraph_governance_demo/README.md`](examples/integrations/langgraph_governance_demo/README.md) for details.\n\n---\n\n## Features\n\n### Cryptographic Provenance\n\nEvery block is cryptographically signed and linked - any tampering is detectable.\n\n```python\nfrom maif import MAIFEncoder, MAIFDecoder\n\n# Each block is signed with Ed25519 on creation\nencoder = MAIFEncoder(\"memory.maif\", agent_id=\"agent-1\")\nencoder.add_text_block(\"First message\")   # Signed immediately\nencoder.add_text_block(\"Second message\")  # Linked to previous via hash\nencoder.add_text_block(\"Third message\")   # Chain continues\nencoder.finalize()\n\n# Verify the entire chain + all signatures\ndecoder = MAIFDecoder(\"memory.maif\")\ndecoder.load()\nis_valid, errors = decoder.verify_integrity()\n\n# Check provenance chain\nfor entry in decoder.provenance:\n    print(f\"{entry.action} by {entry.agent_id} at {entry.timestamp}\")\n```\n\n### Privacy \u0026 Security\n\nBuilt-in encryption, anonymization, and access control.\n\n```python\nfrom maif import PrivacyLevel, EncryptionMode\n\n# Add encrypted content\nmaif.add_text(\n    \"Sensitive data\",\n    encrypt=True,\n    anonymize=True,  # Auto-redact PII\n    privacy_level=PrivacyLevel.CONFIDENTIAL\n)\n\n# Access control\nmaif.add_access_rule(AccessRule(\n    role=\"analyst\",\n    permissions=[Permission.READ],\n    resources=[\"reports\"]\n))\n```\n\n### Multimodal Support\n\nStore and search across text, images, video, embeddings, and knowledge graphs.\n\n```python\nfrom maif_api import MAIF\n\nmaif = MAIF(\"my-agent\")\n\n# Text with metadata\nmaif.add_text(\"Analysis results\", metadata={\"title\": \"Report\", \"language\": \"en\"})\n\n# Images with feature extraction\nmaif.add_image(\"chart.png\", metadata={\"title\": \"Sales Chart\"})\n\n# Semantic embeddings (pre-computed or from TF-IDF)\nmaif.add_embeddings([[0.1, 0.2, 0.3], [0.2, 0.3, 0.4]])\n\n# Multimodal content - combines text, images, and embeddings\nmaif.add_multimodal({\n    \"text\": \"Product description\",\n    \"image_path\": \"product.jpg\",\n    \"embeddings\": [[0.1, 0.2, ...]],\n    \"metadata\": {\"category\": \"electronics\"}\n})\n\nmaif.save(\"output.maif\")\n```\n\n## What's Working\n\nThe following features are fully tested and working:\n\n- **Ed25519 cryptographic signatures** - Fast, compact 64-byte signatures ✓\n- **Multiple compression formats** - ZLIB, BROTLI, GZIP, and other standard formats ✓\n- **Framework integrations** - LangGraph, CrewAI, LangChain, AWS Strands ✓\n- **Provenance tracking** - Hash-chained blocks with tamper detection ✓\n- **TF-IDF embeddings** - Lightweight semantic search with sklearn ✓\n\n## What's In Progress / Research Phase\n\nThe following are research implementations with known limitations:\n\n### Hierarchical Semantic Compression (HSC)\n- **Status**: Research implementation\n- **Current performance**: ~1.5x compression ratio on embeddings\n- **What works**: DBSCAN clustering, vector quantization, Huffman coding\n- **Limitations**: Not achieving claimed 2.5-4x ratio, not production-ready\n- **Roadmap**: Plan to implement proper Product Quantization in v2.2\n\n### Adaptive Cross-Modal Attention (ACAM)\n```python\nfrom maif.semantic import AdaptiveCrossModalAttention\nimport numpy as np\n\n# ⚠ RESEARCH IMPLEMENTATION - Use with caution\nacam = AdaptiveCrossModalAttention(embedding_dim=384, num_heads=8)\n\n# Train on multimodal data (optional but recommended)\ntraining_data = [\n    {\n        \"text\": np.random.randn(384),\n        \"image\": np.random.randn(384),\n        \"audio\": np.random.randn(384),\n    },\n    # ... more samples\n]\nstats = acam.fit(training_data, epochs=10)\n\n# Use for attention computation\nembeddings = {\n    \"text\": np.random.randn(384),\n    \"image\": np.random.randn(384),\n}\nweights = acam.compute_attention_weights(embeddings)\n\n# Get fused representation\nattended = acam.get_attended_representation(embeddings, weights, \"text\")\n\n# Save/load trained weights\nacam.save_weights(\"acam_weights.pkl\")\nacam.load_weights(\"acam_weights.pkl\")\n```\n- **Status**: Research implementation\n- **Current capability**: Computes cross-modal attention weights\n- **Known limitations**: Training uses simple gradient descent, not optimized\n\n### Cryptographic Semantic Binding (CSB)\n- **Status**: Research implementation\n- **Current capability**: SHA-256 based commitment schemes\n- **Note**: Infrastructure in place, not validated for production use\n\n### Neural Embeddings\n- **Status**: ❌ Not implemented\n- **Current**: TF-IDF only (sklearn-based)\n- **Planned**: Optional sentence-transformers integration in future versions\n- **Note**: Infrastructure exists but neural models not functional\n\n---\n\n## Performance\n\n| Metric | Performance | Notes |\n|--------|-------------|-------|\n| Semantic Search | ~30ms for 1K vectors | TF-IDF based, tested at 1K, scales linearly |\n| Standard Compression (ZLIB) | 2-3× typical | Proven, well-tested |\n| Hierarchical Semantic Compression (HSC) | ~1.5× average | Research implementation, not production-ready |\n| Integrity Verification | ~0.1ms per file | Ed25519 signature verification |\n| Tamper Detection | 100% detection in \u003c0.1ms | Hash-chain verification |\n| Signature Overhead | 64 bytes per block | Ed25519 signatures |\n\n**Note:** HSC claims of \"2.5-4x compression up to 10x maximum\" were not verified and are not guaranteed. Current implementation achieves ~1.5x on embeddings.\n\n---\n\n## Project Structure\n\n```\nmaif/\n├── maif/                  # Core library\n│   ├── core.py           # MAIFEncoder, MAIFDecoder\n│   ├── security.py       # Signing, verification\n│   ├── privacy.py        # Encryption, anonymization\n│   ├── integrations/     # Framework integrations (LangGraph, etc.)\n│   └── semantic*.py      # Embeddings, compression\n├── maif_api.py           # High-level API\n├── examples/\n│   ├── langgraph/        # Multi-agent RAG system\n│   ├── integrations/     # Framework integration demos\n│   ├── basic/            # Getting started\n│   ├── security/         # Privacy \u0026 encryption\n│   └── advanced/         # Agent framework, lifecycle\n├── tests/                # 450+ tests\n├── docs/                 # VitePress documentation\n└── benchmarks/           # Performance tests\n```\n\n---\n\n## Documentation\n\n| Resource | Description |\n|----------|-------------|\n| [Getting Started](https://vineethsai.github.io/maif/guide/getting-started) | Quick start guide |\n| [Framework Integrations](https://vineethsai.github.io/maif/guide/integrations/) | LangGraph, LangChain, CrewAI |\n| [DeepWiki](https://deepwiki.com/vineethsai/maif) | Auto-generated API docs and code exploration |\n| [Examples](examples/) | Working code examples |\n\n---\n\n## Examples\n\n### Basic Usage\n```bash\npython examples/basic/simple_api_demo.py\npython examples/basic/basic_usage.py\n```\n\n### Privacy \u0026 Security\n```bash\npython examples/security/privacy_demo.py\npython examples/security/classified_api_simple_demo.py\n```\n\n### Advanced Features\n```bash\npython examples/advanced/maif_agent_demo.py          # Agent framework\npython examples/advanced/lifecycle_management_demo.py # Lifecycle management\npython examples/advanced/video_demo.py               # Video processing\n```\n\n---\n\n## Contributing\n\nWe welcome contributions! Please ensure:\n\n1. All tests pass (`pytest tests/`)\n2. Code follows PEP 8 style\n3. New features include tests and documentation\n4. Security-sensitive changes include impact analysis\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.\n\n---\n\n## References\n\n- [FIPS 140-2 Standards](https://csrc.nist.gov/publications/detail/fips/140/2/final) - Cryptographic module requirements\n- [NIST 800-53](https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final) - Security and privacy controls\n- [ISO BMFF](https://www.iso.org/standard/68960.html) - Binary format inspiration\n\n---\n\n## License\n\nMIT License - See [LICENSE](LICENSE) for details.\n\n---\n\n## Community \u0026 Support\n\n- **[GitHub Discussions](https://github.com/vineethsai/maif/discussions)** - Ask questions, share ideas\n- **[Issue Tracker](https://github.com/vineethsai/maif/issues)** - Report bugs or request features  \n- **[Documentation](https://vineethsai.github.io/maif/)** - Complete guides and API reference\n- **[Security](SECURITY.md)** - Report security vulnerabilities\n- **[Changelog](CHANGELOG.md)** - See what's new\n- **[Specification](SPECIFICATION.md)** - MAIF file format specification\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003cb\u003eBuild trustworthy AI agents with cryptographic provenance\u003c/b\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmbhatt1%2Fmaif","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmbhatt1%2Fmaif","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmbhatt1%2Fmaif/lists"}