{"id":27442933,"url":"https://github.com/donaldfilimon/wdbx-py","last_synced_at":"2026-04-30T13:34:22.449Z","repository":{"id":287926857,"uuid":"966258403","full_name":"donaldfilimon/wdbx-py","owner":"donaldfilimon","description":"WDBX is a flexible vector database system designed for AI applications with an extensible plugin architecture. Uses can extend to just about any simple or complex case and distributed servers as well. Code is very error optimized and simplified and allows plugins for you to extend everything. Miniature implementation of this standard that I made...","archived":false,"fork":false,"pushed_at":"2025-12-19T22:43:17.000Z","size":482,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-22T09:59:52.538Z","etag":null,"topics":["ai","database","distributed","ml","performance","plugins","python","python3","secure","sharded","wdbx"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/donaldfilimon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-04-14T16:39:03.000Z","updated_at":"2025-12-19T22:52:42.000Z","dependencies_parsed_at":"2025-04-15T01:17:48.586Z","dependency_job_id":"dc39fb6f-27d9-4cd3-920f-41caa6bd4031","html_url":"https://github.com/donaldfilimon/wdbx-py","commit_stats":null,"previous_names":["donaldfilimon/wdbx-py"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/donaldfilimon/wdbx-py","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donaldfilimon%2Fwdbx-py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donaldfilimon%2Fwdbx-py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donaldfilimon%2Fwdbx-py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donaldfilimon%2Fwdbx-py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/donaldfilimon","download_url":"https://codeload.github.com/donaldfilimon/wdbx-py/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donaldfilimon%2Fwdbx-py/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32466333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"ssl_error","status_checked_at":"2026-04-30T13:12:06.837Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","database","distributed","ml","performance","plugins","python","python3","secure","sharded","wdbx"],"created_at":"2025-04-15T01:17:43.522Z","updated_at":"2026-04-30T13:34:22.435Z","avatar_url":"https://github.com/donaldfilimon.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# WDBX: Vector Database for AI Applications\n\n[![PyPI version](https://img.shields.io/pypi/v/wdbx.svg)](https://pypi.org/project/wdbx/)\n[![Python Versions](https://img.shields.io/pypi/pyversions/wdbx.svg)](https://pypi.org/project/wdbx/)\n[![License](https://img.shields.io/pypi/l/wdbx.svg)](https://github.com/wdbx/wdbx_python/blob/main/LICENSE)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nWDBX is a flexible vector database system designed for AI applications with an extensible plugin architecture.\n\n## Features\n\n- 🚀 High-performance vector storage and similarity search with multiple indexing options\n- 🔄 Asynchronous API for non-blocking operations\n- 🔌 Extensible plugin architecture for easy integration with external services\n- 🌐 RESTful API server for remote access\n- 🤖 Built-in support for various embedding models and LLM providers\n- 📊 Advanced visualization and analytics capabilities\n- 🔄 Distributed architecture with sharding and replication\n- 🔒 Secure storage with support for authentication and encryption\n- 💻 Command-line interface for easy management\n\n## Installation\n\n```bash\npip install wdbx\n```\n\nTo install with specific components:\n\n```bash\npip install wdbx[api]          # Install with API server\npip install wdbx[security]     # Install with security features\npip install wdbx[visualization] # Install with visualization tools\npip install wdbx[indexing]     # Install with advanced indexing\npip install wdbx[webscraper]   # Install with web scraper plugin\npip install wdbx[ollama]       # Install with Ollama integration\npip install wdbx[all]          # Install with all components\n```\n\n### Docker Installation\n\nTo run WDBX using Docker, you can use the provided `docker-compose.yml` file:\n\n```bash\ndocker-compose up -d\n```\n\nThis will start the WDBX API server and other services defined in the `docker-compose.yml` file.\n\n## Configuration\n\nWDBX can be configured using a YAML configuration file located at `config/wdbx_config.yaml`. Below are the available configuration options:\n\n```yaml\n# WDBX Configuration\n\n# Core settings\nvector_dimension: 384\nnum_shards: 2\ndata_dir: \"./wdbx_data\"\nenable_plugins: true\nenable_distributed: false\nenable_gpu: false\nlog_level: \"INFO\"\n\n# Vector storage settings\nvector_store:\n  save_immediately: false\n  threads: 4\n  cache_size_mb: 128\n\n# Index settings\nindexing:\n  type: \"hnsw\" # \"hnsw\" or \"faiss\"\n  hnsw:\n    m: 16\n    ef_construction: 200\n    ef_search: 50\n  faiss:\n    index_type: \"Flat\"\n    nprobe: 8\n\n# API server settings\napi:\n  host: \"0.0.0.0\"\n  port: 8000\n  enable_auth: false\n  auth_key: \"\"\n  enable_cors: true\n  cors_origins: [\"*\"]\n\n# Plugin settings\nplugins:\n  # WebScraper plugin\n  webscraper:\n    user_agent: \"WDBX WebScraper/0.2.0\"\n    respect_robots_txt: true\n    timeout: 10.0\n    max_depth: 1\n    concurrency: 5\n    rate_limit: 1.0\n    embedding_model: \"all-MiniLM-L6-v2\"\n\n  # Ollama plugin\n  ollama:\n    host: \"http://localhost:11434\"\n    model: \"llama3\"\n    timeout: 30.0\n    embedding_model: \"all-MiniLM-L6-v2\"\n\n  # LMStudio plugin\n  lmstudio:\n    host: \"localhost\"\n    port: 8000\n    model: \"\"\n    embedding_model: \"\"\n    timeout: 30.0\n\n  # Social Media plugin\n  socialmedia:\n    enabled_platforms: \"twitter,reddit\"\n    cache_ttl: 300\n    demo_mode: true\n\n# Security settings\nsecurity:\n  enable_encryption: false\n  enable_authentication: false\n  enable_access_control: false\n  token_expiry: 86400 # 24 hours\n\n# Distributed settings\ndistributed:\n  host: \"localhost\"\n  port: 7777\n  auth_enabled: false\n  auth_key: \"\"\n  replication_factor: 1\n  coordinator_host: \"localhost\"\n  coordinator_port: 7777\n```\n\n## Quick Start\n\n### Basic Usage\n\n```python\nfrom wdbx import WDBX\n\n# Create a WDBX instance\nwdbx = WDBX(\n    vector_dimension=384,  # Common dimension for modern embedding models\n    num_shards=2,\n    data_dir=\"./wdbx_data\",\n    enable_plugins=True,\n)\n\n# Initialize the instance\nimport asyncio\nasyncio.run(wdbx.initialize())\n\n# Store a vector\nvector = [0.1 for _ in range(384)]  # Create a 384-dimensional vector with each element set to 0.1\nmetadata = {\"source\": \"example\", \"content\": \"Sample text\"}\nvector_id = wdbx.vector_store(vector, metadata)\n\n# Search for similar vectors\nresults = wdbx.vector_search(vector, limit=5)\nfor vector_id, similarity, metadata in results:\n    print(f\"Vector ID: {vector_id}, Similarity: {similarity:.4f}\")\n    print(f\"Content: {metadata.get('content')}\")\n\n# Don't forget to close the database\nasyncio.run(wdbx.shutdown())\n```\n\n### Asynchronous API\n\n```python\nimport asyncio\nfrom wdbx import WDBX\n\nasync def main():\n    # Create and initialize WDBX instance\n    wdbx = WDBX(vector_dimension=384)\n    await wdbx.initialize()\n\n    # Store vectors asynchronously\n    vector_id = await wdbx.vector_store_async([0.1 for _ in range(384)], {\"text\": \"Example\"})\n\n    # Search asynchronously\n    results = await wdbx.vector_search_async([0.1 for _ in range(384)], limit=5)\n\n    # Clean up\n    await wdbx.shutdown()\n\n# Run the async function\nasyncio.run(main())\n```\n\n### Using Plugins\n\n```python\nfrom wdbx import WDBX\n\n# Create WDBX with plugins enabled\nwdbx = WDBX(vector_dimension=384, enable_plugins=True)\n\n# Initialize the instance\nimport asyncio\nasyncio.run(wdbx.initialize())\n\n# Get a plugin instance\nwebscraper = wdbx.get_plugin(\"webscraper\")\n\n# Use the plugin to extract content and create an embedding\ncontent = asyncio.run(webscraper.extract_content(\"https://example.com\"))\nembedding = asyncio.run(webscraper.create_embedding(content))\n\n# Store in the database\nmetadata = {\"url\": \"https://example.com\", \"content\": content}\nvector_id = wdbx.vector_store(embedding, metadata)\n\n# Clean up\nasyncio.run(wdbx.shutdown())\n```\n\n### Using the CLI\n\nThe Command-Line Interface provides easy access to WDBX functionality:\n\n```bash\n# Display help\nwdbx help\n\n# Store a vector from text\nwdbx store --from-text \"This is a sample text to embed\"\n\n# Search for similar vectors\nwdbx search --from-text \"sample text\" --limit 5\n\n# Start the API server\nwdbx serve --port 8000\n```\n\n### Starting the API Server\n\n```python\nfrom wdbx import WDBX\nfrom wdbx.api import WDBXAPIServer\nimport asyncio\n\nasync def main():\n    # Create and initialize WDBX\n    wdbx = WDBX(vector_dimension=384, enable_plugins=True)\n    await wdbx.initialize()\n\n    # Create and start API server\n    server = WDBXAPIServer(wdbx, port=8000)\n    await server.initialize()\n    await server.start()\n\n# Run the server\nasyncio.run(main())\n```\n\n## Components\n\n### Core System\n\n- **Vector Storage**: High-performance storage for vector embeddings\n- **Indexing**: Multiple indexing options (HNSW, Faiss) for efficient similarity search\n- **Distributed Architecture**: Sharding and replication for scalability and fault tolerance\n- **Configuration Management**: Flexible configuration system with environment variables and config files\n\n### Plugins\n\nWDBX includes several plugins for integration with external services:\n\n| Plugin | Description | Status |\n|--------|-------------|--------|\n| WebScraper | Web content extraction and analysis | Stable |\n| Ollama | Local LLM integration via Ollama API | Stable |\n| LMStudio | OpenAI-compatible local API integration | Stable |\n| Discord | Chat integration with Discord | Stable |\n| Twitch | Twitch chat and API integration | Stable |\n| YouTube | YouTube data and analytics | Stable |\n| SocialMedia | Cross-platform social media integration | Stable |\n\n### Utilities\n\n- **Visualization**: Tools for visualizing vector spaces and relationships\n- **Security**: Authentication, encryption, and access control features\n- **API Server**: RESTful API for remote access to WDBX functionality\n- **CLI**: Command-line interface for easy management\n\n## API Endpoints\n\nThe WDBX API server provides the following endpoints:\n\n### Health Check\n\n- **GET /api/v1/health**: Check the health of the API server.\n\n### Vector Operations\n\n- **POST /api/v1/vectors**: Store a vector.\n- **POST /api/v1/vectors/search**: Search for similar vectors.\n- **GET /api/v1/vectors/{vector_id}**: Get a vector by ID.\n- **DELETE /api/v1/vectors/{vector_id}**: Delete a vector.\n- **PUT /api/v1/vectors/{vector_id}/metadata**: Update vector metadata.\n\n### Database Operations\n\n- **GET /api/v1/stats**: Get database statistics.\n- **POST /api/v1/clear**: Clear the database.\n\n### Embedding Operations\n\n- **POST /api/v1/embeddings**: Create an embedding for a text.\n- **POST /api/v1/embeddings/batch**: Create embeddings for a batch of texts.\n\n### Plugin Operations\n\n- **GET /api/v1/plugins**: List available plugins.\n- **GET /api/v1/plugins/{plugin_name}**: Get information about a plugin.\n\n## Documentation\n\nComprehensive documentation is available in the [docs](docs/) directory:\n\n- **API Reference**: Detailed class and method references\n- **Plugin System**: How the plugin system works\n- **Security Guide**: Authentication and encryption features\n- **Visualization Guide**: Tools for visualizing vector data\n- **CLI Reference**: Command-line interface documentation\n\n## Development\n\nTo set up the development environment:\n\n```bash\n# Clone the repository\ngit clone https://github.com/donaldfilimon/wdbx-py.git\ncd wdbx-py\n\n# Create and activate a virtual environment\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install development dependencies\npip install -r requirements.txt -U\n\n# Set up pre-commit hooks\npre-commit install\n```\n\n## Testing\n\nRun the test suite:\n\n```bash\n# Run core tests\npytest\n\n# Run plugin-specific tests\npython wdbx/tests.test_core.py -v\npython wdbx/tests.test_plugins.py -v\n```\n\n## Contributing\n\nContributions are welcome! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n## License\n\nWDBX is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonaldfilimon%2Fwdbx-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdonaldfilimon%2Fwdbx-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonaldfilimon%2Fwdbx-py/lists"}