{"id":21994234,"url":"https://github.com/codelibs/fess-webapp-multimodal","last_synced_at":"2026-05-04T04:36:59.560Z","repository":{"id":245286483,"uuid":"817799652","full_name":"codelibs/fess-webapp-multimodal","owner":"codelibs","description":null,"archived":false,"fork":false,"pushed_at":"2025-03-06T01:31:45.000Z","size":216,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-03-06T01:31:59.998Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codelibs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-20T13:15:44.000Z","updated_at":"2025-03-06T01:31:48.000Z","dependencies_parsed_at":"2024-06-21T05:43:06.490Z","dependency_job_id":"dc82a477-0eaf-4882-be26-e4d1a5971573","html_url":"https://github.com/codelibs/fess-webapp-multimodal","commit_stats":null,"previous_names":["codelibs/fess-webapp-multimodal"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-multimodal","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-multimodal/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-multimodal/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-multimodal/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codelibs","download_url":"https://codeload.github.com/codelibs/fess-webapp-multimodal/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245052673,"owners_count":20553172,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-29T21:08:05.108Z","updated_at":"2026-05-04T04:36:59.548Z","avatar_url":"https://github.com/codelibs.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fess Multimodal Search Plugin\n\n[![Maven Central](https://img.shields.io/maven-central/v/org.codelibs.fess/fess-webapp-multimodal.svg)](https://search.maven.org/artifact/org.codelibs.fess/fess-webapp-multimodal)\n[![Java CI with Maven](https://github.com/codelibs/fess-webapp-multimodal/actions/workflows/maven.yml/badge.svg)](https://github.com/codelibs/fess-webapp-multimodal/actions/workflows/maven.yml)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nA powerful multimodal search plugin for [Fess](https://fess.codelibs.org/) that enables semantic search across text, images, and other media formats using CLIP (Contrastive Language-Image Pre-training) embeddings and vector similarity search.\n\n## 🌟 Features\n\n- **Multimodal Search**: Search across text and images using natural language queries\n- **CLIP Integration**: Leverages OpenAI's CLIP model for generating high-quality embeddings\n- **Vector Similarity**: Uses OpenSearch/Elasticsearch KNN capabilities for fast vector search\n- **Seamless Integration**: Easy installation as a Fess plugin\n- **Scalable Architecture**: Built for enterprise-scale search deployments\n- **Open Source**: Apache 2.0 licensed with full source code availability\n\n## 🏗️ Architecture\n\nThe plugin extends Fess with the following components:\n\n- **CasClient**: Communicates with CLIP-as-a-Service for embedding generation\n- **MultiModalSearchHelper**: Configures vector field mappings and query rewriting\n- **KNNQueryBuilder**: Builds k-nearest neighbor queries for vector similarity search\n- **CasExtractor**: Extracts and processes image content during crawling\n- **EmbeddingIngester**: Handles vector embedding storage and indexing\n\n## 📋 Requirements\n\n- **Fess**: Version 15.0 or higher\n- **Java**: OpenJDK 11 or higher\n- **OpenSearch/Elasticsearch**: With KNN plugin support\n- **Docker**: For running the CLIP service\n- **GPU** (optional): For faster embedding generation\n\n## 🚀 Quick Start\n\n### 1. Installation\n\nDownload the plugin JAR from [Maven Central](https://repo1.maven.org/maven2/org/codelibs/fess/fess-webapp-multimodal/) and install it via the Fess administration console.\n\nAlternatively, add the dependency to your project:\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eorg.codelibs.fess\u003c/groupId\u003e\n    \u003cartifactId\u003efess-webapp-multimodal\u003c/artifactId\u003e\n    \u003cversion\u003e15.1.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n### 2. Start CLIP Service\n\nClone the repository and start the CLIP API server:\n\n```bash\ngit clone https://github.com/codelibs/fess-webapp-multimodal.git\ncd fess-webapp-multimodal/docker\ndocker compose up -d\n```\n\nThe CLIP API will be available at `http://localhost:51000`.\n\n### 3. Configure Fess\n\nAdd the following system properties in Fess administration console:\n\n```properties\nfess.multimodal.content.field=content_vector\nfess.multimodal.content.dimension=512\nfess.multimodal.content.method=hnsw\nfess.multimodal.content.engine=lucene\nfess.multimodal.content.space_type=cosinesimil\nfess.multimodal.min_score=0.5\n```\n\n### 4. Apply Configuration\n\n1. Navigate to **Scheduler** → Execute **Config Reloader**\n2. Navigate to **Maintenance** → Execute **Re-indexing**\n\n### 5. Start Crawling\n\nConfigure and start crawling directories containing images and documents. The plugin will automatically:\n- Extract text and image content\n- Generate CLIP embeddings\n- Store vectors in the search index\n\n## 🔍 Usage Examples\n\n### Text-to-Image Search\nSearch for images using natural language descriptions:\n```\n\"red sports car on highway\"\n\"sunset over mountains\"\n\"person playing guitar\"\n```\n\n### Cross-Modal Search\nFind related content across different media types:\n```\n\"beach vacation\" → Returns both text documents and beach images\n\"cooking recipe\" → Returns recipe text and food images\n```\n\n## ⚙️ Configuration\n\n### System Properties\n\n| Property | Description | Default | Example |\n|----------|-------------|---------|---------|\n| `fess.multimodal.content.field` | Vector field name | `content_vector` | `image_vector` |\n| `fess.multimodal.content.dimension` | Vector dimensions | `512` | `768` |\n| `fess.multimodal.content.method` | KNN algorithm | `hnsw` | `ivf` |\n| `fess.multimodal.content.engine` | Search engine | `lucene` | `nmslib` |\n| `fess.multimodal.content.space_type` | Distance metric | `cosinesimil` | `l2` |\n| `fess.multimodal.min_score` | Minimum similarity score | `0.5` | `0.7` |\n\n### CLIP Service Configuration\n\nThe CLIP service can be customized by modifying `docker/clip_config.yaml`:\n\n```yaml\njtype: Flow\nversion: '1'\nwith:\n  port: 51000\n  protocol: http\n  cors: true\nexecutors:\n  - name: clip_t\n    uses:\n      jtype: CLIPEncoder\n      metas:\n        py_modules:\n          - clip_server.executors.clip_torch\n```\n\n## 🧪 Testing\n\nRun the test suite:\n\n```bash\nmvn clean test\n```\n\nFor integration testing with sample data:\n\n```bash\n# Install test data using FiftyOne\npip install fiftyone\nfiftyone zoo datasets load open-images-v7 --split validation --kwargs max_samples=1000 -d ./test-images\n\n# Configure Fess to crawl the test-images directory\n```\n\n## 📊 Performance\n\n- **Embedding Generation**: ~50ms per image (with GPU), ~200ms (CPU only)\n- **Search Latency**: \u003c100ms for vector similarity queries\n- **Throughput**: 1000+ documents/minute during indexing\n- **Index Size**: ~2KB additional storage per document for vectors\n\n## 🛠️ Development\n\n### Building from Source\n\n```bash\ngit clone https://github.com/codelibs/fess-webapp-multimodal.git\ncd fess-webapp-multimodal\nmvn clean package\n```\n\n### Project Structure\n\n```\nsrc/main/java/org/codelibs/fess/multimodal/\n├── client/          # CLIP service client\n├── crawler/         # Content extraction\n├── helper/          # Search configuration\n├── index/           # Query builders\n├── query/           # Query processing\n├── rank/            # Result ranking\n└── util/            # Utilities\n```\n\n### Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## 📚 Documentation\n\n- [Fess Documentation](https://fess.codelibs.org/)\n- [Plugin Installation Guide](https://fess.codelibs.org/15.1/admin/plugin-guide.html)\n- [OpenSearch KNN Plugin](https://opensearch.org/docs/latest/search-plugins/knn/)\n- [CLIP Paper](https://arxiv.org/abs/2103.00020)\n\n## 🐛 Troubleshooting\n\n### Common Issues\n\n**CLIP Service Connection Failed**\n```bash\n# Check if CLIP service is running\ncurl http://localhost:51000/health\n\n# Check Docker logs\ndocker logs clip_server\n```\n\n**Vector Search Not Working**\n- Ensure KNN plugin is installed in OpenSearch/Elasticsearch\n- Verify vector field mapping in index settings\n- Check minimum score threshold configuration\n\n**Performance Issues**\n- Enable GPU support for CLIP service\n- Increase JVM heap size for Fess\n- Optimize KNN index parameters\n\n## 📄 License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## 🙏 Acknowledgments\n\n- [OpenAI CLIP](https://github.com/openai/CLIP) for the foundational multimodal model\n- [Jina AI](https://github.com/jina-ai) for the CLIP server implementation\n- [CodeLibs](https://www.codelibs.org/) for the Fess search platform\n- All contributors who have helped improve this project\n\n## 📞 Support\n\n- **Issues**: [GitHub Issues](https://github.com/codelibs/fess-webapp-multimodal/issues)\n- **Documentation**: [Fess Official Docs](https://fess.codelibs.org/)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodelibs%2Ffess-webapp-multimodal","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodelibs%2Ffess-webapp-multimodal","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodelibs%2Ffess-webapp-multimodal/lists"}