{"id":21994236,"url":"https://github.com/codelibs/fess-webapp-semantic-search","last_synced_at":"2026-03-10T01:31:14.873Z","repository":{"id":65509596,"uuid":"585085654","full_name":"codelibs/fess-webapp-semantic-search","owner":"codelibs","description":null,"archived":false,"fork":false,"pushed_at":"2026-02-21T22:36:55.000Z","size":150,"stargazers_count":4,"open_issues_count":1,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-22T04:48:24.751Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codelibs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-01-04T09:29:17.000Z","updated_at":"2026-02-21T22:36:59.000Z","dependencies_parsed_at":"2024-02-24T12:27:10.941Z","dependency_job_id":"a35b8794-a29d-47e0-b3ac-26f773d4fb4d","html_url":"https://github.com/codelibs/fess-webapp-semantic-search","commit_stats":null,"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"purl":"pkg:github/codelibs/fess-webapp-semantic-search","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-semantic-search","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-semantic-search/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-semantic-search/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-semantic-search/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codelibs","download_url":"https://codeload.github.com/codelibs/fess-webapp-semantic-search/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelibs%2Ffess-webapp-semantic-search/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30320886,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-09T20:05:46.299Z","status":"ssl_error","status_checked_at":"2026-03-09T19:57:04.425Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-29T21:08:05.138Z","updated_at":"2026-03-10T01:31:14.189Z","avatar_url":"https://github.com/codelibs.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fess Semantic Search Plugin\n\n[![Java CI with Maven](https://github.com/codelibs/fess-webapp-semantic-search/actions/workflows/maven.yml/badge.svg)](https://github.com/codelibs/fess-webapp-semantic-search/actions/workflows/maven.yml)\n[![Maven Central](https://maven-badges.herokuapp.com/maven-central/org.codelibs.fess/fess-webapp-semantic-search/badge.svg)](https://maven-badges.herokuapp.com/maven-central/org.codelibs.fess/fess-webapp-semantic-search)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nA powerful semantic search plugin for [Fess](https://fess.codelibs.org/), the open-source enterprise search server. This plugin extends Fess's search capabilities by integrating neural search using OpenSearch's machine learning features and vector similarity search.\n\n## ✨ Features\n\n- **Neural Search Integration**: Leverages OpenSearch ML Commons plugin for semantic vector search\n- **Automatic Query Rewriting**: Converts traditional text queries to neural queries when appropriate\n- **Rank Fusion Processing**: Combines traditional and semantic search results for improved relevance\n- **Content Chunking**: Processes long documents in chunks for better semantic matching\n- **Configurable Models**: Supports multiple pre-trained transformer models from HuggingFace\n- **Seamless Integration**: Works as a drop-in plugin for existing Fess installations\n\n## 🚀 Quick Start\n\n### Prerequisites\n\n- Fess 15.0+ (Full-text Enterprise Search Server)\n- OpenSearch 2.x with ML Commons plugin enabled\n- Docker and Docker Compose (recommended for setup)\n\n### 1. Clone and Setup Docker Environment\n\n```bash\ngit clone https://github.com/codelibs/docker-fess.git\ncd docker-fess/compose\n```\n\n### 2. Configure Plugin in Docker Compose\n\nAdd the following line to your `compose.yaml`:\n\n```yaml\nenvironment:\n  - \"FESS_PLUGINS=fess-webapp-semantic-search:15.1.0\"\n```\n\n### 3. Start Services\n\n```bash\ndocker compose -f compose.yaml -f compose-opensearch2.yaml up -d\n```\n\n### 4. Initialize ML Models and Pipeline\n\nDownload and run the setup script:\n\n```bash\ncurl -o setup.sh https://raw.githubusercontent.com/codelibs/fess-webapp-semantic-search/main/tools/setup.sh\nchmod +x setup.sh\n./setup.sh localhost:9200\n```\n\nThe setup script will:\n- Display available pre-trained models\n- Register your selected model in OpenSearch\n- Create the neural search pipeline\n- Provide the configuration settings\n\n### 5. Configure Fess\n\nIn Fess Admin Panel (Admin \u003e General \u003e System Properties), add the configuration provided by the setup script:\n\n```properties\nfess.semantic_search.pipeline=neural_pipeline\nfess.semantic_search.content.field=content_vector\nfess.semantic_search.content.dimension=384\nfess.semantic_search.content.method=hnsw\nfess.semantic_search.content.engine=lucene\nfess.semantic_search.content.space_type=cosinesimil\nfess.semantic_search.content.model_id=\u003cyour-model-id\u003e\n```\n\n#### Optional: Performance Tuning (v15.3.0+)\n\nFor better performance, you can add these optional parameters:\n\n```properties\n# HNSW search-time parameter (higher = better recall, slower search)\nfess.semantic_search.content.param.ef_search=100\n\n# Enable performance monitoring for debugging\nfess.semantic_search.performance.monitoring.enabled=true\n\n# Enable batch inference (requires compatible ML model setup)\nfess.semantic_search.batch_inference.enabled=true\n```\n\n#### Optional: Diversity with MMR (Experimental)\n\nTo improve result diversity using Maximal Marginal Relevance:\n\n```properties\n# Enable MMR\nfess.semantic_search.mmr.enabled=true\n\n# Lambda: 1.0 = only relevance, 0.0 = only diversity, 0.5 = balanced\nfess.semantic_search.mmr.lambda=0.7\n```\n\n### 6. Create Index and Start Crawling\n\n1. Go to Admin \u003e Maintenance and start reindexing\n2. Create your crawling configuration\n3. Start the crawler\n4. Begin semantic searching!\n\n## 📖 Available Models\n\nThe plugin supports various pre-trained transformer models:\n\n| Model | Dimension | Description |\n|-------|-----------|-------------|\n| all-MiniLM-L6-v2 | 384 | Fast and efficient, good for general use |\n| all-mpnet-base-v2 | 768 | Higher quality, slower performance |\n| all-distilroberta-v1 | 768 | RoBERTa-based, good performance |\n| msmarco-distilbert-base-tas-b | 768 | Optimized for passage retrieval |\n| multi-qa-MiniLM-L6-cos-v1 | 384 | Specialized for question answering |\n| paraphrase-multilingual-MiniLM-L12-v2 | 384 | Multilingual support |\n\n## ⚙️ Configuration Options\n\n### Core Settings\n\n| Property | Description | Default |\n|----------|-------------|---------|\n| `fess.semantic_search.pipeline` | Neural search pipeline name | - |\n| `fess.semantic_search.content.model_id` | ML model ID in OpenSearch | - |\n| `fess.semantic_search.content.field` | Vector field name | - |\n| `fess.semantic_search.content.dimension` | Vector dimension size | - |\n\n### Advanced Settings\n\n| Property | Description | Default |\n|----------|-------------|---------|\n| `fess.semantic_search.content.method` | Vector search method | `hnsw` |\n| `fess.semantic_search.content.engine` | Vector search engine | `lucene` |\n| `fess.semantic_search.content.space_type` | Distance calculation method | `cosinesimil` |\n| `fess.semantic_search.min_score` | Minimum similarity score | - |\n| `fess.semantic_search.min_content_length` | Minimum content length for processing | - |\n| `fess.semantic_search.content.chunk_size` | Number of chunks to return | `1` |\n\n### HNSW Parameters\n\n| Property | Description | Default |\n|----------|-------------|---------|\n| `fess.semantic_search.content.param.m` | HNSW M parameter (higher = better recall, more memory) | `16` |\n| `fess.semantic_search.content.param.ef_construction` | HNSW ef_construction parameter (higher = better quality, slower indexing) | `100` |\n| `fess.semantic_search.content.param.ef_search` | HNSW ef_search parameter (higher = better recall, slower search) | Not set (OpenSearch default) |\n\n### Performance Tuning (v15.3.0+)\n\n| Property | Description | Default |\n|----------|-------------|---------|\n| `fess.semantic_search.performance.monitoring.enabled` | Enable detailed performance logging | `false` |\n| `fess.semantic_search.batch_inference.enabled` | Enable batch inference for better GPU utilization | `false` |\n\n### Experimental Features (v15.3.0+)\n\n| Property | Description | Default |\n|----------|-------------|---------|\n| `fess.semantic_search.mmr.enabled` | Enable Maximal Marginal Relevance for diversity | `false` |\n| `fess.semantic_search.mmr.lambda` | MMR lambda (1.0=relevance, 0.0=diversity) | `0.5` |\n\n## 🏗️ Architecture\n\n### Core Components\n\n- **SemanticSearchHelper**: Central component managing neural search configuration and model interactions\n- **NeuralQueryBuilder**: Custom OpenSearch query builder for neural/vector search queries  \n- **SemanticPhraseQueryCommand**: Converts phrase queries to neural queries when appropriate\n- **SemanticTermQueryCommand**: Handles term-based semantic search queries\n- **SemanticSearcher**: Extends Fess's DefaultSearcher for rank fusion processing\n\n### Integration Points\n\n- **Query Processing**: Integrates with Fess's QueryParser to rewrite queries for semantic search\n- **Document Processing**: Adds rewrite rules for OpenSearch mapping and settings to support vector fields\n- **Rank Fusion**: Registers as a searcher in Fess's rank fusion processor\n- **DI Container**: Uses LastaDi for dependency injection\n\n## 🔧 Development\n\n### Building from Source\n\n```bash\ngit clone https://github.com/codelibs/fess-webapp-semantic-search.git\ncd fess-webapp-semantic-search\nmvn clean package\n```\n\n### Running Tests\n\n```bash\nmvn test\n```\n\n### Code Quality\n\n```bash\nmvn clean compile javadoc:javadoc\n```\n\n## 📦 Installation Methods\n\n### Maven Repository\n\nThe plugin is available from Maven Central:\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eorg.codelibs.fess\u003c/groupId\u003e\n    \u003cartifactId\u003efess-webapp-semantic-search\u003c/artifactId\u003e\n    \u003cversion\u003e15.1.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n### Manual Installation\n\n1. Download the JAR from [Maven Repository](https://repo1.maven.org/maven2/org/codelibs/fess/fess-webapp-semantic-search/)\n2. Place it in your Fess webapp/WEB-INF/lib/ directory\n3. Restart Fess\n\n### Plugin Management\n\nSee the [Fess Plugin Guide](https://fess.codelibs.org/15.0/admin/plugin-guide.html) for detailed installation instructions.\n\n## 🤝 Contributing\n\nWe welcome contributions! \n\n### Development Setup\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Add tests for new functionality\n5. Run the test suite (`mvn test`)\n6. Commit your changes (`git commit -m 'Add some amazing feature'`)\n7. Push to the branch (`git push origin feature/amazing-feature`)\n8. Open a Pull Request\n\n### Code Style\n\nThis project uses:\n- Maven for build management\n- JUnit for testing\n- CheckStyle for code formatting\n- JavaDoc for documentation\n\n## 📄 License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## 🔗 Links\n\n- [Fess Official Website](https://fess.codelibs.org/)\n- [OpenSearch ML Commons](https://opensearch.org/docs/latest/ml-commons-plugin/)\n- [Docker Fess](https://github.com/codelibs/docker-fess)\n- [Issue Tracker](https://github.com/codelibs/fess-webapp-semantic-search/issues)\n\n## 🚀 OpenSearch 3.3 Optimization (v15.3.0+)\n\nThis plugin is optimized for OpenSearch 3.3 with significant performance improvements and new features:\n\n### Key Improvements\n- **Concurrent Segment Search**: Enabled by default, up to 2.5x faster k-NN queries\n- **Improved HNSW**: Default `space_type` changed to `cosinesimil` for better semantic search accuracy\n- **Performance Monitoring**: Optional detailed query performance tracking\n- **Advanced Tuning**: Fine-grained control over HNSW parameters including `ef_search`\n\n### Migration from Earlier Versions\nIf upgrading from v15.2.x or earlier:\n1. The default `space_type` has changed from `l2` to `cosinesimil`\n2. To maintain compatibility with existing indices, explicitly set: `fess.semantic_search.content.space_type=l2`\n3. For new deployments, the new default `cosinesimil` is recommended\n\n## 📊 Version Compatibility\n\n| Plugin Version | Fess Version | OpenSearch Version |\n|----------------|--------------|-------------------|\n| 15.3.x | 15.3+ | 3.3.x (recommended) |\n| 15.0.x | 15.0+ | 2.x |\n| 14.9.x | 14.9+ | 2.x |\n\n## 🆘 Support\n\n- **Documentation**: [Fess Documentation](https://fess.codelibs.org/15.0/)\n- **Issues**: [GitHub Issues](https://github.com/codelibs/fess-webapp-semantic-search/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/codelibs/fess-webapp-semantic-search/discussions)\n- **Community**: [Fess Community](https://discuss.codelibs.org/)\n\n## 🙏 Acknowledgments\n\n- [CodeLibs](https://www.codelibs.org/) for developing and maintaining Fess\n- [HuggingFace](https://huggingface.co/) for providing pre-trained transformer models\n- [OpenSearch](https://opensearch.org/) team for ML Commons plugin\n- All contributors who have helped improve this plugin\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodelibs%2Ffess-webapp-semantic-search","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodelibs%2Ffess-webapp-semantic-search","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodelibs%2Ffess-webapp-semantic-search/lists"}