{"id":29258187,"url":"https://github.com/sakhileln/ai-search-engine","last_synced_at":"2026-05-05T09:31:34.087Z","repository":{"id":302688817,"uuid":"1013281637","full_name":"sakhileln/ai-search-engine","owner":"sakhileln","description":"A modern, production-grade backend for an AI-powered search engine.","archived":false,"fork":false,"pushed_at":"2025-08-07T20:38:36.000Z","size":128,"stargazers_count":0,"open_issues_count":19,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-07T22:16:38.698Z","etag":null,"topics":["consul","docker","jenkins","jwt","kubernetes","microservices","nginx","oauth2","rate-limiting","rbac","sql","tls13"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sakhileln.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"docs/security/.gitkeep","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-03T16:32:30.000Z","updated_at":"2025-08-07T20:38:34.000Z","dependencies_parsed_at":"2025-07-21T02:24:35.087Z","dependency_job_id":"d13d14dc-64f1-4856-b6f9-e0c6ab5cc80c","html_url":"https://github.com/sakhileln/ai-search-engine","commit_stats":null,"previous_names":["sakhileln/ai-search-engine"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sakhileln/ai-search-engine","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakhileln%2Fai-search-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakhileln%2Fai-search-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakhileln%2Fai-search-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakhileln%2Fai-search-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sakhileln","download_url":"https://codeload.github.com/sakhileln/ai-search-engine/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakhileln%2Fai-search-engine/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32643513,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"online","status_checked_at":"2026-05-05T02:00:06.033Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["consul","docker","jenkins","jwt","kubernetes","microservices","nginx","oauth2","rate-limiting","rbac","sql","tls13"],"created_at":"2025-07-04T06:00:33.170Z","updated_at":"2026-05-05T09:31:34.066Z","avatar_url":"https://github.com/sakhileln.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ai search engine\nAn AI-powered search and question-answering platform that combines web search, natural language processing, and large language models to deliver concise, accurate, and context-aware answers to user queries. Unlike traditional search engines, this focuses on providing direct, conversational responses with sourced references, making it ideal for research and knowledge discovery. This project draws inspiration from Perplexity's approach to intelligent search and aims to replicate a secure, scalable backend to support similar query processing, data handling, and service orchestration in a production environment.\n\n## Table of Contents\n- [Overview](#overview)\n- [Running the Application](#running-the-application)\n- [Architecture](#architecture)\n- [Microservices](#microservices)\n- [Security Principles](#security-principles)\n- [Technologies](#technologies)\n- [Setup and Installation](#setup-and-installation)\n\n## Overview\nThis project delivers a scalable, secure backend system for a search platform, leveraging Spring Boot microservices. It integrates service discovery, messaging, sharded databases, and containerized deployment, with a strong emphasis on offensive security and observability. The system is designed to handle real-world production workloads while adhering to zero-trust security principles.\n\n## Running the Application\n- Clone the project\n    ```bash\n    git clone https://github.com/sakhileln/ai-search-engine.git\n    cd ai-search-engine\n    ```\n- Run the project\n  ```bash\n  mvn clean install -DskipTests\n  sudo docker-compose up --build -d\n  docker-compose logs -f auth-service\n  ```\n- Start Kubernetes cluster (e.g., Minikube):\n    ```bash\n    minikube start  \n    ```\n- Deploy services:\n    ```bash\n    kubectl apply -f k8s/\n    ```\n- Access the API via the NGINX gateway:\n    ```bash\n    curl https://api.secure-search.com/api/v1/query -H \"Authorization: Bearer \u003cJWT_TOKEN\u003e\"\n    ```\n\n## Architecture\nThe system follows a microservices architecture, orchestrated via Kubernetes, with services communicating through an API gateway. Key components include:\n- **Service Discovery**: Consul for dynamic service registration and health checks.\n- **Messaging**: Pluggable RabbitMQ or Kafka for asynchronous communication.\n- **Database**: Sharded PostgreSQL for scalability and data isolation.\n- **Containerization**: Docker for consistent deployments.\n- **API Gateway**: NGINX and Spring Cloud Gateway for routing, rate limiting, and TLS.\n- **Monitoring**: Prometheus, Grafana, and ELK Stack (planned) for observability.\n- **CI/CD**: Jenkins for automated builds, tests, and deployments.\n\n## Microservices\n| Service               | Purpose                                      | Key Tech / Security Focus                       |\n|------------------------|----------------------------------------------|--------------------------------------------------|\n| `auth-service`         | Authentication, authorization, JWT, OAuth2  | Secure session, password hashing, RBAC          |\n| `query-service`        | User query intake, routing, validation      | Input validation, audit logging                 |\n| `search-service`       | Search, AI model integration (mocked)       | Output encoding, rate limiting                  |\n| `data-service`         | Data persistence, PostgreSQL sharding       | SQL injection defense, data encryption          |\n| `notification-service` | Async notifications, event processing       | Message integrity, replay protection            |\n| `gateway-service`      | API gateway, routing, rate limiting, TLS    | API security, DoS protection                    |\n| `config-service`       | Centralized config management               | Secrets management, config validation           |\n| `discovery-service`    | Service registry/discovery (Consul)         | Service identity, health checks                 |\n\n## Security Principles\n- **Zero Trust**: All service-to-service calls are authenticated and authorized using JWT/OAuth2.\n- **Defense in Depth**: Layered security at network, application, and data levels.\n- **Observability**: Every request is traceable; security events are logged and monitored.\n- **Fail Secure**: Services fail closed on errors to prevent unauthorized access.\n- **Continuous Verification**: Security tests and static analysis integrated into CI/CD.\n\n## Technologies\n- Language: Java 17+ (Spring Boot)\n- Service Discovery: Consul\n- Messaging: RabbitMQ or Kafka (pluggable)\n- Database: PostgreSQL (sharded)\n- Containerization: Docker\n- Orchestration: Kubernetes\n- CI/CD: Jenkins, Github Actions\n- API Gateway: NGINX, Spring Cloud Gateway\n- Monitoring: Prometheus, Grafana, ELK Stack (future)\n- Security: JWT, OAuth2, secure coding standards\n\n## Contact\n- Sakhile L. Ndlazi\n- [LinkedIn Profile](https://www.linkedin.com/in/sakhile-)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsakhileln%2Fai-search-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsakhileln%2Fai-search-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsakhileln%2Fai-search-engine/lists"}