{"id":28087394,"url":"https://github.com/nasrmohammad4804/search-engine-concept","last_synced_at":"2025-05-13T11:32:02.968Z","repository":{"id":259243765,"uuid":"861160802","full_name":"nasrmohammad4804/search-engine-concept","owner":"nasrmohammad4804","description":"this repo for learning search engine such as elk and web search engine concept such as google to grow knowledge of software engineering","archived":false,"fork":false,"pushed_at":"2025-04-17T09:14:29.000Z","size":13663,"stargazers_count":9,"open_issues_count":2,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-17T23:56:21.987Z","etag":null,"topics":["bm25","crwaler","elasticsearch","etl-pipeline","google","inverted-index","kafka","kibana","microservice","mongodb","ranking","redis","search-engine","tf-idf"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nasrmohammad4804.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-09-22T06:52:23.000Z","updated_at":"2025-04-17T09:14:33.000Z","dependencies_parsed_at":"2024-10-23T20:53:14.659Z","dependency_job_id":"43f244c4-3583-4bce-a490-56d9b7a868a1","html_url":"https://github.com/nasrmohammad4804/search-engine-concept","commit_stats":null,"previous_names":["nasrmohammad4804/search-engine-concept"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasrmohammad4804%2Fsearch-engine-concept","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasrmohammad4804%2Fsearch-engine-concept/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasrmohammad4804%2Fsearch-engine-concept/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasrmohammad4804%2Fsearch-engine-concept/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nasrmohammad4804","download_url":"https://codeload.github.com/nasrmohammad4804/search-engine-concept/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253933005,"owners_count":21986495,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bm25","crwaler","elasticsearch","etl-pipeline","google","inverted-index","kafka","kibana","microservice","mongodb","ranking","redis","search-engine","tf-idf"],"created_at":"2025-05-13T11:31:58.146Z","updated_at":"2025-05-13T11:32:02.943Z","avatar_url":"https://github.com/nasrmohammad4804.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"### \u003ci\u003eare you think how a web search engine such as google work? it seem very complex work? dont worry we are create piston engine It behaves like it\u003c/i\u003e\n\n### Piston Engine Demo\n\nhttps://github.com/user-attachments/assets/09f97a12-bef5-487f-a47a-1d4923e5d1df\n\n\u003cbr/\u003e \n\n### 🤩🤩🤩  .Wow we are implement lightweight web search engine same as \u003cb style=\"color:blue\"\u003eG\u003c/b\u003e\u003cb style=\"color:red\"\u003eo\u003c/b\u003e\u003cb style=\"color:orange\"\u003eo\u003c/b\u003e\u003cb style=\"color:blue\"\u003eg\u003c/b\u003e\u003cb style=\"color:green\"\u003el\u003c/b\u003e\u003cb style=\"color:red\"\u003ee\u003c/b\u003e  \u003cbr/\u003e \u003cbr/\u003e\n-------------------------------------------------------------------------------------\n\n\n## general architecture and overview of piston engine at a glance\u003cbr/\u003e\u003cbr/\u003e\n![](picture.png)\n\n\n\nwe implemented web search engine with elastic . 😎😎 in next version we want to implement key functionality of elastic instead of using it. \u003cbr/\u003e \u003cbr/\u003e\n\n--------------------------------------------------------------------------------------\n.\n### get started with project\n\nit's very easy to run that\n\n1) \u003cb\u003einstall Docker from official page\u003c/b\u003e\n2) use following command for start in \u003cb\u003edevelopment mode\u003c/b\u003e . open \u003cb\u003epowershell\u003c/b\u003e or \u003cb\u003egit-bash\u003c/b\u003e \n            \n       chmod +x start-development-services.sh\n       bash start-development-services.sh\n\n2) its amazing all services and dependency installed and configured properly\n--------------------------------------------------\n\nafter run all service \u0026 dependency for project to work correctly\nwe need two thing\n\n1) add domain with crawler-service to crawler that domain and related page .and automatically index that in search-service from crawled webpage\n\nwe do it manually from swagger . but we can generate ui for that and only admin user can do that  \u003cbr/\u003e\n\u003cspan style=\"font-weight:500;font-family:Verdana\"\u003eit accessible at http://localhost:8080/swagger-ui/index.html\u003c/span\u003e\n![crawl-page.png](crawl-page.png)\n\n2) you access to all document generated from start point of \u003cb\u003ecrawler-service\u003c/b\u003e in \u003cb\u003ekibana dashboard\u003c/b\u003e\n   \u003cspan style=\"font-weight:500;font-family:Verdana\"\u003e at address(http://localhost:5601) by \u003ci\u003eusername -\u003e \u003cb\u003eelastic\u003c/b\u003e \u0026 password -\u003e \u003cb\u003e123456\u003c/b\u003e \u003c/i\u003e \u003c/span\u003e. \u003cbr/\u003e\n\nit has amazing visualization dashboard like bellow\n\n![](kibana-dashboard-page.png)\n\n3) also using search-service endpoint at \u003cspan style=\"font-weight:500;font-family:Verdana\"\u003e(http://localhost:8083/swagger-ui/index.html) \u003c/span\u003e  to retrieve clear result for \u003cb\u003esuggestion \u0026 search\u003c/b\u003e that read from elastic index \u003cbr/\u003e\n\nwe fortunately create ui for show suggestion and search from \u003cspan style=\"font-weight:600\"\u003esearch-service\u003c/span\u003e same as \u003cb style=\"color:blue\"\u003eG\u003c/b\u003e\u003cb style=\"color:red\"\u003eo\u003c/b\u003e\u003cb style=\"color:orange\"\u003eo\u003c/b\u003e\u003cb style=\"color:blue\"\u003eg\u003c/b\u003e\u003cb style=\"color:green\"\u003el\u003c/b\u003e\u003cb style=\"color:red\"\u003ee\u003c/b\u003e .\u003cbr/\u003e \u003cspan style=\"font-weight:500;font-family:Verdana\"\u003eit accessible at http://localhost:3000\u003c/span\u003e\n\ni generate ui for autocomplete in search-box and select that. like picture bellow \u003cbr/\u003e\n\n\n![suggestion](suggestion-page.png)\n\n-----------------------------------------------------------------\n\nand after select input we search base on query to find most related document and rank them. like picture bellow\n\n![search-page](search-page.png)\n\n\u003cbr\u003e\nover time our dataset is larger and our suggestion it can be more precise and search result is more\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnasrmohammad4804%2Fsearch-engine-concept","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnasrmohammad4804%2Fsearch-engine-concept","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnasrmohammad4804%2Fsearch-engine-concept/lists"}