https://github.com/nasrmohammad4804/search-engine-concept
this repo for learning search engine such as elk and web search engine concept such as google to grow knowledge of software engineering
https://github.com/nasrmohammad4804/search-engine-concept
bm25 crwaler elasticsearch etl-pipeline google inverted-index kafka kibana microservice mongodb ranking redis search-engine tf-idf
Last synced: 5 months ago
JSON representation
this repo for learning search engine such as elk and web search engine concept such as google to grow knowledge of software engineering
- Host: GitHub
- URL: https://github.com/nasrmohammad4804/search-engine-concept
- Owner: nasrmohammad4804
- License: mit
- Created: 2024-09-22T06:52:23.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-04-17T09:14:29.000Z (6 months ago)
- Last Synced: 2025-04-17T23:56:21.987Z (6 months ago)
- Topics: bm25, crwaler, elasticsearch, etl-pipeline, google, inverted-index, kafka, kibana, microservice, mongodb, ranking, redis, search-engine, tf-idf
- Language: Java
- Homepage:
- Size: 13 MB
- Stars: 9
- Watchers: 1
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
### are you think how a web search engine such as google work? it seem very complex work? dont worry we are create piston engine It behaves like it
### Piston Engine Demo
https://github.com/user-attachments/assets/09f97a12-bef5-487f-a47a-1d4923e5d1df
### 🤩🤩🤩 .Wow we are implement lightweight web search engine same as Google
-------------------------------------------------------------------------------------## general architecture and overview of piston engine at a glance
we implemented web search engine with elastic . 😎😎 in next version we want to implement key functionality of elastic instead of using it.
--------------------------------------------------------------------------------------
.
### get started with projectit's very easy to run that
1) install Docker from official page
2) use following command for start in development mode . open powershell or git-bash
chmod +x start-development-services.sh
bash start-development-services.sh2) its amazing all services and dependency installed and configured properly
--------------------------------------------------after run all service & dependency for project to work correctly
we need two thing1) add domain with crawler-service to crawler that domain and related page .and automatically index that in search-service from crawled webpage
we do it manually from swagger . but we can generate ui for that and only admin user can do that
it accessible at http://localhost:8080/swagger-ui/index.html
2) you access to all document generated from start point of crawler-service in kibana dashboard
at address(http://localhost:5601) by username -> elastic & password -> 123456 .it has amazing visualization dashboard like bellow

3) also using search-service endpoint at (http://localhost:8083/swagger-ui/index.html) to retrieve clear result for suggestion & search that read from elastic index
we fortunately create ui for show suggestion and search from search-service same as Google .
it accessible at http://localhost:3000i generate ui for autocomplete in search-box and select that. like picture bellow

-----------------------------------------------------------------
and after select input we search base on query to find most related document and rank them. like picture bellow

over time our dataset is larger and our suggestion it can be more precise and search result is more