An open API service indexing awesome lists of open source software.

https://github.com/nasrmohammad4804/search-engine-concept

this repo for learning search engine such as elk and web search engine concept such as google to grow knowledge of software engineering
https://github.com/nasrmohammad4804/search-engine-concept

bm25 crwaler elasticsearch etl-pipeline google inverted-index kafka kibana microservice mongodb ranking redis search-engine tf-idf

Last synced: 5 months ago
JSON representation

this repo for learning search engine such as elk and web search engine concept such as google to grow knowledge of software engineering

Awesome Lists containing this project

README

          

### are you think how a web search engine such as google work? it seem very complex work? dont worry we are create piston engine It behaves like it

### Piston Engine Demo

https://github.com/user-attachments/assets/09f97a12-bef5-487f-a47a-1d4923e5d1df


### 🤩🤩🤩 .Wow we are implement lightweight web search engine same as Google


-------------------------------------------------------------------------------------

## general architecture and overview of piston engine at a glance


![](picture.png)

we implemented web search engine with elastic . 😎😎 in next version we want to implement key functionality of elastic instead of using it.

--------------------------------------------------------------------------------------
.
### get started with project

it's very easy to run that

1) install Docker from official page
2) use following command for start in development mode . open powershell or git-bash

chmod +x start-development-services.sh
bash start-development-services.sh

2) its amazing all services and dependency installed and configured properly
--------------------------------------------------

after run all service & dependency for project to work correctly
we need two thing

1) add domain with crawler-service to crawler that domain and related page .and automatically index that in search-service from crawled webpage

we do it manually from swagger . but we can generate ui for that and only admin user can do that

it accessible at http://localhost:8080/swagger-ui/index.html
![crawl-page.png](crawl-page.png)

2) you access to all document generated from start point of crawler-service in kibana dashboard
at address(http://localhost:5601) by username -> elastic & password -> 123456 .

it has amazing visualization dashboard like bellow

![](kibana-dashboard-page.png)

3) also using search-service endpoint at (http://localhost:8083/swagger-ui/index.html) to retrieve clear result for suggestion & search that read from elastic index

we fortunately create ui for show suggestion and search from search-service same as Google .
it accessible at http://localhost:3000

i generate ui for autocomplete in search-box and select that. like picture bellow

![suggestion](suggestion-page.png)

-----------------------------------------------------------------

and after select input we search base on query to find most related document and rank them. like picture bellow

![search-page](search-page.png)



over time our dataset is larger and our suggestion it can be more precise and search result is more