https://github.com/iv4n-ga6l/Go-simple-FTS
Simple Full-Text engine implementation in Go
https://github.com/iv4n-ga6l/Go-simple-FTS
fulltext-search go golang
Last synced: 5 months ago
JSON representation
Simple Full-Text engine implementation in Go
- Host: GitHub
- URL: https://github.com/iv4n-ga6l/Go-simple-FTS
- Owner: iv4n-ga6l
- Created: 2024-04-30T11:03:57.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-22T00:50:08.000Z (about 1 year ago)
- Last Synced: 2025-03-26T11:47:42.783Z (7 months ago)
- Topics: fulltext-search, go, golang
- Language: Go
- Homepage:
- Size: 2.22 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Full-Text engine implementation in Go
Full-Text Search (FTS) is a technique for searching text in a collection of documents. A document can refer to a web page, a newspaper article, an email message, or any structured text. Most well-known FTS engine is Elasticsearch.
Explanation of the implementation
-
Step 1 : Document Definition
Define the document structure and the global variables for documents and indexes.-
Step 2 : Initialize Inverted and TF-IDF Indexes
Initialize the inverted and TF-IDF indexes by calling the init function when the program starts.-
Step 3 : Tokenize Text
Create a function to tokenize the text into words and normalize them to lowercase.-
Step 4 : Calculate Term Frequency (TF)
Define a function to calculate the term frequency for the tokens in a document.-
Step 5 : Build Inverted Index
Build an inverted index that maps each term to the list of document IDs that contain the term.-
Step 6 : Calculate Inverse Document Frequency (IDF)
Calculate the IDF for each term based on the inverted index and the total number of documents.-
Step 7 : Build TF-IDF Index
Build the TF-IDF index by combining the term frequency and inverse document frequency.-
Step 8 : Perform TF-IDF Search
Perform a TF-IDF search on the query and return a map of document IDs to their TF-IDF scores.-
Step 9 : Perform Letter-by-Letter Search
Perform a secondary letter-by-letter search for suggestions based on the query.-
Step 10 : Rank Search Results
Rank the search results based on their TF-IDF scores.-
Step 11 : Handle Search Requests
Handle the search requests by combining the results from both TF-IDF and letter-by-letter searches.-
Step 12 : Serve Index Page and Start Server
Serve the index page and start the HTTP server.