Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/drkostas/python_search_engine
This is a search engine created for the Gutenberg Project archive. It is implemented in python and the front end part is created with the flask framework.
https://github.com/drkostas/python_search_engine
css engine flask html python search
Last synced: 3 months ago
JSON representation
This is a search engine created for the Gutenberg Project archive. It is implemented in python and the front end part is created with the flask framework.
- Host: GitHub
- URL: https://github.com/drkostas/python_search_engine
- Owner: drkostas
- Created: 2017-04-13T06:29:14.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2023-11-22T19:27:08.000Z (about 1 year ago)
- Last Synced: 2024-10-12T08:32:32.048Z (4 months ago)
- Topics: css, engine, flask, html, python, search
- Language: HTML
- Homepage: https://search.gkos.dev
- Size: 33 MB
- Stars: 12
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Python Search Engine
This is a search engine on the **Gutenberg Project* archive.
It is implemented with python and the front end part is handled with **Flask** framework.**Demo:** [search.gkos.dev](https://search.gkos.dev)
It is consisted of 2 basics elements:
The first one is the BuildIndex.py which scans the archive and creates an Inverted File and a file with the Document Names.
I am using *3-in-4 Front Coding* compression.
The second part is the index.py(or QueryIndex.py as it was originally named).
Through the front-end part(form.html) the user enters the keywords.
The index.py is called which searches inside the inverted file using **tf*idf**.
The results are separated in two parts:
All the documents the keywords found in and
the documents in which the keywords were found as a phrase.
Then the results are being printed firstly including all the documents and secondly the top 10 documents based on the **tf*idf**.