https://github.com/njmarko/googolplex-pdf-search
Python program for searching pdf text, ranking the results and exporting highlighted search results in pdf. Uses trie structure, stack, heap, page graph. Converts queries to postfix notation. Allows for logical expressions and phrases. Offers did you mean functionality.
https://github.com/njmarko/googolplex-pdf-search
autocomplete datastructures-algorithms didyoumean graph heap pdf-generation pdf-highlighter pdf-search postfix-evaluation stack trie
Last synced: 4 months ago
JSON representation
Python program for searching pdf text, ranking the results and exporting highlighted search results in pdf. Uses trie structure, stack, heap, page graph. Converts queries to postfix notation. Allows for logical expressions and phrases. Offers did you mean functionality.
- Host: GitHub
- URL: https://github.com/njmarko/googolplex-pdf-search
- Owner: njmarko
- Created: 2022-06-13T23:55:06.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2024-08-28T13:29:22.000Z (10 months ago)
- Last Synced: 2025-02-01T11:41:23.480Z (5 months ago)
- Topics: autocomplete, datastructures-algorithms, didyoumean, graph, heap, pdf-generation, pdf-highlighter, pdf-search, postfix-evaluation, stack, trie
- Language: Python
- Homepage:
- Size: 6.21 MB
- Stars: 5
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# googolplex-pdf-search
Python program for searching pdf text, ranking the results and exporting highlighted search results in pdf. Uses trie structure, stack, heap, page graph. Converts queries to postfix notation. Allows for logical expressions and phrases. Offers did you mean functionality.## Required libraries
- PyMuPDF
- didyoumean.py## How to install and run the program
1. Create a virtual environment in the project directory:
```virtualenv venv```
2. Activate the virtual environment:2.1. For Windows:
```venv\Scripts\activate```2.2. For Linux:
```source venv/bin/activate```
3. Install the required libraries:
```pip install -r requirements.txt```
4. Run the program:
```python main.py```
5. All in one command:5.1. For linux
```virtualenv venv && source venv/bin/activate && pip install -r requirements.txt && python main.py```5.1. For windows (if using Powershell)
```virtualenv venv; venv\Scripts\Activate; pip install -r requirements.txt; python main.py```## Application screenshots
![]()
Ilustration 1 - Loading bar.
![]()
Ilustration 2 - Autocomplete feature.
![]()
Ilustration 3 - Did you mean functionality.
![]()
Ilustration 4 - Third page of results for the search query graph.
![]()
Ilustration 5 - Complex logical query with OR, AND and grouping with brackets.
![]()
Ilustration 6 - Complex logical query with negation (NOT) and grouping with brackets.
![]()
Ilustration 7 - Phrase search for "skip list" by using the double quotes.
![]()
Ilustration 8 - Generated pdf with highlighted search query "skip list".