https://github.com/uohzxela/wordbase-solver
Proof-of-concept OpenCV project to extract characters from Wordbase screenshots and generate suitable words
https://github.com/uohzxela/wordbase-solver
Last synced: 7 months ago
JSON representation
Proof-of-concept OpenCV project to extract characters from Wordbase screenshots and generate suitable words
- Host: GitHub
- URL: https://github.com/uohzxela/wordbase-solver
- Owner: uohzxela
- Created: 2015-02-19T11:20:19.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2015-07-13T16:52:00.000Z (about 10 years ago)
- Last Synced: 2025-01-17T08:27:18.490Z (9 months ago)
- Language: Python
- Homepage: http://wordbase.alexjiao.com
- Size: 906 KB
- Stars: 2
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# wordbase-solver
Wordbase Solver is live! http://wordbase.alexjiao.com
##Dependencies
* OpenCV and Python bindings
* [pytesseract](https://github.com/madmaze/pytesseract) for Optical Character Recognition (OCR)##Usage
* `cd wordbase-solver/src`
* `python extract_game_board_text.py wordbase.png` to print out the gameboard in console
* `python tst_wrapper.py` to try out the tree loaded with dictionary
* `python solver.py [blue/orange] [/path/to/screenshot.jpg]` to find suitable words in the given screenshot, given the player color##How it works
Image before preprocessing | Intermediate image | Preprocessed image used for OCR
:-------------------------:|:-------------------------:|:-------------------------:|:-----------------------:|
 |  | 
* The screenshot is preprocessed using OpenCV functions to generate a B&W image which makes OCR more effective
* Simple thresholding is used to convert the screenshot to B&W
* Contour finding is used to find inverted regions and invert them
* Erosion is used to deal with tricky cases where two diagonal inverted regions stick with each other, making it difficult to obtain the contours
* Tesseract is used to recognize individual characters from the preprocessed B&W image
* Characters are stored in a 2D array, along with its color mapping.
* A tree data structure is initialized to store 170k+ English words from the dictionary with O(w) lookup where w is the length of the word
* A graph of characters and their neighbors is created from the 2D array, and DFS is employed to find all valid words from the graph
* The list of valid words is sorted according to the word's promixity to the opponent's base (the nearer, the better).##To-do list
1. Dockerization since setting up dependencies is quite time consuming
2. Make web app more user-friendly