https://github.com/stackmodel/ocr-gemini
OCR app using Google Gemini Flash
https://github.com/stackmodel/ocr-gemini
gemini llm ocr-recognition python streamlit
Last synced: about 1 year ago
JSON representation
OCR app using Google Gemini Flash
- Host: GitHub
- URL: https://github.com/stackmodel/ocr-gemini
- Owner: stackmodel
- License: mit
- Created: 2024-12-16T23:37:54.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-17T22:48:37.000Z (over 1 year ago)
- Last Synced: 2025-02-14T09:40:30.749Z (about 1 year ago)
- Topics: gemini, llm, ocr-recognition, python, streamlit
- Language: Python
- Homepage:
- Size: 10.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
**OCR App Using Gemini Flash**
This is a Streamlit-based web application that allows users to upload an image and extract readable text from it leveraging Google Gemini Flash model.
**Features**
- 📸 Upload an image in JPG, JPEG, or PNG format.
- 🤖 Extract text from the uploaded image using the Google Gemini Flash model.
- 📝 Display the extracted text in a well-organized Markdown format.
- 🖥️ Simple and intuitive interface using Streamlit.
**Setup Instructions**
1. Clone the repository
```
git clone https://github.com/stackmodel/ocr-gemini.git
cd ocr-gemini
```
2. install Dependencies:
- Make sure you have Python 3.7 or higher installed. Then, create a virtual environment and install the dependencies:
```
python -m venv env
source env/bin/activate # For Linux/macOS
.\env\Scripts\activate # For Windows
pip install -r requirements.txt
```
3. Rename .env.example to .env file and populate the google gemini api key.
You can obtain your API key from [Google AI Studio](https://aistudio.google.com/app/apikey).
4. Run the app using the following command: ```streamlit run app.py```
This will launch the app in your browser.
