Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sayedgamal99/pdf-to-text
https://github.com/sayedgamal99/pdf-to-text
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/sayedgamal99/pdf-to-text
- Owner: sayedgamal99
- Created: 2024-10-23T19:47:02.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-10-23T20:09:13.000Z (3 months ago)
- Last Synced: 2024-10-24T07:02:54.992Z (3 months ago)
- Language: Python
- Size: 165 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PDF Text Extractor
A simple web application that allows users to upload a PDF document and extract its text content. Built with Streamlit and pdfplumber, this app provides an easy-to-use interface for reading and extracting text from PDF files.
## Table of Contents
- [Project Summary](#project-summary)
- [Features](#features)
- [Technologies Used](#technologies-used)
- [Installation](#installation)
- [Usage](#usage)
- [Deployment](#deployment)
- [Contributing](#contributing)
- [License](#license)## Project Summary
This project is designed to help users extract text from PDF files effortlessly. Whether you need to retrieve text for documentation, research, or any other purpose, this app provides a straightforward solution. Users can simply upload their PDF files, and the app will display the extracted text, making it accessible for copying or further analysis.
## Features
- **Upload PDF Files**: Users can easily upload their PDF documents.
- **Text Extraction**: The app extracts and displays the text from the uploaded PDF.
- **User-Friendly Interface**: Built using Streamlit, the app is intuitive and responsive.
- **Cross-Platform Compatibility**: Works on any platform that supports Streamlit.## Technologies Used
- [Streamlit](https://streamlit.io/): A Python library for creating web apps.
- [pdfplumber](https://github.com/jsvine/pdfplumber): A Python library for extracting text and information from PDF files.
- Python 3.x## Installation
To run the application locally, follow these steps:
1. **Clone the repository**:
```bash
git clone https://github.com/your_username/PDF-to-text.git
cd PDF-to-text
```2. **Create a virtual environment (optional but recommended)**:
- **Windows**:
```bash
python -m venv venv
venv\Scripts\activate
```- **macOS/Linux**:
```bash
python3 -m venv venv
source venv/bin/activate
```3. **Install the required packages**:
```bash
pip install -r requirements.txt
```## Usage
1. **Run the Streamlit app**:
```bash
streamlit run app.py
```2. **Open your browser** and navigate to `http://localhost:8501` to access the app.
3. **Upload a PDF file** using the provided file uploader.
4. **View the extracted text** displayed on the screen for your convenience.
5. **Copy or save the extracted text** for your use as needed.
## Deployment
This application is also deployed on [Streamlit Cloud](https://streamlit.io/cloud). You can access the live version of the app at [streamlit app](https://pdf-to-text-jz36s82wgqtjd4i8vwwyt7.streamlit.app/).
### Deploying on Streamlit Cloud
To deploy your app on Streamlit Cloud, follow these steps:
1. **Push your code to a GitHub repository**:
- Make sure your app is working locally.
- Commit your changes and push them to GitHub.2. **Sign in to Streamlit Cloud**:
- Go to [Streamlit Cloud](https://streamlit.io/cloud) and sign in with your GitHub account.3. **Create a New App**:
- Click on "New App" in the Streamlit Cloud dashboard.
- Select the GitHub repository you just pushed your code to.4. **Select the branch** (usually `main`) and specify the file path to `app.py`.
5. **Click on "Deploy"** to publish your app.
---