An open API service indexing awesome lists of open source software.

https://github.com/khushikumarigupta14/pdf-mcq-extractor

PDF MCQ Extractor – Quickly extract multiple-choice questions from PDFs and export them as structured JSON. Perfect for educators, students, and study apps.
https://github.com/khushikumarigupta14/pdf-mcq-extractor

educational-tools machine-readable-mcqs mcq-extraction pdf-processing pdf-to-json study-materials

Last synced: 26 days ago
JSON representation

PDF MCQ Extractor – Quickly extract multiple-choice questions from PDFs and export them as structured JSON. Perfect for educators, students, and study apps.

Awesome Lists containing this project

README

          

# PDF MCQ Extractor

![PDF MCQ Extractor Screenshot]

A fast, accurate, and easy-to-use web application designed to extract multiple-choice questions (MCQs) from PDF files and provide them as a downloadable JSON file. This tool is perfect for educators, students, and anyone needing to quickly convert MCQ-based assessments or study materials into a structured, machine-readable format.

## ✨ Features

- **Upload PDF Files:** Easily upload your PDF documents containing MCQs.
- **Automatic Question Extraction:** Intelligently identifies and extracts questions along with their corresponding options.
- **JSON Download:** Download the extracted MCQs in a clean, organized JSON format, ready for use in other applications or databases.
- **User-Friendly Interface:** Built with a clean and intuitive design for a seamless user experience.
- **Accessible & Performance Optimized:** Designed with accessibility and performance in mind.

## 🚀 Demo

[Link to Live Demo (if available)] - You might want to host this on Netlify, Vercel, or Heroku for a live demo.

## 🛠️ Tech Stack

This project leverages a robust and modern tech stack to deliver a efficient and scalable solution:

- **Backend:**
- **Node.js:** A powerful JavaScript runtime for server-side logic.
- **Express.js:** A minimalist web framework for Node.js, used for building the API and handling requests.
- **Frontend:**
- **EJS (Embedded JavaScript):** A templating engine to render dynamic HTML pages.
- **Bootstrap:** A popular CSS framework for responsive and aesthetically pleasing UI components.
- **File Handling:**
- **Multer:** Middleware for handling `multipart/form-data`, primarily used for file uploads.
- **pdf-parse / pdf-parser:** Libraries used for parsing PDF documents and extracting text content.

## ⚙️ Installation & Setup

Follow these steps to get the PDF MCQ Extractor up and running on your local machine.

### Prerequisites

- Node.js (>=18.0.0)
- npm (>=9.0.0)

### Steps

1. **Clone the Repository:**

```bash
git clone https://github.com/khushikumarigupta14/pdf-mcq-extractor.git
cd pdf-mcq-extractor
```

2. **Install Dependencies:**

```bash
npm install
```

3. **Run in Development Mode:**

This will start the server using `nodemon`, which automatically restarts the server when file changes are detected.

```bash
npm run dev
```

4. **Access the Application:**

Open your web browser and navigate to `http://localhost:3000` (or the port specified in your `server.js`).

## 🐳 Docker (Optional)

For easier deployment and environment consistency, you can also use Docker.

1. **Build the Docker Image:**

```bash
docker build -t pdf-mcq-extractor .
```

2. **Run the Docker Container:**

```bash
docker run -p 3000:3000 pdf-mcq-extractor
```

The application will be accessible at `http://localhost:3000`.

## 📂 Project Structure

```bash
.
├── public/ # Static assets (CSS, JS, images)
│ ├── css/
│ ├── js/
│ └── images/
├── views/ # EJS template files
│ └── index.ejs
├── server.js # Main Express.js server file
├── extractMCQs.js # Logic for PDF parsing and MCQ extraction
├── package.json # Project dependencies and scripts
├── package-lock.json # Locked dependencies
├── .env.example # Example environment variables (if any)
├── .gitignore # Files/folders to ignore from Git
├── README.md # This file
└── ...

```

## 📝 Usage

1. **Upload PDF:** On the homepage, click the "Choose File" button to select a PDF document from your computer.
2. **Extract MCQs:** Click the "Extract Questions" button. The application will process the PDF.
3. **Download JSON:** Once processed, a "Download JSON" button will appear. Click it to save the extracted MCQs to your device.

## 🤝 Contributing

Contributions are welcome! If you have suggestions for improvements, new features, or bug fixes, please follow these steps:

1. **Fork the repository.**
2. **Create a new branch:** `git checkout -b feature/your-feature-name` or `bugfix/your-bug-fix`.
3. **Make your changes.**
4. **Commit your changes:** `git commit -m "feat: Add new feature"`.
5. **Push to the branch:** `git push origin feature/your-feature-name`.
6. **Open a Pull Request.**

Please ensure your code adheres to the project's coding style (run `npm run format` and `npm run lint`).

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 📞 Contact

If you have any questions or feedback, feel free to reach out:

- **Author:** Khushi Kumari
- **Email:** khushikumari00999.k@gmail.com
- **GitHub:** [https://github.com/khushikumarigupta14](https://github.com/yourusername)

---

**Note:** Remember to replace `[Link to Live Demo (if available)]` and `[LICENSE]` with actual links if you set them up. Also, consider adding a `LICENSE` file to your repository if you haven't already.