https://github.com/adrijadastidar/vta-python
This is the Flask-based backend for the Virtual Teaching Assistant platform
https://github.com/adrijadastidar/vta-python
cors ffmpeg flask langchain llama3 noisereduce pdfplumber pydub python scipy waitress whisper-ai
Last synced: 2 days ago
JSON representation
This is the Flask-based backend for the Virtual Teaching Assistant platform
- Host: GitHub
- URL: https://github.com/adrijadastidar/vta-python
- Owner: AdrijaDastidar
- Created: 2025-04-26T19:44:12.000Z (13 days ago)
- Default Branch: main
- Last Pushed: 2025-05-06T10:23:16.000Z (3 days ago)
- Last Synced: 2025-05-07T18:12:57.638Z (2 days ago)
- Topics: cors, ffmpeg, flask, langchain, llama3, noisereduce, pdfplumber, pydub, python, scipy, waitress, whisper-ai
- Language: Python
- Homepage:
- Size: 2.23 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ง Virtual Teaching Assistant โ Server (Flask Backend)
This is the **Flask-based backend** for the **Virtual Teaching Assistant** platform, a smart educational system that uses AI to transcribe lectures, generate summaries, create quizzes, and parse learning materials.
---
## ๐ Features
- ๐ค **Audio Upload & Transcription** โ Accept `.wav` files and transcribe using OpenAI Whisper
- ๐งพ **AI-Powered Summarization** โ Summarize transcripts using LLMs (LLaMA 3 via Groq API)
- ๐งช **Quiz Generation** โ Extract questions from lecture content with retry logic
- ๐ **PDF Parsing** โ Extract text from uploaded PDFs using `pdfplumber`
- ๐ **RESTful API** โ Clean endpoints for frontend interaction
- ๐ **Cross-Origin Support** โ Enabled via `Flask-CORS`
- ๐ **Environment-based Config** โ Secure API keys via `.env`---
## ๐ ๏ธ Tech Stack
| Category | Tech/Library |
|---------------------|---------------------------------------|
| ๐ฅ๏ธ Server Framework | `Flask`, `waitress` (production WSGI) |
| ๐ถ Audio Processing | `ffmpeg`, `whisper`, `pydub`, `noisereduce`, `soundfile`, `scipy` |
| ๐ฌ LLM Integration | `langchain`, `langchain_groq`, `langchain_experimental`, `llama3` |
| ๐ NLP Utilities | Custom prompt chaining via `langchain` |
| ๐ File Parsing | `pdfplumber` for PDF content extraction |
| ๐ Environment Mgmt | `python-dotenv` |
| ๐ CORS Handling | `Flask-CORS` |---
## ๐ง Setup Instructions
### โ Prerequisites
* Python 3.9+
* FFmpeg installed (for audio preprocessing)
* `.env` file with your Groq API key### ๐งช Installation
```bash
# Clone and enter project
cd server-python# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate# Install dependencies
pip install -r requirements.txt
```### ๐ .env Example
```
GROQ_API_KEY=your_groq_api_key_here
```---
## โถ๏ธ Running the Server
```bash
python app.py
```Runs at: [http://localhost:5000](http://localhost:5000)
---
## ๐ API Endpoints
| Method | Endpoint | Description |
| ------- | ---------------- | ---------------------------------------------- |
| ๐ค POST | `/getTranscript` | Upload `.wav` file and generate summary + quiz |
| ๐ POST | `/summary` | Generate summary JSON from transcript |
| ๐งช POST | `/quiz` | Generate quiz JSON from transcript |
| ๐ POST | `/pdf` | Extract raw text from uploaded PDF file |---
## ๐งช Testing & Debugging
* Use Swagger alternatives like [Postman](https://www.postman.com/) to test endpoints.
* Debug statements included in logs for API calls to internal services.---
## ๐ก๏ธ Production Deployment
You can deploy using:
* `waitress` (already integrated)
* Docker (optional Dockerfile)
* Render, Heroku, or any Linux server with Python 3.9+Example production run:
```bash
waitress-serve --host 0.0.0.0 --port 5000 app:app
```---
## ๐ค Contributing
We welcome all contributors! You can:
* ๐ File bugs
* ๐ฑ Suggest improvements
* ๐ฆ Submit pull requests---
### ๐ฉโ๐ซ Let's build better classrooms with AI โ one lecture at a time! ๐