https://github.com/nilukush/resume-parser
AI-Powered Resume Parser with OCR, NLP, and AI Enhancement
https://github.com/nilukush/resume-parser
ai gpt-4 nlp-parsing openai resume resume-parser
Last synced: 3 months ago
JSON representation
AI-Powered Resume Parser with OCR, NLP, and AI Enhancement
- Host: GitHub
- URL: https://github.com/nilukush/resume-parser
- Owner: nilukush
- Created: 2026-02-22T08:02:40.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-02-22T13:57:46.000Z (4 months ago)
- Last Synced: 2026-02-22T14:06:22.927Z (4 months ago)
- Topics: ai, gpt-4, nlp-parsing, openai, resume, resume-parser
- Language: Python
- Homepage:
- Size: 335 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ResuMate - Smart Resume Parser
An intelligent resume parsing platform that extracts structured data from resumes using OCR, NLP, and AI.
## Tech Stack
- **Backend:** FastAPI (Python 3.11+) with WebSocket support
- **Frontend:** React 18+ with TypeScript and WebSocket hooks
- **Database:** PostgreSQL with JSONB
- **AI/ML:** Tesseract OCR, spaCy NLP, OpenAI GPT-4
- **Deployment:** Railway (backend), Vercel (frontend)
- **Testing:** Pytest (backend), Vitest (frontend)
## Features
- Multi-format support (PDF, DOCX, DOC, TXT)
- Real-time parsing progress via WebSocket
- NLP-based entity extraction
- Confidence scoring
- Review and edit parsed data
- **Shareable links** with configurable expiration
- **Export to PDF** with professional formatting
- **Social sharing** (WhatsApp, Telegram, Email)
- Access tracking and share revocation
## Project Structure
```
resume-parser/
|-- backend/ # FastAPI application
| |-- app/
| | |-- api/ # API routes & WebSocket handlers
| | |-- models/ # SQLAlchemy models & progress types
| | |-- services/ # Business logic (parser, orchestrator, export)
| | |-- core/ # Storage, config, database
| | `-- main.py # FastAPI app entry
| |-- tests/
| | |-- unit/ # Unit tests
| | |-- integration/ # Integration tests
| | `-- e2e/ # End-to-end tests
| `-- requirements.txt
|-- frontend/ # React application
| |-- src/
| | |-- components/ # React components
| | |-- pages/ # Page components
| | |-- hooks/ # Custom React hooks (WebSocket)
| | |-- services/ # API client
| | `-- lib/ # Utilities
| `-- package.json
`-- docs/
`-- plans/
```
## Getting Started
### Backend Setup
```bash
cd backend
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download spaCy model
python -m spacy download en_core_web_sm
# Setup environment
cp .env.example .env
# Edit .env with your configuration
# Run development server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
```
Backend will be available at http://localhost:8000
### Frontend Setup
```bash
cd frontend
# Install dependencies
npm install
# Setup environment
cp .env.example .env
# Edit .env if needed (defaults to http://localhost:8000)
# Run development server
npm run dev
```
Frontend will be available at http://localhost:3000
### Testing
**Backend:**
```bash
cd backend
source .venv/bin/activate
python -m pytest tests/ -v
```
**Frontend:**
```bash
cd frontend
npm test -- --run
npm run type-check
```
## Usage Flow
1. **Upload** - Upload resume (PDF, DOCX, DOC, TXT) at http://localhost:3000
2. **Processing** - Watch real-time parsing progress via WebSocket
3. **Review** - Review extracted data with confidence scores, make corrections
4. **Share** - Create shareable links, export to PDF, or share via social media
## WebSocket Communication
**Connection:**
```
ws://localhost:8000/ws/resumes/{resume_id}
```
**Progress Update Message:**
```json
{
"type": "progress_update",
"stage": "text_extraction | nlp_parsing | ai_enhancement | complete",
"progress": 50,
"status": "Extracting text...",
"estimated_seconds_remaining": 15
}
```
## Share and Export API
### Create Shareable Link
```http
POST /v1/resumes/{resume_id}/share
```
**Response:**
```json
{
"share_token": "uuid-v4-token",
"share_url": "http://localhost:3000/share/uuid-v4-token",
"expires_at": "2026-03-20T12:00:00"
}
```
### Export Resume
**PDF:**
```http
GET /v1/resumes/{resume_id}/export/pdf
```
Returns PDF file with `Content-Type: application/pdf`
**WhatsApp:**
```http
GET /v1/resumes/{resume_id}/export/whatsapp
```
```json
{
"whatsapp_url": "https://wa.me/?text=..."
}
```
**Telegram:**
```http
GET /v1/resumes/{resume_id}/export/telegram
```
```json
{
"telegram_url": "https://t.me/share/url?url=&text=..."
}
```
**Email:**
```http
GET /v1/resumes/{resume_id}/export/email
```
```json
{
"mailto_url": "mailto:?subject=...&body=..."
}
```
### Public Share Access
```http
GET /v1/share/{share_token}
```
Returns resume data without authentication. Status codes:
- `200` - Success
- `403` - Share has been revoked
- `404` - Share not found
- `410` - Share has expired
### Revoke Share
```http
DELETE /v1/resumes/{resume_id}/share
```
**Response:**
```json
{
"message": "Share revoked successfully",
"resume_id": "resume-id"
}
```
## Environment Variables
**Backend (.env):**
- `DATABASE_URL` - PostgreSQL connection string
- `REDIS_URL` - Redis connection for Celery
- `OPENAI_API_KEY` - OpenAI API key (optional)
- `SECRET_KEY` - Secret key for signing
- `ALLOWED_ORIGINS` - CORS allowed origins
**Frontend (.env):**
- `VITE_API_BASE_URL` - Backend API base URL
- `VITE_WS_BASE_URL` - WebSocket base URL
## Test Results
### Backend Tests: 120/120 Passing
- Unit tests: 67 tests
- Integration tests: 50 tests
- E2E tests: 4 tests
### Frontend Tests: 31/31 Passing
- Component tests: 31 tests
- Type check: Passed (TypeScript strict mode)
## License
MIT