{"id":28489803,"url":"https://github.com/gaurav-singh7092/resumatch","last_synced_at":"2026-04-30T22:37:02.490Z","repository":{"id":297102022,"uuid":"995667660","full_name":"gaurav-singh7092/ResuMatch","owner":"gaurav-singh7092","description":"An AI-powered resume and job description matching application using natural language processing and machine learning techniques. This application provides intelligent analysis of resume-job compatibility with detailed scoring and recommendations.","archived":false,"fork":false,"pushed_at":"2025-06-03T22:28:04.000Z","size":1507,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-04T05:32:10.594Z","etag":null,"topics":["fastapi","keyword-extraction","nextjs","nlp","preprocessing-data","python","similarity-score","tailwind"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gaurav-singh7092.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-03T20:47:17.000Z","updated_at":"2025-06-03T22:28:06.000Z","dependencies_parsed_at":"2025-06-04T05:34:14.268Z","dependency_job_id":"aa5ff1de-6b04-425a-b756-9b894053e97e","html_url":"https://github.com/gaurav-singh7092/ResuMatch","commit_stats":null,"previous_names":["gaurav-singh7092/resumatch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/gaurav-singh7092/ResuMatch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaurav-singh7092%2FResuMatch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaurav-singh7092%2FResuMatch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaurav-singh7092%2FResuMatch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaurav-singh7092%2FResuMatch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gaurav-singh7092","download_url":"https://codeload.github.com/gaurav-singh7092/ResuMatch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaurav-singh7092%2FResuMatch/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263140352,"owners_count":23419862,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","keyword-extraction","nextjs","nlp","preprocessing-data","python","similarity-score","tailwind"],"created_at":"2025-06-08T07:06:24.466Z","updated_at":"2026-04-30T22:36:57.469Z","avatar_url":"https://github.com/gaurav-singh7092.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ResuMatch 🎯\n\n## Revolutionize Your Job Application Process\n\nResuMatch is an advanced AI-powered platform that transforms how job seekers and recruiters approach the hiring process. By leveraging cutting-edge natural language processing (NLP) and machine learning algorithms, ResuMatch analyzes the compatibility between resumes and job descriptions with unprecedented accuracy and insight.\n\n### Why ResuMatch?\n\nIn today's competitive job market, standing out is essential. ResuMatch provides job seekers with data-driven insights to tailor their applications, while helping recruiters identify ideal candidates efficiently. Our intelligent matching system goes beyond simple keyword matching by understanding context, skills relevance, and experience alignment.\n\n\n## Screenshots 📸\n\n### Home Page\n![ResuMatch Home Page](ResuMatch%20Home.png)\n\n### Analysis Results\n![ResuMatch Demo Results](ResuMatch%20Demo.png)\n\n## Features ✨\n\n- **Multi-format Support**: PDF, DOC, DOCX, TXT, and image files with text extraction\n- **Text Preprocessing**: NLP pipeline with entity recognition, skill extraction, and text normalization\n- **Similarity Analysis**: Semantic similarity calculation using sentence embeddings\n- **Component Scoring**: Analysis across multiple dimensions:\n  - Semantic similarity\n  - Skill matching\n  - Experience compatibility\n  - Education alignment\n  - Keyword overlap\n- **Web Interface**: React-based frontend for easy interaction\n- **REST API**: Backend API for integration with other applications\n- **Batch Processing**: Analyze multiple resumes through API endpoints\n- **Open Source**: Complete source code available for customization and extension\n\n## Quick Start 🚀\n\n### Prerequisites\n\n- Python 3.8 or higher\n- Node.js 16 or higher (for frontend)\n- pip package manager\n\n### Installation\n\n1. **Clone the repository**:\n```bash\ngit clone https://github.com/gaurav-singh7092/ResuMatch.git\ncd ResuMatch\n```\n\n2. **Backend Setup**:\n```bash\ncd backend\npython -m venv .venv\nsource .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\npip install -r requirements.txt\npython -m spacy download en_core_web_sm\n```\n\n3. **Frontend Setup**:\n```bash\ncd ../frontend\nnpm install\n```\n\n4. **Install Tesseract OCR** (for image processing):\n   - **Ubuntu/Debian**: `sudo apt-get install tesseract-ocr`\n   - **macOS**: `brew install tesseract`\n   - **Windows**: Download from [GitHub](https://github.com/UB-Mannheim/tesseract/wiki)\n\n### Running the Application\n\n#### Quick Start with Development Script\n```bash\n# Setup everything\n./dev.sh setup-all\n\n# Run both backend and frontend\n./dev.sh run-dev\n```\n\n#### Manual Setup\n\n**Backend API Server**\n```bash\ncd backend\npython main.py\n```\nThe backend API will be available at `http://localhost:8000`\n\n**Frontend Development Server**\n```bash\ncd frontend\nnpm run dev\n```\nThe frontend will be available at `http://localhost:3000`\n\n#### Development Helper Commands\n```bash\n./dev.sh setup-backend     # Setup Python environment\n./dev.sh setup-frontend    # Setup Node.js environment  \n./dev.sh setup-all         # Setup both environments\n./dev.sh run-backend       # Start only backend\n./dev.sh run-frontend      # Start only frontend\n./dev.sh run-dev           # Start both services\n./dev.sh clean             # Clean up generated files\n./dev.sh help              # Show help\n```\n\n## Project Structure 📁\n\n```\nResuMatch/\n├── README.md                    # Main project documentation\n├── backend/                     # Python backend application\n│   ├── README.md               # Backend-specific documentation\n│   ├── main.py                 # Flask/FastAPI application\n│   ├── config.py               # Configuration settings\n│   ├── similarity_engine.py    # Core matching algorithms\n│   ├── text_extractor.py       # Document text extraction\n│   ├── text_preprocessor.py    # NLP preprocessing\n│   ├── examples.py             # Example scripts\n│   ├── requirements.txt        # Python dependencies\n│   ├── examples/               # Sample files and results\n│   ├── results/                # Analysis output storage\n│   ├── static/                 # Web assets\n│   ├── templates/              # HTML templates\n│   └── uploads/                # File upload storage\n├── frontend/                   # Next.js frontend application\n│   ├── README.md               # Frontend documentation\n│   ├── package.json            # Node.js dependencies\n│   ├── next.config.js          # Next.js configuration\n│   ├── tailwind.config.js      # Tailwind CSS config\n│   ├── components/             # React components\n│   ├── pages/                  # Next.js pages\n│   ├── services/               # API services\n│   ├── hooks/                  # Custom React hooks\n│   └── styles/                 # CSS styles\n└── .venv/                      # Python virtual environment\n```\n\n## Usage Examples 📋\n\n### Web Interface\n\n1. **Single Resume Analysis**:\n   - Upload your resume (PDF, DOC, DOCX, or TXT)\n   - Paste the job description\n   - Click \"Analyze Match\"\n   - View detailed results with scores and recommendations\n\n2. **Batch Analysis**:\n   - Use the API endpoint `/batch-analyze` to process multiple resumes\n\n### API Usage Examples\n\nThe project now focuses on web-based interaction through the frontend interface and REST API endpoints. For automated processing, you can use the API endpoints programmatically:\n\n```bash\n# Navigate to backend directory\ncd backend\n\n# Example API usage with curl\ncurl -X POST http://localhost:8000/api/analyze \\\n  -F \"resume=@path/to/resume.pdf\" \\\n  -F \"job_description=Your job description text here\"\n\n# Or use the web interface at http://localhost:3000\n```\n\n## Architecture 🏗️\n\n### Core Components\n\n1. **Frontend (Next.js + TypeScript)**\n   - Modern React-based web interface\n   - Responsive design with Tailwind CSS\n   - Real-time analysis results\n   - File upload and management\n\n2. **Backend API (Python)**\n   - RESTful API server\n   - File processing and analysis\n   - Machine learning models\n   - Database integration\n\n3. **Text Extractor** (`backend/text_extractor.py`)\n   - Supports multiple file formats\n   - OCR for scanned documents\n   - Fallback mechanisms for better reliability\n\n4. **Text Preprocessor** (`backend/text_preprocessor.py`)\n   - Advanced NLP pipeline\n   - Skill and entity extraction\n   - Statistical analysis\n\n5. **Similarity Engine** (`backend/similarity_engine.py`)\n   - Transformer-based embeddings\n   - Multi-component scoring\n   - Configurable weights\n\n### Data Flow\n\n```\nResume File → Text Extraction → Preprocessing → Feature Extraction\n                                      ↓\nJob Description → Preprocessing → Feature Extraction\n                                      ↓\n                              Similarity Analysis\n                                      ↓\n                            Detailed Results \u0026 Recommendations\n```\n\n## API Documentation 📚\n\n### REST Endpoints\n\n#### `POST /analyze`\nAnalyze a single resume against a job description.\n\n**Parameters**:\n- `resume`: File upload (multipart/form-data)\n- `job_description`: Text (form field)\n\n**Response**:\n```json\n{\n  \"analysis_id\": \"uuid\",\n  \"similarity_analysis\": {\n    \"overall_score\": 75.5,\n    \"component_scores\": {\n      \"semantic_similarity\": 0.78,\n      \"skill_match\": 0.65,\n      \"experience_match\": 0.80,\n      \"education_match\": 0.90,\n      \"keyword_match\": 0.72\n    },\n    \"matched_skills\": [\"python\", \"django\", \"rest api\"],\n    \"missing_skills\": [\"react\", \"aws\", \"docker\"],\n    \"recommendations\": [\"Add React experience\", \"Include cloud technologies\"]\n  }\n}\n```\n\n#### `POST /batch-analyze`\nAnalyze multiple resumes against a job description.\n\n#### `GET /analysis/{analysis_id}`\nRetrieve detailed analysis results by ID.\n\n#### `GET /health`\nHealth check endpoint.\n\n#### `GET /api/stats`\nApplication statistics.\n\n## Configuration ⚙️\n\n### Environment Variables\n\n```bash\n# Server settings\nHOST=0.0.0.0\nPORT=8000\nDEBUG=False\n\n# Model settings\nSENTENCE_MODEL=all-MiniLM-L6-v2\nUSE_GPU=False\n\n# Processing settings\nMAX_FILE_SIZE=52428800  # 50MB\nMAX_BATCH_SIZE=10\n\n# Component weights\nWEIGHT_SEMANTIC=0.35\nWEIGHT_SKILL=0.25\nWEIGHT_EXPERIENCE=0.15\nWEIGHT_EDUCATION=0.10\nWEIGHT_KEYWORD=0.15\n```\n\n### Configuration Files\n\nThe application uses `config.py` for centralized configuration management with support for different environments (development, production, testing).\n\n## Supported File Formats 📄\n\n| Format | Extensions | Notes |\n|--------|------------|-------|\n| PDF | `.pdf` | Text-based and scanned (OCR) |\n| Word | `.doc`, `.docx` | Microsoft Word documents |\n| Text | `.txt` | Plain text files |\n| Images | `.png`, `.jpg`, `.jpeg`, `.tiff`, `.bmp` | OCR processing |\n\n## Scoring System 📊\n\n### Component Weights (Default)\n\n- **Semantic Similarity** (35%): Overall content alignment\n- **Skill Match** (25%): Technical and soft skills overlap\n- **Experience Match** (15%): Years and type of experience\n- **Education Match** (10%): Educational background alignment\n- **Keyword Match** (15%): Important keyword overlap\n\n### Score Interpretation\n\n- **75-100%**: Excellent match\n- **60-74%**: Good match\n- **45-59%**: Fair match\n- **0-44%**: Poor match\n\n## Advanced Features 🔧\n\n### Skill Extraction\n\nThe system automatically extracts:\n- Programming languages\n- Frameworks and libraries\n- Databases\n- Cloud platforms\n- Tools and technologies\n- Soft skills\n- Certifications\n\n### Entity Recognition\n\nIdentifies:\n- Contact information\n- Company names\n- Locations\n- Dates\n- Educational institutions\n\n### Quality Assessment\n\nEvaluates:\n- Document completeness\n- Text clarity\n- Professional formatting\n- Technical depth\n\n## Performance Optimization 🚀\n\n### Model Loading\n\n- Lazy loading of heavy models\n- Fallback to lighter models\n- GPU acceleration support\n\n### Processing\n\n- Async file processing\n- Batch optimization\n- Caching mechanisms\n\n### Memory Management\n\n- Streaming file processing\n- Garbage collection optimization\n- Resource cleanup\n\n## Development 👨‍💻\n\n### Project Structure\n\n```\nResuMatch/\n├── main.py              # Web application\n├── config.py            # Configuration management\n├── text_extractor.py    # Text extraction engine\n├── text_preprocessor.py # NLP preprocessing pipeline\n├── similarity_engine.py # AI similarity analysis\n├── requirements.txt     # Python dependencies\n├── README.md           # Documentation\n├── uploads/            # Uploaded files\n├── results/            # Analysis results\n└── static/             # Static web assets\n```\n\n### Running Tests\n\n```bash\n# Install test dependencies\npip install pytest pytest-asyncio\n\n# Run tests\npytest tests/\n\n# Run with coverage\npytest --cov=. tests/\n```\n\n### Adding New Features\n\n1. **New File Format Support**:\n   - Add extraction method to `TextExtractor`\n   - Update supported formats list\n   - Add file type detection\n\n2. **New Skill Categories**:\n   - Update skill patterns in `TextPreprocessor`\n   - Add new regex patterns\n   - Update feature extraction\n\n3. **Custom Scoring**:\n   - Modify weights in `SimilarityEngine`\n   - Add new scoring components\n   - Update analysis logic\n\n## Troubleshooting 🔧\n\n### Common Issues\n\n1. **Model Loading Errors**:\n   ```bash\n   # Download models manually\n   python -c \"from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')\"\n   ```\n\n2. **OCR Not Working**:\n   - Ensure Tesseract is installed and in PATH\n   - Check image quality and format\n\n3. **Memory Issues**:\n   - Reduce batch size\n   - Use lighter models\n   - Enable garbage collection\n\n4. **Performance Issues**:\n   - Enable GPU acceleration\n   - Use caching\n   - Optimize preprocessing options\n\n### Logging\n\n```bash\n# Enable debug logging\nexport LOG_LEVEL=DEBUG\npython main.py\n```\n\n## Contributing 🤝\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests\n5. Submit a pull request\n\n### Code Style\n\n- Follow PEP 8\n- Use type hints\n- Add docstrings\n- Write unit tests\n\n## License 📝\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support 💬\n\nFor support, please:\n1. Check the troubleshooting section\n2. Review existing issues on GitHub\n3. Create a new issue with detailed information\n\n## Acknowledgments 🙏\n\n- **Sentence Transformers**: For semantic similarity models\n- **spaCy**: For NLP preprocessing\n- **FastAPI**: For the web framework\n- **scikit-learn**: For machine learning utilities\n- **Tesseract**: For OCR capabilities\n\n## Roadmap 🗺️\n\n### Planned Features\n\n- [ ] Database integration for result storage\n- [ ] User authentication and profiles\n- [ ] Advanced analytics dashboard\n- [ ] Multi-language support\n- [ ] Resume optimization suggestions\n- [ ] Integration with job boards\n- [ ] Machine learning model fine-tuning\n- [ ] Mobile application\n- [ ] Enterprise features\n\n---\n\n**ResuMatch** - Making resume screening intelligent and efficient! 🎯\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaurav-singh7092%2Fresumatch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgaurav-singh7092%2Fresumatch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaurav-singh7092%2Fresumatch/lists"}