{"id":29193754,"url":"https://github.com/tanush-g/llm-task","last_synced_at":"2026-02-03T13:02:30.571Z","repository":{"id":299715060,"uuid":"998911476","full_name":"tanush-g/llm-task","owner":"tanush-g","description":"A privacy-focused web application that detects Personally Identifiable Information (PII) in text using advanced NLP techniques and provides AI-powered paraphrasing to help users protect their sensitive information. It leverages Named Entity Recognition (NER) and Large Language Models (LLMs) to identify and sanitize PII","archived":false,"fork":false,"pushed_at":"2025-06-21T01:48:01.000Z","size":2097,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-02T03:09:29.760Z","etag":null,"topics":["fastapi","gemini","named-entity-recognition","ner","pii-detection","render","spacy","vercel"],"latest_commit_sha":null,"homepage":"https://llm-task-phi.vercel.app","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tanush-g.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-09T12:58:08.000Z","updated_at":"2025-06-23T13:45:04.000Z","dependencies_parsed_at":"2025-06-18T00:28:19.257Z","dependency_job_id":"40f40162-606d-46b1-a230-56135868429e","html_url":"https://github.com/tanush-g/llm-task","commit_stats":null,"previous_names":["tanush-g/llm-task"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tanush-g/llm-task","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tanush-g%2Fllm-task","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tanush-g%2Fllm-task/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tanush-g%2Fllm-task/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tanush-g%2Fllm-task/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tanush-g","download_url":"https://codeload.github.com/tanush-g/llm-task/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tanush-g%2Fllm-task/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29046503,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-03T10:09:22.136Z","status":"ssl_error","status_checked_at":"2026-02-03T10:09:16.814Z","response_time":96,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","gemini","named-entity-recognition","ner","pii-detection","render","spacy","vercel"],"created_at":"2025-07-02T03:08:49.144Z","updated_at":"2026-02-03T13:02:28.141Z","avatar_url":"https://github.com/tanush-g.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PrivChat - PII Detection Web Application\n\nA privacy-focused full-stack web application that detects Personally Identifiable Information (PII) in text using advanced NLP techniques and provides AI-powered suggestions for safer communication using Google's Gemini API.\n\n![Demo](demo.gif)\n\n## 🌐 Live Demo\n\n- **Frontend:** [https://llm-task-phi.vercel.app](https://llm-task-phi.vercel.app)\n- **Backend API:** [https://llm-task.onrender.com](https://llm-task.onrender.com)\n- **API Documentation:** [https://llm-task.onrender.com/docs](https://llm-task.onrender.com/docs)\n\n\u003e **Note:** The live demo is hosted on free tiers, so it may take a few seconds to wake up after inactivity. Make sure the backend is running before testing the frontend.\n\n## ✨ Features\n\n- **🔍 PII Detection**: Uses spaCy's Named Entity Recognition to identify personal information including:\n  - Names (PERSON)\n  - Organizations (ORG)  \n  - Locations (GPE)\n  - Dates (DATE)\n  - Money amounts (MONEY)\n  - Phone numbers and other sensitive data\n  \n- **🤖 AI-Powered Suggestions**: Integrates with Google's Gemini 2.0 Flash model to provide:\n  - Privacy-conscious text rewriting\n  - Maintains original meaning while protecting sensitive data\n  - Natural language output with professional tone\n\n- **🎨 Interactive UI**: Modern, Mac-style interface featuring:\n  - Real-time PII highlighting with pulsing visual effects\n  - Expandable details panel showing all detected entities\n  - Confidence scores for each detected PII element\n  - Responsive design with smooth animations\n  - Color-coded entity categories\n\n- **⚡ Production Ready**: Fully deployed with professional infrastructure\n\n## 🏗️ Architecture\n\n### Backend (FastAPI)\n\n- **Framework**: FastAPI with production-grade CORS configuration\n- **NLP**: spaCy with `en_core_web_sm` model for entity recognition\n- **AI Integration**: Google Gemini 2.0 Flash API for text rewriting\n- **Confidence Scoring**: Custom algorithm based on entity type, length, and context\n- **Deployment**: Render.com with auto-scaling and health monitoring\n\n### Frontend (Vanilla JavaScript)\n\n- **Design**: Mac-style window interface with sidebar navigation\n- **Highlighting**: Dynamic PII highlighting with color-coded categories\n- **Interactions**: Smooth animations and responsive layout\n- **Accessibility**: Keyboard navigation and screen reader support\n- **Deployment**: Vercel with global CDN and instant deployment\n\n## 🚀 Quick Start (Try It Now!)\n\n**Just visit:** [https://llm-task-phi.vercel.app](https://llm-task-phi.vercel.app)\n\n1. Click the **\"+\"** button to enter input mode\n2. Type or paste text containing PII (e.g., \"Hi, I'm John Doe from Google in New York\")\n3. Click **\"Send\"** to analyze\n4. View the AI-powered privacy-safe rewrite\n5. Click **\"i\"** to see detailed PII analysis\n\n## 💻 Local Development Setup\n\n### Prerequisites\n\n- **Python 3.8+** \n- **Google AI Studio API Key** ([Get one here](https://aistudio.google.com/apikey))\n\n### 1. Clone the Repository\n\n```bash\ngit clone https://github.com/tanush-g/llm-task.git\ncd llm-task\n```\n\n### 2. Backend Setup\n\n```bash\ncd backend\n\n# Create virtual environment (recommended)\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install Python dependencies\npip install -r requirements.txt\n\n# Download spaCy English model\npython -m spacy download en_core_web_sm\n\n# Configure API key\ncp .env.example .env\n# Edit .env and add your GEMINI_API_KEY\n```\n\n### 3. Start Development Servers\n\n```bash\n# Terminal 1: Start backend\ncd backend\npython main.py\n# Backend runs on http://localhost:8000\n\n# Terminal 2: Serve frontend\ncd frontend\npython -m http.server 3000\n# Frontend available at http://localhost:3000\n```\n\n### 4. Access Local Application\n\nNavigate to `http://localhost:3000` in your web browser.\n\n## 🎯 How to Use\n\n1. **Input Text**: Click the \"+\" button to toggle input mode\n2. **Analyze**: Type or paste text containing potential PII and click \"Send\"\n3. **Review Results**: \n   - View the AI-rewritten privacy-safe version\n   - See highlighted PII elements in the original text\n   - Click the \"i\" button to view detailed PII analysis with confidence scores\n4. **Privacy Protection**: Use the rewritten version for safer communication\n\n## 📡 API Documentation\n\n### Live API Endpoints\n\n- **Base URL**: [https://llm-task.onrender.com](https://llm-task.onrender.com)\n- **Interactive Docs**: [https://llm-task.onrender.com/docs](https://llm-task.onrender.com/docs)\n\n### `POST /analyze`\n\nAnalyzes text for PII and generates a privacy-safe rewrite using Gemini AI.\n\n**Request Body:**\n```json\n{\n  \"text\": \"Hi, I'm John Smith from Acme Corp. Call me at 555-123-4567.\"\n}\n```\n\n**Response:**\n```json\n{\n  \"entities\": [\n    {\n      \"text\": \"John Smith\",\n      \"label\": \"PERSON\",\n      \"confidence\": 0.95,\n      \"start\": 8,\n      \"end\": 18\n    },\n    {\n      \"text\": \"Acme Corp\",\n      \"label\": \"ORG\",\n      \"confidence\": 0.85,\n      \"start\": 24,\n      \"end\": 33\n    }\n  ],\n  \"ai_response\": \"Hello! I'm a professional from a technology company. Please feel free to contact me.\",\n  \"original_text\": \"Hi, I'm John Smith from Acme Corp. Call me at 555-123-4567.\",\n  \"sanitized_text\": \"Hi, I'm [Name] from [Company]. Call me at [Phone].\"\n}\n```\n\n### `GET /health`\n\nHealth check endpoint to verify service status.\n\n**Response:**\n```json\n{\n  \"status\": \"healthy\",\n  \"spacy_loaded\": true,\n  \"gemini_configured\": true,\n  \"model\": \"gemini-2.0-flash\"\n}\n```\n\n## 🔧 Configuration \u0026 Deployment\n\n### Environment Variables\n\n```env\nGEMINI_API_KEY=your_gemini_api_key_here\n```\n\n### Production Architecture\n\n- **Backend**: Deployed on [Render.com](https://render.com) with auto-scaling\n- **Frontend**: Deployed on [Vercel](https://vercel.com) with global CDN\n- **Database**: None required (stateless application)\n- **AI Model**: Google Gemini 2.0 Flash via API\n\n### Model Configuration\n\nThe Gemini API is configured with optimal settings for privacy-focused text rewriting:\n\n- **Temperature**: 0.3 (for consistent, reliable results)\n- **Max tokens**: 100 (concise responses)\n- **Top-p**: 0.8 (balanced creativity and coherence)\n\n## 🛠️ Development\n\n### Project Structure\n\n```\nllm-task/\n├── backend/\n│   ├── main.py              # FastAPI application with Gemini integration\n│   ├── requirements.txt     # Python dependencies\n│   ├── .env                 # Environment variables (API key)\n│   ├── setup.sh            # Setup script for local development\n│   └── test_config.py      # Configuration testing utility\n├── frontend/\n│   └── index.html          # Single-page application\n├── demo.gif                # Demo animation\n├── DEPLOYMENT_README.md    # Deployment documentation\n└── README.md               # This file\n```\n\n### Adding New Entity Types\n\nTo support additional PII types, modify the entity filter in `backend/main.py`:\n\n```python\nif ent.label_ in [\"PERSON\", \"ORG\", \"GPE\", \"DATE\", \"MONEY\", \"CARDINAL\", \"YOUR_NEW_TYPE\"]:\n```\n\n### Customizing the UI\n\nThe frontend uses a single HTML file with embedded CSS and JavaScript:\n\n- **Color schemes**: Modify CSS custom properties\n- **Animation timing**: Adjust CSS keyframe durations  \n- **Layout breakpoints**: Update responsive design rules\n\n## 🔍 Testing\n\n### Manual Testing\n\n```bash\n# Test health endpoint\ncurl https://llm-task.onrender.com/health\n\n# Test analyze endpoint\ncurl -X POST \"https://llm-task.onrender.com/analyze\" \\\n     -H \"Content-Type: application/json\" \\\n     -d '{\"text\": \"Hi, I am John Doe and I work at Google in Mountain View.\"}'\n```\n\n### Example API Usage\n\n```python\nimport requests\n\n# Analyze text for PII\nresponse = requests.post(\n    \"https://llm-task.onrender.com/analyze\",\n    json={\"text\": \"Hi, I'm Jane Smith from Microsoft in Seattle. Call me at (555) 123-4567.\"}\n)\n\ndata = response.json()\nprint(\"Detected entities:\", data[\"entities\"])\nprint(\"AI rewrite:\", data[\"ai_response\"])\n```\n\n## 🚨 Troubleshooting\n\n### Common Issues\n\n#### 1. API Connection Errors\n- **Issue**: Frontend can't connect to backend\n- **Solution**: Check CORS configuration and ensure backend is running\n- **Live Status**: Visit [https://llm-task.onrender.com/health](https://llm-task.onrender.com/health)\n\n#### 2. Gemini API Errors\n- **Issue**: \"API key not configured\" or quota exceeded\n- **Solution**: Verify API key in environment variables or check quota limits\n- **Get API Key**: [Google AI Studio](https://aistudio.google.com/apikey)\n\n#### 3. spaCy Model Errors\n- **Issue**: \"spaCy model not found\"\n- **Local Solution**: `python -m spacy download en_core_web_sm`\n- **Production**: Model is automatically installed during deployment\n\n#### 4. Cold Start Delays\n- **Issue**: First request takes 30+ seconds\n- **Explanation**: Render free tier apps sleep after inactivity\n- **Solution**: Upgrade to paid tier for instant responses\n\n### Performance Notes\n\n- **Cold Start**: First request after inactivity may take 30-60 seconds\n- **Response Time**: Subsequent requests typically respond in 2-5 seconds\n- **Rate Limiting**: No artificial limits, bounded by Gemini API quotas\n\n## 🔒 Security \u0026 Privacy\n\n### Data Handling\n- **No Data Storage**: Application is stateless, no user data is persisted\n- **API Security**: All API calls use HTTPS encryption\n- **CORS Protection**: Configured to allow specific domains only\n- **Input Validation**: All user inputs are validated and sanitized\n\n### Privacy Considerations\n- **PII Detection**: Identifies sensitive information locally using spaCy\n- **AI Processing**: Text is sent to Google Gemini API for rewriting\n- **No Logging**: Sensitive data is not logged or stored\n- **Minimal Data**: Only necessary text is sent to external APIs\n\n## 🚀 Deployment Details\n\n### Backend (Render.com)\n- **URL**: https://llm-task.onrender.com\n- **Auto-deployment**: Triggered by GitHub commits\n- **Environment**: Python 3.9+ with automatic dependency management\n- **Scaling**: Auto-scales based on demand (free tier limitations apply)\n\n### Frontend (Vercel)\n- **URL**: https://llm-task-phi.vercel.app\n- **CDN**: Global edge network for fast loading\n- **Auto-deployment**: Triggered by GitHub commits\n- **Performance**: Optimized static asset delivery\n\n## 🤝 Contributing\n\nWe welcome contributions to improve PrivChat! Here's how to get started:\n\n### Development Workflow\n\n1. **Fork the repository**\n2. **Create a feature branch**\n   ```bash\n   git checkout -b feature/amazing-feature\n   ```\n3. **Make your changes**\n4. **Test thoroughly** (both locally and against production API)\n5. **Submit a pull request**\n\n### Areas for Contribution\n\n- **🎨 UI/UX improvements**: Enhanced animations, better responsive design\n- **🔍 PII Detection**: Support for additional entity types or languages\n- **⚡ Performance**: Caching strategies, faster load times\n- **🛡️ Security**: Additional validation, rate limiting\n- **📚 Documentation**: Improved examples, tutorials\n\n## 📄 License\n\nThis project is created for educational and demonstration purposes. \n\n## 🙏 Acknowledgments\n\n- **[spaCy](https://spacy.io/)**: For excellent NLP capabilities and entity recognition\n- **[Google Gemini](https://ai.google.dev/)**: For powerful AI text generation\n- **[FastAPI](https://fastapi.tiangolo.com/)**: For the robust, high-performance backend framework\n- **[Render](https://render.com/)**: For reliable backend hosting\n- **[Vercel](https://vercel.com/)**: For fast frontend deployment and CDN\n\n## 📞 Support\n\n- **Live Demo**: [https://llm-task-phi.vercel.app](https://llm-task-phi.vercel.app)\n- **API Status**: [https://llm-task.onrender.com/health](https://llm-task.onrender.com/health)\n- **Issues**: [GitHub Issues](https://github.com/tanush-g/llm-task/issues)\n\n---\n\n**⚡ Quick Start**: Visit the [live demo](https://llm-task-phi.vercel.app) to try PrivChat immediately, or follow the [local setup guide](#-local-development-setup) to run it on your machine!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftanush-g%2Fllm-task","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftanush-g%2Fllm-task","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftanush-g%2Fllm-task/lists"}