{"id":25972869,"url":"https://github.com/rushilpatel21/redactify","last_synced_at":"2025-03-05T01:17:55.609Z","repository":{"id":280137492,"uuid":"936671225","full_name":"rushilpatel21/Redactify","owner":"rushilpatel21","description":"Redactify is an efficient data redaction tool that secures sensitive text using advanced NLP and rule-based methods. It combines transformer-based NER, regex, and Presidio analysis to detect and mask personal information through full redaction or partial masking—ensuring compliance while preserving data utility.","archived":false,"fork":false,"pushed_at":"2025-03-01T13:23:13.000Z","size":401,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-01T14:26:52.937Z","etag":null,"topics":["anonymization","data-redaction","flask","nlp","pii","presidio","privacy","pseudonymization","python","redactify","regex","transformers"],"latest_commit_sha":null,"homepage":"https://redactify.vercel.app","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rushilpatel21.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-21T13:40:27.000Z","updated_at":"2025-03-01T13:23:16.000Z","dependencies_parsed_at":"2025-03-01T14:37:04.386Z","dependency_job_id":null,"html_url":"https://github.com/rushilpatel21/Redactify","commit_stats":null,"previous_names":["rushilpatel21/redactify"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rushilpatel21%2FRedactify","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rushilpatel21%2FRedactify/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rushilpatel21%2FRedactify/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rushilpatel21%2FRedactify/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rushilpatel21","download_url":"https://codeload.github.com/rushilpatel21/Redactify/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241945530,"owners_count":20046870,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anonymization","data-redaction","flask","nlp","pii","presidio","privacy","pseudonymization","python","redactify","regex","transformers"],"created_at":"2025-03-05T01:17:55.018Z","updated_at":"2025-03-05T01:17:55.596Z","avatar_url":"https://github.com/rushilpatel21.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Redactify - Advanced PII Anonymization Platform\n\n![Redactify](https://img.shields.io/badge/Redactify-1.0.0-blue)\n![React](https://img.shields.io/badge/React-18.2.0-61DAFB)\n![Flask](https://img.shields.io/badge/Flask-2.3.2-000000)\n![Python](https://img.shields.io/badge/Python-3.9+-3776AB)\n\nRedactify is a comprehensive solution for detecting and anonymizing personally identifiable information (PII) in text documents. The application combines a modern React frontend with a powerful Python Flask backend to provide an intuitive and effective PII redaction service.\n\n![Redactify Screenshot](assets/Redactify_1.png)\n![Redactify Screenshot](assets/Redactify_2.png)\n![Redactify Screenshot](assets/Redactify_3.png)\n\n### PS : Still figuring out a way to host server for free. (There are some memory constraints on service providers like render)\n\n## 🔍 Overview\n\nRedactify helps organizations comply with data privacy regulations by identifying and removing sensitive information from text. The platform uses a sophisticated multi-method detection approach combining machine learning models, rule-based patterns, and Microsoft's Presidio Analyzer to achieve high accuracy PII detection.\n\n### Project Architecture\n\n```\nredactify/\n├── client/          # React frontend application\n│   ├── public/\n│   ├── src/\n│   ├── .env\n│   └── package.json\n│\n├── server/          # Flask backend API\n│   ├── server.py    # Main API entry point\n│   ├── .env\n│   └── requirements.txt\n│\n└── README.md        # This file\n```\n\n## ✨ Key Features\n\n### Detection Capabilities\n- **Multi-Method Detection**: Combines machine learning (Hugging Face NER), rule-based patterns (regex), and Presidio Analyzer\n- **13+ PII Types**: Identifies personal names, organizations, locations, email addresses, phone numbers, credit cards, SSNs, IP addresses, URLs, dates, passwords, API keys, and roll numbers\n- **Customizable Confidence Thresholds**: Configure detection sensitivity based on your needs\n\n### Anonymization Options\n- **Full Redaction**: Replace PII with categorized placeholders (e.g., `[PERSON-611732]`)\n- **Partial Redaction**: Preserve some characters while masking others for better context retention\n- **Selective Anonymization**: Enable/disable specific PII types for targeted redaction\n\n### User Experience\n- **Intuitive Interface**: Clean, responsive design works across devices\n- **Real-time Feedback**: Toast notifications provide operation status\n- **Animated UI**: Smooth transitions make interaction pleasant\n- **Performance Optimizations**: Concurrent processing for faster results\n\n## 🛠 Technologies Used\n\n### Frontend (Client)\n- React 18.2.0\n- Framer Motion (animations)\n- React Icons\n- SweetAlert2 (notifications)\n- CSS3 with custom styling\n\n### Backend (Server) \n- Python 3.9+\n- Flask web framework\n- Microsoft Presidio Analyzer\n- Hugging Face Transformers (BERT NER model)\n- Regular expressions for pattern matching\n- Thread pool for concurrent processing\n\n## 🚀 Installation \u0026 Setup\n\n### Prerequisites\n- Node.js (v16+)\n- Python (v3.9+)\n- npm or yarn\n- pip\n\n### Clone the Repository\n\n```bash\ngit clone https://github.com/rushilpatel21/Redactify.git\ncd Redactify\n```\n\n### Backend Setup\n\n```bash\n# Navigate to server directory\ncd server\n\n# Create and activate a virtual environment\npython -m venv venv\nsource venv/bin/activate   # On Windows, use `venv\\Scripts\\activate`\n\n# Install dependencies\npip install -r requirements.txt\n\n# Run the server\npython server.py   # Runs on http://localhost:8000 by default\n```\n\n### Frontend Setup\n\n```bash\n# Navigate to client directory\ncd client\n\n# Install dependencies\nnpm install\n\n# Create .env file\necho \"VITE_BACKEND_BASE_URL='http://localhost:8000'\" \u003e .env\n\n# Start development server\nnpm run dev   # Access at http://localhost:5173 by default\n```\n\n## 🔧 Environment Configuration\n\n### Backend\nThe server uses environment variables for configuration:\n- `PORT`: Server port (default: 8000)\n- `CONFIDENCE_THRESHOLD`: Detection confidence threshold (default: 0.6)\n\n### Frontend\nThe client uses a `.env` file for configuration:\n- `VITE_BACKEND_BASE_URL`: Backend API URL (default: 'http://localhost:8000')\n\n## 📝 API Documentation\n\n### Anonymize Endpoint\n\n```\nPOST /anonymize\n```\n\n#### Request Format\n\n```json\n{\n  \"text\": \"The input text containing PII to anonymize\",\n  \"options\": {\n    \"PERSON\": true,\n    \"ORGANIZATION\": true,\n    \"LOCATION\": true,\n    \"EMAIL_ADDRESS\": true,\n    \"PHONE_NUMBER\": true,\n    \"CREDIT_CARD\": true,\n    \"SSN\": true,\n    \"IP_ADDRESS\": true,\n    \"URL\": true,\n    \"DATE_TIME\": true,\n    \"PASSWORD\": true,\n    \"API_KEY\": true,\n    \"ROLL_NUMBER\": true\n  },\n  \"full_redaction\": true\n}\n```\n\n#### Response Format\n\nSuccess:\n```json\n{\n  \"anonymized_text\": \"The redacted text with PII anonymized\"\n}\n```\n\nError:\n```json\n{\n  \"error\": \"Description of the error that occurred\"\n}\n```\n\n## 📋 Usage Examples\n\n### Example 1: Full Redaction\n\n**Input:**\n```\nThis agreement is made between Generic \u0026 Associates (email: john.doe@example.com, phone: 555-123-4567) and Mr. John Smith (SSN: 123-45-6789).\n```\n\n**Output:**\n```\nThis agreement is made between [ORGANIZATION-0458a5] (email: [EMAIL_ADDRESS-8eb1b5], phone: [PHONE_NUMBER-ca71de]) and Mr. [PERSON-611732] (SSN: [SSN-1e8748]).\n```\n\n### Example 2: Partial Redaction\n\n**Input:**\n```\nPlease contact John Smith at john.smith@example.com or 555-123-4567.\n```\n\n**Output:**\n```\nPlease contact Jo*****ith at jo****ith@*******.com or 55*******567.\n```\n\n---\n\n\u0026copy; 2025 Redactify. All rights reserved.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frushilpatel21%2Fredactify","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frushilpatel21%2Fredactify","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frushilpatel21%2Fredactify/lists"}