{"id":48790749,"url":"https://github.com/divyamohan1993/async-document-processing-workflow","last_synced_at":"2026-04-13T19:43:56.396Z","repository":{"id":348091718,"uuid":"1196463852","full_name":"divyamohan1993/async-document-processing-workflow","owner":"divyamohan1993","description":"Production-grade async document processing workflow system. Next.js 14 + FastAPI + Celery + Redis Pub/Sub + PostgreSQL. Real-time SSE progress, JWT auth, Docker Compose deployment.","archived":false,"fork":false,"pushed_at":"2026-04-12T17:40:46.000Z","size":250,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-13T19:43:55.230Z","etag":null,"topics":["async","celery","docker","document-processing","fastapi","full-stack","nextjs","postgresql","python","react","real-time","redis","tailwindcss","typescript","workflow"],"latest_commit_sha":null,"homepage":"https://docprocessor.dmj.one","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/divyamohan1993.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-30T18:14:58.000Z","updated_at":"2026-04-09T07:17:49.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/divyamohan1993/async-document-processing-workflow","commit_stats":null,"previous_names":["divyamohan1993/async-document-processing-workflow"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/divyamohan1993/async-document-processing-workflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divyamohan1993%2Fasync-document-processing-workflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divyamohan1993%2Fasync-document-processing-workflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divyamohan1993%2Fasync-document-processing-workflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divyamohan1993%2Fasync-document-processing-workflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/divyamohan1993","download_url":"https://codeload.github.com/divyamohan1993/async-document-processing-workflow/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divyamohan1993%2Fasync-document-processing-workflow/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31768649,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T15:25:13.801Z","status":"ssl_error","status_checked_at":"2026-04-13T15:25:09.162Z","response_time":93,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["async","celery","docker","document-processing","fastapi","full-stack","nextjs","postgresql","python","react","real-time","redis","tailwindcss","typescript","workflow"],"created_at":"2026-04-13T19:43:55.793Z","updated_at":"2026-04-13T19:43:56.384Z","avatar_url":"https://github.com/divyamohan1993.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Async Document Processing Workflow System\n\n![Docker](https://img.shields.io/badge/Docker-24.0+-2496ED?logo=docker\u0026logoColor=white)\n![Python](https://img.shields.io/badge/Python-3.11+-3776AB?logo=python\u0026logoColor=white)\n![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-3178C6?logo=typescript\u0026logoColor=white)\n![License](https://img.shields.io/badge/License-MIT-green)\n\nA production-grade asynchronous document processing system with a multi-stage workflow (upload, process, review, finalize, export), real-time progress tracking via SSE, and a modern web interface.\n\n---\n\n## Table of Contents\n\n- [Overview](#overview)\n- [Architecture Overview](#architecture-overview)\n- [Tech Stack](#tech-stack)\n- [Features](#features)\n- [Prerequisites](#prerequisites)\n- [Quick Start](#quick-start)\n- [Configuration](#configuration)\n- [API Documentation](#api-documentation)\n- [Architecture Details](#architecture-details)\n- [Development](#development)\n- [Deployment](#deployment)\n- [Testing](#testing)\n- [Assumptions and Tradeoffs](#assumptions-and-tradeoffs)\n- [Limitations](#limitations)\n- [Sample Files](#sample-files)\n\n---\n\n## Overview\n\nThis system allows users to upload documents (PDF, TXT, CSV, JSON, Markdown), which are then processed asynchronously through a multi-stage pipeline. Each document goes through: upload, text extraction, analysis/categorization, human review, finalization, and optional bulk export. Real-time progress is streamed to the frontend via Server-Sent Events (SSE).\n\n---\n\n## Architecture Overview\n\n```\n                          +-------------------+\n                          |     Browser       |\n                          +--------+----------+\n                                   |\n                          +--------v----------+\n                          |      Nginx        |\n                          |  (Reverse Proxy)  |\n                          |  Ports 80 / 443   |\n                          +----+--------+-----+\n                               |        |\n                    +----------+        +----------+\n                    |                              |\n           +--------v--------+          +----------v--------+\n           |    Frontend     |          |     Backend       |\n           |   (Next.js 14) |          |    (FastAPI)      |\n           |   Port 3000    |          |   Port 8000       |\n           +-----------------+          +---+----------+----+\n                                            |          |\n                                +-----------+    +-----v------+\n                                |                |   Redis 7   |\n                                |                |  (Broker +  |\n                                |                |   Cache)    |\n                                |                +-----+-------+\n                                |                      |\n                          +-----v------+        +------v--------+\n                          |PostgreSQL  |        | Celery Worker |\n                          |    16      |        | (Async Tasks) |\n                          +------------+        +---------------+\n```\n\n**Data Flow: Document Processing Pipeline**\n\n```\nUpload --\u003e Validate --\u003e Extract Text --\u003e Analyze/Categorize --\u003e Ready for Review\n   |                                                                |\n   |  (async via Celery)                                           |\n   |                                                                v\n   +-- SSE Progress Stream ----\u003e Frontend UI \u003c---- Review (Human) --+\n                                                        |\n                                                        v\n                                                   Finalize --\u003e Export (JSON/CSV)\n```\n\n---\n\n## Tech Stack\n\n| Component        | Technology              | Version |\n|------------------|-------------------------|---------|\n| Frontend         | Next.js (React)         | 14.x    |\n| Frontend Styling | Tailwind CSS            | 3.x     |\n| Frontend Language| TypeScript              | 5.x     |\n| Backend          | FastAPI (Python)        | 0.109+  |\n| Task Queue       | Celery                  | 5.3+    |\n| Task Monitoring  | Flower                  | 2.0+    |\n| Database         | PostgreSQL              | 16      |\n| Cache/Broker     | Redis                   | 7       |\n| Reverse Proxy    | Nginx                   | Alpine  |\n| ORM              | SQLAlchemy (async)      | 2.0+    |\n| Migrations       | Alembic                 | 1.13+   |\n| Containerization | Docker + Compose        | 24.0+   |\n\n---\n\n## Features\n\n**Document Management**\n- Multi-format file upload (PDF, TXT, CSV, JSON, Markdown)\n- Automatic text extraction and content analysis\n- Document categorization and keyword extraction\n- Configurable file size limits (default 50MB)\n\n**Async Processing Pipeline**\n- Multi-stage workflow: Upload -\u003e Process -\u003e Review -\u003e Finalize -\u003e Export\n- Background processing via Celery workers\n- Real-time progress tracking via Server-Sent Events (SSE)\n- Automatic retry on transient failures\n\n**Review and Export**\n- Human-in-the-loop review step before finalization\n- Approve, reject, or request reprocessing\n- Bulk export to JSON and CSV formats\n\n**Infrastructure**\n- Production-grade Docker Compose setup with health checks\n- Nginx reverse proxy with rate limiting and security headers\n- PostgreSQL with UUID and trigram search extensions\n- Redis for caching, task brokering, and pub/sub\n- Celery Beat for scheduled tasks\n- Flower dashboard for task monitoring\n- Resource limits on all containers\n- Network isolation via Docker bridge network\n\n**Developer Experience**\n- Development mode with hot reload (source mount)\n- Comprehensive Makefile with 20+ targets\n- Database backup and restore scripts\n- Automated deployment script for Ubuntu servers\n\n---\n\n## Prerequisites\n\n- **Docker** \u003e= 24.0 and **Docker Compose** \u003e= 2.20 (plugin version)\n- **Git**\n- **Make** (optional, for Makefile targets)\n- 4 GB RAM minimum (8 GB recommended)\n- 10 GB free disk space\n\n---\n\n## Quick Start\n\n```bash\n# Clone the repository\ngit clone \u003crepository-url\u003e\ncd async-document-processing-workflow-system\n\n# Copy and configure environment\ncp .env.example .env\n# Edit .env with your preferred editor - at minimum change passwords for production\n\n# Build and start all services\nmake build\nmake up\n\n# Or without Make:\ndocker compose build\ndocker compose up -d\n\n# Check that services are running\nmake status\nmake health\n\n# View logs\nmake logs\n```\n\nOnce running, access:\n\n| Service          | URL                          |\n|------------------|------------------------------|\n| Web Application  | http://localhost              |\n| API Documentation| http://localhost/api/v1/docs  |\n| API (direct)     | http://localhost:8000         |\n| Flower Dashboard | http://localhost:5555         |\n| PostgreSQL       | localhost:5432               |\n| Redis            | localhost:6379               |\n\n---\n\n## Configuration\n\nAll configuration is managed via environment variables in the `.env` file.\n\n| Variable              | Description                                    | Default                    | Required |\n|-----------------------|------------------------------------------------|----------------------------|----------|\n| `DEBUG`               | Enable debug mode                              | `false`                    | No       |\n| `SECRET_KEY`          | JWT signing key (min 32 chars)                 | -                          | Yes      |\n| `DOMAIN`              | Application domain name                        | `localhost`                | No       |\n| `POSTGRES_USER`       | PostgreSQL username                            | `postgres`                 | No       |\n| `POSTGRES_PASSWORD`   | PostgreSQL password                            | -                          | Yes      |\n| `POSTGRES_DB`         | PostgreSQL database name                       | `docprocess`               | No       |\n| `REDIS_PASSWORD`      | Redis password                                 | -                          | Yes      |\n| `FLOWER_USER`         | Flower dashboard username                      | `admin`                    | No       |\n| `FLOWER_PASSWORD`     | Flower dashboard password                      | `admin123`                 | No       |\n| `NEXT_PUBLIC_API_URL` | API URL used by the frontend                   | `http://localhost/api/v1`  | No       |\n| `MAX_FILE_SIZE`       | Maximum upload file size in bytes               | `52428800` (50MB)          | No       |\n\n---\n\n## API Documentation\n\nThe backend serves interactive API documentation at `/api/v1/docs` (Swagger UI) and `/api/v1/redoc` (ReDoc) when running.\n\n### Endpoints Reference\n\n#### Health\n\n| Method | Endpoint     | Description         |\n|--------|-------------|---------------------|\n| GET    | `/health`   | Service health check |\n\n#### Authentication\n\n| Method | Endpoint             | Description          |\n|--------|---------------------|----------------------|\n| POST   | `/api/v1/auth/register` | Register new user |\n| POST   | `/api/v1/auth/login`    | Login and get JWT |\n| POST   | `/api/v1/auth/refresh`  | Refresh JWT token |\n\n#### Documents\n\n| Method | Endpoint                          | Description                        |\n|--------|----------------------------------|------------------------------------|\n| POST   | `/api/v1/documents/upload`       | Upload a new document              |\n| GET    | `/api/v1/documents/`             | List all documents (paginated)     |\n| GET    | `/api/v1/documents/{id}`         | Get document details               |\n| DELETE | `/api/v1/documents/{id}`         | Delete a document                  |\n| GET    | `/api/v1/documents/{id}/status`  | Get processing status              |\n| GET    | `/api/v1/documents/{id}/stream`  | SSE stream for real-time progress  |\n\n#### Review\n\n| Method | Endpoint                            | Description                   |\n|--------|-------------------------------------|-------------------------------|\n| GET    | `/api/v1/review/pending`            | List documents pending review |\n| POST   | `/api/v1/review/{id}/approve`       | Approve a processed document  |\n| POST   | `/api/v1/review/{id}/reject`        | Reject a processed document   |\n| POST   | `/api/v1/review/{id}/reprocess`     | Request reprocessing          |\n\n#### Export\n\n| Method | Endpoint                    | Description                          |\n|--------|-----------------------------|--------------------------------------|\n| POST   | `/api/v1/export/json`       | Export finalized documents as JSON   |\n| POST   | `/api/v1/export/csv`        | Export finalized documents as CSV    |\n| GET    | `/api/v1/export/{id}/download` | Download a specific export file   |\n\n### Example: Upload a Document\n\n```bash\ncurl -X POST http://localhost/api/v1/documents/upload \\\n  -H \"Authorization: Bearer \u003ctoken\u003e\" \\\n  -F \"file=@sample-files/sample-report.txt\"\n```\n\n**Response:**\n```json\n{\n  \"id\": \"550e8400-e29b-41d4-a716-446655440000\",\n  \"filename\": \"sample-report.txt\",\n  \"file_type\": \"text/plain\",\n  \"file_size\": 1234,\n  \"status\": \"uploaded\",\n  \"created_at\": \"2024-01-15T10:00:00Z\"\n}\n```\n\n### Example: Stream Processing Progress (SSE)\n\n```bash\ncurl -N http://localhost/api/v1/documents/550e8400-e29b-41d4-a716-446655440000/stream \\\n  -H \"Authorization: Bearer \u003ctoken\u003e\"\n```\n\n**SSE Events:**\n```\ndata: {\"stage\": \"extracting\", \"progress\": 25, \"message\": \"Extracting text content...\"}\n\ndata: {\"stage\": \"analyzing\", \"progress\": 50, \"message\": \"Analyzing document structure...\"}\n\ndata: {\"stage\": \"categorizing\", \"progress\": 75, \"message\": \"Categorizing document...\"}\n\ndata: {\"stage\": \"completed\", \"progress\": 100, \"message\": \"Processing complete. Ready for review.\"}\n```\n\n---\n\n## Architecture Details\n\n### Backend Architecture\n\nThe backend follows a layered architecture:\n\n```\napp/\n  api/           # Route handlers (thin controllers)\n    v1/\n      documents.py\n      review.py\n      export.py\n      auth.py\n  models/        # SQLAlchemy ORM models\n  schemas/       # Pydantic request/response schemas\n  services/      # Business logic layer\n  workers/       # Celery task definitions\n  core/          # Configuration, security, database setup\n```\n\n**Key patterns:**\n- Async/await throughout (asyncpg for DB, httpx for HTTP)\n- Repository pattern for data access\n- Dependency injection via FastAPI's Depends()\n- Pydantic v2 for validation and serialization\n\n### Async Processing Flow\n\n1. **Upload**: File is saved to disk, metadata stored in PostgreSQL, status set to `uploaded`\n2. **Processing triggered**: A Celery task is dispatched to the `documents` queue\n3. **Text extraction**: Worker extracts text based on file type (stage: `extracting`)\n4. **Analysis**: Worker analyzes content structure, extracts summary (stage: `analyzing`)\n5. **Categorization**: Worker categorizes document and extracts keywords (stage: `categorizing`)\n6. **Ready for review**: Status set to `processed`, awaiting human review\n7. **Review**: Human approves, rejects, or requests reprocessing\n8. **Finalization**: Approved documents are marked `finalized`\n9. **Export**: Finalized documents can be exported as JSON or CSV\n\n### Progress Tracking\n\nProgress updates flow through Redis Pub/Sub:\n- Celery worker publishes progress events to a Redis channel per document\n- Backend SSE endpoint subscribes to the channel and streams events to the client\n- Frontend EventSource API receives and displays real-time updates\n\n### File Storage\n\n- Files are stored on a Docker volume (`upload_data`) mounted at `/app/uploads`\n- Each file is stored with a UUID-based path to avoid collisions\n- For production at scale, the volume can be replaced with cloud storage (S3, GCS)\n\n---\n\n## Development\n\n### Running in Development Mode\n\nDevelopment mode mounts source code for hot reload and enables debug logging:\n\n```bash\n# Start with dev overrides\nmake dev\n\n# Or manually:\ndocker compose -f docker-compose.yml -f docker-compose.dev.yml up --build\n```\n\n### Code Structure\n\n```\n.\n+-- docker-compose.yml           # Production compose\n+-- docker-compose.dev.yml       # Development overrides\n+-- .env.example                 # Environment template\n+-- init-db.sql                  # Database initialization\n+-- Makefile                     # Build and management targets\n+-- nginx/\n|   +-- nginx.conf               # Main Nginx configuration\n|   +-- conf.d/default.conf      # Virtual host configuration\n|   +-- ssl/                     # SSL certificates (gitignored)\n+-- backend/                     # FastAPI application\n|   +-- Dockerfile\n|   +-- app/\n|   +-- tests/\n|   +-- requirements.txt\n+-- frontend/                    # Next.js application\n|   +-- Dockerfile\n|   +-- src/\n|   +-- package.json\n+-- scripts/\n|   +-- deploy.sh                # Deployment automation\n|   +-- backup.sh                # Database backup\n+-- sample-files/                # Test documents\n+-- sample-outputs/              # Expected output examples\n```\n\n### Useful Make Targets\n\n```bash\nmake help            # Show all available targets\nmake dev             # Start in dev mode with hot reload\nmake logs-backend    # Stream backend logs\nmake logs-worker     # Stream worker logs\nmake shell-backend   # Open bash in backend container\nmake shell-db        # Open psql in database container\nmake test            # Run backend tests\nmake test-frontend   # Run frontend tests\nmake migrate         # Run database migrations\nmake backup-db       # Backup the database\n```\n\n---\n\n## Deployment\n\n### Production Deployment (Ubuntu Server)\n\nAn automated deployment script is provided:\n\n```bash\n# On a fresh Ubuntu 22.04+ server:\nsudo bash scripts/deploy.sh --domain your-domain.com --email you@email.com\n```\n\nThe script will:\n1. Install Docker and Docker Compose\n2. Configure UFW firewall (ports 22, 80, 443)\n3. Set up fail2ban for brute-force protection\n4. Generate secure random passwords\n5. Obtain SSL certificates via Let's Encrypt\n6. Build and start all services\n7. Verify deployment health\n\n### Manual Production Setup\n\n```bash\n# 1. Clone and configure\ngit clone \u003crepo-url\u003e /opt/docprocess\ncd /opt/docprocess\ncp .env.example .env\n\n# 2. Generate secure values for .env\n#    - SECRET_KEY: openssl rand -hex 32\n#    - POSTGRES_PASSWORD: openssl rand -base64 24\n#    - REDIS_PASSWORD: openssl rand -base64 24\n#    - Set DOMAIN to your actual domain\n#    - Set DEBUG=false\n\n# 3. Set up SSL certificates\nmkdir -p nginx/ssl\n# Copy fullchain.pem and privkey.pem to nginx/ssl/\n# Uncomment the HTTPS server block in nginx/conf.d/default.conf\n\n# 4. Build and start\ndocker compose build --no-cache\ndocker compose up -d\n\n# 5. Run migrations\ndocker compose exec backend alembic upgrade head\n\n# 6. Verify\nmake health\n```\n\n### SSL/TLS with Let's Encrypt\n\n```bash\n# Install certbot\nsudo apt install certbot\n\n# Obtain certificate (stop nginx first)\ndocker compose stop nginx\nsudo certbot certonly --standalone -d your-domain.com\n\n# Copy certificates\ncp /etc/letsencrypt/live/your-domain.com/fullchain.pem nginx/ssl/\ncp /etc/letsencrypt/live/your-domain.com/privkey.pem nginx/ssl/\n\n# Uncomment HTTPS server block in nginx/conf.d/default.conf\n# Restart nginx\ndocker compose up -d nginx\n```\n\n### Security Checklist\n\n- [ ] Change all default passwords in `.env`\n- [ ] Set `DEBUG=false`\n- [ ] Generate a strong `SECRET_KEY` (min 32 chars)\n- [ ] Configure SSL/TLS\n- [ ] Set `DOMAIN` to actual domain name\n- [ ] Change Flower credentials\n- [ ] Configure firewall (UFW) to only allow ports 22, 80, 443\n- [ ] Enable fail2ban\n- [ ] Restrict direct database port access (remove `5432:5432` from compose)\n- [ ] Restrict direct Redis port access (remove `6379:6379` from compose)\n- [ ] Set up automated database backups\n- [ ] Review and tighten Content-Security-Policy header\n\n---\n\n## Testing\n\n### Backend Tests\n\n```bash\n# Run all backend tests\nmake test\n\n# Run with coverage\nmake test-cov\n\n# Run specific test file\ndocker compose exec backend pytest tests/test_documents.py -v\n\n# Run specific test\ndocker compose exec backend pytest tests/test_documents.py::test_upload_document -v\n```\n\n### Frontend Tests\n\n```bash\nmake test-frontend\n\n# Or directly:\ndocker compose exec frontend npm test\n```\n\n### Manual API Testing\n\nSample files are provided in `sample-files/` for testing uploads:\n\n```bash\n# Upload a text file\ncurl -X POST http://localhost/api/v1/documents/upload \\\n  -F \"file=@sample-files/sample-report.txt\"\n\n# Upload a CSV\ncurl -X POST http://localhost/api/v1/documents/upload \\\n  -F \"file=@sample-files/sample-invoice.csv\"\n\n# Upload JSON\ncurl -X POST http://localhost/api/v1/documents/upload \\\n  -F \"file=@sample-files/sample-technical-doc.json\"\n\n# List documents\ncurl http://localhost/api/v1/documents/\n\n# Check processing status\ncurl http://localhost/api/v1/documents/\u003cdocument-id\u003e/status\n```\n\n---\n\n## Assumptions and Tradeoffs\n\n| Decision | Rationale |\n|----------|-----------|\n| **Local file storage** | Simplifies initial setup. Can be swapped for S3/GCS via a storage abstraction layer. |\n| **Redis for both broker and cache** | Reduces infrastructure complexity. For high-throughput production, consider separate Redis instances or RabbitMQ for the broker. |\n| **Synchronous Celery DB access** | Celery workers use psycopg2 (sync) instead of asyncpg because Celery tasks are synchronous by nature. |\n| **Single Nginx instance** | Adequate for moderate traffic. For high availability, use a cloud load balancer in front. |\n| **JWT authentication** | Stateless auth suitable for API-first architecture. Tokens should be short-lived with refresh rotation. |\n| **PostgreSQL for all data** | Single database simplifies operations. Document metadata, user data, and processing results all in one place. |\n| **Human-in-the-loop review** | Ensures quality control before finalization. Can be made optional per document type. |\n| **Docker Compose (not Kubernetes)** | Suitable for single-server deployments. Migrate to Kubernetes for multi-node horizontal scaling. |\n\n---\n\n## Limitations\n\n- **No horizontal scaling of Celery workers** in the current Compose setup (single worker container). Scale by increasing `--concurrency` or adding worker replicas.\n- **File storage is local** to the Docker volume. Not suitable for multi-node deployments without shared storage.\n- **No built-in user management UI** -- user registration and management is API-only.\n- **SSE connections are long-lived** and consume a server connection each. Under heavy load, consider WebSocket or polling as alternatives.\n- **No document versioning** -- reprocessing overwrites previous results.\n- **Single-region deployment** -- no built-in multi-region or DR support.\n- **No rate limiting per user** -- rate limiting is IP-based via Nginx.\n\n---\n\n## Sample Files\n\nThe `sample-files/` directory contains test documents for exercising the upload and processing pipeline:\n\n| File | Type | Description |\n|------|------|-------------|\n| `sample-report.txt` | Text | Annual technology report with findings and recommendations |\n| `sample-invoice.csv` | CSV | Invoice data with line items and totals |\n| `sample-memo.md` | Markdown | Engineering priorities memo with structured sections |\n| `sample-letter.txt` | Text | Formal business partnership letter |\n| `sample-technical-doc.json` | JSON | Microservices architecture specification |\n\nThe `sample-outputs/` directory contains examples of expected export output:\n\n| File | Description |\n|------|-------------|\n| `sample-export.json` | JSON export of processed document results |\n| `sample-export.csv` | CSV export of processed document results |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdivyamohan1993%2Fasync-document-processing-workflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdivyamohan1993%2Fasync-document-processing-workflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdivyamohan1993%2Fasync-document-processing-workflow/lists"}