{"id":49882801,"url":"https://github.com/smartcraze/evaluation-engine","last_synced_at":"2026-05-15T16:06:29.551Z","repository":{"id":352815535,"uuid":"1206084992","full_name":"smartcraze/evaluation-engine","owner":"smartcraze","description":"AI-Based Exam Paper Evaluation Engine (End-to-End)","archived":false,"fork":false,"pushed_at":"2026-04-21T07:52:34.000Z","size":75,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-21T09:41:36.727Z","etag":null,"topics":["ai","datalab","fastapi","ocr","python","uv"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/smartcraze.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-09T15:02:54.000Z","updated_at":"2026-04-21T07:53:02.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/smartcraze/evaluation-engine","commit_stats":null,"previous_names":["smartcraze/evaluation-engine"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/smartcraze/evaluation-engine","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smartcraze%2Fevaluation-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smartcraze%2Fevaluation-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smartcraze%2Fevaluation-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smartcraze%2Fevaluation-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/smartcraze","download_url":"https://codeload.github.com/smartcraze/evaluation-engine/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smartcraze%2Fevaluation-engine/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33071615,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-15T11:35:32.926Z","status":"ssl_error","status_checked_at":"2026-05-15T11:35:31.362Z","response_time":103,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","datalab","fastapi","ocr","python","uv"],"created_at":"2026-05-15T16:06:27.182Z","updated_at":"2026-05-15T16:06:29.545Z","avatar_url":"https://github.com/smartcraze.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Evaluation Engine\n\nProduction-oriented FastAPI service that converts exam documents to markdown (OCR), evaluates student answers with an LLM, and stores full job lifecycle data in PostgreSQL.\n\nThis project demonstrates practical full-stack engineering skills recruiters look for: API design, external service integration, schema-backed persistence, frontend integration, and environment-driven deployment workflows.\n\n## Why This Project Matters\n\n- Solves a real workflow: document ingestion -\u003e OCR extraction -\u003e AI evaluation -\u003e persisted results.\n- Uses modern backend patterns: FastAPI, async SQLAlchemy, typed service layers, and explicit error handling.\n- Built for extension: current architecture supports migration to queue-based workers without breaking API contracts.\n\n## Core Features\n\n- File conversion to markdown via Datalab SDK (`/convert/localhost`) and webhook mode (`/convert/webhook`).\n- Secure Datalab webhook receiver with shared-secret validation (`/webhook/datalab`).\n- LLM-based exam evaluation with strict JSON contract and deterministic score normalization (`/evaluate/{request_id}`).\n- Persistent result lookup API (`/result/{request_id}`).\n- OCR quality metrics endpoint with heuristic scoring (`/metrics/markdown`).\n- Static frontend served from FastAPI for manual testing and demos (`/`).\n## API DOCS\n\u003cimg width=\"2852\" height=\"1610\" alt=\"image\" src=\"https://github.com/user-attachments/assets/037edae1-808b-46b1-b759-72b47fde560d\" /\u003e\n\n## System Architecture\n\n1. Client uploads an exam file.\n2. Service sends file to Datalab for OCR/markdown conversion.\n3. Extracted markdown is stored in PostgreSQL and written to `public/extracted`.\n4. Evaluation endpoint sends extracted text to OpenRouter-compatible model.\n5. Marks, remarks, and keyword coverage are persisted and returned through result APIs.\n\n### High-Level Components\n\n- `main.py`: FastAPI app, route handlers, startup lifecycle, static mount.\n- `services/datalab.py`: OCR conversion and webhook submission integrations.\n- `services/evaluator.py`: LLM prompt contract, JSON parsing, deterministic score derivation.\n- `services/storage.py`: SQLAlchemy model, async DB session, upsert/result/metrics operations.\n- `public/index.html`: lightweight UI for end-to-end manual verification.\n\n## Tech Stack\n\n- Python (async-first backend)\n- FastAPI + Uvicorn\n- SQLAlchemy 2.0 (async) + asyncpg\n- PostgreSQL\n- Datalab OCR API / SDK\n- OpenAI Python SDK (OpenRouter endpoint)\n- uv (dependency and environment management)\n\n## Full-Stack Engineering Lens\n\n- Backend: async FastAPI APIs, webhook handling, and PostgreSQL persistence.\n- AI integration: strict JSON evaluation contract, deterministic post-processing, and robust validation.\n- Frontend: browser UI in `public/index.html` for upload/evaluate/result workflows.\n- Product flow: complete user journey from file ingestion to stored evaluation output.\n\n## Project Structure\n\n```text\nevaluation-engine/\n|- main.py\n|- pyproject.toml\n|- docker-compose.yml\n|- Dockerfile\n|- services/\n|  |- datalab.py\n|  |- evaluator.py\n|  |- storage.py\n|- public/\n|  |- index.html\n|  |- extracted/\n```\n\n## API Surface\n\n### 1) Convert (Local SDK auto-poll)\n\n- `POST /convert`\n- `POST /convert/localhost`\n- Accepts multipart file (`pdf`, `png`, `jpg`, `jpeg`, `webp`)\n- Returns `request_id`, markdown metadata, and public markdown URL\n\n### 2) Convert (Webhook mode)\n\n- `POST /convert/webhook`\n- Submits to Datalab with callback URL derived from `BASE_URL`\n- Returns `request_id` and optional Datalab check URL\n\n### 3) Datalab Webhook Receiver\n\n- `POST /webhook/datalab`\n- Validates `DATALAB_WEBHOOK_SECRET`\n- Upserts extracted markdown and marks job as received\n\n### 4) Evaluate Extracted Answer\n\n- `POST /evaluate/{request_id}`\n- Body: `max_marks` (int), `model` (string)\n- Returns marks, remarks, matched/missing keywords, model used\n\n### 5) Get Result\n\n- `GET /result/{request_id}`\n- Returns status + evaluation payload for a job\n\n### 6) Markdown Quality Metrics\n\n- `GET /metrics/markdown`\n- Aggregates OCR quality heuristics across processed jobs\n\n## Database Model\n\n`evaluation_jobs` table fields include:\n\n- `request_id` (PK)\n- `status`\n- `mode`\n- `request_check_url`\n- `extracted_text`\n- `marks`\n- `remarks`\n- `matched_keywords` (JSON)\n- `missing_keywords` (JSON)\n- `model_name`\n- `payload` (JSON)\n- `created_at`, `updated_at`\n\n## Clone and Setup (Quick Start)\n\n### 1) Clone the repository\n\n```bash\ngit clone https://github.com/\u003cyour-org-or-username\u003e/evaluation-engine.git\ncd evaluation-engine\n```\n\n### 2) Install uv (if needed)\n\n```bash\npip install uv\n```\n\n### 3) Install project dependencies\n\n```bash\nuv sync\n```\n\n### 4) Create `.env` file\n\n```env\n# App\nAPP_NAME=evaluation-engine\nBASE_URL=http://localhost:8000\n\n# Database\nDATABASE_URL=postgresql+asyncpg://myuser:mypassword@localhost:5432/mydb\n\n# OCR\nDATALAB_API_KEY=your_datalab_api_key\nDATALAB_WEBHOOK_SECRET=your_webhook_secret\n\n# LLM (OpenRouter)\nOPEN_ROUTER_API_KEY=your_openrouter_api_key\n```\n\n### 5) Start PostgreSQL\n\nUsing Docker Compose:\n\n```bash\ndocker compose up -d db\n```\n\nOr run your own local PostgreSQL instance and point `DATABASE_URL` to it.\n\n### 6) Run the API\n\n```bash\nuv run uvicorn main:app --host 0.0.0.0 --port 8000 --reload\n```\n\n### 7) Verify the app\n\n- Open `http://localhost:8000/` for the web UI.\n- Use `http://localhost:8000/docs` for interactive API docs.\n\n## Local Setup (Detailed)\n\n### Prerequisites\n\n- Python 3.14+ (as declared in `pyproject.toml`)\n- `uv`\n- PostgreSQL running locally (or Docker)\n\n### 1) Install dependencies\n\n```bash\nuv sync\n```\n\n### 2) Configure environment variables\n\nCreate a `.env` file in project root:\n\n```env\n# App\nAPP_NAME=evaluation-engine\nBASE_URL=http://localhost:8000\n\n# Database\nDATABASE_URL=postgresql+asyncpg://myuser:mypassword@localhost:5432/mydb\n\n# OCR\nDATALAB_API_KEY=your_datalab_api_key\nDATALAB_WEBHOOK_SECRET=your_webhook_secret\n\n# LLM (OpenRouter)\nOPEN_ROUTER_API_KEY=your_openrouter_api_key\n```\n\n### 3) Start PostgreSQL (optional via Docker)\n\n```bash\ndocker compose up -d db\n```\n\n### 4) Run API\n\n```bash\nuv run uvicorn main:app --host 0.0.0.0 --port 8000 --reload\n```\n\n### 5) Open UI\n\n- `http://localhost:8000/`\n\n## cURL Examples\n\n### Convert (localhost mode)\n\n```bash\ncurl -X POST http://localhost:8000/convert/localhost \\\n\t-F \"file=@sample_exam.pdf\"\n```\n\n### Convert (webhook mode)\n\n```bash\ncurl -X POST http://localhost:8000/convert/webhook \\\n\t-F \"file=@sample_exam.pdf\"\n```\n\n### Evaluate\n\n```bash\ncurl -X POST http://localhost:8000/evaluate/\u003crequest_id\u003e \\\n\t-H \"Content-Type: application/json\" \\\n\t-d '{\"max_marks\": 100, \"model\": \"openai/gpt-oss-120b:free\"}'\n```\n\n### Result\n\n```bash\ncurl http://localhost:8000/result/\u003crequest_id\u003e\n```\n\n## Engineering Decisions\n\n- Async IO for external API calls and DB operations to improve throughput.\n- Job lifecycle persistence first, evaluation second, ensuring auditability.\n- Deterministic score derived from keyword coverage to reduce LLM variance.\n- Structured prompt contract and strict JSON parsing to harden model output handling.\n\n## Reliability and Security Notes\n\n- Webhook secret verification enforced for callback ingestion.\n- Missing or malformed upstream responses return explicit HTTP errors.\n- Upload validation includes empty file checks and content-type handling.\n- Data persists in PostgreSQL for traceability and debugging.\n\n## Recruiter-Focused Highlights\n\n- End-to-end ownership: API design, DB modeling, third-party integrations, and UX demo page.\n- AI-in-production mindset: schema validation, deterministic post-processing, and failure handling.\n- Clean separation of concerns through service modules.\n- Ready for scale evolution: straightforward path to background worker queue in next iteration.\n- Full-stack delivery signal: backend services plus browser-accessible workflow for demos and stakeholder reviews.\n\n## Current Limitations\n\n- Database schema is auto-created at startup; migrations are not yet added.\n- No dedicated background worker yet; evaluation is API-triggered.\n- No automated tests committed yet.\n\n## Suggested Next Iteration\n\n1. Add Alembic migrations and environment-specific config.\n2. Introduce worker process with DB locking or queue backend.\n3. Add integration tests for OCR/evaluation/result flow.\n4. Add structured logging, tracing IDs, and health/readiness endpoints.\n\n## License\n\nThis project is provided for educational and portfolio demonstration purposes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsmartcraze%2Fevaluation-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsmartcraze%2Fevaluation-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsmartcraze%2Fevaluation-engine/lists"}