{"id":40714877,"url":"https://github.com/psaboia/pad-salience-annotations","last_synced_at":"2026-01-21T13:08:47.700Z","repository":{"id":332141422,"uuid":"1132816915","full_name":"psaboia/pad-salience-annotations","owner":"psaboia","description":"Web-based annotation platform for pharmaceutical drug sample classification. Features multi-role authentication (admin/specialist), study management with specialist assignments, real-time annotation with audio recording, session replay for review, and optional eye-tracking integration.","archived":false,"fork":false,"pushed_at":"2026-01-19T17:42:53.000Z","size":48243,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-01-19T23:29:04.493Z","etag":null,"topics":["annotation-tool","fastapi","pharmaceutical","python","research"],"latest_commit_sha":null,"homepage":null,"language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/psaboia.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-12T13:46:56.000Z","updated_at":"2026-01-19T04:36:33.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/psaboia/pad-salience-annotations","commit_stats":null,"previous_names":["psaboia/pad-salience-annotations"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/psaboia/pad-salience-annotations","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psaboia%2Fpad-salience-annotations","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psaboia%2Fpad-salience-annotations/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psaboia%2Fpad-salience-annotations/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psaboia%2Fpad-salience-annotations/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/psaboia","download_url":"https://codeload.github.com/psaboia/pad-salience-annotations/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psaboia%2Fpad-salience-annotations/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28633748,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-21T04:47:28.174Z","status":"ssl_error","status_checked_at":"2026-01-21T04:47:22.943Z","response_time":86,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotation-tool","fastapi","pharmaceutical","python","research"],"created_at":"2026-01-21T13:08:47.217Z","updated_at":"2026-01-21T13:08:47.694Z","avatar_url":"https://github.com/psaboia.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PAD Salience Annotations\n\nA system for capturing expert annotations on PAD (Paper Analytical Device) card images to build training datasets for AI models.\n\n## What is PAD?\n\n**PAD (Paper Analytical Device)** is a paper-based test card developed by the [Notre Dame PAD Project](https://padproject.nd.edu/) for screening pharmaceutical quality. Each card has 12 lanes (A-L) with different chemical reagents that produce color reactions to identify drugs and detect counterfeits.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"sample_images/amoxicillin_15214_processed.png\" alt=\"PAD Card Example - Amoxicillin\" width=\"300\"\u003e\n  \u003cbr\u003e\n  \u003cem\u003eExample PAD card showing 12 lanes (A-L) with color reactions for Amoxicillin\u003c/em\u003e\n\u003c/p\u003e\n\n## Purpose\n\nThis project builds a structured annotation system where:\n- **Specialists** mark salient regions on PAD card images\n- **Audio explanations** capture expert reasoning\n- **Eye-tracking** captures gaze patterns during annotation\n- **Data** is formatted for training multimodal AI models (fine-tuning, distillation, embeddings)\n\n## Features\n\n### Current\n- **Study Management System** with SQLite database\n  - Admin interface for creating and managing studies\n  - Specialist dashboard with assignment tracking\n  - Randomized sample order per specialist\n  - Progress tracking and statistics\n- **Authentication System** with JWT tokens and bcrypt password hashing\n- Web-based annotation interface with two layout options\n- Rectangle and polygon drawing tools\n- Automatic lane detection (A-L)\n- Continuous audio recording with timestamps\n- **Eye-tracking support** with AprilTag markers for Pupil Labs surface tracking\n- **Unique AprilTag identification** per sample for automatic image correlation with gaze data\n- YAML configuration file for easy customization\n- Export to JSONL format\n- 26 drug samples from FHI2020 project\n\n### Planned\n- Audio transcription integration (OpenAI API)\n- Export pipeline for HuggingFace/Ollama\n- Live gaze overlay from eye-tracker\n\n## Quick Start\n\n```bash\n# Clone the repository\ngit clone https://github.com/psaboia/pad-salience-annotations.git\ncd pad-salience-annotations\n\n# Install dependencies (requires uv)\nuv sync\n\n# Create admin user\nuv run python scripts/create_admin.py --email admin@example.com --password yourpassword\n\n# Run the server\nuv run uvicorn app.main:app --reload --port 8765\n\n# Open in browser\n# http://localhost:8765\n```\n\n## System Architecture\n\n### Admin Workflow\n1. Admin logs in at `/login`\n2. Creates study from `/admin/studies`\n3. Selects samples and assigns specialists\n4. Monitors progress from dashboard\n\n### Specialist Workflow\n1. Specialist logs in at `/login`\n2. Views assigned studies at `/specialist`\n3. Starts study (samples randomized)\n4. Annotates each sample sequentially (no skipping/going back)\n5. Progress automatically tracked\n\n## Configuration\n\nSettings are stored in `config.yaml`:\n\n```yaml\n# AprilTag settings\napriltags:\n  size_px: 60          # Tag size in pixels (recommended: 60-80)\n  margin_px: 10        # Margin between tags and PAD image\n  family: \"tag36h11\"\n  ids: [0, 3, 7, 4]    # Default tags (overridden per sample)\n\n# Layout settings\nlayout:\n  sidebar_width_px: 240\n  background_color: \"#1a1a2e\"\n  sidebar_color: \"#16213e\"\n\n# PAD image settings\npad_image:\n  max_height_vh: 85    # Max height as % of viewport\n  border_px: 3\n  border_color: \"#333333\"\n\n# Lane detection\nlanes:\n  start_percent: 0.082\n  end_percent: 0.986\n  labels: [\"A\", \"B\", \"C\", \"D\", \"E\", \"F\", \"G\", \"H\", \"I\", \"J\", \"K\", \"L\"]\n```\n\n## Eye-Tracking Setup\n\nFor Pupil Labs integration, see [Eye-Tracking Integration](docs/eye-tracking-integration.md).\n\n**AprilTag Identification System:**\n- Each sample has 4 unique AprilTags (tag36h11 family, 587 tags available)\n- Minimum distance of 2 between any pair of samples\n- Enables automatic correlation of gaze data with the correct image\n- Supports 1000+ unique samples\n- See [AprilTag Identification System](docs/apriltag-identification-system.md) for details\n\n**AprilTag size recommendations:**\n- Minimum detectable: ~32 pixels (white border to white border)\n- Recommended: 60-80 pixels for reliable detection at 50-70cm distance\n- Tags at corners should be larger if detection issues occur at angles\n\n## Documentation\n\n| Document | Description |\n|----------|-------------|\n| [Requirements](docs/requirements.md) | Full system requirements and data architecture |\n| [Study System](docs/study-system.md) | Database schema and study workflow design |\n| [Prototype Specs](docs/prototype-specifications.md) | Current prototype implementation details |\n| [Eye-Tracking Integration](docs/eye-tracking-integration.md) | Pupil Labs setup and AprilTag configuration |\n| [AprilTag Identification](docs/apriltag-identification-system.md) | Unique tag allocation for automatic sample identification |\n| [Feedback Questionnaire](docs/feedback-questionnaire.md) | Questions for users and specialists |\n\n## Project Structure\n\n```\npad-salience-annotations/\n├── app/                           # FastAPI backend\n│   ├── main.py                    # Application entry point\n│   ├── database.py                # SQLite helpers\n│   ├── models/                    # Pydantic models\n│   ├── routers/                   # API endpoints\n│   │   ├── auth.py               # Authentication\n│   │   ├── admin.py              # Admin endpoints\n│   │   └── specialist.py         # Specialist endpoints\n│   └── services/                  # Business logic\n├── frontend/                      # HTML templates\n│   ├── static/                    # CSS and JS\n│   └── templates/                 # Jinja2 templates\n│       ├── login.html\n│       ├── admin/                 # Admin pages\n│       └── specialist/            # Specialist pages\n├── migrations/                    # SQL migrations\n├── scripts/                       # Utility scripts\n│   ├── create_admin.py           # Create users\n│   ├── allocate_tags.py          # Allocate unique AprilTags\n│   └── generate_apriltags.py     # Generate tag images\n├── sample_images/\n│   ├── manifest.json             # Image metadata\n│   └── *.png                     # PAD card images\n├── assets/\n│   └── apriltags/                # AprilTag markers (587 tags)\n├── data/\n│   ├── pad_annotations.db        # SQLite database\n│   └── audio/                    # Audio recordings\n├── docs/                         # Documentation\n├── config.yaml                   # Configuration file\n└── pyproject.toml                # Python dependencies\n```\n\n## API Endpoints\n\n### Authentication\n| Endpoint | Method | Description |\n|----------|--------|-------------|\n| `/api/auth/login` | POST | Login with email/password |\n| `/api/auth/logout` | POST | Logout |\n| `/api/auth/me` | GET | Get current user |\n\n### Admin\n| Endpoint | Method | Description |\n|----------|--------|-------------|\n| `/api/admin/studies` | GET/POST | List/Create studies |\n| `/api/admin/studies/{id}` | GET/PUT/DELETE | CRUD operations |\n| `/api/admin/studies/{id}/samples` | GET/POST | Manage samples |\n| `/api/admin/studies/{id}/assignments` | GET/POST/DELETE | Manage assignments |\n| `/api/admin/users` | GET/POST | Manage users |\n\n### Specialist\n| Endpoint | Method | Description |\n|----------|--------|-------------|\n| `/api/specialist/studies` | GET | List assigned studies |\n| `/api/specialist/studies/{id}/start` | POST | Start study |\n| `/api/specialist/studies/{id}/current` | GET | Get current sample |\n| `/api/specialist/sessions/{uuid}/complete` | POST | Complete annotation |\n\n## Data Format\n\nAnnotations are saved in SQLite database with normalized coordinates (0-999) compatible with DeepSeek-OCR style grounding:\n\n```json\n{\n  \"session_id\": \"session_123\",\n  \"sample\": {\"drug_name\": \"amoxicillin\", \"card_id\": 15214},\n  \"annotations\": [\n    {\n      \"type\": \"rectangle\",\n      \"lanes\": [\"D\", \"E\"],\n      \"timestamp_start_ms\": 12500,\n      \"timestamp_end_ms\": 15800,\n      \"bbox_normalized\": {\"x1\": 225, \"y1\": 298, \"x2\": 335, \"y2\": 411}\n    }\n  ],\n  \"audio\": {\"filename\": \"session_123.webm\", \"duration_ms\": 45000}\n}\n```\n\n## Dependencies\n\n- Python 3.12+\n- [uv](https://github.com/astral-sh/uv) - Package manager\n- [FastAPI](https://fastapi.tiangolo.com/) - Web framework\n- [aiosqlite](https://aiosqlite.omnilib.dev/) - Async SQLite\n- [python-jose](https://python-jose.readthedocs.io/) - JWT tokens\n- [passlib](https://passlib.readthedocs.io/) - Password hashing\n- [pad-analytics](https://github.com/PaperAnalyticalDeviceND/pad-analytics) - PAD database API\n- [Pillow](https://pillow.readthedocs.io/) - Image processing\n- [PyYAML](https://pyyaml.org/) - Configuration file parsing\n\n## Contributing\n\nWe welcome feedback! Please:\n- Open an [issue](https://github.com/psaboia/pad-salience-annotations/issues) for bugs or suggestions\n- Review the [feedback questionnaire](docs/feedback-questionnaire.md) and share your thoughts\n\n## License\n\nTBD\n\n## Acknowledgments\n\n- [Notre Dame PAD Project](https://padproject.nd.edu/) for PAD technology and data\n- [pad-analytics](https://github.com/PaperAnalyticalDeviceND/pad-analytics) package for API access\n- [Pupil Labs](https://pupil-labs.com/) for eye-tracking technology\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsaboia%2Fpad-salience-annotations","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpsaboia%2Fpad-salience-annotations","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsaboia%2Fpad-salience-annotations/lists"}