{"id":13992629,"url":"https://github.com/davidjurgens/potato","last_synced_at":"2026-04-02T13:31:51.773Z","repository":{"id":40588058,"uuid":"258270256","full_name":"davidjurgens/potato","owner":"davidjurgens","description":"potato: the portable annotation tool","archived":false,"fork":false,"pushed_at":"2026-03-26T16:13:53.000Z","size":63758,"stargazers_count":374,"open_issues_count":9,"forks_count":69,"subscribers_count":8,"default_branch":"master","last_synced_at":"2026-03-26T19:01:00.649Z","etag":null,"topics":["agentic-ai","agentic-workflow","agents","annotation","annotation-tool","audio","data-labeling","image","labeling-tool","nlp","speech","vision"],"latest_commit_sha":null,"homepage":"https://www.potatoannotator.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/davidjurgens.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-04-23T16:49:28.000Z","updated_at":"2026-03-26T16:14:32.000Z","dependencies_parsed_at":"2024-12-07T00:02:11.439Z","dependency_job_id":"6a6278bf-191b-46e8-a471-872b076746c7","html_url":"https://github.com/davidjurgens/potato","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/davidjurgens/potato","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidjurgens%2Fpotato","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidjurgens%2Fpotato/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidjurgens%2Fpotato/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidjurgens%2Fpotato/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/davidjurgens","download_url":"https://codeload.github.com/davidjurgens/potato/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidjurgens%2Fpotato/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31307134,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T12:59:32.332Z","status":"ssl_error","status_checked_at":"2026-04-02T12:54:48.875Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","agentic-workflow","agents","annotation","annotation-tool","audio","data-labeling","image","labeling-tool","nlp","speech","vision"],"created_at":"2024-08-09T14:02:04.207Z","updated_at":"2026-04-02T13:31:51.767Z","avatar_url":"https://github.com/davidjurgens.png","language":"Python","funding_links":[],"categories":["Jupyter Notebook","Codebases","Data Management","Libraries","Python"],"sub_categories":["2020 and before","Annotation Tools"],"readme":"# Potato: The Portable Annotation Tool\n\n[![Documentation](https://img.shields.io/badge/docs-readthedocs-blue)](https://potatoannotator.readthedocs.io/)\n[![PyPI](https://img.shields.io/pypi/v/potato-annotation)](https://pypi.org/project/potato-annotation/)\n[![License](https://img.shields.io/badge/license-Polyform%20Shield-green)](LICENSE)\n[![Paper](https://img.shields.io/badge/paper-EMNLP%202022-orange)](https://aclanthology.org/2022.emnlp-demos.33/)\n[![Live Demo](https://img.shields.io/badge/demo-HuggingFace%20Spaces-yellow)](https://huggingface.co/spaces/Blablablab/potato)\n\n**Potato** is a free, self-hosted annotation platform for NLP, Agentic, and GenAI research. Annotate text, audio, video, images, documents, agent traces, and more — configured entirely through YAML. No coding required.\n\n**[Try the live demo on HuggingFace Spaces](https://huggingface.co/spaces/Blablablab/potato)** — no installation needed.\n\n---\n\n## Quick Start\n\n```bash\npip install potato-annotation\n\n# List available templates\npotato list all\n\n# Get a template and start annotating\npotato get sentiment_analysis\npotato start sentiment_analysis\n```\n\nOr run from source:\n\n```bash\ngit clone https://github.com/davidjurgens/potato.git\ncd potato \u0026\u0026 pip install -r requirements.txt\npython potato/flask_server.py start examples/classification/single-choice/config.yaml -p 8000\n```\n\nOpen [http://localhost:8000](http://localhost:8000) and start annotating.\n\n---\n\n## What Can You Annotate?\n\nPotato handles the full spectrum of annotation tasks — from traditional NLP labeling to evaluating the latest AI agent systems.\n\n### Data Types\n\n| Modality | Capabilities |\n|----------|-------------|\n| **Text** | Classification, span labeling, entity linking, coreference, pairwise comparison ([docs](docs/schemas_and_templates.md)) |\n| **Agent Traces** | Step-by-step evaluation of LLM agents, tool calls, ReAct chains, and multi-agent systems ([docs](docs/agent_traces.md)) |\n| **Web Agents** | Screenshot-based review with SVG click/scroll overlays, or live browsing with automatic trace recording ([docs](docs/web_agent_annotation.md)) |\n| **RAG Pipelines** | Retrieval relevance, answer faithfulness, citation accuracy, hallucination detection |\n| **Audio** | Waveform visualization, segment labeling, ELAN-style tiered annotation ([docs](docs/audio_annotation.md)) |\n| **Video** | Frame-by-frame labeling, temporal segments, playback sync ([docs](docs/video_annotation.md)) |\n| **Images** | Bounding boxes, polygons, landmarks, classification ([docs](docs/image_annotation.md)) |\n| **Dialogue** | Turn-level annotation, conversation trees, interactive chat evaluation |\n| **Documents** | PDF, Word, Markdown, code, and spreadsheets with coordinate mapping ([docs](docs/format_support.md)) |\n\n### Annotation Schemes\n\n| Scheme | Use Case |\n|--------|----------|\n| Radio / Checkbox / Likert | Classification, multi-label, rating scales |\n| Span annotation | NER, highlighting, hallucination marking |\n| Pairwise comparison | A/B testing, best-worst scaling |\n| Per-step ratings | Evaluate individual agent actions or dialogue turns |\n| Free text | Open-ended responses with validation |\n| Triage | Rapid accept/reject/skip curation ([docs](docs/triage.md)) |\n| Conditional logic | Adaptive forms that respond to prior answers ([docs](docs/conditional_logic.md)) |\n\n---\n\n## Agent \u0026 LLM Evaluation\n\nPotato provides purpose-built tooling for evaluating AI agents at every level of granularity.\n\n### Trace Formats\n\nImport traces from any major agent framework with the built-in converter:\n\n```bash\npython -m potato.trace_converter --input traces.json --input-format openai --output data.jsonl\n```\n\nSupported formats: **OpenAI**, **Anthropic/Claude**, **ReAct**, **LangChain**, **LangFuse**, **WebArena**, **SWE-bench**, **OpenTelemetry**, **CrewAI/AutoGen/LangGraph**, **MCP**, and more. Auto-detection is available with `--auto-detect`.\n\n### Evaluation Levels\n\n| Level | What You Annotate | Example |\n|-------|------------------|---------|\n| **Trajectory** | Overall task success, efficiency, safety | \"Did the agent complete the task?\" |\n| **Step** | Individual action correctness, reasoning quality | Per-turn Likert ratings on each agent step |\n| **Span** | Specific text segments within agent output | Highlight hallucinated claims, factual errors |\n| **Comparison** | Side-by-side A/B agent evaluation | \"Which agent performed better?\" |\n\n### Web Agent Viewer\n\nAn interactive viewer for GUI agent traces — navigate step-by-step through screenshots with SVG overlays showing clicks, bounding boxes, mouse paths, and scroll actions. Annotators rate each step with inline controls while a filmstrip bar provides quick navigation.\n\n### Ready-to-Use Agent Examples\n\n| Example | What It Evaluates |\n|---------|-------------------|\n| [agent-trace-evaluation](examples/agent-traces/agent-trace-evaluation/) | Text agent traces with MAST error taxonomy + hallucination spans |\n| [visual-agent-evaluation](examples/agent-traces/visual-agent-evaluation/) | GUI agents with screenshot grounding accuracy |\n| [agent-comparison](examples/agent-traces/agent-comparison/) | Side-by-side A/B agent comparison |\n| [rag-evaluation](examples/agent-traces/rag-evaluation/) | RAG retrieval relevance and citation accuracy |\n| [openai-evaluation](examples/agent-traces/openai-evaluation/) | OpenAI Chat API traces with tool calls |\n| [anthropic-evaluation](examples/agent-traces/anthropic-evaluation/) | Claude messages with tool_use blocks |\n| [swebench-evaluation](examples/agent-traces/swebench-evaluation/) | Coding agents with patch correctness ratings |\n| [multi-agent-evaluation](examples/agent-traces/multi-agent-evaluation/) | Multi-agent coordination (CrewAI, AutoGen, LangGraph) |\n| [web-agent-review](examples/agent-traces/web-agent-review/) | Pre-recorded web traces with step-by-step overlay viewer |\n| [web-agent-creation](examples/agent-traces/web-agent-creation/) | Live web browsing with automatic trace recording |\n\n---\n\n## AI-Powered Annotation\n\n### LLM Label Suggestions\n\nIntegrate any LLM provider to pre-annotate instances and suggest labels. Annotators review and correct — dramatically faster than labeling from scratch.\n\nSupported backends: **OpenAI**, **Anthropic**, **Ollama**, **vLLM**, **Gemini**, **HuggingFace**, **OpenRouter**\n\n### Active Learning\n\nPotato reorders your annotation queue based on model uncertainty so annotators label the most informative instances first. Supports uncertainty sampling, BADGE, BALD, diversity, and hybrid strategies ([docs](docs/active_learning_guide.md)).\n\n### Solo Mode\n\nA human-LLM collaborative workflow where the system learns from annotator feedback and progressively transitions to autonomous LLM labeling as agreement improves ([docs](docs/solo_mode.md)).\n\n### Chat Assistant\n\nAn LLM-powered sidebar where annotators can ask questions about difficult instances. The AI provides guidance informed by your task description and annotation guidelines — helping annotators think through decisions without auto-labeling ([docs](docs/chat_support.md)).\n\n---\n\n## Quality Control \u0026 Workflows\n\n### Quality Assurance\n\n| Feature | Description |\n|---------|-------------|\n| Attention checks | Automatically inserted known-answer items to verify engagement |\n| Gold standards | Track annotator accuracy against expert labels |\n| Inter-annotator agreement | Built-in Krippendorff's alpha and Cohen's kappa |\n| Training phase | Practice annotations with feedback before the real task |\n| Behavioral tracking | Timing, click patterns, and annotation change history |\n\n### Annotation Workflows\n\n| Workflow | Description |\n|----------|-------------|\n| **Multi-annotator** | Multiple annotators per item with overlap control and agreement metrics |\n| **Adjudication** | Expert review of annotator disagreements to produce gold labels ([docs](docs/admin_dashboard.md)) |\n| **Solo mode** | Human-LLM collaboration with progressive automation ([docs](docs/solo_mode.md)) |\n| **Crowdsourcing** | Prolific and MTurk integration with platform-specific auth ([docs](docs/crowdsourcing.md)) |\n| **Triage** | Rapid accept/reject/skip for data curation ([docs](docs/triage.md)) |\n\n---\n\n## Authentication \u0026 Deployment\n\nPotato supports multiple authentication methods, from passwordless quick-start to enterprise SSO:\n\n| Method | Use Case |\n|--------|----------|\n| **In-memory** | Local development, quick studies |\n| **Password + file persistence** | Team annotation with shared credential files ([docs](docs/password_management.md)) |\n| **Database** | Production deployments with SQLite or PostgreSQL ([docs](docs/password_management.md#database-authentication-backend)) |\n| **OAuth / SSO** | Google, GitHub, or institutional OIDC login ([docs](docs/sso_authentication.md)) |\n| **Passwordless** | Low-stakes tasks where ease of access matters ([docs](docs/passwordless_login.md)) |\n\nPasswords are hashed with per-user PBKDF2-SHA256 salts. Admins can reset passwords via CLI (`potato reset-password`) or REST API. Self-service token-based reset is also available.\n\n---\n\n## Example Projects\n\nReady-to-use templates organized by type in [`examples/`](examples/):\n\n| Category | Examples |\n|----------|----------|\n| [Classification](examples/classification/) | Radio, checkbox, Likert, slider, pairwise comparison |\n| [Span](examples/span/) | NER, span linking, coreference, entity linking |\n| [Agent Traces](examples/agent-traces/) | LLM agents, web agents, RAG, multi-agent, code agents |\n| [Audio](examples/audio/) | Waveform annotation, classification, ELAN-style tiered |\n| [Video](examples/video/) | Frame-level labeling, temporal segments |\n| [Image](examples/image/) | Bounding boxes, PDF/document annotation |\n| [Advanced](examples/advanced/) | Solo mode, adjudication, quality control, conditional logic |\n| [AI-Assisted](examples/ai-assisted/) | LLM suggestions, Ollama integration |\n| [Custom Layouts](examples/custom-layouts/) | Content moderation, dialogue QA, medical review |\n\n### Research Showcase\n\nThe **[Potato Showcase](https://github.com/davidjurgens/potato-showcase/)** contains annotation projects from published research — sentiment analysis, dialogue evaluation, summarization, and more.\n\n```bash\npotato list all          # Browse available projects\npotato get \u003cproject\u003e     # Download one\n```\n\n---\n\n## Documentation\n\n| Topic | Link |\n|-------|------|\n| Quick Start | [docs/quick-start.md](docs/quick-start.md) |\n| Configuration Reference | [docs/configuration.md](docs/configuration.md) |\n| Schema Gallery | [docs/schemas_and_templates.md](docs/schemas_and_templates.md) |\n| Agent Trace Evaluation | [docs/agent_traces.md](docs/agent_traces.md) |\n| Web Agent Annotation | [docs/web_agent_annotation.md](docs/web_agent_annotation.md) |\n| AI Support | [docs/ai_support.md](docs/ai_support.md) |\n| Active Learning | [docs/active_learning_guide.md](docs/active_learning_guide.md) |\n| Solo Mode | [docs/solo_mode.md](docs/solo_mode.md) |\n| Quality Control | [docs/quality_control.md](docs/quality_control.md) |\n| Password Management | [docs/password_management.md](docs/password_management.md) |\n| SSO \u0026 OAuth | [docs/sso_authentication.md](docs/sso_authentication.md) |\n| Admin Dashboard | [docs/admin_dashboard.md](docs/admin_dashboard.md) |\n| Crowdsourcing | [docs/crowdsourcing.md](docs/crowdsourcing.md) |\n| Export Formats | [docs/export_formats.md](docs/export_formats.md) |\n| Full Documentation Index | [docs/index.md](docs/index.md) |\n\n---\n\n## Development\n\n```bash\n# Run tests\npytest tests/ -v\n\n# By category\npytest tests/unit/ -v        # Unit tests (fast)\npytest tests/server/ -v      # Integration tests\npytest tests/selenium/ -v    # Browser tests\n\n# With coverage\npytest --cov=potato --cov-report=html\n```\n\n---\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/davidjurgens/potato/issues)\n- **Questions**: jurgens@umich.edu\n- **Docs**: [potatoannotator.readthedocs.io](https://potatoannotator.readthedocs.io/)\n\n---\n\n## License\n\nPotato is licensed under [Polyform Shield](LICENSE). Non-commercial applications can use Potato however they want. Commercial applications can use Potato to annotate all they want, but cannot integrate Potato into a commercial product.\n\n\u003cdetails\u003e\n\u003csummary\u003eLicense FAQ\u003c/summary\u003e\n\n| Use Case | Allowed? |\n|----------|----------|\n| Academic research | Yes |\n| Company annotation | Yes |\n| Fork for personal development | Yes |\n| Integration in open-source pipelines | Yes |\n| Commercial annotation service | Contact us |\n| Competing annotation platform | Contact us |\n\n\u003c/details\u003e\n\n---\n\n## Citation\n\n```bibtex\n@inproceedings{pei2022potato,\n  title={POTATO: The Portable Text Annotation Tool},\n  author={Pei, Jiaxin and Ananthasubramaniam, Aparna and Wang, Xingyao and Zhou, Naitian and Dedeloudis, Apostolos and Sargent, Jackson and Jurgens, David},\n  booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},\n  year={2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidjurgens%2Fpotato","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavidjurgens%2Fpotato","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidjurgens%2Fpotato/lists"}