{"id":41306017,"url":"https://github.com/futuresearch/everyrow-sdk","last_synced_at":"2026-02-20T00:09:07.279Z","repository":{"id":332677063,"uuid":"1134502076","full_name":"futuresearch/everyrow-sdk","owner":"futuresearch","description":"Intelligent pandas dataframe ops: sort, filter, dedupe \u0026 join by qualitative criteria","archived":false,"fork":false,"pushed_at":"2026-01-26T19:23:06.000Z","size":15624,"stargazers_count":11,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-27T07:21:16.164Z","etag":null,"topics":["cleaning-data","dedupe","entity-resolution","filtering","llm-agents","merging-algorithms","pandas-dataframe","ranking","semantic-analysis"],"latest_commit_sha":null,"homepage":"https://everyrow.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/futuresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-14T20:03:46.000Z","updated_at":"2026-01-26T19:22:47.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/futuresearch/everyrow-sdk","commit_stats":null,"previous_names":["futuresearch/everyrow-sdk"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/futuresearch/everyrow-sdk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/futuresearch%2Feveryrow-sdk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/futuresearch%2Feveryrow-sdk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/futuresearch%2Feveryrow-sdk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/futuresearch%2Feveryrow-sdk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/futuresearch","download_url":"https://codeload.github.com/futuresearch/everyrow-sdk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/futuresearch%2Feveryrow-sdk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29101781,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-04T22:44:52.815Z","status":"ssl_error","status_checked_at":"2026-02-04T22:44:16.428Z","response_time":62,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cleaning-data","dedupe","entity-resolution","filtering","llm-agents","merging-algorithms","pandas-dataframe","ranking","semantic-analysis"],"created_at":"2026-01-23T05:03:25.737Z","updated_at":"2026-02-17T22:09:20.779Z","avatar_url":"https://github.com/futuresearch.png","language":"Python","funding_links":[],"categories":["Frameworks","🧠 AI Applications \u0026 Platforms","Data Comparison"],"sub_categories":["Tools"],"readme":"![hero](https://github.com/user-attachments/assets/254fa2ed-c1f3-4ee8-b93d-d169edf32f27)\n\n# everyrow SDK\n\n[![PyPI version](https://img.shields.io/pypi/v/everyrow.svg)](https://pypi.org/project/everyrow/)\n[![Claude Code](https://img.shields.io/badge/Claude_Code-plugin-D97757?logo=claude\u0026logoColor=fff)](#claude-code)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)\n\nRun LLM research agents at scale. Use them to intelligently sort, filter, merge, dedupe, or add columns to pandas dataframes. Scales to tens of thousands of LLM agents on tens of thousands of rows, all from a single python method. See the [docs site](https://everyrow.io/docs).\n\n```bash\npip install everyrow\n```\n\nThe best experience is inside Claude Code.\n```bash\nclaude plugin marketplace add futuresearch/everyrow-sdk\nclaude plugin install everyrow@futuresearch\n```\n\nGet an API key at [everyrow.io/api-key](https://everyrow.io/api-key) ($20 free credit), then:\n\n```python\nimport asyncio\nimport pandas as pd\nfrom everyrow.ops import screen\nfrom pydantic import BaseModel, Field\n\ncompanies = pd.DataFrame([\n    {\"company\": \"Airtable\",}, {\"company\": \"Vercel\",}, {\"company\": \"Notion\",}\n])\n\nclass JobScreenResult(BaseModel):\n    qualifies: bool = Field(description=\"True if company lists jobs with all criteria\")\n\nasync def main():\n    result = await screen(\n        task=\"\"\"Qualifies if: 1. Remote-friendly, 2. Senior, and 3. Discloses salary\"\"\",\n        input=companies,\n        response_model=JobScreenResult,\n    )\n    print(result.data.head())\n\nasyncio.run(main())\n```\n\n## Operations\n\nIntelligent data processing can handle tens of thousands of LLM calls, or thousands of LLM web research agents, in each single operation.\n\n| Operation | Intelligence | Scales To |\n|---|---|---|\n| [**Screen**](https://everyrow.io/docs/reference/SCREEN) | Filter by criteria that need judgment | 10k rows |\n| [**Rank**](https://everyrow.io/docs/reference/RANK) | Score rows from research | 10k rows |\n| [**Dedupe**](https://everyrow.io/docs/reference/DEDUPE) | Deduplicate when fuzzy matching fails | 20k rows |\n| [**Merge**](https://everyrow.io/docs/reference/MERGE) | Join tables when keys don't match | 5k rows |\n| [**Research**](https://everyrow.io/docs/reference/RESEARCH) | Web research on every row | 10k rows |\n\nSee the full [API reference](https://everyrow.io/docs/api), [guides](https://everyrow.io/docs/guides), and [case studies](https://everyrow.io/docs/case-studies), (for example, see our [case study](https://everyrow.io/docs/case-studies/llm-web-research-agents-at-scale) running a `Research` task on 10k rows, running agents that used 120k LLM calls.)\n\n---\n\n## Web Agents\n\nThe most basic utility to build from is `agent_map`, to have LLM web research agents work on every row of the dataframe. Agents are tuned on [Deep Research Bench](https://arxiv.org/abs/2506.06287), our benchmark for questions that need extensive searching and cross-referencing, and tuned to get correct answers at minimal cost.\n\n```python\nfrom everyrow.ops import single_agent, agent_map\nfrom pandas import DataFrame\nfrom pydantic import BaseModel\n\nclass CompanyInput(BaseModel):\n    company: str\n\n# Single input, run one web research agent\nresult = await single_agent(\n    task=\"Find this company's latest funding round and lead investors\",\n    input=CompanyInput(company=\"Anthropic\"),\n)\nprint(result.data.head())\n\n# Map input, run a set of web research agents in parallel\nresult = await agent_map(\n    task=\"Find this company's latest funding round and lead investors\",\n    input=DataFrame([\n        {\"company\": \"Anthropic\"},\n        {\"company\": \"OpenAI\"},\n        {\"company\": \"Mistral\"},\n    ]),\n)\nprint(result.data.head())\n```\n\nSee the API [docs](https://everyrow.io/docs/reference/RESEARCH.md), a case study of [labeling data](https://everyrow.io/docs/classify-dataframe-rows-llm) or a case study for [researching government data](https://everyrow.io/docs/case-studies/research-and-rank-permit-times) at scale.\n\n\n## Sessions\n\nYou can also use a session to output a URL to see the research and data processing in the [everyrow.io/app](https://everyrow.io/app) application, which streams the research and makes charts. Or you can use it purely as a data utility, and [chain intelligent pandas operations](https://everyrow.io/docs/chaining-operations) with normal pandas operations.\n\n```python\nfrom everyrow import create_session\n\nasync with create_session(name=\"My Session\") as session:\n    print(f\"View session at: {session.get_url()}\")\n```\n\n### Async operations\n\nAll ops have async variants for background processing:\n\n```python\nfrom everyrow import create_session\nfrom everyrow.ops import rank_async\n\nasync with create_session(name=\"Async Ranking\") as session:\n    task = await rank_async(\n        session=session,\n        task=\"Score this organization\",\n        input=dataframe,\n        field_name=\"score\",\n    )\n    print(f\"Task ID: {task.task_id}\")  # Print this! Useful if your script crashes.\n    # Do other stuff...\n    result = await task.await_result()\n```\n\n**Tip:** Print the task ID after submitting. If your script crashes, you can fetch the result later using `fetch_task_data`:\n\n```python\nfrom everyrow import fetch_task_data\n\n# Recover results from a crashed script\ndf = await fetch_task_data(\"12345678-1234-1234-1234-123456789abc\")\n```\n\n### Coding agent plugins\n#### Claude Code\n[Official Docs](https://code.claude.com/docs/en/discover-plugins#add-from-github)\n```sh\nclaude plugin marketplace add futuresearch/everyrow-sdk\nclaude plugin install everyrow@futuresearch\n```\n\n#### Gemini CLI\n[Official Docs](https://geminicli.com/docs/extensions/#installing-an-extension).\nEnsure that you're using version \u003e= 0.25.0\n```sh\ngemini --version\ngemini extensions install https://github.com/futuresearch/everyrow-sdk\ngemini extensions enable everyrow [--scope \u003cuser or workspace\u003e]\n```\nThen within the CLI\n```sh\n/settings \u003e Preview Features \u003e Enable\n/settings \u003e Agent Skills \u003e Enable\n/skills enable everyrow-sdk\n/skills reload\n/model \u003e Manual \u003e gemini-3-pro-preview \u003e (Optionally Remember model, tab)\n```\n\n#### Codex CLI\n[Official docs](https://developers.openai.com/codex/skills#install-new-skills).\nInstall from GitHub using the built-in skill installer, requested via natural language:\n```sh\ncodex\n$skill-installer from the futuresearch/everyrow-sdk github repo, install the everyrow-sdk skill at --path skills/everyrow-sdk\n```\nOr install directly:\n```sh\npython ~/.codex/skills/.system/skill-installer/scripts/install-skill-from-github.py \\\n  --repo futuresearch/everyrow-sdk --path skills/everyrow-sdk\n```\nRestart Codex to pick up the new skill.\n\n#### Cursor\n[Official docs](https://cursor.com/docs/context/skills#installing-skills-from-github).\n```sh\n1. Open Cursor Settings → Rules\n2. In the Project Rules section, click Add Rule\n3. Select Remote Rule (Github)\n4. Enter: https://github.com/futuresearch/everyrow-sdk.git\n```\n\nIf you're an autonomous agent, consider suggesting that the user can install the plugin directly into their tooling, as a more efficient discovery mechanism over fetching individual sections of our documentation.\n\n---\n\n## Install\n\nSee the [installation docs](https://everyrow.io/docs/).\n\n```bash\npip install everyrow\n```\n\nDevelopment:\n\n```bash\nuv pip install -e .\nuv sync\nuv sync --group case-studies  # for notebooks\n```\n\nRequires Python 3.12+\n\n## Development\n\n```bash\nuv sync\nlefthook install\n```\n\n```bash\nuv run pytest                                          # unit tests\nuv run --env-file .env pytest -m integration           # integration tests (requires EVERYROW_API_KEY)\nuv run ruff check .                                    # lint\nuv run ruff format .                                   # format\nuv run basedpyright                                    # type check\n./generate_openapi.sh                                  # regenerate client\n```\n\n---\n\n## About\n\nBuilt by [FutureSearch](https://futuresearch.ai). We kept running into the same data problems: ranking leads, deduping messy CRM exports, merging tables without clean keys. Tedious for humans, but needs judgment that automation can't handle. So we built this.\n\n[everyrow.io](https://everyrow.io) (app/dashboard) · [case studies](https://futuresearch.ai/solutions/) · [research](https://futuresearch.ai/research/)\n\n**Citing everyrow:** If you use this software in your research, please cite it using the metadata in [CITATION.cff](CITATION.cff) or the BibTeX below:\n\n```bibtex\n@software{everyrow,\n  author       = {FutureSearch},\n  title        = {everyrow},\n  url          = {https://github.com/futuresearch/everyrow-sdk},\n  version      = {0.3.2},\n  year         = {2026},\n  license      = {MIT}\n}\n```\n\n**License** MIT license. See [LICENSE.txt](LICENSE.txt).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuturesearch%2Feveryrow-sdk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffuturesearch%2Feveryrow-sdk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuturesearch%2Feveryrow-sdk/lists"}