{"id":50261160,"url":"https://github.com/valuein/valuein","last_synced_at":"2026-05-27T10:04:27.200Z","repository":{"id":349271308,"uuid":"1201696227","full_name":"valuein/valuein","owner":"valuein","description":"We provide documentation and resources for 12M+ SEC filings and 105M+ raw facts. This is a survivor-bias free and Point-in-Time dataset.","archived":false,"fork":false,"pushed_at":"2026-05-20T01:19:17.000Z","size":4988,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-20T04:36:07.539Z","etag":null,"topics":["community","documentation","issues","support"],"latest_commit_sha":null,"homepage":"https://valuein.biz","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/valuein.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":"SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null},"funding":{"custom":["https://valuein.biz/pricing"]}},"created_at":"2026-04-05T03:04:22.000Z","updated_at":"2026-05-20T01:19:22.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/valuein/valuein","commit_stats":null,"previous_names":["valuein/valuein"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/valuein/valuein","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valuein%2Fvaluein","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valuein%2Fvaluein/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valuein%2Fvaluein/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valuein%2Fvaluein/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/valuein","download_url":"https://codeload.github.com/valuein/valuein/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valuein%2Fvaluein/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33560731,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-27T02:00:06.184Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["community","documentation","issues","support"],"created_at":"2026-05-27T10:04:27.130Z","updated_at":"2026-05-27T10:04:27.194Z","avatar_url":"https://github.com/valuein.png","language":"Python","funding_links":["https://valuein.biz/pricing"],"categories":[],"sub_categories":[],"readme":"[![Valuein](https://www.valuein.biz/valuein/twitter-rounded.png)](https://valuein.biz)\n\n[![PyPI version](https://img.shields.io/pypi/v/valuein-sdk?cacheSeconds=300)](https://pypi.org/project/valuein-sdk/)\n[![PyPI downloads](https://img.shields.io/pypi/dm/valuein-sdk?label=pypi%20downloads\u0026cacheSeconds=3600)](https://pypi.org/project/valuein-sdk/)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue)](https://pypi.org/project/valuein-sdk/)\n[![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-green)](LICENSE)\n[![GitHub stars](https://img.shields.io/github/stars/valuein/valuein?style=flat\u0026cacheSeconds=3600)](https://github.com/valuein/valuein/stargazers)\n[![MCP Registry](https://img.shields.io/badge/MCP-registry.modelcontextprotocol.io-blue)](https://registry.modelcontextprotocol.io)\n[![Docs](https://img.shields.io/badge/docs-valuein.biz-purple)](https://valuein.biz/developers/catalog)\n\n# Valuein — SEC EDGAR fundamentals for analysts, quants, and AI agents\n\n\u003e **Survivorship-bias-free, point-in-time US fundamentals — streamed as Parquet, queried with DuckDB or natural language.**\n\nThis repository is the **public home and discovery hub** for the Valuein data platform. It hosts the documentation, examples, notebooks, and the [MCP registry manifest](server.json) used by AI agents to find us. Source code for the SDK, MCP server, and data pipeline lives in dedicated repositories — this is the front door.\n\n```bash\npip install valuein-sdk          # data for code\n# or add this URL to any MCP-capable AI client:\n# https://mcp.valuein.biz/mcp     # data for agents\n```\n\n---\n\n## What's in here\n\n| You want to… | Go to |\n|---|---|\n| Try the SDK in 30 seconds without a token | [Quickstart](#quickstart-30-seconds-no-token) |\n| See every channel we ship through | [Distribution channels](#distribution-channels) |\n| Check pricing and what each plan unlocks | [Plans \u0026 access](#plans--access) |\n| Connect an AI agent (Claude, Cursor, Codex…) | [MCP for AI agents](#mcp-for-ai-agents) |\n| Set up the Workspace by role (analyst, PM, quant, creator) | [`docs/WORKSPACE_GUIDE.md`](docs/WORKSPACE_GUIDE.md) |\n| Read the data model | [Data model](#data-model) |\n| Find a quick recipe by role | [Recipes by role](#recipes-by-role) |\n| Run end-to-end Python examples | [`examples/python/`](examples/python/) |\n| Run interactive notebooks (Colab) | [`examples/notebooks/`](examples/notebooks/) |\n| Read the methodology / SLA / compliance | [Documentation](#documentation) |\n| Report a data error or request a feature | [Support \u0026 community](#support--community) |\n| Contribute an example or notebook | [`CONTRIBUTING.md`](CONTRIBUTING.md) |\n\n---\n\n## The data product\n\nSurvivorship-bias-free, point-in-time US fundamentals sourced directly from SEC EDGAR.\n\n- **12M+ filings** — 10-K, 10-Q, 8-K, 20-F, 40-F, and amendments since **1993**\n- **105M+ standardized facts** across **19,000+** active and delisted US public-company entities\n- **11,966 raw XBRL tags** normalized to **~286 canonical `standard_concept`** values (95%+ coverage)\n- **Cloud Parquet** on Cloudflare R2 — stream with DuckDB; no database setup, no local downloads\n- **PIT-correct** — every fact carries `filing_date` and millisecond-precision `accepted_at`\n- **Semantic core** — every 10-K / 10-Q / 20-F's narrative sections (Risk Factors, MD\u0026A, Business, Legal, Controls) chunked and indexed for natural-language search via the MCP server\n\n### Why it's different\n\n| Property | What it means for you |\n|---|---|\n| 🕒 **Point-in-time** | `filing_date \u003c= trade_date` removes look-ahead bias. `accepted_at` gives intraday resolution for same-day signals. |\n| ⚖️ **Survivorship-bias free** | Delisted, bankrupt, and acquired companies remain in every snapshot — your backtest sees the universe the market saw. |\n| 📊 **Standardized concepts** | Both the raw XBRL tag (`fact.concept`) and the canonical name (`fact.standard_concept`) are on every row. No hidden mapping table. |\n| 🔍 **CPA-verified catalog** | Every `standard_concept` carries a `review_confidence` — `1.0` once an accountant has signed off on its name, statement and rule (then it's locked; the pipeline only ever adds new concepts, never mutates a verified one), `0.7` while provisional. Filter `review_confidence \u003e= 1.0` for the labels analysts, quants and AI models can agree on and train against. |\n| 🚀 **DuckDB-native** | Millisecond analytics over remote Parquet via `httpfs`. Zero database provisioning. |\n| 🔁 **Append-only restatements** | A `10-K/A` adds a new row — the original stays. Reconstruct the as-reported view of any historical date. |\n| 🔐 **One token, every channel** | The same Bearer token authenticates the SDK, MCP server, and bulk-data API. |\n\n---\n\n## Distribution channels\n\nThe same dataset, delivered four ways so it lands where you already work.\n\n| Channel | Audience | Endpoint / install |\n|---|---|---|\n| **Python SDK** | Quants, engineers, data scientists | `pip install valuein-sdk` · [PyPI](https://pypi.org/project/valuein-sdk/) |\n| **MCP server** | AI agents (Claude, Cursor, Codex, custom) | `https://mcp.valuein.biz/mcp` · [server.json](server.json) |\n| **Web dashboard** | Retail, executives, non-technical users | [valuein.biz](https://valuein.biz) |\n| **Bulk data API** | B2B partners, fintech platforms | `https://data.valuein.biz` · [contact us](mailto:sales@valuein.biz) |\n\nA single Stripe-issued token unlocks every channel at your tier — no per-channel billing.\n\n---\n\n## Plans \u0026 access\n\nPricing and feature scope are mirrored from [valuein.biz/pricing](https://valuein.biz/pricing) — the website is the source of truth and our checkout flow routes to the correct Stripe product.\n\n| Plan | Universe | History | Data freshness | Price | Get it |\n|---|---|---|---|---|---|\n| **Sample** | S\u0026P 500 (~500 tickers) | 5-year window | Quarterly snapshots | **Free** · no signup | Just `pip install valuein-sdk` |\n| **Free** | S\u0026P 500 (~500 tickers) | 1993 – present | Daily | **Free** · register | [Register](https://valuein.biz/signup/free) |\n| **Pro** | Full active + delisted US universe (19,000+ entities) — fundamentals dataset only | 15-year rolling (2011 → present) | 24h after SEC | **$49 / mo** · $490 / yr | [Subscribe](https://valuein.biz/checkout?tier=pro\u0026billing=monthly) |\n| **Institutional** | Same universe + **smart-money dataset** (insider transactions on Forms 3/4/5/144 + institutional ownership on Forms 13F/13D/13G) | 1993 – present (unlimited) | 4h priority + filing-event webhooks | **$499 / mo** · $4,790 / yr | [Subscribe](https://valuein.biz/checkout?tier=full\u0026billing=monthly) |\n| **Enterprise** | Negotiated · dedicated infrastructure · expanded redistribution scope | Custom | Real-time 8-K + zero-retention option | Talk to us | [sales@valuein.biz](mailto:sales@valuein.biz) |\n\nEach tier removes a *different* buyer objection — Pro removes the universe + history limits on the fundamentals dataset; Institutional adds the smart-money dataset (insider transactions + institutional ownership), unlimited history back to 1993, filing-event webhooks, and a commercial redistribution license under a business-hours SLA; Enterprise adds dedicated infrastructure and bespoke contracts.\n\n### Pay-per-call (MPP)\n\nAutonomous AI agents that hit a rate or tier limit can pay per request using **Stripe card tokens** — no human checkout loop. Payment uses the [Machine Payment Protocol](https://mpp.dev). The agent quotes a price, charges a card Shared Payment Token, then retries the MCP call with the confirmed token.\n\n**Payment is card-only today.** Fetch `https://api.valuein.biz/api/mpp/well-known` to see which networks are live before paying.\n\n| Category | Examples | Price |\n|---|---|---|\n| Provenance / schema | `describe_schema`, `verify_fact_lineage` | Free |\n| Discovery | `search_companies`, `get_sec_filing_links` | **$0.01 / entity** |\n| Fundamentals | `get_company_fundamentals`, `get_financial_ratios` | **$0.10 / entity** |\n| Analytics | `get_valuation_metrics`, `get_peer_comparables`, `compare_periods`, `get_capital_allocation_profile`, `get_earnings_signals` | **$0.50 / entity** |\n| Compute | `compute_dcf`, `forensic_audit`, `generate_dcf_xlsx`, `generate_research_brief_docx`, `generate_comps_xlsx` | **$2.50 / call** |\n| Screens / universe | `screen_universe`, `get_pit_universe` | **$5.00 / call** |\n| Smart money (Institutional dataset) | `get_insider_transactions`, `get_insider_sentiment`, `get_institutional_holdings`, `get_manager_portfolio`, `get_blockholders`, `get_top_holders`, `get_smart_money_flow` | **$5.00 / entity** |\n\nPAYG is priced at 5× the subscription-equivalent rate — steady-state agent usage is almost always cheaper with a [Pro or Institutional subscription](https://valuein.biz/pricing). Daily spend caps exist per token as abuse protection; caps are raisable on request. See [`AGENTS.md`](AGENTS.md) for the full three-step MPP flow.\n\nRate limits per tier (canonical at `https://data.valuein.biz/v1/plans`):\n\n| Plan | Per minute | Per hour |\n|---|---:|---:|\n| Sample (anonymous) | 15 | 150 |\n| Free | 60 | 1,000 |\n| Pro | 100 | 3,000 |\n| Institutional / Enterprise | 300 | 10,000 |\n\n---\n\n## Quickstart (30 seconds, no token)\n\nPick whichever Python workflow you already use — both work in any virtual environment, and both run the same code below:\n\n```bash\n# Option A — pip (universal, ships with Python)\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\npip install valuein-sdk\n```\n\n```bash\n# Option B — uv (10–100× faster; install from https://docs.astral.sh/uv/)\nuv venv \u0026\u0026 source .venv/bin/activate\nuv pip install valuein-sdk\n```\n\n\u003e **Zero-friction by design.** No `VALUEIN_API_KEY`? No problem. The SDK detects the missing token and falls back to the SAMPLE dataset (S\u0026P 500, last 5 years); the edge gateway does the same — `GET /v1/{sp500,pro,full}/:table` with no `Authorization` header automatically 302-redirects to `/v1/sample/:table`. The snippet below runs as-is.\n\n```python\nfrom valuein_sdk import ValueinClient\n\nwith ValueinClient() as client:\n    print(client.me())               # {plan, status, email, createdAt}\n    print(client.manifest())         # snapshot id, last_updated, tables\n    print(client.tables())           # currently loaded tables\n\n    df = client.run_query(\"\"\"\n        SELECT r.symbol, r.name, r.sector\n        FROM   \"references\" r\n        JOIN   index_membership im ON im.cik = r.cik\n        WHERE  im.index_name = 'SP500'\n          AND  im.removal_date IS NULL\n          AND  r.is_active = TRUE\n        ORDER  BY r.name\n        LIMIT  10\n    \"\"\")\n    print(df)\n```\n\nThat's a real query against the live S\u0026P 500 sample. Add a token only when you need full universe or full history:\n\n```bash\n# optional — sample tier works without a key\necho 'VALUEIN_API_KEY=\"your_token_here\"' \u003e\u003e .env\n```\n\nThe same code now reads from your tier — no other changes.\n\n### Production pattern — context manager, typed errors, pre-built templates\n\n```python\nfrom valuein_sdk import (\n    ValueinClient,\n    ValueinAuthError,\n    ValueinPlanError,\n    ValueinRateLimitError,\n    ValueinAPIError,\n    ValueinError,\n)\n\n# Two-level try/except is intentional:\n#   outer = init errors raised by ValueinClient.__enter__ (auth, manifest, 503)\n#   inner = per-query errors raised by run_query / run_template (rate-limit,\n#           plan denial, bad SQL). Each level dispatches by exception type so\n#           you can act on the right cause — exit on auth, sleep on rate-limit,\n#           upsell on plan, log + skip on a single bad row.\n\ntry:\n    with ValueinClient() as client:\n        try:\n            # 1) Build \u0026 run a raw SQL query → pandas DataFrame\n            sql = \"SELECT COUNT(cik) FROM entity\"\n            result = client.run_query(sql)\n            print(result)\n\n            # 2) Run a named SQL template with kwargs (the SDK quotes safely)\n            df = client.run_template(\n                \"fundamentals_by_ticker\",\n                ticker=\"AAPL\",\n                start_date=\"2020-01-01\",\n                end_date=\"2024-12-31\",\n                form_types=[\"10-K\", \"10-Q\"],\n                metrics=[\"TotalRevenue\", \"NetIncome\", \"OperatingCashFlow\"],\n            )\n            print(df.head())\n        except ValueinPlanError:\n            print(\"This query needs a higher plan — see valuein.biz/pricing.\")\n        except ValueinRateLimitError as e:\n            print(f\"Rate limited; retry in {e.retry_after}s.\")\n        except ValueinError as ve:\n            # Catch-all for any other per-query failure (validation, bad SQL, etc.)\n            print(f\"Query failed: {ve}\")\nexcept ValueinAuthError:\n    raise SystemExit(\"Token missing or expired — set VALUEIN_API_KEY.\")\nexcept ValueinAPIError as e:\n    print(f\"Gateway error during init (HTTP {e.status_code}).\")\nexcept Exception as e:\n    print(f\"Initialization failed: {e}\")\n```\n\nThe SDK ships **54 named SQL templates** for the most common screens, ratios, and PIT backtests. List them:\n\n```python\nfrom valuein_sdk import ValueinClient\nwith ValueinClient() as c:\n    print(c.list_templates())\n```\n\nReference: [`docs/QUERY_COOKBOOK.md`](docs/QUERY_COOKBOOK.md) (DuckDB recipes) · [`docs/data_catalog.md`](docs/data_catalog.md) (canonical concepts) · [PyPI README](https://pypi.org/project/valuein-sdk/) (SDK quickstart).\n\n---\n\n## Recipes by role\n\nEvery link below points to a runnable script in [`examples/python/`](examples/python/) (mirror notebook in [`examples/notebooks/`](examples/notebooks/)). The Sample tier runs every example — no token, no signup.\n\n| You are a… | Start with | What you'll see |\n|---|---|---|\n| **Financial analyst** | [`financial_analysis.py`](examples/python/financial_analysis.py) | Revenue trend, margin walk, peer comparison from one ticker |\n| **Quant / researcher** | [`pit_backtest.py`](examples/python/pit_backtest.py) | PIT-correct factor query, restatement impact, common mistakes |\n| **Portfolio manager** | [`factor_screen.py`](examples/python/factor_screen.py) | Quality + Growth + Efficiency composite z-score over the S\u0026P 500 |\n| **Trader / signals** | [`earnings_momentum.py`](examples/python/earnings_momentum.py) | YoY revenue \u0026 earnings acceleration ranking |\n| **Asset manager** | [`survivorship_bias.py`](examples/python/survivorship_bias.py) | Quantify how survivorship bias inflates returns |\n| **Valuation modeler** | [`dcf_inputs.py`](examples/python/dcf_inputs.py) | Free-cash-flow assembly, balance sheet, Valuein's pre-computed DCF |\n| **Data engineer** | [`production-ready.py`](examples/python/production-ready.py) | Service pattern for FastAPI / Celery / Airflow |\n| **First-time user** | [`getting_started.py`](examples/python/getting_started.py) | First query, token check, sector counts |\n| **Building an AI agent** | [MCP for AI agents](#mcp-for-ai-agents) | Use natural language — no SDK required |\n\nRun any of them:\n\n```bash\n# Sample tier — works without a token\npython examples/python/getting_started.py\n\n# Paid tier\nVALUEIN_API_KEY=xxx python examples/python/factor_screen.py\n```\n\n---\n\n## Data model\n\nFull schema in [`docs/schema.json`](docs/schema.json) (machine-readable) and [`docs/data_catalog.md`](docs/data_catalog.md) (canonical concept names).\n\n| Table | What it is | Why it matters |\n|---|---|---|\n| **`references`** | **Start here.** Flat join of `entity` + `security`. One row per security with `cik`, `is_active`, sector, exchange, FIGI. For membership, JOIN `index_membership` on `cik = cik`. | One scan for cross-company metadata; index membership stays in its own table so historical entry/exit is preserved. |\n| `entity` | Company metadata — CIK, name, sector, SIC, status, fiscal year end | The legal entity dimension. |\n| `security` | Ticker history (SCD Type 2 with `valid_from` / `valid_to`) | Resolve historical tickers, share classes, exchanges. |\n| `filing` | Filing metadata — `accession_id`, `filing_date`, `report_date`, form type, amendment flag | The \"what was filed when\" dimension. |\n| `fact` | Standardized financial facts — both raw `concept` and canonical `standard_concept` on every row | The numbers. PIT-safe via `accepted_at`. |\n| `ratio` | Pipeline-computed financial ratios per filing | Skip the SQL — margins, returns, leverage, efficiency pre-calculated. |\n| `valuation` | Two-stage DCF + DDM intrinsic values per entity per period | Cross-check your model against ours. |\n| `taxonomy_guide` | 2026 US GAAP Taxonomy | Definitions for every `standard_concept`. |\n| `index_membership` | Historical index constituents (SP500, RUSSELL1000, RUSSELL2000, RUSSELL3000) — keyed on `cik`, with `effective_date` / `removal_date` half-open windows | Reconstruct any index on any historical date. JOIN `references.cik = index_membership.cik` for company metadata. |\n| `factor_scores` | Cross-sectional factor scores + percentile ranks + a proprietary composite | Quality / value / momentum screens with one query — no recomputation. |\n| `earnings_signals` | Proprietary TTM EPS trend estimate + surprise %, plus YoY revenue growth | Earnings-momentum signals without re-deriving them from `fact`. |\n| `filing_text` | Narrative chunks from 10-K / 10-Q / 20-F TextBlocks (Risk Factors, MD\u0026A, Business, Legal, Controls) | Source of the Vectorize index that powers semantic search via MCP. |\n\n### Date columns — which to use when\n\n| Column | Table | Use for |\n|---|---|---|\n| `report_date` / `period_end` | `filing` / `fact` | Aligning to the fiscal calendar |\n| `filing_date` | `filing` | **PIT backtest filter** — when the SEC received it |\n| `accepted_at` | `fact`, `valuation`, `filing_text` | Millisecond-precision PIT for intraday research |\n\n\u003e For any cross-company backtest, **always** filter by `filing_date \u003c= trade_date`. Filtering by `report_date` introduces look-ahead bias.\n\n### Three patterns that pay off in DuckDB\n\n**1. Start from `references`** (one join for cross-company filters; membership is in `index_membership`):\n\n```sql\nSELECT r.symbol, r.name, r.sector\nFROM   \"references\" r\nJOIN   index_membership im ON im.cik = r.cik\nWHERE  im.index_name = 'SP500'\n  AND  im.removal_date IS NULL          -- current member\n  AND  r.is_active     = TRUE\n  AND  r.sector ILIKE '%technology%'\n```\n\n**2. `LATERAL` for the latest filing per company:**\n\n```sql\nJOIN LATERAL (\n    SELECT accession_id, filing_date FROM filing\n    WHERE  entity_id = r.cik AND form_type = '10-K'\n    ORDER  BY filing_date DESC LIMIT 1\n) f ON TRUE\n```\n\n**3. Pivot multiple concepts in one `fact` scan:**\n\n```sql\nSELECT\n    MAX(CASE WHEN standard_concept = 'TotalRevenue'       THEN numeric_value END) AS revenue,\n    MAX(CASE WHEN standard_concept = 'StockholdersEquity' THEN numeric_value END) AS equity\nFROM   fact\nWHERE  standard_concept IN ('TotalRevenue', 'StockholdersEquity')\nGROUP  BY accession_id\n```\n\n\u003e Quarterly cash flows: use `COALESCE(derived_quarterly_value, numeric_value)` — Q2/Q3 10-Qs report YTD; this column isolates the single quarter. CAPEX sign varies by filer — always `ABS(capex)`.\n\nThe full cookbook — 20 recipes, 8 anti-patterns, end-to-end factor screen — lives in [`docs/QUERY_COOKBOOK.md`](docs/QUERY_COOKBOOK.md).\n\n### Canonical concept names\n\nQuery `fact.standard_concept` with canonical names like `'TotalRevenue'`, `'NetIncome'`, `'OperatingCashFlow'`, `'CAPEX'`, `'StockholdersEquity'` — **not** raw XBRL tags (`'Revenues'`, `'NetIncomeLoss'`, `'Assets'`). The full list lives in [`docs/data_catalog.md`](docs/data_catalog.md) and the machine-readable form is in [`docs/data_catalog.json`](docs/data_catalog.json).\n\n---\n\n## MCP for AI agents\n\nValuein ships a remote Model Context Protocol server so any MCP-capable agent (Claude Desktop, Cursor, Codex, custom) can answer fundamentals questions without writing code.\n\n- **Endpoint:** `https://mcp.valuein.biz/mcp` (Streamable HTTP, MCP spec 2025-11-25)\n- **Auth:** `Authorization: Bearer \u003cyour_api_token\u003e` — same token as the SDK and bulk-data API\n- **Manifest:** [`server.json`](server.json) — published to [registry.modelcontextprotocol.io](https://registry.modelcontextprotocol.io) as `io.github.valuein/mcp-sec-edgar`\n- **Reference:** [`docs/MCP_TOOLS.md`](docs/MCP_TOOLS.md) — every tool, every parameter, every tier gate\n\n### Tools\n\n\u003c!-- GEN:mcp-summary --\u003e\nThe server exposes **57 live tools + 1 stub** (58 total), plus **22 agentic SOP prompts** (two flagship cross-persona briefs — `equity_research_brief` and `screen_and_shortlist` — plus specialised chains for analyst, PM, quant, ratio, smart-money, and workflow personas) and **3 data resources** (`schema://{table}`, `reference://sp500`, `pricing://current`). Tier gating happens at the data layer — Sample / Free tokens see Sample / S\u0026P 500 data; Pro sees the full 19,000+-entity universe with a 15-year point-in-time window (2011 → present); Institutional unlocks the smart-money tools (insider transactions on Forms 3 / 4 / 5 / 144 + institutional ownership on Forms 13F / 13D / 13G), unlimited history back to 1993, filing-event webhooks, and the commercial redistribution license.\n\u003c!-- /GEN:mcp-summary --\u003e\n\n**Discovery \u0026 schema**\n\n| Tool | What it does |\n|---|---|\n| `search_companies` | Look up tickers, names, CIKs; filter by sector, S\u0026P 500, active status |\n| `describe_schema` | Return columns, types, and descriptions for any table |\n| `get_pit_universe` | The live constituent list (S\u0026P 500 or all) for any historical `as_of_date` |\n\n**Fundamentals \u0026 ratios**\n\n| Tool | What it does |\n|---|---|\n| `get_company_fundamentals` | Income statement, balance sheet, cash flow per ticker per period |\n| `get_financial_ratios` | Margins, returns, leverage, efficiency, FCF yield (per category) |\n| `get_valuation_metrics` | Margins + ROIC + DCF inputs + Valuein's pre-computed valuations |\n| `get_capital_allocation_profile` | CapEx intensity, buyback yield, dividend history |\n\n**Filings \u0026 lineage**\n\n| Tool | What it does |\n|---|---|\n| `get_sec_filing_links` | Direct EDGAR URLs for 10-K / 10-Q / 8-K / 20-F / 40-F |\n| `verify_fact_lineage` | Trace any number back to the exact filing + accession ID it came from |\n\n**Comparison \u0026 analytics**\n\n| Tool | What it does |\n|---|---|\n| `compare_periods` | Side-by-side comparison across periods with material-change flags |\n| `get_peer_comparables` | Peer set + comparable metrics by sector |\n| `screen_universe` | Factor-score-driven screen across the universe |\n| `get_earnings_signals` | EPS trends and surprise metrics around earnings releases |\n\n**Bulk \u0026 semantic**\n\n| Tool | What it does |\n|---|---|\n| `get_compute_ready_stream` | Issue presigned R2 URLs for direct Parquet streaming (skip the gateway) |\n| `search_filing_text` | Semantic search over Risk Factors / MD\u0026A / Business across every 10-K / 10-Q / 20-F (rolling out — Vectorize backfill in progress) |\n\n**Smart money — Institutional tier and above**\n\nThe smart-money bundle replaces Bloomberg's INSIDER\\\u003cGO\\\u003e / OWNER\\\u003cGO\\\u003e / HDS\\\u003cGO\\\u003e screens with a single Valuein token. Each tool reads a per-CIK Parquet partition and returns structured rows with the `lineage` envelope for one-click SEC verification.\n\n| Tool | What it does |\n|---|---|\n| `get_insider_transactions` | Form 3 / 4 / 5 / 144 line items per issuer — joined to insider_party for name + role |\n| `get_institutional_holdings` | Form 13F top holders for one issuer with HHI concentration + 13F-lag staleness flag |\n| `get_manager_portfolio` | Form 13F filer's full portfolio with QoQ deltas (new / increased / decreased / exited) |\n| `get_blockholders` | SC 13D / 13G with the first-class `going_active` flag (13G→13D = control-change signal) |\n\n### Configure in Claude Desktop\n\nAdd to `claude_desktop_config.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"valuein\": {\n      \"url\": \"https://mcp.valuein.biz/mcp\",\n      \"headers\": { \"Authorization\": \"Bearer YOUR_VALUEIN_API_KEY\" }\n    }\n  }\n}\n```\n\nSame URL + Bearer token works for any MCP client that supports Streamable HTTP remotes — Cursor, Codex, your own LangGraph / CrewAI agent.\n\n---\n\n## Examples in this repository\n\nEvery script and notebook works against the SDK published on PyPI. The Sample tier runs without a token; add `VALUEIN_API_KEY` to use a paid tier.\n\n### Python scripts ([`examples/python/`](examples/python/))\n\n| Script | Level | What it shows |\n|---|---|---|\n| [`getting_started.py`](examples/python/getting_started.py) | Beginner | First query, auth check, entity counts by sector |\n| [`usage.py`](examples/python/usage.py) | Reference | Every public SDK method demonstrated end-to-end |\n| [`entity_screening.py`](examples/python/entity_screening.py) | Beginner | Screen by sector, SIC code, active vs inactive |\n| [`financial_analysis.py`](examples/python/financial_analysis.py) | Intermediate | Revenue trends, margins, concept normalization, peer comparison |\n| [`pit_backtest.py`](examples/python/pit_backtest.py) | Intermediate | PIT discipline, restatement impact, `filing_date` vs `report_date` |\n| [`survivorship_bias.py`](examples/python/survivorship_bias.py) | Intermediate | Delisted companies, index membership, bias quantification |\n| [`factor_screen.py`](examples/python/factor_screen.py) | Intermediate | Composite Quality + Growth + Efficiency z-score ranking |\n| [`earnings_momentum.py`](examples/python/earnings_momentum.py) | Intermediate | YoY revenue \u0026 earnings acceleration across the S\u0026P 500 |\n| [`dcf_inputs.py`](examples/python/dcf_inputs.py) | Intermediate | FCF history, balance sheet, Valuein's pre-computed DCF |\n| [`production-ready.py`](examples/python/production-ready.py) | Advanced | Service pattern for FastAPI / Celery / Airflow integrations |\n\n### Jupyter notebooks ([`examples/notebooks/`](examples/notebooks/))\n\n| Notebook | Open in Colab |\n|---|---|\n| [Quickstart](examples/notebooks/quickstart.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/valuein/valuein/blob/main/examples/notebooks/quickstart.ipynb) |\n| [Fundamental Analysis](examples/notebooks/fundamental_analysis.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/valuein/valuein/blob/main/examples/notebooks/fundamental_analysis.ipynb) |\n| [PIT Backtest](examples/notebooks/pit_backtest.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/valuein/valuein/blob/main/examples/notebooks/pit_backtest.ipynb) |\n| [Survivorship Bias](examples/notebooks/survivorship_bias.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/valuein/valuein/blob/main/examples/notebooks/survivorship_bias.ipynb) |\n| [Factor Screen](examples/notebooks/factor_screen.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/valuein/valuein/blob/main/examples/notebooks/factor_screen.ipynb) |\n| [Earnings Momentum](examples/notebooks/earnings_momentum.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/valuein/valuein/blob/main/examples/notebooks/earnings_momentum.ipynb) |\n| [DCF Inputs](examples/notebooks/dcf_inputs.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/valuein/valuein/blob/main/examples/notebooks/dcf_inputs.ipynb) |\n\n---\n\n## Documentation\n\nEverything in [`docs/`](docs/) is kept in sync with the production data and the SDK on PyPI.\n\n| Document | What's in it |\n|---|---|\n| [`docs/WORKSPACE_GUIDE.md`](docs/WORKSPACE_GUIDE.md) | Workspace welcome guide — 15-minute setup + daily/weekly/monthly playbooks per role (analyst, PM, quant, creator) |\n| [`docs/METHODOLOGY.md`](docs/METHODOLOGY.md) | Sourcing, PIT architecture, restatement handling, XBRL normalization, valuation models |\n| [`docs/accuracy/`](docs/accuracy/) | **Accuracy proof** — 99.58% on 12,048 S\u0026P 500 FY filings, citable to FactSet PIT / FASB ASC / Penman, reproducible via `duckdb -c \".read scripts/accuracy/accuracy_check.sql\"` |\n| [`docs/QUERY_COOKBOOK.md`](docs/QUERY_COOKBOOK.md) | 20 copy-pasteable DuckDB recipes — `LATERAL`, pivots, PIT, factor screens |\n| [`docs/MCP_TOOLS.md`](docs/MCP_TOOLS.md) | Reference for every MCP tool — parameters, tier gates, examples |\n| [`docs/data_catalog.md`](docs/data_catalog.md) | Canonical `standard_concept` names and definitions |\n| [`docs/DATA_CATALOG.xlsx`](docs/DATA_CATALOG.xlsx) | Same catalog as a workbook — columns, types, sample values |\n| [`docs/data_catalog.json`](docs/data_catalog.json) | Machine-readable catalog (used by SDK metadata + docs sites) |\n| [`docs/schema.json`](docs/schema.json) | Machine-readable table + column schema |\n| [`docs/COMPLIANCE_AND_DDQ.md`](docs/COMPLIANCE_AND_DDQ.md) | Data provenance, MNPI policy, PIT integrity, security, SLA summary |\n| [`docs/SLA.md`](docs/SLA.md) | Uptime targets, data freshness, support response times, SLA credits |\n\n---\n\n## Support \u0026 community\n\nGitHub Issues is the primary support channel. Use the right template — it routes correctly and gets faster triage.\n\n| I want to… | Open |\n|---|---|\n| Report incorrect or suspicious data | [Data Quality Report](https://github.com/valuein/valuein/issues/new?template=01_data_quality_report.yml) |\n| Request a feature, concept, or dataset | [Feature Request](https://github.com/valuein/valuein/issues/new?template=02_feature_request.yml) |\n| Report an outage or degraded service | [Service Outage](https://github.com/valuein/valuein/issues/new?template=03_service_outage.yml) |\n| Ask a general question | [Q\u0026A](https://github.com/valuein/valuein/issues/new?template=04_general_question.yml) |\n| Report a security issue privately | See [`SECURITY.md`](SECURITY.md) |\n| Get general help | See [`SUPPORT.md`](SUPPORT.md) |\n\nFor private or contractual matters (DPAs, procurement, DDQs, enterprise SLAs): **[support@valuein.biz](mailto:support@valuein.biz)**.\n\nContributions — examples, notebook improvements, documentation fixes, query recipes — are very welcome. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the workflow and [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md) for community standards.\n\n---\n\n## License \u0026 disclosure\n\n[Apache 2.0](LICENSE). See [NOTICE](NOTICE) for attribution.\n\nThis repository is provided for research and educational purposes. **It is not investment advice.** No warranty of fitness for any particular trading, investment, or regulatory purpose is implied.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvaluein%2Fvaluein","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvaluein%2Fvaluein","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvaluein%2Fvaluein/lists"}