{"id":34694420,"url":"https://github.com/robofinsystems/robosystems","last_synced_at":"2026-06-07T06:02:02.716Z","repository":{"id":317467575,"uuid":"1035641625","full_name":"RoboFinSystems/robosystems","owner":"RoboFinSystems","description":"RoboSystems is a financial intelligence platform that unifies structured data, document search, and AI memory to transform complex financial data into actionable intelligence. Fork-ready with full GitHub Actions CI/CD for deploying CloudFormation infrastructure to your AWS account.","archived":false,"fork":false,"pushed_at":"2026-06-04T05:25:33.000Z","size":50910,"stargazers_count":16,"open_issues_count":0,"forks_count":6,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-06-04T05:26:49.274Z","etag":null,"topics":["accounting","ai","arelle","aws","context-graph","dagster","dbt","duckdb","fastapi","financial","financial-analysis","financial-data","knowledge-graph","ladybugdb","mcp","mcp-server","opensearch","postgresql","robosystems","xbrl"],"latest_commit_sha":null,"homepage":"https://robosystems.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RoboFinSystems.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-10T20:37:56.000Z","updated_at":"2026-06-04T05:25:36.000Z","dependencies_parsed_at":"2025-10-01T04:23:51.480Z","dependency_job_id":"dd062110-8153-4205-95ec-614689abc7d4","html_url":"https://github.com/RoboFinSystems/robosystems","commit_stats":null,"previous_names":["robofinsystems/robosystems"],"tags_count":192,"template":false,"template_full_name":null,"purl":"pkg:github/RoboFinSystems/robosystems","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoboFinSystems%2Frobosystems","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoboFinSystems%2Frobosystems/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoboFinSystems%2Frobosystems/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoboFinSystems%2Frobosystems/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RoboFinSystems","download_url":"https://codeload.github.com/RoboFinSystems/robosystems/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoboFinSystems%2Frobosystems/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34010556,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-07T02:00:07.652Z","response_time":124,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accounting","ai","arelle","aws","context-graph","dagster","dbt","duckdb","fastapi","financial","financial-analysis","financial-data","knowledge-graph","ladybugdb","mcp","mcp-server","opensearch","postgresql","robosystems","xbrl"],"created_at":"2025-12-24T22:29:44.014Z","updated_at":"2026-06-07T06:01:59.002Z","avatar_url":"https://github.com/RoboFinSystems.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RoboSystems\n\nRoboSystems is an open-source financial intelligence platform built on a unified operational and analytical graph architecture — a transactional Postgres backbone for ledger-grade correctness paired with an analytical LadybugDB graph for AI retrieval and reporting. Purpose-built for accounting, financial reporting, investment management, and analysis. Powers [RoboLedger](https://roboledger.ai) and [RoboInvestor](https://roboinvestor.ai).\n\n- **Unified Operational + Analytical Graph**: Graph workloads split the same way relational workloads do — transactional stores for writes, analytical stores for queries. Extension schemas drive both a Postgres operational backbone for ledger-grade correctness and a LadybugDB analytical graph for relationship traversal and AI retrieval, bound by a shared schema and Cypher query surface\n- **LadybugDB Graph Database**: Embedded columnar graph database with native DuckDB staging, LanceDB vector search, and tiered infrastructure\n- **Extensions**: Domain schemas that drive OLTP tables, data pipelines, and dedicated frontend apps, all surfaced through the Extensions API. Schema-per-tenant isolation in a single Postgres database; materialized to the graph for analytics\n- **Document Search**: Full-text and semantic search across SEC filings, uploaded documents, and connected sources via OpenSearch\n- **AI-Native Architecture**: Context graphs with embeddings, semantic enrichment, and confidence scoring for LLM-powered analytics\n- **Model Context Protocol (MCP)**: Standardized server and [client](https://www.npmjs.com/package/@robosystems/mcp) for LLM integration with schema-aware tools\n- **Multi-Source Data Integration**: SEC XBRL filings, QuickBooks accounting data via dbt pipelines, and custom financial datasets\n- **Enterprise-Ready Infrastructure**: Multi-tenant architecture with tiered scaling and production-grade query management\n- **Core REST API** (`/v1`): Auth, orgs, billing, graph lifecycle, Cypher, and MCP. Reads as REST GETs; graph lifecycle writes (subgraphs, backups, materialize, tier changes) as named `OperationEnvelope` operations\n- **Extensions API** (`/extensions/{graph_id}`): Strawberry GraphQL for typed reads over extensions OLTP, plus named REST operations for domain writes and analytical views over the materialized graph\n- **Unified Write Contract**: Every write across both surfaces is a named `OperationEnvelope` operation with `Idempotency-Key` support, audit logging, and SSE progress streaming via `/v1/operations/{id}/stream`\n\n## Platform\n\nThe platform provides the core infrastructure that all extensions build on:\n\n- **Dedicated Infrastructure**: Tiered graph infrastructure with dedicated instances and configurable memory allocation\n- **AI Agent System**: Autonomous financial operations — graph queries, taxonomy mapping, report generation — with automatic credit tracking and SSE progress streaming\n- **Shared Repositories**: SEC XBRL filings knowledge graph for context mining and benchmarking\n- **Document Management**: Upload, index, and search documents with full-text and semantic search via OpenSearch\n- **DuckDB Staging System**: High-performance data validation and bulk ingestion pipeline\n- **Dagster Orchestration**: Data pipeline orchestration for SEC filings, QuickBooks sync, backups, billing, and scheduled jobs\n- **Credit-Based Billing**: Flexible credits for AI operations based on token usage\n- **Subgraphs (Workspaces)**: AI memory graphs and isolated environments for development and team collaboration\n\n## Extensions\n\nExtensions are domain-specific subsystems that bring their own schema, OLTP tables, API routes, data pipelines, and dedicated frontend apps. They share a single PostgreSQL database with schema-per-tenant isolation and materialize to the graph for analytical queries. See [Schema Extensions](/robosystems/schemas/README.md) for the authoring contract.\n\nThe extensions API surface is **graph-scoped at the URL level** — `graph_id` is always a path parameter, never a query argument — and splits reads from writes by transport:\n\n- **Reads** → `POST /extensions/{graph_id}/graphql` — Strawberry GraphQL, GraphiQL in dev, schema composed dynamically from enabled domains\n- **Writes** → `POST /extensions/{roboledger|roboinvestor}/{graph_id}/operations/{operation_name}` — named REST commands (see Unified Write Contract above)\n\nBehind the API is a CQRS operations kernel (`reads/` + `commands/` per domain) that's the single source of truth for business logic — GraphQL resolvers, REST operation routes, and MCP tools all delegate to the same functions. Per-domain feature flags (`ROBOLEDGER_ENABLED`, `ROBOINVESTOR_ENABLED`) gate both the routers and the GraphQL schema composition.\n\nSee [GraphQL Extensions](/robosystems/graphql/README.md) for the read-path implementation details, the Strawberry-Pydantic auto-derivation pattern, and the walkthrough for adding a new read field.\n\n### [RoboLedger](https://roboledger.ai)\n\nAccounting and financial reporting extension. OLTP general ledger in schema-per-tenant PostgreSQL (accounts, transactions, journal entries, line items, dimensions); 29 GraphQL read fields covering entities, accounts, trial balance, fiscal calendar, schedules, taxonomies, mappings, reports, and publish lists; 23 named command operations for closing periods, creating schedules and closing entries, managing CoA→GAAP mapping associations, and authoring multi-period reports; analytical view operations over the materialized graph; QuickBooks ELT pipeline via dbt/Dagster; SEC XBRL financial reporting; AI-powered CoA→GAAP mapping via the MappingAgent. Dedicated frontend app.\n\n### [RoboInvestor](https://roboinvestor.ai)\n\nPortfolio management and investment tracking extension. OLTP database with portfolios, securities, and positions in schema-per-tenant PostgreSQL; 7 GraphQL read fields (portfolios, securities, positions, holdings) and 9 named command operations for portfolio CRUD and position management. Securities can link to entities for cross-graph research between investor portfolios and SEC public-company data via the shared repository. Dedicated frontend app.\n\n## Quick Start\n\n### Docker Development Environment\n\n```bash\n# Install uv and just\nbrew install uv just\n\n# Start robosystems backend api\njust start\n\n# Start frontend apps - robosystems-app, roboledger-app, roboinvestor-app\njust start apps\n```\n\nThis initializes the `.env` file and starts the complete RoboSystems stack with:\n\n- Graph API with LadybugDB and DuckDB backends\n- Dagster for data pipeline orchestration\n- PostgreSQL for IAM, graph metadata, extensions and Dagster\n- Valkey for caching, SSE messaging, and rate limiting\n- OpenSearch for full-text and semantic document search\n- Localstack for S3 and DynamoDB emulation\n\n**Service URLs:**\n\n| Service    | URL                   |\n| ---------- | --------------------- |\n| Main API   | http://localhost:8000 |\n| Graph API  | http://localhost:8001 |\n| Dagster UI | http://localhost:8002 |\n\nWith `just start apps` (frontend apps):\n\n| App              | URL                   |\n| ---------------- | --------------------- |\n| RoboSystems App  | http://localhost:3000 |\n| RoboLedger App   | http://localhost:3001 |\n| RoboInvestor App | http://localhost:3002 |\n\n### Local Development\n\n```bash\n# Setup Python environment (uv automatically handles Python versions)\njust init\n```\n\n## Examples\n\nSee RoboSystems in action with runnable demos that create graphs, load data, and execute queries with the `robosystems-client`:\n\n```bash\njust demo-sec               # Loads NVIDIA's SEC XBRL data via Dagster pipeline\njust demo-roboledger        # End-to-end RoboLedger demo: bulk OLTP, schedules, FY 2025 filed report, AI close\njust demo-custom-graph      # Builds custom graph schema with relationship networks\n```\n\nEach demo has a corresponding [Wiki article](https://github.com/RoboFinSystems/robosystems/wiki) with detailed guides.\n\n## Development Commands\n\n### Testing\n\n```bash\njust test-all               # Tests with code quality\njust test                   # Default test suite\njust test adapters          # Test specific module\njust test-cov               # Tests with coverage\n```\n\n### Log Monitoring\n\n```bash\njust logs api                 # View API logs (last 100 lines)\njust logs graph-api           # View Graph API logs (last 100 lines)\njust logs dagster-webserver   # View Dagster Webserver logs\njust logs dagster-daemon      # View Dagster Daemon logs\n```\n\n**See [justfile](justfile) for 50+ development commands** including database migrations, CloudFormation linting, graph operations, administration, and more.\n\n### Prerequisites\n\n#### System Requirements\n\n- Docker \u0026 Docker Compose\n- 8GB RAM minimum\n- 20GB free disk space\n\n#### Required Tools\n\n- `uv` for Python package and version management\n- `just` for project command runner\n\n#### Deployment Requirements\n\n- Fork this repo\n- AWS account with IAM Identity Center (SSO)\n- Run `just bootstrap` to configure OIDC and GitHub variables\n\nSee the **[Bootstrap Guide](https://github.com/RoboFinSystems/robosystems/wiki/Bootstrap-Guide)** for complete instructions.\n\n## Architecture\n\nRoboSystems is built on a modern, scalable architecture with:\n\n**Application Layer:**\n\n- FastAPI REST API with versioned endpoints\n- Extension GraphQL API for reads with command operations\n- MCP Server for AI-powered graph database access with schema-aware tools\n- AI Agent System for autonomous financial operations with automatic credit tracking\n- Dagster for data pipeline orchestration and background jobs\n\n**LadybugDB Graph Database:** ([configuration](/.github/configs/graph.yml))\n\n- Embedded columnar graph database purpose-built for financial analytics\n- Base + extension schema architecture — extensions define domain models\n- Native DuckDB integration for high-performance staging and ingestion\n- LanceDB vector search for semantic element resolution (IVF-PQ indexes, 384-dim embeddings)\n- Tiered infrastructure with configurable memory, rate limits, and subgraph allocations\n- Shared tier hosts public repositories with read replicas\n\n**Data Layer:**\n\n- PostgreSQL for IAM, graph metadata, Dagster, and extension OLTP databases (schema-per-tenant)\n- OpenSearch for full-text and semantic document search (BM25 + KNN)\n- Valkey for caching, SSE messaging, and rate limiting\n- AWS S3 for data lake storage and static assets\n- DynamoDB for instance/graph/volume registry\n\n**Infrastructure:**\n\n- ECS Fargate for API and Dagster\n- EC2 ASG for LadybugDB writer clusters\n- EC2 ALB + ASG for LadybugDB shared replica clusters\n- RDS PostgreSQL + ElastiCache Valkey\n- OpenSearch for full-text and semantic document search\n- CloudFormation infrastructure deployed via GitHub Actions with OIDC\n\n**For detailed architecture documentation, see the [Architecture Overview](https://github.com/RoboFinSystems/robosystems/wiki/Architecture-Overview) in the Wiki.**\n\n## SEC Shared Repository\n\nA curated knowledge graph of US public company financial data from SEC EDGAR XBRL filings. Runs on the shared LadybugDB tier, accessible via MCP tools, Cypher queries, and the AI agent.\n\n- **Pipeline**: EDGAR → Download → Process (Parquet) → Stage (DuckDB) → Enrich (fastembed) → Materialize (LadybugDB) → Index + Embed (OpenSearch)\n- **Graph**: 14 node types and 24 relationship types modeling the full XBRL reporting hierarchy\n- **Search**: Hybrid BM25 + KNN vector search across XBRL text blocks, narrative sections, and iXBRL disclosures\n- **Enrichment**: Semantic element mapping, statement classification, and disclosure tagging via the [Seattle Method](http://xbrlsite.com/seattlemethod/SeattleMethod.pdf) taxonomy\n\n```bash\njust sec-load NVDA 2025  # Load NVIDIA filings for 2025\njust sec-health          # Check SEC database health\n```\n\nSee [SEC Adapter](/robosystems/adapters/sec/README.md) and [SEC Pipeline](/robosystems/adapters/sec/pipeline/README.md) for detailed documentation.\n\n## AI\n\n### Model Context Protocol (MCP)\n\n- **Financial Analysis**: Natural language queries across enterprise data and public benchmark data\n- **Cross-Database Queries**: Compare user graph data against SEC shared repository data\n- **Tools**: Rich toolkit for graph queries, schema introspection, fact discovery, financial analysis, document search, and AI memory operations\n- **Handler Pool**: Managed MCP handler instances with resource limits\n\n### Agent System\n\n- Unified architecture: stateless agents with protocol-based service injection\n- Dual execution: API (sync/SSE) and background worker (Valkey queue + SSE progress)\n- Automatic credit tracking per AI call — agents cannot forget billing\n- Extensible: new agents implement `run(ctx)` and register with a decorator\n- See [Agent README](/robosystems/operations/agents/README.md) for details\n\n### Credit System\n\n- **AI Operations Only**: Credits are consumed exclusively by AI agent calls (Anthropic Claude via AWS Bedrock)\n- **Token-Based Billing**: Credits based on actual token usage and model cost\n- **MCP Tool Access**: No credits consumed for MCP calls or database operations\n\n## Client Libraries\n\nRoboSystems provides comprehensive client libraries for building applications:\n\n### MCP (Model Context Protocol) Client\n\nAI integration client for connecting Claude and other LLMs to RoboSystems.\n\n```bash\nnpx -y @robosystems/mcp\n```\n\n- **Features**: Claude Desktop integration, natural language queries, graph traversal, financial analysis\n- **Use Cases**: AI agents, chatbots, intelligent assistants, automated research\n- **Documentation**: [npm](https://www.npmjs.com/package/@robosystems/mcp) | [GitHub](https://github.com/RoboFinSystems/robosystems-mcp-client)\n\n### TypeScript/JavaScript Client\n\nFull-featured SDK for web and Node.js applications with TypeScript support.\n\n```bash\nnpm install @robosystems/client\n```\n\n- **Features**: Type-safe API calls, automatic retry logic, connection pooling, streaming support\n- **Use Cases**: Web applications, Node.js backends, React/Vue/Angular frontends\n- **Documentation**: [npm](https://www.npmjs.com/package/@robosystems/client) | [GitHub](https://github.com/RoboFinSystems/robosystems-typescript-client)\n\n### Python Client\n\nNative Python SDK for backend services and data science workflows.\n\n```bash\npip install robosystems-client\n```\n\n- **Features**: Async/await support, pandas integration, Jupyter compatibility, batch operations\n- **Use Cases**: Data pipelines, ML workflows, backend services, analytics\n- **Documentation**: [PyPI](https://pypi.org/project/robosystems-client/) | [GitHub](https://github.com/RoboFinSystems/robosystems-python-client)\n\n## Documentation\n\n### User Guides (Wiki)\n\n- **[Getting Started](https://github.com/RoboFinSystems/robosystems/wiki)** - Quick start and overview\n- **[Bootstrap Guide](https://github.com/RoboFinSystems/robosystems/wiki/Bootstrap-Guide)** - Fork and deploy to your AWS account\n- **[Architecture Overview](https://github.com/RoboFinSystems/robosystems/wiki/Architecture-Overview)** - System design and components\n- **[Data Pipeline Guide](https://github.com/RoboFinSystems/robosystems/wiki/Pipeline-Guide)** - Dagster data orchestration and custom integrations\n- **[SEC XBRL Pipeline](https://github.com/RoboFinSystems/robosystems/wiki/SEC-XBRL-Pipeline)** - Working with SEC financial data\n- **[Custom Graph Demo](https://github.com/RoboFinSystems/robosystems/wiki/Custom-Graph-Schema)** - Guide for creating a custom schema graph demo\n\n### Developer Documentation (Codebase)\n\n**Core Services:**\n\n- **[Adapters](/robosystems/adapters/README.md)** - External service integrations\n- **[Operations](/robosystems/operations/README.md)** - Business workflow orchestration, CQRS reads/commands kernels for extensions\n- **[Schemas](/robosystems/schemas/README.md)** - Graph schema definitions\n- **[Extensions GraphQL](/robosystems/graphql/README.md)** - Strawberry GraphQL read surface, Pydantic auto-derivation, resolver patterns\n- **[Configuration](/robosystems/config/README.md)** - Configuration management\n- **[Dagster](/robosystems/dagster/README.md)** - Data pipeline and task orchestration\n\n**Database Models:**\n\n- **[Platform Models](/robosystems/models/core/README.md)** - SQLAlchemy models for the platform database\n- **[Extensions Models](/robosystems/models/extensions/README.md)** - SQLAlchemy models for the extensions database with schema-per-graph tenancy\n- **[API Models](/robosystems/models/api/README.md)** - Pydantic request/response models for core platform and extensions surfaces\n\n**Graph Database System:**\n\n- **[Graph API](/robosystems/graph_api/README.md)** - Graph API overview\n- **[Client Factory](/robosystems/graph_api/client/README.md)** - Client factory system\n- **[Core Services](/robosystems/graph_api/core/README.md)** - Core services layer\n\n**Middleware Components:**\n\n- **[Authentication](/robosystems/middleware/auth/README.md)** - Authentication and authorization\n- **[Graph Routing](/robosystems/middleware/graph/README.md)** - Graph routing layer\n- **[MCP](/robosystems/middleware/mcp/README.md)** - MCP tools and pooling\n- **[Billing](/robosystems/middleware/billing/README.md)** - Subscription and billing management\n- **[Observability](/robosystems/middleware/otel/README.md)** - OpenTelemetry observability\n- **[Robustness](/robosystems/middleware/robustness/README.md)** - Circuit breakers and retry policies\n\n**Infrastructure:**\n\n- **[CloudFormation](/cloudformation/README.md)** - AWS infrastructure templates\n- **[Setup Scripts](/bin/setup/README.md)** - Bootstrap and configuration scripts\n\n**Development Resources:**\n\n- **[Examples](/examples/README.md)** - Runnable demos and integration examples\n- **[Tests](/tests/README.md)** - Testing strategy and organization\n- **[Admin Tools](/robosystems/admin/README.md)** - Administrative utilities and cli\n\n**Security \u0026 Compliance:**\n\n- **[SECURITY.md](/SECURITY.md)** - Security features and compliance configuration\n\n## API Reference\n\n- [API reference](https://api.robosystems.ai)\n- [API documentation](https://api.robosystems.ai/docs)\n- [OpenAPI specification](https://api.robosystems.ai/openapi.json)\n\n## Support\n\n- [Issues](https://github.com/RoboFinSystems/robosystems/issues)\n- [Wiki](https://github.com/RoboFinSystems/robosystems/wiki)\n- [Projects](https://github.com/orgs/RoboFinSystems/projects)\n- [Discussions](https://github.com/orgs/RoboFinSystems/discussions)\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\nApache-2.0 © 2026 RFS LLC\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobofinsystems%2Frobosystems","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frobofinsystems%2Frobosystems","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobofinsystems%2Frobosystems/lists"}