{"id":42804812,"url":"https://github.com/astronomer/agents","last_synced_at":"2026-02-27T00:17:30.157Z","repository":{"id":334273068,"uuid":"1133647197","full_name":"astronomer/agents","owner":"astronomer","description":"AI agent tooling for data engineering workflows.","archived":false,"fork":false,"pushed_at":"2026-02-23T20:35:14.000Z","size":1451,"stargazers_count":238,"open_issues_count":17,"forks_count":21,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-02-24T00:29:41.541Z","etag":null,"topics":["agents","ai","airflow","apache-airflow","claude","cursor","data-engineering","mcp","skills"],"latest_commit_sha":null,"homepage":"https://astronomer.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/astronomer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-01-13T16:23:31.000Z","updated_at":"2026-02-23T20:35:16.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/astronomer/agents","commit_stats":null,"previous_names":["astronomer/agents"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/astronomer/agents","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fagents","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fagents/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fagents/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fagents/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/astronomer","download_url":"https://codeload.github.com/astronomer/agents/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fagents/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29878464,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-26T23:51:21.483Z","status":"ssl_error","status_checked_at":"2026-02-26T23:50:46.793Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","ai","airflow","apache-airflow","claude","cursor","data-engineering","mcp","skills"],"created_at":"2026-01-30T03:52:49.955Z","updated_at":"2026-02-27T00:17:30.148Z","avatar_url":"https://github.com/astronomer.png","language":"Python","funding_links":[],"categories":["Browse The Shelves"],"sub_categories":["Agent skill packs"],"readme":"# agents\n\nAI agent tooling for data engineering workflows. Includes an [MCP server](./astro-airflow-mcp/) for Airflow, a [CLI tool (`af`)](./astro-airflow-mcp/README.md#airflow-cli-tool) for interacting with Airflow from your terminal, and [skills](#skills) that extend AI coding agents with specialized capabilities for working with Airflow and data warehouses. Works with [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [Cursor](https://cursor.com), and other agentic coding tools.\n\nBuilt by [Astronomer](https://www.astronomer.io/). [Apache 2.0 licensed](https://github.com/astronomer/agents/blob/main/LICENSE) and compatible with open-source Apache Airflow.\n\n## Table of Contents\n\n\u003c!-- START doctoc generated TOC please keep comment here to allow auto update --\u003e\n\u003c!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --\u003e\n\n- [Installation](#installation)\n  - [Quick Start](#quick-start)\n  - [Compatibility](#compatibility)\n  - [Claude Code](#claude-code)\n  - [Cursor](#cursor)\n  - [Other MCP Clients](#other-mcp-clients)\n- [Features](#features)\n  - [MCP Server](#mcp-server)\n  - [Skills](#skills)\n- [Why Astro?](#why-astro)\n  - [User Journeys](#user-journeys)\n  - [Airflow CLI (`af`)](#airflow-cli-af)\n- [Configuration](#configuration)\n  - [Warehouse Connections](#warehouse-connections)\n  - [Airflow](#airflow)\n- [Usage](#usage)\n  - [Getting Started](#getting-started)\n- [Development](#development)\n  - [Local Development Setup](#local-development-setup)\n  - [Adding Skills](#adding-skills)\n- [Troubleshooting](#troubleshooting)\n  - [Common Issues](#common-issues)\n- [Contributing](#contributing)\n- [Roadmap](#roadmap)\n- [License](#license)\n\n\u003c!-- END doctoc generated TOC please keep comment here to allow auto update --\u003e\n\n## Installation\n\n### Quick Start\n\n```bash\nnpx skills add astronomer/agents --skill '*'\n```\n\nThis installs all Astronomer skills into your project via [skills.sh](https://skills.sh). You'll be prompted to select which agents to install to. To also select skills individually, omit the `--skill` flag.\n\n\u003e [!IMPORTANT]\n\u003e **Claude Code users:** We recommend using the plugin instead (see [Claude Code](#claude-code) section below) for better integration with MCP servers and hooks.\n\n### Compatibility\n\n**Skills:** Works with [25+ AI coding agents](https://github.com/vercel-labs/add-skill?tab=readme-ov-file#available-agents) including Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, Cline, and more.\n\n**MCP Server:** Works with any [MCP-compatible client](https://modelcontextprotocol.io/clients) including Claude Desktop, VS Code, and others.\n\n\u003e [!NOTE]\n\u003e **Open-source Airflow users:** The MCP server works with any Airflow 2.x/3.x REST API. Set `AIRFLOW_API_URL` to your self-hosted instance. Skills are tool-agnostic and work with any Airflow deployment.\n\n### Claude Code\n\n```bash\n# Add the marketplace and install the plugin\nclaude plugin marketplace add astronomer/agents\nclaude plugin install data@astronomer\n```\n\nThe plugin includes the Airflow MCP server that runs via `uvx` from PyPI. Data warehouse queries are handled by the `analyzing-data` skill using a background Jupyter kernel.\n\n### Cursor\n\nCursor supports both MCP servers and skills.\n\n**MCP Server** - Click to install:\n\n\u003ca href=\"https://cursor.com/en-US/install-mcp?name=astro-airflow-mcp\u0026config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyJhc3Ryby1haXJmbG93LW1jcCIsIi0tdHJhbnNwb3J0Iiwic3RkaW8iXX0\"\u003e\u003cimg src=\"https://cursor.com/deeplink/mcp-install-dark.svg\" alt=\"Add Airflow MCP to Cursor\" height=\"32\"\u003e\u003c/a\u003e\n\n**Skills** - Install to your project:\n\n```bash\nnpx skills add astronomer/agents --skill '*' -a cursor\n```\n\nThis installs skills to `.cursor/skills/` in your project.\n\n\u003cdetails\u003e\n\u003csummary\u003eManual MCP configuration\u003c/summary\u003e\n\nAdd to `~/.cursor/mcp.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"airflow\": {\n      \"command\": \"uvx\",\n      \"args\": [\"astro-airflow-mcp\", \"--transport\", \"stdio\"]\n    }\n  }\n}\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eEnable hooks (skill suggestions, session management)\u003c/summary\u003e\n\nCreate `.cursor/hooks.json` in your project:\n\n```json\n{\n  \"version\": 1,\n  \"hooks\": {\n    \"beforeSubmitPrompt\": [\n      {\n        \"command\": \"$CURSOR_PROJECT_DIR/.cursor/skills/airflow/hooks/airflow-skill-suggester.sh\",\n        \"timeout\": 5\n      }\n    ],\n    \"stop\": [\n      {\n        \"command\": \"uv run $CURSOR_PROJECT_DIR/.cursor/skills/analyzing-data/scripts/cli.py stop\",\n        \"timeout\": 10\n      }\n    ]\n  }\n}\n```\n\n**What these hooks do:**\n- `beforeSubmitPrompt`: Suggests data skills when you mention Airflow keywords\n- `stop`: Cleans up kernel when session ends\n\n\u003c/details\u003e\n\n### Other MCP Clients\n\nFor any MCP-compatible client (Claude Desktop, VS Code, etc.):\n\n```bash\n# Airflow MCP\nuvx astro-airflow-mcp --transport stdio\n\n# With remote Airflow\nAIRFLOW_API_URL=https://your-airflow.example.com \\\nAIRFLOW_USERNAME=admin \\\nAIRFLOW_PASSWORD=admin \\\nuvx astro-airflow-mcp --transport stdio\n```\n\n## Features\n\nThe `data` plugin bundles an MCP server and skills into a single installable package.\n\n### MCP Server\n\n| Server | Description |\n|--------|-------------|\n| **[Airflow](https://github.com/astronomer/agents/tree/main/astro-airflow-mcp)** | Full Airflow REST API integration via [astro-airflow-mcp](https://github.com/astronomer/agents/tree/main/astro-airflow-mcp): DAG management, triggering, task logs, system health |\n\n### Skills\n\n#### Data Discovery \u0026 Analysis\n\n| Skill | Description |\n|-------|-------------|\n| [warehouse-init](./skills/warehouse-init/) | Initialize schema discovery - generates `.astro/warehouse.md` for instant lookups |\n| [analyzing-data](./skills/analyzing-data/) | SQL-based analysis to answer business questions (uses background Jupyter kernel) |\n| [checking-freshness](./skills/checking-freshness/) | Check how current your data is |\n| [profiling-tables](./skills/profiling-tables/) | Comprehensive table profiling and quality assessment |\n\n#### Data Lineage\n\n| Skill | Description |\n|-------|-------------|\n| [tracing-downstream-lineage](./skills/tracing-downstream-lineage/) | Analyze what breaks if you change something |\n| [tracing-upstream-lineage](./skills/tracing-upstream-lineage/) | Trace where data comes from |\n| [annotating-task-lineage](./skills/annotating-task-lineage/) | Add manual lineage to tasks using inlets/outlets |\n| [creating-openlineage-extractors](./skills/creating-openlineage-extractors/) | Build custom OpenLineage extractors for operators |\n\n#### DAG Development\n\n| Skill | Description |\n|-------|-------------|\n| [airflow](./skills/airflow/) | Main entrypoint - routes to specialized Airflow skills |\n| [setting-up-astro-project](./skills/setting-up-astro-project/) (Astro) | Initialize and configure new Astro/Airflow projects |\n| [managing-astro-local-env](./skills/managing-astro-local-env/) (Astro) | Manage local Airflow environment (start, stop, logs, troubleshoot) |\n| [authoring-dags](./skills/authoring-dags/) | Create and validate Airflow DAGs with best practices |\n| [testing-dags](./skills/testing-dags/) | Test and debug Airflow DAGs locally |\n| [debugging-dags](./skills/debugging-dags/) | Deep failure diagnosis and root cause analysis |\n| [deploying-airflow](./skills/deploying-airflow/) | Deploy Airflow DAGs and projects (Astro, Docker Compose, Kubernetes) |\n| [airflow-hitl](./skills/airflow-hitl/) | Human-in-the-loop workflows: approval gates, form input, branching (Airflow 3.1+) |\n\n#### dbt Integration\n\n| Skill | Description |\n|-------|-------------|\n| [cosmos-dbt-core](./skills/cosmos-dbt-core/) | Run dbt Core projects as Airflow DAGs using [Astronomer Cosmos](https://github.com/astronomer/astronomer-cosmos) |\n| [cosmos-dbt-fusion](./skills/cosmos-dbt-fusion/) | Run dbt Fusion projects with Cosmos (Snowflake/Databricks only) |\n\n#### Migration\n\n| Skill | Description |\n|-------|-------------|\n| [migrating-airflow-2-to-3](./skills/migrating-airflow-2-to-3/) | Migrate DAGs from Airflow 2.x to 3.x |\n\n## Why Astro?\n\nAstro is Astronomer's managed Airflow platform. It's optional, but a good fit if you want managed deployments, built-in alerting, and centralized observability across environments. If you run open-source Airflow, everything in this repo still applies—you'll just configure your own Airflow URL and infrastructure.\n\n### User Journeys\n\n#### Data Analysis Flow\n\n```mermaid\nflowchart LR\n    init[\"/data:warehouse-init\"] --\u003e analyzing[\"/data:analyzing-data\"]\n    analyzing --\u003e profiling[\"/data:profiling-tables\"]\n    analyzing --\u003e freshness[\"/data:checking-freshness\"]\n```\n\n1. **Initialize** (`/data:warehouse-init`) - One-time setup to generate `warehouse.md` with schema metadata\n2. **Analyze** (`/data:analyzing-data`) - Answer business questions with SQL\n3. **Profile** (`/data:profiling-tables`) - Deep dive into specific tables for statistics and quality\n4. **Check freshness** (`/data:checking-freshness`) - Verify data is up to date before using\n\n#### DAG Development Flow\n\nFor open-source Airflow, use Docker Compose for local dev and the Helm chart for production (see `deploying-airflow`) instead of Astro setup skills.\n\n```mermaid\nflowchart LR\n    setup[\"/data:setting-up-astro-project\"] --\u003e authoring[\"/data:authoring-dags\"]\n    setup --\u003e env[\"/data:managing-astro-local-env\"]\n    authoring --\u003e testing[\"/data:testing-dags\"]\n    testing --\u003e debugging[\"/data:debugging-dags\"]\n```\n\n1. **Setup** (`/data:setting-up-astro-project`) - Initialize project structure and dependencies\n2. **Environment** (`/data:managing-astro-local-env`) - Start/stop local Airflow for development\n3. **Author** (`/data:authoring-dags`) - Write DAG code following best practices\n4. **Test** (`/data:testing-dags`) - Run DAGs and fix issues iteratively\n5. **Debug** (`/data:debugging-dags`) - Deep investigation for complex failures\n\n### Airflow CLI (`af`)\n\nThe `af` command-line tool lets you interact with Airflow directly from your terminal. Install it with:\n\n```bash\nuvx --from astro-airflow-mcp af --help\n```\n\nFor frequent use, add an alias to your shell config (`~/.bashrc` or `~/.zshrc`):\n\n```bash\nalias af='uvx --from astro-airflow-mcp af'\n```\n\nThen use it for quick operations like `af health`, `af dags list`, or `af runs trigger \u003cdag_id\u003e`.\n\nSee the [full CLI documentation](./astro-airflow-mcp/README.md#airflow-cli-tool) for all commands and instance management.\n\n\u003e **Telemetry:** The `af` CLI collects anonymous usage telemetry to help improve the tool. Only the command name is collected (e.g., `dags list`), never the arguments or their values. Opt out with `af telemetry disable`.\n\n## Configuration\n\n### Warehouse Connections\n\nConfigure data warehouse connections at `~/.astro/agents/warehouse.yml`:\n\n```yaml\nmy_warehouse:\n  type: snowflake\n  account: ${SNOWFLAKE_ACCOUNT}\n  user: ${SNOWFLAKE_USER}\n  auth_type: private_key\n  private_key_path: ~/.ssh/snowflake_key.p8\n  private_key_passphrase: ${SNOWFLAKE_PRIVATE_KEY_PASSPHRASE}\n  warehouse: COMPUTE_WH\n  role: ANALYST\n  query_tag: claude-code\n  databases:\n    - ANALYTICS\n    - RAW\n```\n\n\u003e [!NOTE]\n\u003e The `account` field requires your Snowflake **account identifier** (e.g., `orgname-accountname` or `xy12345.us-east-1`), not your account name. Find this in your Snowflake console under Admin \u003e Accounts.\n\nStore credentials in `~/.astro/agents/.env`:\n\n```bash\nSNOWFLAKE_ACCOUNT=myorg-myaccount  # Use your Snowflake account identifier (format: orgname-accountname or accountname.region)\nSNOWFLAKE_USER=myuser\nSNOWFLAKE_PRIVATE_KEY_PASSPHRASE=your-passphrase-here  # Only required if using an encrypted private key\n```\n\n**Supported databases:**\n\n| Type | Package | Description |\n|------|---------|-------------|\n| `snowflake` | Built-in | Snowflake Data Cloud |\n| `postgres` | Built-in | PostgreSQL |\n| `bigquery` | Built-in | Google BigQuery |\n| `sqlalchemy` | Any SQLAlchemy driver | Auto-detects packages for 25+ databases (see below) |\n\n\u003cdetails\u003e\n\u003csummary\u003eAuto-detected SQLAlchemy databases\u003c/summary\u003e\n\nThe connector automatically installs the correct driver packages for:\n\n| Database | Dialect URL |\n|----------|-------------|\n| PostgreSQL | `postgresql://` or `postgres://` |\n| MySQL | `mysql://` or `mysql+pymysql://` |\n| MariaDB | `mariadb://` |\n| SQLite | `sqlite:///` |\n| SQL Server | `mssql+pyodbc://` |\n| Oracle | `oracle://` |\n| Redshift | `redshift://` |\n| Snowflake | `snowflake://` |\n| BigQuery | `bigquery://` |\n| DuckDB | `duckdb:///` |\n| Trino | `trino://` |\n| ClickHouse | `clickhouse://` |\n| CockroachDB | `cockroachdb://` |\n| Databricks | `databricks://` |\n| Amazon Athena | `awsathena://` |\n| Cloud Spanner | `spanner://` |\n| Teradata | `teradata://` |\n| Vertica | `vertica://` |\n| SAP HANA | `hana://` |\n| IBM Db2 | `db2://` |\n\nFor unlisted databases, install the driver manually and use standard SQLAlchemy URLs.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eExample configurations\u003c/summary\u003e\n\n```yaml\n# PostgreSQL\nmy_postgres:\n  type: postgres\n  host: localhost\n  port: 5432\n  user: analyst\n  password: ${POSTGRES_PASSWORD}\n  database: analytics\n  application_name: claude-code\n\n# BigQuery\nmy_bigquery:\n  type: bigquery\n  project: my-gcp-project\n  credentials_path: ~/.config/gcloud/service_account.json\n  location: US\n  labels:\n    team: data-eng\n    env: prod\n\n# SQLAlchemy (any supported database)\nmy_duckdb:\n  type: sqlalchemy\n  url: duckdb:///path/to/analytics.duckdb\n  databases: [main]\n\n# SQLAlchemy with connect_args (passed to the DBAPI driver)\nmy_pg_sqlalchemy:\n  type: sqlalchemy\n  url: postgresql://${PG_USER}:${PG_PASSWORD}@localhost/analytics\n  databases: [analytics]\n  connect_args:\n    application_name: claude-code\n\n# Redshift (via SQLAlchemy)\nmy_redshift:\n  type: sqlalchemy\n  url: redshift+redshift_connector://${REDSHIFT_USER}:${REDSHIFT_PASSWORD}@${REDSHIFT_HOST}:5439/${REDSHIFT_DATABASE}\n  databases: [my_database]\n```\n\n\u003c/details\u003e\n\n### Airflow\n\nThe Airflow MCP auto-discovers your project when you run Claude Code from an Airflow project directory (contains `airflow.cfg` or `dags/` folder).\n\nFor remote instances, set environment variables:\n\n| Variable | Description |\n|----------|-------------|\n| `AIRFLOW_API_URL` | Airflow webserver URL |\n| `AIRFLOW_USERNAME` | Username |\n| `AIRFLOW_PASSWORD` | Password |\n| `AIRFLOW_AUTH_TOKEN` | Bearer token (alternative to username/password) |\n\n## Usage\n\nSkills are invoked automatically based on what you ask. You can also invoke them directly with `/data:\u003cskill-name\u003e`.\n\n### Getting Started\n\n1. **Initialize your warehouse** (recommended first step):\n   ```\n   /data:warehouse-init\n   ```\n   This generates `.astro/warehouse.md` with schema metadata for faster queries.\n\n2. **Ask questions naturally**:\n   - \"What tables contain customer data?\"\n   - \"Show me revenue trends by product\"\n   - \"Create a DAG that loads data from S3 to Snowflake daily\"\n   - \"Why did my etl_pipeline DAG fail yesterday?\"\n\n## Development\n\nSee [CLAUDE.md](./CLAUDE.md) for plugin development guidelines.\n\n### Local Development Setup\n\n```bash\n# Clone the repo\ngit clone https://github.com/astronomer/agents.git\ncd agents\n\n# Test with local plugin\nclaude --plugin-dir .\n\n# Or install from local marketplace\nclaude plugin marketplace add .\nclaude plugin install data@astronomer\n```\n\n### Adding Skills\n\nCreate a new skill in `skills/\u003cname\u003e/SKILL.md` with YAML frontmatter:\n\n```yaml\n---\nname: my-skill\ndescription: When to invoke this skill\n---\n\n# Skill instructions here...\n```\n\nAfter adding skills, reinstall the plugin:\n```bash\nclaude plugin uninstall data@astronomer \u0026\u0026 claude plugin install data@astronomer\n```\n\n## Troubleshooting\n\n### Common Issues\n\n| Issue | Solution |\n|-------|----------|\n| Skills not appearing | Reinstall plugin: `claude plugin uninstall data@astronomer \u0026\u0026 claude plugin install data@astronomer` |\n| Warehouse connection errors | Check credentials in `~/.astro/agents/.env` and connection config in `warehouse.yml` |\n| Airflow not detected | Ensure you're running from a directory with `airflow.cfg` or a `dags/` folder |\n\n## Contributing\n\nContributions welcome! Please read our [Code of Conduct](./CODE_OF_CONDUCT.md) and [Contributing Guide](./CONTRIBUTING.md) before getting started.\n\n## Roadmap\n\nSkills we're likely to build:\n\n**DAG Operations**\n- CI/CD pipelines for DAG deployment\n- Performance optimization and tuning\n- Monitoring and alerting setup\n- Data quality and validation workflows\n\n**Astronomer Open Source**\n- [DAG Factory](https://github.com/astronomer/dag-factory) - Generate DAGs from YAML\n- Other open source projects we maintain\n\n**Conference Learnings**\n- Reviewing talks from Airflow Summit, Coalesce, Data Council, and other conferences to extract reusable skills and patterns\n\n**Broader Data Practitioner Skills**\n- Churn prediction, data modeling, ML training, and other workflows that span DE/DS/analytics roles\n\n**Don't see a skill you want? [Open an issue](https://github.com/astronomer/agents/issues) or submit a PR!**\n\n## License\n\nApache 2.0\n\n---\n\nMade with :heart: by Astronomer\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastronomer%2Fagents","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fastronomer%2Fagents","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastronomer%2Fagents/lists"}