{"id":50459526,"url":"https://github.com/bruin-data/python-sdk","last_synced_at":"2026-06-17T20:01:22.170Z","repository":{"id":353966889,"uuid":"1162852526","full_name":"bruin-data/python-sdk","owner":"bruin-data","description":"Bruin Python SDK — eliminate boilerplate in Bruin Python assets","archived":false,"fork":false,"pushed_at":"2026-04-26T12:56:54.000Z","size":298,"stargazers_count":4,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-26T14:34:31.095Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bruin-data.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-20T19:22:51.000Z","updated_at":"2026-04-24T19:44:58.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/bruin-data/python-sdk","commit_stats":null,"previous_names":["bruin-data/python-sdk"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/bruin-data/python-sdk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruin-data%2Fpython-sdk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruin-data%2Fpython-sdk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruin-data%2Fpython-sdk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruin-data%2Fpython-sdk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bruin-data","download_url":"https://codeload.github.com/bruin-data/python-sdk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruin-data%2Fpython-sdk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34463558,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-01T04:00:38.876Z","updated_at":"2026-06-17T20:01:22.163Z","avatar_url":"https://github.com/bruin-data.png","language":"Python","funding_links":[],"categories":["\u003ca name=\"Python\"\u003e\u003c/a\u003ePython"],"sub_categories":[],"readme":"# Bruin Python SDK\n\nThe official Python SDK for [Bruin CLI](https://github.com/bruin-data/bruin). Query databases, access connections, and read pipeline context — all with zero boilerplate.\n\n```python\nfrom bruin import query, get_connection, context\n\n# One-liner: query any database Bruin manages\ndf = query(\"SELECT * FROM users WHERE created_at \u003e '{{start_date}}'\")\n\n# Access pipeline context\nprint(context.start_date)    # datetime.date(2024, 6, 1)\nprint(context.pipeline)      # \"my_pipeline\"\nprint(context.asset_name)    # \"my_asset\"\n\n# Get a typed database client\nconn = get_connection(\"my_bigquery\")\nclient = conn.client  # google.cloud.bigquery.Client, ready to use\n```\n\n## Installation\n\nAdd `bruin-sdk` to the `requirements.txt` that sits next to your Python assets:\n\n```\nbruin-sdk\npandas\n```\n\nFor specific database connections, install the corresponding extras:\n\n```\nbruin-sdk[bigquery]     # Google BigQuery\nbruin-sdk[snowflake]    # Snowflake\nbruin-sdk[postgres]     # PostgreSQL / Redshift\nbruin-sdk[redshift]     # Redshift (alias for postgres extra)\nbruin-sdk[mssql]        # Microsoft SQL Server\nbruin-sdk[mysql]        # MySQL\nbruin-sdk[duckdb]       # DuckDB\nbruin-sdk[sheets]       # Google Sheets (for GCP connections)\nbruin-sdk[all]          # Everything\n```\n\n## Quick Start\n\n### Before (manual boilerplate)\n\n```python\n\"\"\" @bruin\nname: my_asset\nconnection: bigquery_conn\nsecrets:\n    - key: bigquery_conn\n@bruin \"\"\"\n\nimport os\nimport json\nfrom google.cloud import bigquery\n\n# Parse connection JSON from env var\nraw = json.loads(os.environ[\"bigquery_conn\"])\nsa_info = json.loads(raw[\"service_account_json\"])\n\n# Create client manually\nclient = bigquery.Client.from_service_account_info(\n    sa_info, project=raw[\"project_id\"]\n)\n\n# Execute query\nstart = os.environ[\"BRUIN_START_DATE\"]\ndf = client.query(f\"SELECT * FROM users WHERE dt \u003e= '{start}'\").to_dataframe()\n```\n\n### After (with SDK)\n\n```python\n\"\"\" @bruin\nname: my_asset\nconnection: bigquery_conn\n@bruin \"\"\"\n\nfrom bruin import query, context\n\ndf = query(f\"SELECT * FROM users WHERE dt \u003e= '{context.start_date}'\")\n```\n\n---\n\n## API Reference\n\n### `context`\n\nA module-level object that provides access to all `BRUIN_*` environment variables as properly typed Python values. Each property reads the env var fresh on every access — no caching, no stale values.\n\n```python\nfrom bruin import context\n```\n\n| Property | Type | Env Var | Description |\n|----------|------|---------|-------------|\n| `context.start_date` | `date \\| None` | `BRUIN_START_DATE` | Pipeline run start date |\n| `context.start_datetime` | `datetime \\| None` | `BRUIN_START_DATETIME` | Start date with time |\n| `context.start_timestamp` | `datetime \\| None` | `BRUIN_START_TIMESTAMP` | Start timestamp with timezone |\n| `context.end_date` | `date \\| None` | `BRUIN_END_DATE` | Pipeline run end date |\n| `context.end_datetime` | `datetime \\| None` | `BRUIN_END_DATETIME` | End date with time |\n| `context.end_timestamp` | `datetime \\| None` | `BRUIN_END_TIMESTAMP` | End timestamp with timezone |\n| `context.execution_date` | `date \\| None` | `BRUIN_EXECUTION_DATE` | Execution date |\n| `context.execution_datetime` | `datetime \\| None` | `BRUIN_EXECUTION_DATETIME` | Execution date with time |\n| `context.execution_timestamp` | `datetime \\| None` | `BRUIN_EXECUTION_TIMESTAMP` | Execution timestamp with timezone |\n| `context.run_id` | `str \\| None` | `BRUIN_RUN_ID` | Unique run identifier |\n| `context.pipeline` | `str \\| None` | `BRUIN_PIPELINE` | Pipeline name |\n| `context.asset_name` | `str \\| None` | `BRUIN_ASSET` | Current asset name |\n| `context.connection` | `str \\| None` | `BRUIN_CONNECTION` | Asset's default connection |\n| `context.is_full_refresh` | `bool` | `BRUIN_FULL_REFRESH` | `True` when `--full-refresh` flag is set |\n| `context.commit_hash` | `str \\| None` | `BRUIN_COMMIT_HASH` | Git commit hash of the pipeline's repository |\n| `context.vars` | `dict` | `BRUIN_VARS` | Pipeline variables (types preserved from JSON Schema) |\n\nAll properties return `None` when the corresponding env var is missing (except `is_full_refresh` which returns `False`, and `vars` which returns `{}`).\n\n```python\nfrom bruin import context\n\n# Dates\nprint(context.start_date)       # datetime.date(2024, 6, 1)\nprint(context.end_date)         # datetime.date(2024, 6, 2)\n\n# Pipeline variables (types preserved from pipeline.yml JSON Schema)\nsegment = context.vars[\"segment\"]     # str: \"enterprise\"\nhorizon = context.vars[\"horizon\"]     # int: 30\ncohorts = context.vars[\"cohorts\"]     # list[dict]\n\n# Conditional logic\nif context.is_full_refresh:\n    df = query(\"SELECT * FROM users\")\nelse:\n    df = query(f\"SELECT * FROM users WHERE dt \u003e= '{context.start_date}'\")\n```\n\n---\n\n### `query(sql, connection=None)`\n\nExecute SQL and return results.\n\n```python\nfrom bruin import query\n```\n\n**Parameters:**\n\n| Parameter | Type | Default | Description |\n|-----------|------|---------|-------------|\n| `sql` | `str` | *(required)* | SQL statement to execute |\n| `connection` | `str \\| None` | `None` | Connection name. When `None`, uses the asset's default connection (`BRUIN_CONNECTION`) |\n\n**Returns:** `pandas.DataFrame` for data-returning statements (`SELECT`, `WITH`, `SHOW`, `DESCRIBE`, `EXPLAIN`), `None` for DDL/DML (`CREATE`, `INSERT`, `UPDATE`, `DELETE`, `DROP`, etc.).\n\n```python\n# Uses the asset's default connection (from the `connection:` field in asset definition)\ndf = query(\"SELECT * FROM users\")\n\n# Explicit connection name\ndf = query(\"SELECT * FROM users\", connection=\"my_bigquery\")\n\n# DDL/DML returns None\nquery(\"CREATE TABLE temp_users AS SELECT * FROM users\")\nquery(\"INSERT INTO audit_log VALUES ('ran_asset', NOW())\")\n\n# Works with any supported database\ndf_bq = query(\"SELECT * FROM users\", connection=\"my_bigquery\")\ndf_sf = query(\"SELECT * FROM users\", connection=\"my_snowflake\")\ndf_pg = query(\"SELECT * FROM users\", connection=\"my_postgres\")\n```\n\nEvery query is automatically annotated with `@bruin.config` metadata for observability and cost tracking.\n\n---\n\n### `get_connection(name)`\n\nGet a typed connection object with a lazy database client.\n\n```python\nfrom bruin import get_connection\n```\n\n**Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `name` | `str` | Connection name as defined in `.bruin.yml` (auto-injected from `connection:` or listed in `secrets`) |\n\n**Returns:** `Connection` or `GCPConnection` depending on the connection type.\n\n```python\nconn = get_connection(\"my_bigquery\")\nconn.name    # \"my_bigquery\"\nconn.type    # \"google_cloud_platform\"\nconn.raw     # dict — the parsed connection JSON\nconn.client  # Lazy-initialized database client\n```\n\n#### Connection types\n\n| Type | `.client` returns | Install extra |\n|------|-------------------|---------------|\n| `google_cloud_platform` | `bigquery.Client` | `bruin-sdk[bigquery]` |\n| `snowflake` | `snowflake.connector.Connection` | `bruin-sdk[snowflake]` |\n| `postgres` | `psycopg2.connection` | `bruin-sdk[postgres]` |\n| `redshift` | `psycopg2.connection` | `bruin-sdk[redshift]` |\n| `mssql` | `pymssql.Connection` | `bruin-sdk[mssql]` |\n| `mysql` | `mysql.connector.Connection` | `bruin-sdk[mysql]` |\n| `duckdb` | `duckdb.DuckDBPyConnection` | `bruin-sdk[duckdb]` |\n| `generic` | N/A (raises error) | — |\n\nClient creation is **lazy** — the actual database connection is only established when `.client` is first accessed.\n\n#### GCP connections\n\nGCP connections have extra methods since one connection can access multiple Google services:\n\n```python\nconn = get_connection(\"my_gcp\")\n\n# BigQuery (most common — also available as .client)\nbq_client = conn.bigquery()\ndf = bq_client.query(\"SELECT 1\").to_dataframe()\n\n# Google Sheets\nsheets_client = conn.sheets()  # requires bruin-sdk[sheets]\n\n# Cloud Storage\ngcs_client = conn.storage()  # requires google-cloud-storage\n\n# Raw credentials for any Google API\ncreds = conn.credentials  # google.oauth2.Credentials\n```\n\n#### Generic connections\n\nGeneric connections hold a raw string value (like an API key or webhook URL). They don't have a database client:\n\n```python\nconn = get_connection(\"slack_webhook\")\nconn.type    # \"generic\"\nconn.raw     # \"https://hooks.slack.com/services/T00/B00/xxx\"\nconn.client  # raises ConnectionTypeError\n```\n\n---\n\n### `Connection.query(sql)`\n\nConnections also have a `.query()` method — an alternative to the top-level `query()`:\n\n```python\nconn = get_connection(\"my_bigquery\")\n\n# These are equivalent:\ndf = conn.query(\"SELECT * FROM users\")\ndf = query(\"SELECT * FROM users\", connection=\"my_bigquery\")\n```\n\nSame return behavior: `DataFrame` for SELECT, `None` for DDL/DML.\n\n---\n\n## Exceptions\n\nAll SDK exceptions inherit from `BruinError`:\n\n```python\nfrom bruin.exceptions import (\n    BruinError,              # Base class\n    ConnectionNotFoundError, # Connection name not found or env var missing\n    ConnectionParseError,    # Invalid JSON in connection env var\n    ConnectionTypeError,     # Unsupported or generic connection type\n    QueryError,              # SQL execution failed\n)\n```\n\n```python\ntry:\n    df = query(\"SELECT * FROM users\", connection=\"missing\")\nexcept ConnectionNotFoundError as e:\n    print(e)\n    # Connection 'missing' not found. Available connections: my_bigquery, my_snowflake.\n```\n\nMissing optional dependencies give clear install instructions:\n\n```python\nconn = get_connection(\"my_snowflake\")\nconn.client\n# ImportError: Install bruin-sdk[snowflake] to use Snowflake connections:\n#   pip install 'bruin-sdk[snowflake]'\n```\n\n---\n\n## Asset Setup\n\nWhen you set the `connection` field in your asset definition, Bruin automatically injects the connection's credentials — no need to list it in `secrets`:\n\n```python\n\"\"\" @bruin\nname: my_asset\nconnection: my_bigquery\n@bruin \"\"\"\n\nfrom bruin import query\n\n# Uses my_bigquery automatically\ndf = query(\"SELECT * FROM users\")\n```\n\nIf you need additional connections beyond the default, add them to `secrets`:\n\n```python\n\"\"\" @bruin\nname: my_asset\nconnection: my_bigquery\nsecrets:\n    - key: my_postgres\n@bruin \"\"\"\n\nfrom bruin import query, get_connection\n\n# Default connection (my_bigquery)\ndf = query(\"SELECT * FROM users\")\n\n# Additional connection via secrets\npg = get_connection(\"my_postgres\")\n```\n\n---\n\n## Examples\n\n### Incremental load with date filtering\n\n```python\n\"\"\" @bruin\nname: analytics.daily_events\nconnection: my_bigquery\n@bruin \"\"\"\n\nfrom bruin import query, context\n\nif context.is_full_refresh:\n    df = query(\"SELECT * FROM raw.events\")\nelse:\n    df = query(f\"\"\"\n        SELECT * FROM raw.events\n        WHERE event_date BETWEEN '{context.start_date}' AND '{context.end_date}'\n    \"\"\")\n\nprint(f\"Loaded {len(df)} events\")\n```\n\n### Cross-database ETL\n\n```python\n\"\"\" @bruin\nname: sync.postgres_to_bigquery\nsecrets:\n    - key: my_postgres\n    - key: my_bigquery\n@bruin \"\"\"\n\nfrom bruin import query, get_connection\n\n# Read from Postgres\ndf = query(\"SELECT * FROM users WHERE active = true\", connection=\"my_postgres\")\n\n# Write to BigQuery\nbq = get_connection(\"my_bigquery\")\ndf.to_gbq(\n    \"staging.active_users\",\n    project_id=bq.raw[\"project_id\"],\n    credentials=bq.credentials,\n    if_exists=\"replace\",\n)\n```\n\n### Using pipeline variables\n\n```yaml\n# pipeline.yml\nname: marketing\nvariables:\n  segment:\n    type: string\n    default: \"enterprise\"\n  lookback_days:\n    type: integer\n    default: 30\n```\n\n```python\n\"\"\" @bruin\nname: marketing.segment_report\nconnection: my_snowflake\n@bruin \"\"\"\n\nfrom bruin import query, context\n\nsegment = context.vars[\"segment\"]\nlookback = context.vars[\"lookback_days\"]\n\ndf = query(f\"\"\"\n    SELECT * FROM customers\n    WHERE segment = '{segment}'\n    AND created_at \u003e= DATEADD(day, -{lookback}, CURRENT_DATE())\n\"\"\")\n\nprint(f\"Found {len(df)} {segment} customers in last {lookback} days\")\n```\n\n### DDL operations\n\n```python\n\"\"\" @bruin\nname: setup.create_tables\nconnection: my_postgres\n@bruin \"\"\"\n\nfrom bruin import query\n\n# DDL returns None\nquery(\"CREATE TABLE IF NOT EXISTS audit_log (event TEXT, ts TIMESTAMP)\")\nquery(\"INSERT INTO audit_log VALUES ('setup_complete', NOW())\")\n\n# SELECT returns DataFrame\ndf = query(\"SELECT COUNT(*) as cnt FROM audit_log\")\nprint(f\"Audit log has {df['cnt'][0]} entries\")\n```\n\n## Disclaimer\n\nThis project is written entirely by machines.\n\nNot a single line of code in this repository was authored by a human. Every function, module, and commit is generated by AI. We intend to keep it that way.\n\nThe engineering team at [Bruin](https://getbruin.com) does not write the code. We guide the machines that do.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbruin-data%2Fpython-sdk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbruin-data%2Fpython-sdk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbruin-data%2Fpython-sdk/lists"}