{"id":13990500,"url":"https://github.com/astronomer/astro-sdk","last_synced_at":"2025-04-13T02:00:13.945Z","repository":{"id":36950545,"uuid":"435573069","full_name":"astronomer/astro-sdk","owner":"astronomer","description":"Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.","archived":false,"fork":false,"pushed_at":"2025-04-11T00:19:02.000Z","size":7905,"stargazers_count":368,"open_issues_count":193,"forks_count":48,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-04-13T02:00:02.471Z","etag":null,"topics":["airflow","apache-airflow","bigquery","dags","data-analysis","data-science","elt","etl","gcs","pandas","postgres","python","s3","snowflake","sql","sqlite","workflows"],"latest_commit_sha":null,"homepage":"https://astro-sdk-python.rtfd.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/astronomer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-06T16:43:50.000Z","updated_at":"2025-04-07T18:54:04.000Z","dependencies_parsed_at":"2023-12-26T05:22:54.008Z","dependency_job_id":"600868f7-4875-4085-a5e0-8aa376ac218f","html_url":"https://github.com/astronomer/astro-sdk","commit_stats":null,"previous_names":["astro-projects/astro"],"tags_count":90,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fastro-sdk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fastro-sdk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fastro-sdk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astronomer%2Fastro-sdk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/astronomer","download_url":"https://codeload.github.com/astronomer/astro-sdk/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248654045,"owners_count":21140235,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["airflow","apache-airflow","bigquery","dags","data-analysis","data-science","elt","etl","gcs","pandas","postgres","python","s3","snowflake","sql","sqlite","workflows"],"created_at":"2024-08-09T13:02:48.964Z","updated_at":"2025-04-13T02:00:13.901Z","avatar_url":"https://github.com/astronomer.png","language":"Python","readme":"\u003ch1 align=\"center\"\u003e\n  astro\n\u003c/h1\u003e\n  \u003ch3 align=\"center\"\u003e\n  workflows made easy\u003cbr\u003e\u003cbr\u003e\n\u003c/h3\u003e\n\n[![Python versions](https://img.shields.io/pypi/pyversions/astro-sdk-python.svg)](https://pypi.org/pypi/astro-sdk-python)\n[![License](https://img.shields.io/pypi/l/astro-sdk-python.svg)](https://pypi.org/pypi/astro-sdk-python)\n[![Development Status](https://img.shields.io/pypi/status/astro-sdk-python.svg)](https://pypi.org/pypi/astro-sdk-python)\n[![PyPI downloads](https://img.shields.io/pypi/dm/astro-sdk-python.svg)](https://pypistats.org/packages/astro-sdk-python)\n[![Contributors](https://img.shields.io/github/contributors/astronomer/astro-sdk)](https://github.com/astronomer/astro-sdk)\n[![Commit activity](https://img.shields.io/github/commit-activity/m/astronomer/astro-sdk)](https://github.com/astronomer/astro-sdk)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/astronomer/astro-sdk/main.svg)](https://results.pre-commit.ci/latest/github/astronomer/astro-sdk/main)\n[![CI](https://github.com/astronomer/astro-sdk/actions/workflows/ci-python-sdk.yaml/badge.svg)](https://github.com/astronomer/astro-sdk)\n[![codecov](https://codecov.io/gh/astronomer/astro-sdk/branch/main/graph/badge.svg?token=MI4SSE50Q6)](https://codecov.io/gh/astronomer/astro-sdk)\n\n**Astro Python SDK** is a Python SDK for rapid development of extract, transform, and load workflows in [Apache Airflow](https://airflow.apache.org/). It allows you to express your workflows as a set of data dependencies without having to worry about ordering and tasks. The Astro Python SDK is maintained by [Astronomer](https://astronomer.io).\n\n## Prerequisites\n\n- Apache Airflow \u003e= 2.1.0.\n\n## Install\n\nThe Astro Python SDK is available at [PyPI](https://pypi.org/project/astro-sdk-python/). Use the standard Python\n[installation tools](https://packaging.python.org/en/latest/tutorials/installing-packages/).\n\nTo install a cloud-agnostic version of the SDK, run:\n\n```shell\npip install astro-sdk-python\n```\n\nYou can also install dependencies for using the SDK with popular cloud providers:\n\n```shell\npip install astro-sdk-python[amazon,google,snowflake,postgres]\n```\n\n\n## Quickstart\n1. Ensure that your Airflow environment is set up correctly by running the following commands:\n\n    ```shell\n    export AIRFLOW_HOME=`pwd`\n    airflow db init\n    ```\n\n   \u003e **Note:**\n   \u003e - `AIRFLOW__CORE__ENABLE_XCOM_PICKLING` no longer needs to be enabled from astro-sdk-python release 1.2 and above.\n   \u003e - For airflow version \u003c 2.5 and astro-sdk-python release \u003c 1.3 Users can either use a custom XCom backend [AstroCustomXcomBackend](https://astro-sdk-python.readthedocs.io/en/latest/guides/xcom_backend.html#xcom-backend) with Xcom pickling disabled (or) enable Xcom pickling.\n   \u003e - For airflow version \u003e= 2.5 and astro-sdk-python release \u003e= 1.3.3 Users can either use [Airflow's Xcom backend](https://astro-sdk-python.readthedocs.io/en/latest/guides/xcom_backend.html#airflow_xcom_backend) with Xcom pickling disabled (or) enable Xcom pickling.\n\n    The data format used by pickle is Python-specific. This has the advantage that there are no restrictions imposed by external standards such as JSON or XDR (which can’t represent pointer sharing); however it means that non-Python programs may not be able to reconstruct pickled Python objects.\n\n    Read more: [enable_xcom_pickling](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#enable-xcom-pickling) and [pickle](https://docs.python.org/3/library/pickle.html#comparison-with-json):\n\n\n\n2. Create a SQLite database for the example to run with:\n\n    ```shell\n    # The sqlite_default connection has different host for MAC vs. Linux\n    export SQL_TABLE_NAME=`airflow connections get sqlite_default -o yaml | grep host | awk '{print $2}'`\n    sqlite3 \"$SQL_TABLE_NAME\" \"VACUUM;\"\n    ```\n\n3. Copy the following workflow into a file named `calculate_popular_movies.py` and add it to the `dags` directory of your Airflow project:\n\n   https://github.com/astronomer/astro-sdk/blob/d5aa768b2d4bca72ef98f8d533fe3f99624b172f/example_dags/calculate_popular_movies.py#L1-L37\n\n   Alternatively, you can download `calculate_popular_movies.py`\n   ```shell\n    curl -O https://raw.githubusercontent.com/astronomer/astro-sdk/main/example_dags/calculate_popular_movies.py\n   ```\n\n4. Run the example DAG:\n\n    ```sh\n    airflow dags test calculate_popular_movies `date -Iseconds`\n    ```\n\n5. Check the result of your DAG by running:\n\n    ```shell\n    sqlite3 \"$SQL_TABLE_NAME\" \"select * from top_animation;\" \".exit\"\n    ```\n\n    You should see the following output:\n\n    ```shell\n    $ sqlite3 \"$SQL_TABLE_NAME\" \"select * from top_animation;\" \".exit\"\n    Toy Story 3 (2010)|8.3\n    Inside Out (2015)|8.2\n    How to Train Your Dragon (2010)|8.1\n    Zootopia (2016)|8.1\n    How to Train Your Dragon 2 (2014)|7.9\n    ```\n\n## Supported technologies\n\n| FileLocation |\n| :----------- |\n| local        |\n| http         |\n| https        |\n| gs           |\n| gdrive       |\n| s3           |\n| wasb         |\n| wasbs        |\n| azure        |\n| sftp         |\n| ftp          |\n\n| FileType |\n| :------- |\n| csv      |\n| json     |\n| ndjson   |\n| parquet  |\n| xls      |\n| xlsx     |\n\n| Database  |\n| :-------- |\n| postgres  |\n| sqlite    |\n| delta     |\n| bigquery  |\n| snowflake |\n| redshift  |\n| mssql     |\n| duckdb    |\n| mysql     |\n\n## Available operations\n\nThe following are some key functions available in the SDK:\n\n- [`load_file`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/load_file.html): Load a given file into a SQL table\n- [`transform`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/transform.html): Applies a SQL select statement to a source table and saves the result to a destination table\n- [`drop_table`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/drop_table.html): Drops a SQL table\n- [`run_raw_sql`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/raw_sql.html): Run any SQL statement without handling its output\n- [`append`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/append.html): Insert rows from the source SQL table into the destination SQL table, if there are no conflicts\n- [`merge`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/merge.html): Insert rows from the source SQL table into the destination SQL table, depending on conflicts:\n  - `ignore`: Do not add rows that already exist\n  - `update`: Replace existing rows with new ones\n- [`export_file`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/export.html): Export SQL table rows into a destination file\n- [`dataframe`](https://astro-sdk-python.readthedocs.io/en/stable/astro/sql/operators/dataframe.html): Export given SQL table into in-memory Pandas data-frame\n\nFor a full list of available operators, see the [SDK reference documentation](https://astro-sdk-python.readthedocs.io/en/stable/operators.html).\n\n## Documentation\n\nThe documentation is a work in progress--we aim to follow the [Diátaxis](https://diataxis.fr/) system:\n\n- **[Getting Started Tutorial](https://docs.astronomer.io/tutorials/astro-python-sdk)**: A hands-on introduction to the Astro Python SDK\n- **How-to guides**: Simple step-by-step user guides to accomplish specific tasks\n- **[Reference guide](https://astro-sdk-python.readthedocs.io/)**: Commands, modules, classes and methods\n- **Explanation**: Clarification and discussion of key decisions when designing the project\n\n## Changelog\n\nThe Astro Python SDK follows semantic versioning for releases. Check the [changelog](python-sdk/docs/CHANGELOG.md) for the latest changes.\n\n## Release managements\n\nTo learn more about our release philosophy and steps, see [Managing Releases](python-sdk/docs/development/RELEASE.md).\n\n## Contribution guidelines\n\nAll contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.\n\nRead the [Contribution Guideline](python-sdk/docs/development/CONTRIBUTING.md) for a detailed overview on how to contribute.\n\nContributors and maintainers should abide by the [Contributor Code of Conduct](CODE_OF_CONDUCT.md).\n\n## License\n\n[Apache Licence 2.0](LICENSE)\n","funding_links":[],"categories":["Python","Building"],"sub_categories":["Workflows"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastronomer%2Fastro-sdk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fastronomer%2Fastro-sdk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastronomer%2Fastro-sdk/lists"}