{"id":28701941,"url":"https://github.com/rasgointelligence/rasgoql","last_synced_at":"2025-06-14T12:08:30.770Z","repository":{"id":36982476,"uuid":"450978755","full_name":"rasgointelligence/RasgoQL","owner":"rasgointelligence","description":"Write python locally, execute SQL in your data warehouse","archived":false,"fork":false,"pushed_at":"2022-07-05T16:54:56.000Z","size":6420,"stargazers_count":269,"open_issues_count":6,"forks_count":12,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-06-13T19:45:34.587Z","etag":null,"topics":["data-analysis","data-science","pandas","python","sql"],"latest_commit_sha":null,"homepage":"https://docs.rasgoql.com","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rasgointelligence.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-01-23T01:39:32.000Z","updated_at":"2025-05-14T13:52:40.000Z","dependencies_parsed_at":"2022-07-12T16:13:00.251Z","dependency_job_id":null,"html_url":"https://github.com/rasgointelligence/RasgoQL","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/rasgointelligence/RasgoQL","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rasgointelligence%2FRasgoQL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rasgointelligence%2FRasgoQL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rasgointelligence%2FRasgoQL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rasgointelligence%2FRasgoQL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rasgointelligence","download_url":"https://codeload.github.com/rasgointelligence/RasgoQL/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rasgointelligence%2FRasgoQL/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259813026,"owners_count":22915201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","pandas","python","sql"],"created_at":"2025-06-14T12:08:29.987Z","updated_at":"2025-06-14T12:08:30.761Z","avatar_url":"https://github.com/rasgointelligence.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Downloads](https://pepy.tech/badge/rasgoql/month)](https://pepy.tech/project/rasgoql)\n[![PyPI version](https://badge.fury.io/py/rasgoql.svg)](https://badge.fury.io/py/rasgoql)\n[![Docs](https://img.shields.io/badge/RasgoQL-DOCS-GREEN.svg)](https://docs.rasgoql.com/)\n[![Chat on Slack](https://img.shields.io/badge/chat-on%20Slack-brightgreen.svg)](https://join.slack.com/t/rasgousergroup/shared_invite/zt-nytkq6np-ANEJvbUSbT2Gkvc8JICp3g)\n[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)\n\n![RasgoQL Hero](https://f.hubspotusercontent30.net/hubfs/20517936/rasgoql/RasgoQL%20Hero%20Image.png)\n\n\u003ch1 align=\"center\"\u003eRasgoQL\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n    \u003cstrong\u003eWrite python locally, execute SQL in your data warehouse\u003c/strong\u003e\n    \u003cbr /\u003e\n    \u003ca href=\"https://docs.rasgoql.com/\"\u003e\u003cstrong\u003e≪  Read the Docs\u003c/strong\u003e\u003c/a\u003e\n    \u0026nbsp · \u0026nbsp\n    \u003ca href=\"https://join.slack.com/t/rasgousergroup/shared_invite/zt-nytkq6np-ANEJvbUSbT2Gkvc8JICp3g\"\u003e\n    \u003cstrong\u003eJoin Our Slack »\u003c/strong\u003e\u003c/a\u003e\n\u003c/p\u003e\n\nRasgoQL is a Python package that enables you to easily query and transform tables in your Data Warehouse directly from a notebook.\n\nYou can quickly create new features, sample data, apply complex aggregates... all without having to write SQL!\n\nChoose from our library of predefined transformations or make your own to streamline the feature engineering process.\n\n![RasgoQL 30-second demo](https://f.hubspotusercontent30.net/hubfs/20517936/rasgoql/rasgo_intro2.gif)\n\n# Why is this package useful?\nData scientists spend much of their time in pandas preparing data for modelling. When they are ready to deploy or scale, two pain points arise:\n1. pandas cannot handle larger volumes of data, forcing the use of VMs or code refactoring.\n2. feature data must be added to the Enterprise Data Warehouse for future processing, requiring refactoring to SQL\n\nWe created RasgoQL to solve these two pain points.\n\nLearn more at [https://docs.rasgoql.com](https://docs.rasgoql.com).\n\n# How does it work?\nUnder the covers, RasgoQL sends all processing to your Data Warehouse, enabling the efficient transformation of massive datasets. RasgoQL only needs basic metadata to execute transforms, so your private data remains secure.\n\n![RasgoQL workflow diagram](https://f.hubspotusercontent30.net/hubfs/20517936/rasgoql/RasgoQL-flow.png)\n\nRasgoQL does these things well:\n- Pulls existing Data Warehouse tables into pandas DataFrames for analysis\n- Constructs SQL queries using a syntax that feels like pandas\n- Creates views in your Data Warehouse to save transformed data\n- Exports runnable sql in .sql files or dbt-compliant .yaml files\n- Offers dozens of free SQL transforms to use\n- Coming Soon: allows users to create \u0026 add custom transforms\n\nRasgo supports [Snowflake](https://docs.rasgoql.com/datawarehouses/credentials), [BigQuery](https://docs.rasgoql.com/datawarehouses/bigquery), [Postgres](https://www.postgresql.org/), and [Amazon Redshift](https://aws.amazon.com/redshift/)\nwith more Data Warehouses being added soon. If you'd like to suggest another database type,\nsubmit your idea to our [GitHub Discussions page](https://github.com/rasgointelligence/RasgoQL/discussions) so that other community members can weight in and show their support.\n\n# Can RasgoQL help you?\n\n* If you use pandas to build features, but you are working on a massive set of data that won't fit in your machine's memory. RasgoQL can help!\n\n* If your organization uses dbt or another SQL tool to run production data flows, but you prefer to build features in pandas. RasgoQL can help!\n\n* If you know pandas, but not SQL and want to learn how queries will translate. RasgoQL can help!\n\n# Where to get it\nJust run a simple pip install.\n\n`pip install rasgoql~=1.0`\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://github.com/rasgointelligence/RasgoQL/issues/new?assignees=\u0026labels=\u0026template=bug_report.md\u0026title=%5BBUG%5D\"\u003e\nReport Bug\u003c/a\u003e\n·\n\u003ca href=\"https://github.com/rasgointelligence/RasgoQL/issues/new?assignees=\u0026labels=\u0026template=feedback.md\u0026title=%5BFEEDBACK%5D\"\u003e\nSuggest Improvement\u003c/a\u003e\n·\n\u003ca href=\"https://github.com/rasgointelligence/RasgoQL/issues/new?assignees=\u0026labels=\u0026template=feature_request.md\u0026title=%5BRFE%5D\"\u003e\nRequest Feature\u003c/a\u003e\n\u003c/p\u003e\n\n\n# Quick Start\n```python\npip install rasgoql --upgrade\n\n# Connect to your data warehouse\ncreds = rasgoql.SnowflakeCredentials(\n    account=\"\",\n    user=\"\",\n    password=\"\",\n    role=\"\",\n    warehouse=\"\",\n    database=\"\",\n    schema=\"\"\n)\n\n# Connect to DW\nrql = rasgoql.connect(creds)\n\n# List available tables\nrql.list_tables('ADVENTUREWORKS').head(10)\n\n# Allow rasgoQL to interact with an existing Table in your Data Warehouse\ndataset = rql.dataset('ADVENTUREWORKS.PUBLIC.FACTINTERNETSALES')\n\n# Take a peek at the data\ndataset.preview()\n\n# Use the datetrunc transform to seperate things into weeks\nweekly_sales = dataset.datetrunc(dates={'ORDERDATE':'week'})\n\n# Aggregate to sum of sales for each week\nagg_weekly_sales = weekly_sales.aggregate(\n    group_by=['PRODUCTKEY', 'ORDERDATE_WEEK'],\n    aggregations={'SALESAMOUNT': ['SUM']},\n    )\n\n# Quickly validate output\nagg_weekly_sales.to_df()\n\n# Print the SQL\nprint(agg_weekly_sales.sql())\n```\n\n## Getting Stared Tutorials\nThe best way to get familiar with the RasgoQL basics is by running through [these notebooks](https://github.com/rasgointelligence/RasgoQL/tree/main/tutorials) in the tutorials folder.\n\n# Advanced Examples\n\n## Joins\nEasily join tables together using the `join` transform.\n\n```python\nsales_dataset = rasgoql.dataset('ADVENTUREWORKS.PUBLIC.FACTINTERNETSALES')\n\nsales_product_dataset = sales_dataset.join(\n  join_table='DIM_PRODUCT',\n  join_columns={'PRODUCTKEY': 'PRODUCTKEY'},\n  join_type='LEFT',\n  join_prefix='PRODUCT')\n\nsales_product_dataset.sql()\nsales_product_dataset.preview()\n```\n\n![Rasgo Join Example](https://f.hubspotusercontent30.net/hubfs/20517936/rasgoql/rasgo_join.gif)\n\n## Chain transforms together\nCreate a rolling average aggregation and then drops unnecessary colomns.\n\n```python\nsales_agg_drop = sales_dataset.rolling_agg(\n    aggregations={\"SALESAMOUNT\": [\"MAX\", \"MIN\", \"SUM\"]},\n    order_by=\"ORDERDATE\",\n    offsets=[-7, 7],\n    group_by=[\"PRODUCTKEY\"],\n).drop_columns(exclude_cols=[\"ORDERDATEKEY\"])\n\nsales_agg_drop.sql()\nsales_agg_drop.preview()\n```\n\n![Multiple rasgoql transforms](https://f.hubspotusercontent30.net/hubfs/20517936/rasgoql/rasgoql_chain.gif)\n\n## Transpose unique values with pivots\nQuickly generate pivot tables of your data.\n\n```python\nsales_by_product = sales_dataset.pivot(\n    dimensions=['ORDERDATE'],\n    pivot_column='SALESAMOUNT',\n    value_column='PRODUCTKEY',\n    agg_method='SUM',\n    list_of_vals=['310', '345'],\n)\n\nsales_by_product.sql()\nsales_by_product.preview()\n```\n\n![Rasgoql pivot example](https://f.hubspotusercontent30.net/hubfs/20517936/rasgoql/rasgoql_pivot.gif)\n\n# Does any of my data get collected?\nRasgo will not collect any personal information. We log execution of methods in `transforms.py` for success and failure so that we can more accurately track what's useful and what's problematic.\n\n# Where do I go for help?\nIf you have any questions please:\n\n1. [RasgoQL Docs](https://docs.rasgoql.com/)\n2. [Slack](https://join.slack.com/t/rasgousergroup/shared_invite/zt-nytkq6np-ANEJvbUSbT2Gkvc8JICp3g)\n3. [GitHub Issues](https://github.com/rasgointelligence/RasgoQL/issues)\n\n\n# How can I contribute?\nReview the [contributors guide](https://github.com/rasgointelligence/RasgoQL/blob/main/CONTRIBUTING.md)\n\n## License\nRasgoQL uses the GNU AGPL license, as found in the [LICENSE file](./LICENSE).\n\nThis project is sponspored by RasgoML. Find out at [https://www.rasgoml.com/](https://www.rasgoml.com/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frasgointelligence%2Frasgoql","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frasgointelligence%2Frasgoql","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frasgointelligence%2Frasgoql/lists"}