{"id":19520084,"url":"https://github.com/malloydata/malloy-py","last_synced_at":"2025-03-17T15:14:13.682Z","repository":{"id":63464000,"uuid":"560939064","full_name":"malloydata/malloy-py","owner":"malloydata","description":"Python package for executing Malloy","archived":false,"fork":false,"pushed_at":"2025-02-11T19:07:32.000Z","size":2463,"stargazers_count":27,"open_issues_count":6,"forks_count":8,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-03-02T13:10:06.338Z","etag":null,"topics":["business-analytics","business-intelligence","data","data-modeling","python","semantic-modeling","sql"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/malloydata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-02T15:36:57.000Z","updated_at":"2024-12-26T21:14:02.000Z","dependencies_parsed_at":"2023-09-26T05:07:23.598Z","dependency_job_id":"7c80eeb9-6ff4-4dca-aa5c-78b2b81a1a57","html_url":"https://github.com/malloydata/malloy-py","commit_stats":null,"previous_names":[],"tags_count":165,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malloydata%2Fmalloy-py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malloydata%2Fmalloy-py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malloydata%2Fmalloy-py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malloydata%2Fmalloy-py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/malloydata","download_url":"https://codeload.github.com/malloydata/malloy-py/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244056424,"owners_count":20390719,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["business-analytics","business-intelligence","data","data-modeling","python","semantic-modeling","sql"],"created_at":"2024-11-11T00:23:45.245Z","updated_at":"2025-03-17T15:14:13.662Z","avatar_url":"https://github.com/malloydata.png","language":"JavaScript","readme":"![Malloy Logo](https://raw.githubusercontent.com/malloydata/malloy-py/main/assets/malloy_square_centered.png)\n\n## What is it?\n\nMalloy is an experimental language for describing data relationships and transformations. It is both a semantic modeling language and a querying language that runs queries against a relational database. Malloy currently connects to BigQuery, and natively supports DuckDB. We've built a Visual Studio Code extension to facilitate building Malloy data models, querying and transforming data, and creating simple visualizations and dashboards.\n\n_Note: These APIs are still in development and are subject to change._\n\n## How do I get it?\n\nBinary installers for the latest released version are available at the [Python Package Index](https://pypi.org/project/malloy/) (PyPI).\n\n```sh\npython3 -m pip install malloy\n```\n\n## Resources\n\n- [Malloy Language GitHub](https://github.com/looker-open-source/malloy/) - Primary location for the malloy language source, documentation, and information\n- [Malloy Language](https://looker-open-source.github.io/malloy/documentation/language/basic.html) - A quick introduction to the language\n- [eCommerce Example Analysis](https://looker-open-source.github.io/malloy/documentation/examples/ecommerce.html) - A walkthrough of the basics on an ecommerce dataset (BigQuery public dataset)\n- [Modeling Walkthrough](https://looker-open-source.github.io/malloy/documentation/examples/iowa/iowa.html) - An introduction to modeling via the Iowa liquor sales public data set (BigQuery public dataset)\n- [Malloy on YouTube](https://www.youtube.com/channel/UCfN2td1dzf-fKmVtaDjacsg) - Watch demos / walkthroughs of Malloy\n\n## Join The Community\n\n- Join our [Malloy Slack Community!](https://malloydata.github.io/slack) Use this community to ask questions, meet other Malloy users, and share ideas with one another.\n- Use [GitHub issues](https://github.com/looker-open-source/malloy/issues) to provide feedback, suggest improvements, report bugs, and start new discussions.\n\n## Syntax Examples\n\n### Run a named query from a Malloy file\n\n```python\nimport asyncio\n\nimport malloy\nfrom malloy.data.duckdb import DuckDbConnection\n\nasync def main():\n  home_dir = \"/path/to/samples/duckdb/imdb\"\n  with malloy.Runtime() as runtime:\n    runtime.add_connection(DuckDbConnection(home_dir=home_dir))\n\n    data = await runtime.load_file(home_dir + \"/imdb.malloy\").run(\n        named_query=\"genre_movie_map\")\n\n    dataframe = data.to_dataframe()\n    print(dataframe)\n\nif __name__ == \"__main__\":\n  asyncio.run(main())\n```\n\n### Get SQL from an in-line query, using a Malloy file as a source\n\n```python\nimport asyncio\n\nimport malloy\nfrom malloy.data.duckdb import DuckDbConnection\n\nasync def main():\n  home_dir = \"/path/to/samples/duckdb/faa\"\n  with malloy.Runtime() as runtime:\n    runtime.add_connection(DuckDbConnection(home_dir=home_dir))\n\n    [sql, connection\n    ] = await runtime.load_file(home_dir + \"/flights.malloy\").get_sql(query=\"\"\"\n                  run: flights -\u003e {\n                    where: carrier ? 'WN' | 'DL', dep_time ? @2002-03-03\n                    group_by:\n                      flight_date is dep_time.day\n                      carrier\n                    aggregate:\n                      daily_flight_count is flight_count\n                      aircraft.aircraft_count\n                    nest: per_plane_data is {\n                      limit: 20\n                      group_by: tail_num\n                      aggregate: plane_flight_count is flight_count\n                      nest: flight_legs is {\n                        order_by: 2\n                        group_by:\n                          tail_num\n                          dep_minute is dep_time.minute\n                          origin_code\n                          dest_code is destination_code\n                          dep_delay\n                          arr_delay\n                      }\n                    }\n                }\n            \"\"\")\n\n    print(sql)\n\nif __name__ == \"__main__\":\n  asyncio.run(main())\n```\n\n### Write an in-line Malloy model, and run a query\n\n```python\nimport asyncio\n\nimport malloy\nfrom malloy.data.duckdb import DuckDbConnection\n\n\nasync def main():\n  home_dir = \"/path/to/samples/duckdb/imdb/data\"\n  with malloy.Runtime() as runtime:\n    runtime.add_connection(DuckDbConnection(home_dir=home_dir))\n\n    data = await runtime.load_source(\"\"\"\n        source:titles is duckdb.table('titles.parquet') extend {\n          primary_key: tconst\n          dimension:\n            movie_url is concat('https://www.imdb.com/title/',tconst)\n        }\n        \"\"\").run(query=\"\"\"\n        run: titles -\u003e {\n          group_by: movie_url\n          limit: 5\n        }\n        \"\"\")\n\n    dataframe = data.to_dataframe()\n    print(dataframe)\n\n\nif __name__ == \"__main__\":\n  asyncio.run(main())\n  \n```\n\n### Querying BigQuery tables\n\nBigQuery auth via OAuth using gcloud.\n```\ngcloud auth login --update-adc\ngcloud config set project {my_project_id} --installation\n```\n\nActual usage is similar to DuckDB.\n\n```python\nimport asyncio\nimport malloy\nfrom malloy.data.bigquery import BigQueryConnection\n\nasync def main():\n  with malloy.Runtime() as runtime:\n    runtime.add_connection(BigQueryConnection())\n\n    data = await runtime.load_source(\"\"\"\n        source:ga_sessions is bigquery.table('bigquery-public-data.google_analytics_sample.ga_sessions_20170801') extend {\n          measure:\n            hits_count is hits.count()\n        }\n        \"\"\").run(query=\"\"\"\n        run: ga_sessions -\u003e {\n            where: trafficSource.`source` != '(direct)'\n            group_by: trafficSource.`source`\n            aggregate: hits_count\n            limit: 10\n          }\n        \"\"\")\n\n    dataframe = data.to_dataframe()\n    print(dataframe)\n\nif __name__ == \"__main__\":\n  asyncio.run(main())\n\n```\n\n## Development\n\n### Initial setup\n\n```sh\ngit submodule init\ngit submodule update\npython3 -m pip install -r requirements.dev.txt\nscripts/gen-services.sh\n```\n\n### Regenerate Protobuf files\n\n```sh\nscripts/gen-protos.sh\n```\n\n### Tests\n\n```sh\npython3 -m pytest\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmalloydata%2Fmalloy-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmalloydata%2Fmalloy-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmalloydata%2Fmalloy-py/lists"}