{"id":16591698,"url":"https://github.com/amol-/datapyground","last_synced_at":"2025-07-28T23:34:11.399Z","repository":{"id":256038112,"uuid":"852437990","full_name":"amol-/datapyground","owner":"amol-","description":"Easy to study Data Platform for fun and profit","archived":false,"fork":false,"pushed_at":"2024-10-16T21:31:28.000Z","size":357,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-22T02:03:46.372Z","etag":null,"topics":["compute-engine","data","data-engineering","database","python"],"latest_commit_sha":null,"homepage":"https://alessandro.molina.fyi/datapyground","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amol-.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-09-04T20:00:45.000Z","updated_at":"2025-05-29T16:40:36.000Z","dependencies_parsed_at":"2024-10-19T03:20:22.267Z","dependency_job_id":null,"html_url":"https://github.com/amol-/datapyground","commit_stats":null,"previous_names":["amol-/datapyground"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/amol-/datapyground","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amol-%2Fdatapyground","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amol-%2Fdatapyground/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amol-%2Fdatapyground/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amol-%2Fdatapyground/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amol-","download_url":"https://codeload.github.com/amol-/datapyground/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amol-%2Fdatapyground/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267604302,"owners_count":24114521,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-28T02:00:09.689Z","response_time":68,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compute-engine","data","data-engineering","database","python"],"created_at":"2024-10-11T23:17:39.084Z","updated_at":"2025-07-28T23:34:11.381Z","avatar_url":"https://github.com/amol-.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"docs/logo.png\" alt=\"DataPyground\" width=\"180\"/\u003e\n\n# DataPyground\n\n[![Tests](https://img.shields.io/github/actions/workflow/status/amol-/datapyground/pytest.yml?branch=main\u0026label=tests)](https://github.com/amol-/datapyground/actions)\n[![Coverage](https://img.shields.io/coveralls/github/amol-/datapyground)](https://coveralls.io/github/amol-/datapyground)\n\nData Analysis framework and Compute Engine for fun,\nit was started as a foundation for the [**How Data Platforms Work**](https://github.com/amol-/datapyground/tree/main/book)\nbook associated to the [**Monthly Python Data Engineering Newsletter**](https://alessandromolina.substack.com/) \nwhile writing the book to showcase the concepts explained in the its chapters.\n\nThe main priority of the codebase is to be as feature complete\nas possible while making it easy to understand and contribute to \nfor people that have no prior knowledge of compute\nengines or data processing frameworks in general.\n\nThe codebase is heavily documented and commented to make it easy to understand\nand modify, and contributions are welcomed and encouraged, it is meant\nto be a safe playground for learning and experimentation.\n\n## Documentation\n\nEach component of the data platform is self documented in a way inspired\nby the literate programming concept. The complete documentation\nis available at [Documentation](http://alessandro.molina.fyi/datapyground/)\n\nFor further understanding of the codebase and the concepts\nreading the [**How Data Platforms Work**](https://github.com/amol-/datapyground/tree/main/book) \nbook is recommended.\n\n## Getting Started\n\nInstall datapyground package from pip:\n\n```bash\npip install datapyground\n```\n\nOnce installed refer to the [Documentation](http://alessandro.molina.fyi/datapyground/) \nof each component to learn how to use it.\n\n### Commands\n\n`DataPyground` exposes some commands to play around with its features,\ncurrently the following commands are provided:\n\n#### pyground-fquery\n\nAllows to run SQL queries on CSV and Parquet files:\n\n```bash\n$ pyground-fquery -t sales=examples/data/sales.csv \"SELECT Product, Quantity, Price, Quantity*Price AS Total FROM sales WHERE Product='Videogame' OR Product='Laptop' ORDER BY Total DESC LIMIT 5\"\nProduct   | Quantity | Price | Total \n--------- | -------- | ----- | ------\nVideogame | 10       | 98.31 | 983.10\nLaptop    | 10       | 97.24 | 972.40\nVideogame | 10       | 97.21 | 972.10\nVideogame | 10       | 96.12 | 961.20\nLaptop    | 10       | 92.23 | 922.30\n```\n\n## Contributing\n\nContributions are welcomed and encouraged, it is meant\nto be a safe playground for learning and experimentation.\n\nThe only requirement is that the contributions maintain\nor increase the level of quality of the documentation and codebase,\ncontributions that are not properly documented won't be merged,\nconsider quality of docmentation more important that elegance or performance\nof the codebase for this project.\n\nThe contributions are currently meant to be in **pure python**,\nthis does not prevent the use of c extensions and cython for performance\nin the future, but that will have to happen when the benefit they provide\noutweights the added complexity they introduce in the context of a learning\nproject.\n\n### Setup development environment\n\nInstall `uv` python package:\n\n```bash\npip install uv\n```\n\nThen install the dependencies and the project in editable mode:\n\n```bash\nuv sync --dev\n```\n\n### Running tests\n\n```bash\nuv run pytest -v\n```\n\n### Building Docs\n\n```bash\ncd docs\nuv run make html\n```\n\nThe documentation is readable at ``docs/build/html``\nafter being built.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famol-%2Fdatapyground","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famol-%2Fdatapyground","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famol-%2Fdatapyground/lists"}