https://github.com/deepyaman/jaffle-shop
Example project for building scalable data pipelines with Kedro and Ibis.
https://github.com/deepyaman/jaffle-shop
duckdb ibis kedro python
Last synced: 6 months ago
JSON representation
Example project for building scalable data pipelines with Kedro and Ibis.
- Host: GitHub
- URL: https://github.com/deepyaman/jaffle-shop
- Owner: deepyaman
- Created: 2023-12-06T00:25:04.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-01-31T06:17:30.000Z (over 1 year ago)
- Last Synced: 2025-03-25T13:02:55.576Z (7 months ago)
- Topics: duckdb, ibis, kedro, python
- Language: Python
- Homepage: https://kedro.org/blog/building-scalable-data-pipelines-with-kedro-and-ibis
- Size: 1.47 MB
- Stars: 13
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Jaffle Shop
## Overview
This is your new Kedro project, which was generated using `kedro 0.19.1`.
Take a look at the [Kedro documentation](https://docs.kedro.org) to get started.
## Rules and guidelines
In order to get the best out of the template:
* Don't remove any lines from the `.gitignore` file we provide
* Make sure your results can be reproduced by following a [data engineering convention](https://docs.kedro.org/en/stable/faq/faq.html#what-is-data-engineering-convention)
* Don't commit data to your repository
* Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in `conf/local/`## How to install dependencies
Declare any dependencies in `requirements.txt` for `pip` installation.
To install them, run:
```
pip install -r requirements.txt
```## How to run your Kedro pipeline
You can run your Kedro project with:
```
kedro run
```## How to test your Kedro project
Have a look at the files `src/tests/test_run.py` and `src/tests/pipelines/test_data_science.py` for instructions on how to write your tests. Run the tests as follows:
```
pytest
```To configure the coverage threshold, look at the `.coveragerc` file.
## Project dependencies
To see and update the dependency requirements for your project use `requirements.txt`. Install the project requirements with `pip install -r requirements.txt`.
[Further information about project dependencies](https://docs.kedro.org/en/stable/kedro_project_setup/dependencies.html#project-specific-dependencies)
## How to work with Kedro and notebooks
> Note: Using `kedro jupyter` or `kedro ipython` to run your notebook provides these variables in scope: `catalog`, `context`, `pipelines` and `session`.
>
> Jupyter, JupyterLab, and IPython are already included in the project requirements by default, so once you have run `pip install -r requirements.txt` you will not need to take any extra steps before you use them.### Jupyter
To use Jupyter notebooks in your Kedro project, you need to install Jupyter:```
pip install jupyter
```After installing Jupyter, you can start a local notebook server:
```
kedro jupyter notebook
```### JupyterLab
To use JupyterLab, you need to install it:```
pip install jupyterlab
```You can also start JupyterLab:
```
kedro jupyter lab
```### IPython
And if you want to run an IPython session:```
kedro ipython
```### How to ignore notebook output cells in `git`
To automatically strip out all output cell contents before committing to `git`, you can use tools like [`nbstripout`](https://github.com/kynan/nbstripout). For example, you can add a hook in `.git/config` with `nbstripout --install`. This will run `nbstripout` before anything is committed to `git`.> *Note:* Your output cells will be retained locally.
[Further information about using notebooks for experiments within Kedro projects](https://docs.kedro.org/en/develop/notebooks_and_ipython/kedro_and_notebooks.html).
## Package your Kedro project[Further information about building project documentation and packaging your project](https://docs.kedro.org/en/stable/tutorial/package_a_project.html).