https://github.com/feast-dev/feast-workshop

A workshop with several modules to help learn Feast, an open-source feature store
https://github.com/feast-dev/feast-workshop

Last synced: 4 months ago
JSON representation

A workshop with several modules to help learn Feast, an open-source feature store

Host: GitHub
URL: https://github.com/feast-dev/feast-workshop
Owner: feast-dev
Created: 2022-04-28T17:04:26.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2025-01-09T18:38:57.000Z (about 1 year ago)
Last Synced: 2025-03-29T07:09:24.381Z (11 months ago)
Language: Jupyter Notebook
Homepage: http://feast.dev
Size: 6.97 MB
Stars: 87
Watchers: 10
Forks: 57
Open Issues: 5
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          

##   Workshop: Learning Feast 

This workshop aims to teach users about [Feast](http://feast.dev), an open-source feature store. 

We explain concepts & best practices by example, and also showcase how to address common use cases.

### What is Feast?

Feast is an operational system for managing and serving machine learning features to models in production. It can serve features from a low-latency online store (for real-time prediction) or from an offline store (for  batch scoring). 



### What is Feast not?

- Feast does not orchestrate data pipelines (e.g. batch / stream transformation or materialization jobs), but provides a framework to integrate with adjacent tools like dbt, Airflow, and Spark.

- Feast also does not solve other commonly faced issues like data quality, experiment management, etc. 

See more details at [What Feast is not](https://docs.feast.dev/#what-feast-is-not).

### Why Feast?

Feast solves several common challenges teams face:

1. Lack of feature reuse across teams

2. Complex point-in-time-correct data joins for generating training data

3. Difficulty operationalizing features for online inference while minimizing training / serving skew

### Pre-requisites

This workshop assumes you have the following installed:

- A local development environment that supports running Jupyter notebooks (e.g. VSCode with Jupyter plugin)

- Python 3.8+

- pip

  - Docker & Docker Compose (e.g. `brew install docker docker-compose`)

- **Module 0 pre-requisites**:

  - Terraform ([docs](https://learn.hashicorp.com/tutorials/terraform/install-cli#install-terraform))

  - Either AWS or GCP setup:

    - AWS

      - AWS CLI

      - An AWS account setup with credentials via `aws configure` (e.g see [AWS credentials quickstart](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html#cli-configure-quickstart-creds))

    - GCP

      - GCP account

      - `gcloud` CLI

- **Module 1 pre-requisites**:

  - Java 11 (for Spark, e.g. `brew install java11`)

Since we'll be learning how to leverage Feast in CI/CD, you'll also need to fork this workshop repository.

**Caveats** 

- M1 Macbook development is untested with this flow. See also [How to run / develop for Feast on M1 Macs](https://github.com/feast-dev/feast/issues/2105).

- Windows development has only been tested with WSL. You will need to follow this [guide](https://docs.docker.com/desktop/windows/wsl/) to have Docker play nicely.

## Modules

*See also: [Feast quickstart](https://docs.feast.dev/getting-started/quickstart), [Feast x Great Expectations tutorial](https://docs.feast.dev/tutorials/validating-historical-features)*

These are meant mostly to be done in order, with examples building on previous concepts.

| Time (min) | Description                                                                      | Module                                                                                  |

| :--------: | :------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------- |

|   30-45    | Setting up Feast projects & CI/CD + powering batch predictions                   | [Module 0](module_0/README.md)                                                                         |

|   15-20    | Streaming ingestion & online feature retrieval with Kafka, Spark, Airflow, Redis | [Module 1](module_1/README.md)                                                                         |

|   10-15    | Real-time feature engineering with on demand transformations                     | [Module 2](module_2/README.md)                                                                         |

|     30     | Orchestrated batch/stream transformations using dbt + Airflow with Feast         | [Module 3 (Snowflake)](module_3_sf/README.md)                                                          |

|     30     | (WIP) Orchestrated batch/stream transformations using dbt + Airflow with Feast   | [Module 3 (Databricks)](module_3_db/README.md)                                                         |

|     30     | Book recommender system with dbt + Airflow + Feast                               | [Feast x Book Recommendations (on Databricks)](https://github.com/tecton-ai/book-recsys-apply-workshop/tree/main/feast_repo) |

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/feast-dev/feast-workshop

Awesome Lists containing this project

README