https://github.com/cmpadden/dagster-essentials-capstone
Dagster Essentials Capstone Project - Letterboxd Movie Summary
https://github.com/cmpadden/dagster-essentials-capstone
Last synced: 11 months ago
JSON representation
Dagster Essentials Capstone Project - Letterboxd Movie Summary
- Host: GitHub
- URL: https://github.com/cmpadden/dagster-essentials-capstone
- Owner: cmpadden
- Created: 2023-11-19T22:43:41.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-11-25T01:44:19.000Z (over 2 years ago)
- Last Synced: 2025-04-22T10:22:47.606Z (about 1 year ago)
- Language: Python
- Size: 39.1 KB
- Stars: 7
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Dagster Essentials Capstone - Movie Summaries
Collect movie metadata from Letterbox and OpenSubtitles, and generate a full movie
summarization with the power of LLMs and LangChain.
## Getting started
First, install your Dagster code location as a Python package. By using the --editable flag, pip will install your Python package in ["editable mode"](https://pip.pypa.io/en/latest/topics/local-project-installs/#editable-installs) so that as you develop, local code changes will automatically apply.
```bash
pip install -e ".[dev]"
```
Then, start the Dagster UI web server:
```bash
dagster dev
```
Open http://localhost:3000 with your browser to see the project.
You can start writing assets in `dagster_essentials_capstone/assets.py`. The assets are automatically loaded into the Dagster code location as you define them.
## Development
### Adding new Python dependencies
You can specify new Python dependencies in `setup.py`.
### Unit testing
Tests are in the `dagster_essentials_capstone_tests` directory and you can run tests using `pytest`:
```bash
pytest dagster_essentials_capstone_tests
```
### Schedules and sensors
If you want to enable Dagster [Schedules](https://docs.dagster.io/concepts/partitions-schedules-sensors/schedules) or [Sensors](https://docs.dagster.io/concepts/partitions-schedules-sensors/sensors) for your jobs, the [Dagster Daemon](https://docs.dagster.io/deployment/dagster-daemon) process must be running. This is done automatically when you run `dagster dev`.
Once your Dagster Daemon is running, you can start turning on schedules and sensors for your jobs.
### Exploring DuckDB
Using the DuckDB CLI, it is possible to easily explore the contents of the local DuckDB
by running the command:
```sh
duckdb data/data.duckdb
```
## Deploy on Dagster Cloud
The easiest way to deploy your Dagster project is to use Dagster Cloud.
Check out the [Dagster Cloud Documentation](https://docs.dagster.cloud) to learn more.