https://github.com/cmpadden/dagster-pipes-rust
Dagster pipes implementation in Rust
https://github.com/cmpadden/dagster-pipes-rust
dagster data integrations orchestration rust
Last synced: 8 months ago
JSON representation
Dagster pipes implementation in Rust
- Host: GitHub
- URL: https://github.com/cmpadden/dagster-pipes-rust
- Owner: cmpadden
- Created: 2024-11-27T17:50:29.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-12-06T20:05:04.000Z (10 months ago)
- Last Synced: 2024-12-06T21:01:35.091Z (10 months ago)
- Topics: dagster, data, integrations, orchestration, rust
- Language: Rust
- Homepage:
- Size: 45.9 KB
- Stars: 6
- Watchers: 2
- Forks: 2
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
> [!NOTE]
> This project has been moved to the official Dagster `community-integrations` repo. The maintained version can be found [here](https://github.com/dagster-io/community-integrations/tree/c7238204fcd539532d67abb3b8dda3bf30a24c07/libraries/pipes/implementations/rust).---
# dagster-pipes-rust
Get full observability into your Rust workloads when orchestrating through Dagster. With this light weight interface, you can retrieve data directly from the Dagster context, report asset materializations, report asset checks, provide structured logging, end more.
[](https://crates.io/crates/dagster_pipes_rust)
## Usage
### Installation
```sh
cargo add dagster_pipes_rust
```### Example
An example project can be found in [./example-dagster-pipes-rust-project](./example-dagster-pipes-rust-project).
In this project there exists a `rust_processing_jobs` binary, which demonstrates how to use the Dagster context to report materializations to Dagster through the `context.report_asset_materialization` method.
```rust
use dagster_pipes_rust::open_dagster_pipes;
use serde_json::json;fn main() {
let mut context = open_dagster_pipes();
let metadata = json!({"row_count": {"raw_value": 100, "type": "int"}});
context.report_asset_materialization("example_rust_subprocess_asset", metadata);
}
```
It also demonstrates how to run the Rust binary in a subprocess from Dagster. Note, that it's also possible to launch processes in external compute environments like Kubernetes.
```python
import shutilimport dagster as dg
@dg.asset(
group_name="pipes",
kinds={"rust"},
)
def example_rust_subprocess_asset(
context: dg.AssetExecutionContext, pipes_subprocess_client: dg.PipesSubprocessClient
) -> dg.MaterializeResult:
"""Demonstrates running Rust binary in a subprocess."""
cmd = [shutil.which("cargo"), "run"]
cwd = dg.file_relative_path(__file__, "../rust_processing_jobs")
return pipes_subprocess_client.run(
command=cmd,
cwd=cwd,
context=context,
).get_materialize_result()defs = dg.Definitions(
assets=[example_rust_subprocess_asset],
resources={
"pipes_subprocess_client": dg.PipesSubprocessClient(
context_injector=dg.PipesEnvContextInjector(),
)
},
)
```## Contributing
### Pipes Schema
We use [jsonschema](https://json-schema.org/) to define the pipes protocol and [quicktype](https://quicktype.io/) to generate the Rust structs. Currently, the json schemas live in `jsonschema/pipes` but they should be hosted/defined in a centralized repository in the future.
To generate the Rust structs, make sure to install quicktype with `npm install -g quicktype`. Then run:
```bash
./quicktype.sh
```