{"id":26445560,"url":"https://github.com/cisco-open/path-warden","last_synced_at":"2025-03-18T11:19:22.996Z","repository":{"id":249826864,"uuid":"814064898","full_name":"cisco-open/path-warden","owner":"cisco-open","description":"This repository contains the Path Warden system, a system to enable dynamic label tracking, data provenance, and data governance enforcement in the service mesh.","archived":false,"fork":false,"pushed_at":"2024-07-22T20:53:19.000Z","size":2237,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2024-07-23T16:06:03.343Z","etag":null,"topics":["dift","governance","istio","labeled-data","policy-enforcement","provenance","service-mesh"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cisco-open.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-12T09:18:56.000Z","updated_at":"2024-07-23T16:07:03.538Z","dependencies_parsed_at":"2024-07-23T16:06:40.974Z","dependency_job_id":"eb7d6dd6-0165-4d96-88b6-e3b50b32b8dc","html_url":"https://github.com/cisco-open/path-warden","commit_stats":null,"previous_names":["cisco-open/path-warden"],"tags_count":0,"template":false,"template_full_name":"cisco-ospo/oss-template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cisco-open%2Fpath-warden","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cisco-open%2Fpath-warden/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cisco-open%2Fpath-warden/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cisco-open%2Fpath-warden/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cisco-open","download_url":"https://codeload.github.com/cisco-open/path-warden/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244208595,"owners_count":20416110,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dift","governance","istio","labeled-data","policy-enforcement","provenance","service-mesh"],"created_at":"2025-03-18T11:19:22.388Z","updated_at":"2025-03-18T11:19:22.985Z","avatar_url":"https://github.com/cisco-open.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Path Warden\n\nThis project looks to improve the security in cloud systems. Specifically, it demonstrates:\n\n1) A system for tracing the Lineage of data moving through services\n2) A strategy for storing the Lineage information for a piece of data\n3) An enforcement point for pieces of data based on their Lineage\n4) A mechanism for evaluating Polices that are being enforced\n\nThis design looks at solving these issues in dynamically constructed edge systems.\n\n## Overview\n\nSo far, there are 4 key components to this system:\n\n1. Lineage Propagation: This refers to the propagation of labels related to a piece of data moving through a set of services. The entirety of this set of labels is the `Data Lineage`. The original source of the data is the `Data Provenance`. The goal of lineage propagation is to save the original Data Provenance and concatenate onto it each processing step the data undergoes to generate a Data Lineage that can be evaluated at each proceeding step and ultimately stored for future reference.\n\n2. Lineage Storage: The Data Lineage propagated throughout the system for each series of actions is stored so that it can be referenced, altered, etc later on.\n\n3. Policy Enforcement: Making use of the propagated labels even prior to the data's ultimate destination, the system enforces data management policies at each service in the chain. Currently, enforcement results in pass/fail and either allows or blocks a request. This enforcement module makes sure to cache evaluations of labels to reduce overhead.\n\n4. Policy Evaluation: At the enforcement points, not-yet-evaluated labels are sent off to a separate service to be evaluated for their pass/fail status. Otherwise, previously cached values are simply retrieved.\n\n### Key Technologies \u0026 Concepts\n\n- Label Propagation built using OpenTelemetry\n- Enforcement done in Istio Sidecars using Go Wasm Plugin\n- Policies written in Rego \u0026 evaluated using OPA\n- It is necessary that developers instrument their applications with our lineage propagation (tracing) libraries. However, these libraries desire to be incredibly lightweight and easy to use.\n\n### Current State\n\n#### Summary of System\n\n1. Label Propagation is achieved using Open Telemetry's Baggage Concept. We store a Label Set in JSON format at the baggage labeled `lineage_label_set`.\n2. Label Storage is achieved in myqsql by creating a separate table whose primary key is equivalent to the primary key of the table one is labeling. The current example shows this being done using a small library of functions in python. This allows enabling/disabling the labeling of data in existing systems without updating/destroying existing tables.\n3. We enforce data label policies in Istio's Service Mesh sidecars using a Go-Wasm Plugin. Reference Istio \u0026 the Go Wasm SDK for more information on those.\n4. We write policies for labels in Rego and evaluate them in OPA. The policies are currently written as part of the OPA sidecar manifest which creates the container in the service's pod.\n\n#### Directory Summary\n\n- Account-CRUD is a Demo App with basic CRUD functionality connected to a MySQL DB created for the purposes of testing the various lineage label propagation \u0026 enforcement technologies\n\n- wasm-lineage-headers contains all files relevant to the development of the plugin written for the Istio sidecar which parses, validates \u0026 caches LabelSets.\n\n##### (Provided as Reference)\n\n- OTel Basic contains a Series of Services used to Initially Develop \u0026 Test OpenTelemetry. Generally, the functionality created here is less mature than that in account-CRUD. These files are provided for general reference.\n\n- OPA contains files relevant to testing \u0026 developing the OPA implementations. Ultimately, the plug-and-play solution of OPA for Istio was not used however these files are provided as reference.\n\n#### Implementation Summary\n\nAs mentioned before, account-CRUD contains the demo of this system. See the README in that directory.\n\n### Pre-reqs\n\n- minikube installed on system\n- Istio installed on Minikube cluster\n- gsutils installed\n- Go installed\n- tinygo installed\n\n#### Recommended Additional Software\n\n- VSCode Server on instance for remote IDE access\n\n#### Getting Up and Running with Cluster on EC2 instance\n\n1. Create Tunnel from terminal: `ssh -L 8080:localhost:8080 \u003cremote-host\u003e`\n2. Launch VSCode server: `code-server --auth none`\n3. Open new Terminal on local machine \u0026 ssh: `ssh \u003cremote-host\u003e`\n4. Create Tunnel from EC2 instance to minikube gateway `minikube tunnel`\n5. Launch New Terminal \u0026 ssh which serves as your working cli\n\n#### Accounts Required For\n\n- GoogleCloud: Remotely storing \u0026 Deploying the Wasm Plugin using a GC bucket. Archived files show how it can be deployed with a local file (see envoyFilter.yaml). Deploying EnvoyFilter with GoogleCloud means saving it to a ConfigMap and injecting ConfigMap into app deployment. See docs for more information.\n- Docker: if you want to push images. Not necessary.\n- Use Jaeger for OpenTelemetry trace visualization\n\n##### Early Steps\n\nIn addition to connecting \u0026 establishing the minikube tunnel, early on you will need to do the following things:\n\n1. Should you ever need to edit/redeploy the wasm-lineage-headers plugins (which is very likely), you'll need to make a google cloud account, and a bucket to contain the wasm file. Reference the git linked at the top of `WASM-Label-Lineage.md`.\n\n### Future Work\n\n- More Languages supported for Label Propagation\n- More Databases supported for Label Storage\n- Integrate OPAL for simpler Policy distribution\n- More Policies written for particular labels\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcisco-open%2Fpath-warden","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcisco-open%2Fpath-warden","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcisco-open%2Fpath-warden/lists"}