{"id":19329732,"url":"https://github.com/outerbounds/ob-content-universe","last_synced_at":"2025-06-26T02:03:15.279Z","repository":{"id":243316444,"uuid":"784036624","full_name":"outerbounds/ob-content-universe","owner":"outerbounds","description":null,"archived":false,"fork":false,"pushed_at":"2024-04-10T21:44:07.000Z","size":531,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-06-26T02:03:14.706Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/outerbounds.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-09T04:01:34.000Z","updated_at":"2024-09-26T15:11:36.000Z","dependencies_parsed_at":"2024-06-08T01:38:33.743Z","dependency_job_id":"cace7501-67b0-41e4-8609-8a8b5ca2b98b","html_url":"https://github.com/outerbounds/ob-content-universe","commit_stats":null,"previous_names":["outerbounds/ob-content-universe"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/outerbounds/ob-content-universe","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fob-content-universe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fob-content-universe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fob-content-universe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fob-content-universe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/outerbounds","download_url":"https://codeload.github.com/outerbounds/ob-content-universe/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fob-content-universe/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261984644,"owners_count":23240302,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T02:29:43.343Z","updated_at":"2025-06-26T02:03:15.244Z","avatar_url":"https://github.com/outerbounds.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ELT to RAG with Metaflow, dbt, and Snowflake\n\u003cimg style=\"display: block; float: left; max-width: 20%; height: auto; margin: auto; float: none!important;\" src=\"static/cover.png\"/\u003e\n\n## How to use this repository\n\n## Workflows and notebooks\nMetaflow workflows and their code dependencies are in the [`flows`](./flows/) directory. \nThe [`utils`](./flows/utils/) contained therein have corresponding notebooks used for iterative development and EDA in [`notebooks`](./notebooks/).\n\n### Docker\nThe [`TransformAndIndex` workflow](./flows/transform_and_refresh_index.py) depends on a docker image. There is one in a public image reprository already configured, or you can rebuild the image yourself using [this Dockerfile](./dependencies/Dockerfile.embedding).\n\n## One-time setup\n🚨 Do not do this in the production Snowflake account unless you want to tear down and re-create the entire content universe!\n\n### Warehouse\nSnowflake setup queries are in [this directory](./snowflake_ops/).\n\n### dbt setup\n\n#### dbt init\n\n🚨 Do not do this unless you want to re-create the dbt project from scratch! \nThe contents of `ob_content_universe` will otherwise already be populated.\n\nRun this command to create the `ob_content_universe` directory for dbt:\n```\ndbt init\n```\nAnswering the questions will lead to dbt append an entry to a `profiles.yml` file like:\n```\nob_content_universe:\n  outputs:\n    dev:\n      account: ...\n      database: ob_content_universe_db\n      password: ...\n      role: ob_content_universe_dbt_role\n      schema: ...\n      threads: 8\n      type: snowflake\n      user: ...\n      warehouse: ...\n  target: dev\n```\n\n#### Install dbt packages\nCheck \n```\ndbt deps\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fouterbounds%2Fob-content-universe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fouterbounds%2Fob-content-universe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fouterbounds%2Fob-content-universe/lists"}