{"id":19426603,"url":"https://github.com/oscaro/ds-test-tools","last_synced_at":"2025-04-24T17:31:10.261Z","repository":{"id":62431910,"uuid":"180535748","full_name":"oscaro/ds-test-tools","owner":"oscaro","description":"Small library to help test Datasplash pipelines.","archived":false,"fork":false,"pushed_at":"2024-10-04T12:38:00.000Z","size":40,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-04-15T10:24:07.803Z","etag":null,"topics":["clojure-library","dataflow"],"latest_commit_sha":null,"homepage":"https://cljdoc.org/d/com.oscaro/ds-test-tools/0.1.1/doc/readme","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"epl-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oscaro.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-04-10T08:20:56.000Z","updated_at":"2024-10-04T12:37:41.000Z","dependencies_parsed_at":"2022-11-01T21:01:13.387Z","dependency_job_id":"77382048-a0ca-4644-9e24-ae203e50e936","html_url":"https://github.com/oscaro/ds-test-tools","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oscaro%2Fds-test-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oscaro%2Fds-test-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oscaro%2Fds-test-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oscaro%2Fds-test-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oscaro","download_url":"https://codeload.github.com/oscaro/ds-test-tools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250674278,"owners_count":21469190,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure-library","dataflow"],"created_at":"2024-11-10T14:08:18.545Z","updated_at":"2025-04-24T17:31:09.926Z","avatar_url":"https://github.com/oscaro.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ds-test-tools [![Clojure CI](https://github.com/oscaro/ds-test-tools/actions/workflows/clojure.yml/badge.svg)](https://github.com/oscaro/ds-test-tools/actions/workflows/clojure.yml) [![Clojars Project](https://img.shields.io/clojars/v/com.oscaro/ds-test-tools.svg)](https://clojars.org/com.oscaro/ds-test-tools)\n\n\n`ds-test-tools` is a small library to help test [Datasplash][] pipelines.\n\n[Datasplash]: https://github.com/ngrunwald/datasplash\n\n## Usage\n\n```clojure\n[com.oscaro/ds-test-tools \"0.2.1\"]\n```\n\nThen:\n\n```clojure\n(ns your.project\n  (:require [ds-test-tools.core :as dt]))\n```\n\nThe only function you need is `dt/run-pipeline`.\n\nIt takes inputs as Clojure data, mapping of keys to result files, and a\nfunction to call in order to build the pipeline. It dumps the input data in the\nappropriate files; builds a configuration map; pass it to your function; run\nthe pipeline; collect the results; and return them to you.\n\n### Simple Usage\n\n```clojure\n;; Your pipeline\n(defn my-job [conf p]\n  (-\u003e\u003e p\n    (ds/read-edn-file (:numbers conf))\n    (ds/map inc)\n    (ds/write-edn-file (str (:output conf) \"/higher.edn\"))))\n\n\n(let [{:keys [result]} (dt/run-pipeline\n                         {:numbers [1 2 3 4]}\n                         {:result \"higher\"}\n                         my-job)]\n  (println (sort result))) ; '(2 3 4 5)\n```\n\n#### Specifying inputs\n\nThe inputs config map uses the same format as your configuration map. Your\nbuild function should take a map of keywords to file paths:\n\n```clojure\n(defn my-job [{:keys [people houses output]} p]\n  (let [people (ds/read-edn-file people p)\n        houses (ds/read-edn-file houses p)]\n    (-\u003e\u003e (ds/join-by (fn [p h] [p :lives-in h])\n                     [[people :house-id {:type :required}]\n                      [houses :id {:type :required}]])\n         (ds/write-edn-file (tio/join-path output \"housing.edn\")))))\n```\n\nThe pipeline above would use the following inputs config map:\n\n```clojure\n{:people [{:name \"John\" :house-id 1} {:name \"Jane\" :house-id 2} ...]\n :houses [{:id 1 :name \"Red House\"} {:id 2 :name \"Green House\"} ...]}\n```\n\n#### Changing the input/output format\n\nBy default, it assumes you use EDN as inputs and outputs. You can change that\nby setting the `:reader` (used to read outputs) and `:writer` (used to write\ninputs) keys in the optional options map:\n\n```clojure\n(dt/run-pipeline\n    {:reader :jsons  ; or :edns (the default)\n     :writer :jsons} ; or :edns (the default)\n    inputs-config\n    outputs-config)\n```\n\nIf you have mixed formats or something else than EDN/JSONS, you can also\nprovide a function of two (readers) or three (writers) arguments:\n\n```clojure\n(dt/run-pipeline\n    {:reader (fn [k filename] ; k is the key in your inputs-config map\n               (case k\n                 :my-jsons-output (tio/read-jsons-file filename)\n                 :my-edns-output (tio/read-edns-file filename)))\n\n     :writer (fn [k filename data]\n               (case k\n                 :my-jsons-input (tio/write-jsons-file filename data)\n                 :my-csv-input (tio/write-csv-file filename data)\n                 :my-text-input (tio/write-text-file filename data)))}\n    inputs-config\n    outputs-config)\n```\n\n## License\n\nCopyright © 2018-2019 Oscaro\n\nDistributed under the Eclipse Public License either version 1.0 or (at your\noption) any later version.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foscaro%2Fds-test-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foscaro%2Fds-test-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foscaro%2Fds-test-tools/lists"}