{"id":20359582,"url":"https://github.com/flaviuvadan/pipe-flow","last_synced_at":"2026-06-01T02:31:42.684Z","repository":{"id":119803256,"uuid":"250435841","full_name":"flaviuvadan/pipe-flow","owner":"flaviuvadan","description":"A data processing pipeline library with a common vocabulary API","archived":false,"fork":false,"pushed_at":"2020-04-11T17:59:10.000Z","size":164,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-12-06T02:42:36.068Z","etag":null,"topics":["dataprocessing","golang","pipeline"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flaviuvadan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-03-27T03:58:55.000Z","updated_at":"2020-04-11T17:59:12.000Z","dependencies_parsed_at":"2023-06-18T23:05:13.409Z","dependency_job_id":null,"html_url":"https://github.com/flaviuvadan/pipe-flow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/flaviuvadan/pipe-flow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flaviuvadan%2Fpipe-flow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flaviuvadan%2Fpipe-flow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flaviuvadan%2Fpipe-flow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flaviuvadan%2Fpipe-flow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flaviuvadan","download_url":"https://codeload.github.com/flaviuvadan/pipe-flow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flaviuvadan%2Fpipe-flow/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33757790,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-01T02:00:06.963Z","response_time":115,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataprocessing","golang","pipeline"],"created_at":"2024-11-14T23:35:02.716Z","updated_at":"2026-06-01T02:31:42.665Z","avatar_url":"https://github.com/flaviuvadan.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pipe-flow\nA data processing library that allows the creation of parallel pipelines that end in a common point.\n\n## Build status\n[![\u003cflaviuvadan\u003e](https://circleci.com/gh/flaviuvadan/pipe-flow.svg?style=svg)](https://app.circleci.com/pipelines/github/flaviuvadan/pipe-flow)\n\n## Flow diagram\n\n![](diagram.png)\n\n## Source\nA data source that holds data that will be passed through pipelines. For now, it is limited to taking in a CSV \nformatted file. The CSV is read and a pipeline is created for each column. The user is responsible for creating\nthe function that runs on a specific column of the CSV file.\n\n## Pipe\nThe structure through which data flows. The pipeline applies the specified user function to either all the data points\nindependently or perform an aggregation of all the data points to create a common summary. Data passes straight through\nthe pipeline and offers the option to report progress as data is processed.\n\n## Sink\nThe sink is a data repository that aggregates all the data that pipeline operations were performed on and creates a new\nCSV file that holds the results. The results may not be structured the same way as the input CSV is because of the \ndifferent pipeline functions that can be created. For example, a CSV column may end with a summary statistic while \nanother may end with independently modified values.\n\n## Structure\nA concept that holds and coordinates calls to flow data through pipes, and make the sink dump its data once\neverything is done.\n\n### Code examples\nSee `main.go` for an example.\n\n## Test and build\nRun: \n```\n# build the project files\ngo build .\n# test all the files of the project, including sub-directories\ngo test ./...\n```\n\n## TODO\n\n1. Make pipes run in parallel\n1. Add pipe ability to report progress\n1. Make structure allow the user specify whether to inform progress or not\n1. Other TODOs left in the code\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflaviuvadan%2Fpipe-flow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflaviuvadan%2Fpipe-flow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflaviuvadan%2Fpipe-flow/lists"}