{"id":15481731,"url":"https://github.com/juliaaplavin/datapipes.jl","last_synced_at":"2025-04-22T16:26:28.163Z","repository":{"id":238906046,"uuid":"759481250","full_name":"JuliaAPlavin/DataPipes.jl","owner":"JuliaAPlavin","description":"The most convenient piping syntax for generic data manipulation in Julia.","archived":false,"fork":false,"pushed_at":"2025-03-18T16:58:40.000Z","size":1157,"stargazers_count":10,"open_issues_count":3,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-17T07:16:49.921Z","etag":null,"topics":["data-manipulation","macro"],"latest_commit_sha":null,"homepage":"","language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JuliaAPlavin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-02-18T17:56:15.000Z","updated_at":"2025-03-18T16:58:50.000Z","dependencies_parsed_at":"2025-04-16T22:01:02.143Z","dependency_job_id":"09e4307d-622e-43af-865b-16707832d52e","html_url":"https://github.com/JuliaAPlavin/DataPipes.jl","commit_stats":null,"previous_names":["juliaaplavin/datapipes.jl"],"tags_count":51,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaAPlavin%2FDataPipes.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaAPlavin%2FDataPipes.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaAPlavin%2FDataPipes.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaAPlavin%2FDataPipes.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JuliaAPlavin","download_url":"https://codeload.github.com/JuliaAPlavin/DataPipes.jl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250276171,"owners_count":21403812,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-manipulation","macro"],"created_at":"2024-10-02T05:05:49.993Z","updated_at":"2025-04-22T16:26:28.135Z","avatar_url":"https://github.com/JuliaAPlavin.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DataPipes.jl \u003c|\u003e\n\nFunction piping with the focus on making general data processing boilerplate-free.\n\n![](https://img.shields.io/badge/tests-passing-brightgreen?logo=github) `DataPipes.jl` is extensively tested with full coverage and more test lines than the actual code.\n\n_Questions other than direct bug reports are best asked in the [![](https://img.shields.io/badge/discourse-topic-brightgreen?logo=discourse)](https://discourse.julialang.org/t/ann-datapipes-jl/60734)._\n\n# Design\n\n![](https://img.shields.io/badge/motivation-why%3F-brightgreen) There are multiple implementation of the piping concept in Julia: [1](https://github.com/c42f/Underscores.jl), [2](https://github.com/jkrumbiegel/Chain.jl), [3](https://github.com/FNj/Hose.jl), [4](https://github.com/oxinabox/Pipe.jl), maybe even more. `DataPipes` design is focused on usual data processing and analysis tasks. What makes `DataPipes` distinct from other packages is that it ticks all these points:\n\n✅ Gets rid of basically all boilerplate for common data processing functions:\n```julia\n@p tbl |\u003e filter(_.a \u003e 5) |\u003e map(_.b + _.c)\n```\n✅ Can be inserted in as a step of a vanilla Julia pipeline without modifying the latter:\n```julia\ntbl |\u003e sum  # before\ntbl |\u003e @f(map(_ ^ 2) |\u003e filter(_ \u003e 5)) |\u003e sum  # after\n```\n✅ Can define a function transforming the data instead of immediately applying it\n```julia\nfunc = @f map(_ ^ 2) |\u003e filter(_ \u003e 5) |\u003e sum  # define func\nfunc(tbl)  # apply it\n```\n✅ Supports easily exporting the result of an intermediate pipeline step\n```julia\n@p let\n    tbl\n    @export tbl_filt = filter(_.a \u003e 5)  # export a single intermediate result\n    map(_.b + _.c)\nend\n\n@p begin  # use begin instead of let to make all intermediate results available afterwards\n    tbl\n    tbl_filt = filter(_.a \u003e 5)\n    map(_.b + _.c)\nend\n\n# tbl_filt is available here\n```\n✅ Provides no-boilerplate nesting\n```julia\n@p let\n\t\"a=1 b=2 c=3\"\n\tsplit\n\tmap() do __  # `__` turns the inner function into a pipeline\n\t\tsplit(__, '=')\n\t\tSymbol(__[1]) =\u003e parse(Int, __[2])\n\tend\n\tNamedTuple\nend  # == (a = 1, b = 2, c = 3)\n```\n\n\nAs demonstrated, `DataPipes` tries to minimally modify regular Julia syntax and stays fully composable both with other instruments _(vanilla pipelines)_ and with itself _(nested pipes)_.\n\n# Examples\n\nThose design decisions make `DataPipes` convenient for both working with flat tabular data, and for processing nested structures. An example of the former:\n```julia\n@p begin\n    tbl\n    filter(!any(ismissing, _))\n    filter(_.id \u003e 6)\n    groupview(_.group)\n    map(sum(_.age))\nend\n```\n_(adapted from the Chain.jl README; all DataFrames-specific operations replaced with general functions)_\n\n\n![](https://img.shields.io/badge/docs-examples-brightgreen?logo=julia) See [the Pluto notebook](https://aplavin.github.io/DataPipes.jl/examples/notebook.html) for more examples and more extensive `DataPipes` syntax description.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliaaplavin%2Fdatapipes.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjuliaaplavin%2Fdatapipes.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliaaplavin%2Fdatapipes.jl/lists"}