{"id":17658664,"url":"https://github.com/ruuda/dfc","last_synced_at":"2025-07-12T14:34:49.570Z","repository":{"id":66191226,"uuid":"160945704","full_name":"ruuda/dfc","owner":"ruuda","description":"Dataflow compiler experiment","archived":false,"fork":false,"pushed_at":"2018-12-08T14:20:10.000Z","size":45,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-30T10:30:22.063Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Haskell","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ruuda.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-12-08T14:12:42.000Z","updated_at":"2020-07-04T11:35:06.000Z","dependencies_parsed_at":"2023-03-07T10:16:01.288Z","dependency_job_id":null,"html_url":"https://github.com/ruuda/dfc","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ruuda/dfc","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruuda%2Fdfc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruuda%2Fdfc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruuda%2Fdfc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruuda%2Fdfc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ruuda","download_url":"https://codeload.github.com/ruuda/dfc/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ruuda%2Fdfc/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265003809,"owners_count":23696292,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-23T15:28:10.020Z","updated_at":"2025-07-12T14:34:49.535Z","avatar_url":"https://github.com/ruuda.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Dataflow Compiler\n\nA proof of concept optimizing compiler for a dataflow-based intermediate stream\nprocessing language. The compiler takes is a program that for every element in\nthe input stream yields zero or more elements. (Think list comprehensions in\nPython, `for` comprehensions in Scala, or LINQ in C#). It optimizes the program\nby analyzing the data flow, taking advantage of purity, and of the limited\nopportunity for control flow in such programs.\n\nMy goal is to target a strict language, which is an interesting code generation\nproblem because the data flow leaves a lot of freedom for scheduling operations.\nWe need to infer the control flow from the dataflow. On the one hand to avoid\ndoing useless work, but also for correctness: a conditional division that\nverifies that the denominator is nonzero, should not compile to a program that\ndivides by zero anyway. Targeting a lazy language is simpler in this regard,\nbecause dependencies are tracked dynamically at runtime, rather than statically\nat compile time. For example, if a value is only used conditionally, we should\nonly compute it in the branch where it is used. In a lazy language we could get\naway with unconditionally producing a thunk, as it would only be forced inside\nthe branch.\n\n## Implementation Notes\n\n * The variable and expression type use GADTs for type safety. An optimization\n   pass that would change the type of a value would not typecheck.\n * The use of `PatternSynonyms` and `ViewPatterns` makes for quite readable\n   peephole optimization passes.\n * I started out allowing both variables and constants in expressions, but\n   allowing only variables (and making constants expressions) make writing\n   optimizations more uniform.\n * Having an identity expression is useful to modularize optimization passes.\n   One pass would rewrite `$2 = $1 + 0` to `$2 = $1`, and it does not need to\n   be cluttered by anything else. Another pass would then rewrite references to\n   `$2` with references to `$1`, at which point `$2` becomes dead code.\n\n## Building\n\n    stack build\n    stack exec dfc\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fruuda%2Fdfc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fruuda%2Fdfc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fruuda%2Fdfc/lists"}