{"id":16348869,"url":"https://github.com/alt-romes/hegg","last_synced_at":"2025-08-21T00:32:00.833Z","repository":{"id":43441296,"uuid":"509998976","full_name":"alt-romes/hegg","owner":"alt-romes","description":"Fast equality saturation in Haskell","archived":false,"fork":false,"pushed_at":"2025-07-13T20:39:45.000Z","size":1100,"stargazers_count":86,"open_issues_count":10,"forks_count":8,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-08-18T16:43:58.178Z","etag":null,"topics":["egraphs","equality-saturation","haskell"],"latest_commit_sha":null,"homepage":"https://hackage.haskell.org/package/hegg","language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alt-romes.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-07-03T11:06:04.000Z","updated_at":"2025-07-19T14:04:12.000Z","dependencies_parsed_at":"2023-01-25T21:01:16.307Z","dependency_job_id":"d60b8ccb-0b5d-40ab-980a-d28421b99f24","html_url":"https://github.com/alt-romes/hegg","commit_stats":{"total_commits":175,"total_committers":2,"mean_commits":87.5,"dds":0.005714285714285672,"last_synced_commit":"52c0143b217059f5686e45a2a4da03a8546bf205"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/alt-romes/hegg","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alt-romes%2Fhegg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alt-romes%2Fhegg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alt-romes%2Fhegg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alt-romes%2Fhegg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alt-romes","download_url":"https://codeload.github.com/alt-romes/hegg/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alt-romes%2Fhegg/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271409446,"owners_count":24754715,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-20T02:00:09.606Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["egraphs","equality-saturation","haskell"],"created_at":"2024-10-11T00:55:30.388Z","updated_at":"2025-08-21T00:32:00.827Z","avatar_url":"https://github.com/alt-romes.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"## hegg\n\nFast equality saturation in Haskell\n\nBased on [*egg: Fast and Extensible Equality Saturation*](https://arxiv.org/pdf/2004.03082.pdf), [*Relational E-matching*](https://arxiv.org/pdf/2108.02290.pdf) and the [rust implementation](https://github.com/egraphs-good/egg).\n\n### Equality Saturation and E-graphs\n\nSuggested material on equality saturation and e-graphs for beginners\n* (tutorial) https://docs.rs/egg/latest/egg/tutorials/_01_background/index.html\n* (5m video) https://www.youtube.com/watch?v=ap29SzDAzP0\n\n## Equality saturation in Haskell\n\nTo get a feel for how we can use `hegg` and do equality saturation in Haskell,\nwe'll write a simple numeric *symbolic* manipulation library that can simplify expressions\naccording to a set of rewrite rules by leveraging equality saturation.\n\nIf you've never heard of symbolic mathematics you might get some intuition from\nreading [Let’s Program a Calculus\nStudent](https://iagoleal.com/posts/calculus-symbolic/) first.\n\n### Syntax\n\nWe'll start by defining the abstract syntax tree for our simple symbolic expressions:\n```hs\ndata SymExpr = Const Double\n             | Symbol String\n             | SymExpr :+: SymExpr\n             | SymExpr :*: SymExpr\n             | SymExpr :/: SymExpr\ninfix 6 :+:\ninfix 7 :*:, :/:\n\ne1 :: SymExpr\ne1 = (Symbol \"x\" :*: Const 2) :/: (Const 2) -- (x*2)/2\n```\n\nYou might notice that `(x*2)/2` is the same as just `x`. Our goal is to get\nequality saturation to do that for us.\n\nOur second step is to instance `Language` for our `SymExpr`\n\n### Language\n\n`Language` is the required constraint on *expressions* that are to be\nrepresented in e-graph and on which equality saturation can be run:\n\n```hs\ntype Language l = (Traversable l, ∀ a. Ord a =\u003e Ord (l a))\n```\n\nTo declare a `Language` we must write the \"base functor\" of `SymExpr` (i.e. use\na type parameter where the recursion points used to be in the original\n`SymExpr`), then instance `Traversable l`, `∀ a. Ord a =\u003e Ord (l a)` (we can do\nit automatically through deriving), and write an `Analysis` instance for it (see\nnext section).\n\n```hs\ndata SymExpr a = Const Double\n               | Symbol String\n               | a :+: a\n               | a :*: a\n               | a :/: a\n               deriving (Eq, Ord, Show, Functor, Foldable, Traversable)\ninfix 6 :+:\ninfix 7 :*:, :/:\n```\n\nSuggested reading on defining recursive data types in their parametrized\nversion: [Introduction To Recursion\nSchemes](https://blog.sumtypeofway.com/posts/introduction-to-recursion-schemes.html)\n\nIf we now wanted to represent an expression, we'd write it in its\nfixed-point form\n\n```hs\ne1 :: Fix SymExpr\ne1 = Fix (Fix (Fix (Symbol \"x\") :*: Fix (Const 2)) :/: (Fix (Const 2))) -- (x*2)/2\n```\nThen, we define an `Analysis` for our `SymExpr`.\n\n### Analysis\n\nE-class analysis is first described in [*egg: Fast and Extensible Equality\nSaturation*](https://arxiv.org/pdf/2004.03082.pdf) as a way to make equality\nsaturation more *extensible*.\n\nWith it, we can attach *analysis data* from a semilattice to each e-class. More\ncan be read about e-class analysis in the [`Data.Equality.Analsysis`]() module and\nin the paper.\n\nWe can easily define constant folding (`2+2` being simplified to `4`) through\nan `Analysis` instance.\n\nAn `Analysis` is defined over a `domain` and a `language`. To define constant\nfolding, we'll say the domain is `Maybe Double` to attach a value of that type to\neach e-class, where `Nothing` indicates the e-class does not currently have a\nconstant value and `Just i` means the e-class has constant value `i`.\n\n```hs\ninstance Analysis (Maybe Double) SymExpr\n  makeA = ...\n  joinA = ...\n  modifyA = ...\n```\n\nLet's now understand and implement the three methods of the analysis instance we want.\n\n`makeA` is called when a new e-node is added to a new e-class, and constructs\nfor the new e-class a new value of the domain to be associated with it, always\nby accessing the associated data of the node's children data.  Its type is `l\ndomain -\u003e domain`, so note that the e-node's children associated data is\ndirectly available in place of the actual children.\n\nWe want to associate constant data to the e-class, so we must find if the\ne-node has a constant value or otherwise return `Nothing`:\n\n```hs\nmakeA :: SymExpr (Maybe Double) -\u003e Maybe Double\nmakeA = \\case\n  Const x -\u003e Just x\n  Symbol _ -\u003e Nothing\n  x :+: y -\u003e (+) \u003c$\u003e x \u003c*\u003e y\n  x :*: y -\u003e (*) \u003c$\u003e x \u003c*\u003e y\n  x :/: y -\u003e (/) \u003c$\u003e x \u003c*\u003e y\n```\n \n`joinA` is called when e-classes c1 c2 are being merged into c. In this case, we\nmust join the e-class data from both classes to form the e-class data to be\nassociated with new e-class c. Its type is `domain -\u003e domain -\u003e domain`.  In our\ncase, to merge `Just _` with `Nothing` we simply take the `Just`, and if we\nmerge two e-classes with a constant value (that is, both are `Just`), then the\nconstant value is the same (or something went very wrong) and we just keep it.\n\n```hs\njoinA :: Maybe Double -\u003e Maybe Double -\u003e Maybe Double\njoinA Nothing (Just x) = Just x\njoinA (Just x) Nothing = Just x\njoinA Nothing Nothing  = Nothing\njoinA (Just x) (Just y) = if x == y then Just x else error \"ouch, that shouldn't have happened\"\n```\n\nFinally, `modifyA` describes how an e-class should (optionally) be modified\naccording to the e-class data and what new language expressions are to be added\nto the e-class also w.r.t. the e-class data.\nIts type is `ClassId -\u003e EGraph domain l -\u003e EGraph domain l`, where the first argument\nis the id of the class to modify (the class which prompted the modification),\nand then receives and returns an e-graph, in which the e-class has been\nmodified.  For our example, if the e-class has a constant value associated to\nit, we want to create a new e-class with that constant value and merge it to\nthis e-class.\n\n```hs\n-- import Data.Equality.Graph.Lens ((^.), _class, _data)\nmodifyA :: ClassId -\u003e EGraph (Maybe Double) SymExpr -\u003e EGraph (Maybe Double) SymExpr\nmodifyA c egr\n    = case egr ^._class c._data of\n        Nothing -\u003e egr\n        Just i -\u003e\n          let (c', egr') = represent (Fix (Const i)) egr\n           in snd $ merge c c' egr'\n```\n\nModify is a bit trickier than the other methods, but it allows our e-graph to\nchange based on the e-class analysis data. Note that the method is optional and\nthere's a default implementation for it which doesn't change the e-class or adds\nanything to it. Analysis data can be otherwise used, e.g., to inform rewrite\nconditions.\n\nBy instancing this e-class analysis, all e-classes that have a constant value\nassociated to them will also have an e-node with a constant value. This is great\nfor our simple symbolic library because it means if we ever find e.g. an\nexpression equal to `3+1`, we'll also know it to be equal to `4`, which is a\nbetter result than `3+1` (we've then successfully implemented constant folding).\n\nIf, otherwise, we didn't want to use an analysis, we could specify the analysis\ndomain as `()` which will make the analysis do nothing, because there's an\ninstance polymorphic over `lang` for `()` that looks like this:\n\n```hs\ninstance Analysis () lang where\n  makeA _ = ()\n  joinA _ _ = ()\n```\n\n### Equality saturation\n\nEquality saturation is defined as the function\n```hs\nequalitySaturation :: forall l. Language l\n                   =\u003e Fix l             -- ^ Expression to run equality saturation on\n                   -\u003e [Rewrite l]       -- ^ List of rewrite rules\n                   -\u003e CostFunction l    -- ^ Cost function to extract the best equivalent representation\n                   -\u003e (Fix l, EGraph l) -- ^ Best equivalent expression and resulting e-graph\n```\n\nTo recap, our goal is to reach `x` starting from `(x*2)/2` by means of equality\nsaturation.\n\nWe already have a starting expression, so we're missing a list of rewrite rules\n(`[Rewrite l]`) and a cost function (`CostFunction`).\n\n### Cost function\n\nPicking up the easy one first:\n```hs\ntype CostFunction l cost = l cost -\u003e cost\n```\n\nA cost function is used to attribute a cost to representations in the e-graph and to extract the best one.\nThe first type parameter `l` is the language we're going to attribute a cost to, and\nthe second type parameter `cost` is the type with which we will model cost. For\nthe cost function to be valid, `cost` must instance `Ord`.\n\nWe'll say `Const`s and `Symbol`s are the cheapest and then in increasing cost we\nhave `:+:`, `:*:` and `:/:`, and model cost with the `Int` type.\n```hs\ncost :: CostFunction SymExpr Int\ncost = \\case\n  Const  x -\u003e 1\n  Symbol x -\u003e 1\n  c1 :+: c2 -\u003e c1 + c2 + 2\n  c1 :*: c2 -\u003e c1 + c2 + 3\n  c1 :/: c2 -\u003e c1 + c2 + 4\n```\n\n### Rewrite rules\n\nRewrite rules are transformations applied to matching expressions represented in\nan e-graph.\n\nWe can write simple rewrite rules and conditional rewrite rules, but we'll only look at the simple ones.\n\nA simple rewrite is formed of its left hand side and right hand side. When the\nleft hand side is matched in the e-graph, the right hand side is added to the\ne-class where the left hand side was found.\n```hs\ndata Rewrite lang = Pattern lang := Pattern lang          -- Simple rewrite rule\n                  | Rewrite lang :| RewriteCondition lang -- Conditional rewrite rule\n```\n\nA `Pattern` is basically an expression that might contain variables and which can be matched against actual expressions.\n```hs\ndata Pattern lang\n    = NonVariablePattern (lang (Pattern lang))\n    | VariablePattern Var\n```\nA patterns is defined by its non-variable and variable parts, and can be\nconstructed directly or using the helper function `pat` and using\n`OverloadedStrings` for the variables, where `pat` is just a synonym for\n`NonVariablePattern` and a string literal `\"abc\"` is turned into a `Pattern`\nconstructed with `VariablePattern`.\n\nWe can then write the following very specific set of rewrite rules to simplify\nour simple symbolic expressions.\n```hs\nrewrites :: [Rewrite SymExpr]\nrewrites =\n  [ pat (pat (\"a\" :*: \"b\") :/: \"c\") := pat (\"a\" :*: pat (\"b\" :/: \"c\"))\n  , pat (\"x\" :/: \"x\")               := pat (Const 1)\n  , pat (\"x\" :*: (pat (Const 1)))   := \"x\"\n  ]\n```\n### Equality saturation, again\n\nWe can now run equality saturation on our expression!\n\n```hs\nlet expr = fst (equalitySaturation e1 rewrites cost)\n```\nAnd upon printing we'd see `expr = Symbol \"x\"`!\n\nIf we had instead `e2 = Fix (Fix (Fix (Symbol \"x\") :/: Fix (Symbol \"x\")) :+:\n(Fix (Const 3))) -- (x/x)+3`, we'd get `expr = Const 4` because of our rewrite\nrules put together with our constant folding!\n\nThis was a first introduction which skipped over some details but that tried to\nwalk through fundamental concepts for using e-graphs and equality saturation\nwith this library.\n\nThe final code for this tutorial is available under `test/SimpleSym.hs`\n\nA more complicated symbolic rewrite system which simplifies some derivatives and\nintegrals was written for the testsuite. It can be found at `test/Sym.hs`.\n\nThis library could also be used not only for equality-saturation but also for\nthe equality-graphs and other equality-things (such as e-matching) available.\nFor example, using just the e-graphs from `Data.Equality.Graph` to improve GHC's\npattern match checker (https://gitlab.haskell.org/ghc/ghc/-/issues/19272).\n\n## Debugging Rewrite Rules\n\nTo debug rewrite rules when doing equality saturation, wrap the `Scheduler`\nwith `TracingScheduler`. The tracing scheduler will use the underlying\nscheduler but log all the rules matched beforehand. Seeing all the rules which\nfire makes it easy to debug the set of rewrite rules, especially when it loops.\n\n## Profiling\n\nNotes on profiling for development.\n\nFor producing the info table, ghc-options must include `-finfo-table-map\n-fdistinct-constructor-tables`\n\n```\ncabal run --enable-profiling hegg-test -- +RTS -p -s -hi -l-agu\nghc-prof-flamegraph hegg-test.prof\neventlog2html hegg-test.eventlog\nopen hegg-test.svg\nopen hegg-test.eventlog.html\n```\n\n## Coverage\n\n```\ncabal test hegg-test --enable-coverage --enable-library-coverage\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falt-romes%2Fhegg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falt-romes%2Fhegg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falt-romes%2Fhegg/lists"}