{"id":16445997,"url":"https://github.com/leahvelleman/fsmcontainers","last_synced_at":"2025-07-28T01:38:28.823Z","repository":{"id":95349735,"uuid":"99703573","full_name":"leahvelleman/fsmcontainers","owner":"leahvelleman","description":"A Pythonic container interface for finite state machines.","archived":false,"fork":false,"pushed_at":"2017-09-30T15:10:07.000Z","size":373,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-26T10:15:26.260Z","etag":null,"topics":["dictionaries","finite-state-automata","finite-state-machines","finite-state-transducers","python-interface","sets"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/leahvelleman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-08-08T14:44:26.000Z","updated_at":"2018-11-12T06:51:24.000Z","dependencies_parsed_at":"2023-03-10T06:15:33.103Z","dependency_job_id":null,"html_url":"https://github.com/leahvelleman/fsmcontainers","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/leahvelleman/fsmcontainers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leahvelleman%2Ffsmcontainers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leahvelleman%2Ffsmcontainers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leahvelleman%2Ffsmcontainers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leahvelleman%2Ffsmcontainers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/leahvelleman","download_url":"https://codeload.github.com/leahvelleman/fsmcontainers/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leahvelleman%2Ffsmcontainers/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267451232,"owners_count":24089298,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-27T02:00:11.917Z","response_time":82,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dictionaries","finite-state-automata","finite-state-machines","finite-state-transducers","python-interface","sets"],"created_at":"2024-10-11T09:46:05.423Z","updated_at":"2025-07-28T01:38:28.777Z","avatar_url":"https://github.com/leahvelleman.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# fsmcontainers\n\nA Pythonic container interface for finite state machines.\n\nRequres [Pynini](http://www.openfst.org/twiki/bin/view/GRM/Pynini), which itself requires [OpenFst](http://openfst.org), [re2](http://github.com/google/re2), Python 2.7, and a C++ 11 compiler to build. \n\n# Overview\n\nThis package provides two classes:\n\n* `fsmdict`: A finite state transducer that you interact with like a (potentially infinite, reversible) mapping. Supports all `dict` methods and operators, plus inversion, composition, concatenation, and others.\n* `fsmset`: A finite state acceptor that you interact with like a (potentially infinite) set. Supports all `set` methods and operators, plus composition with `fsmdict`s. \n\n**These classes have very different time behavior than standard `dict` and `set`.** Creation, \nmutation, and `len()` are not guaranteed to be fast. Composition may also not be fast. \nTheir advantage is that inversion and lookup remain very fast even after a long chain of transducers are composed together. This\nproperty is useful in natural language processing, among other places. \n\n(But if you don't need composition and just want a fast reversible mapping, you want [bidict](https://bidict.readthedocs.io/en/latest/basic-usage.html) instead.) \n\nKeys and values in an fsmdict, and items in an fsmset, must either be strings or have a converter provided which converts them to \nand from strings. A converter for tuples of strings is included.\n\n# fsmdict\n\nAn fsmdict is an immutable Python mapping that is\nbacked by a finite state transducer. Like regular FSTs, fsmdicts can efficiently be inverted, \nand can be composed, concatenated, and so on.\n\n```python\n\u003e\u003e\u003e m = fsmdict({\"a\": \"b\"})\n\u003e\u003e\u003e m[\"a\"]\n\"b\"\n\u003e\u003e\u003e m.inv[\"b\"]\n\"a\"\n\u003e\u003e\u003e n = fsmdict({\"b\": \"c\"})\n\u003e\u003e\u003e p = m * n\n\u003e\u003e\u003e p[\"a\"]\n\"c\"\n\u003e\u003e\u003e q = m + n + m + n\n\u003e\u003e\u003e q[\"abab\"]\n\"bcbc\"\n```\n\n## Construction\n\nfsmdicts can be constructed directly from Pynini transducers, as well as from dictionaries or from \nany combination of arguments that can be passed to the `dict()` constructor.\n\n```python\n\u003e\u003e\u003e import pynini\n\u003e\u003e\u003e m2 = fsmdict(pynini.transducer(\"a\", \"b\").closure())\n\u003e\u003e\u003e m2[\"aaa\"]\n\"bbb\"\n\u003e\u003e\u003e m3 = fsmdict({\"a\": \"c\", \"b\": \"c\"})\n\u003e\u003e\u003e m3[\"a\"]\n\"c\"\n\u003e\u003e\u003e m4 = fsmdict(a=\"c\", b=\"d\")\n\u003e\u003e\u003e m3 == m4\nTrue\n```\n\n## Reversible, composable, concatenable, intersectable...\n\nUnlike regular dictionaries, fsmdicts are bidirectional. The inverse of an fsmdict can be accessed using\nthe `.inv` property or the `~` operator (the same conventions used in\n[bidict](https://bidict.readthedocs.io/en/latest/basic-usage.html)).\n\n```python\n\u003e\u003e\u003e m.inv[\"b\"]\n\"a\"\n\u003e\u003e\u003e (~m)[\"b\"]\n\"a\"\n\u003e\u003e\u003e (~m)[\"a\"]\nKeyError\n\u003e\u003e\u003e ~~m == m\nTrue\n```\n\nOther common FST operators are supported as well: `*` for composition, `+` for concatenation, and `|` for union. \n\n```python\n\u003e\u003e\u003e n = FstMapping({\"b\": \"c\"})\n\u003e\u003e\u003e (m * n)[\"a\"]\n\"c\"\n\u003e\u003e\u003e (m + n)[\"ab\"]\n\"bc\"\n\u003e\u003e\u003e (m | n)[\"a\"]\n\"b\"\n\u003e\u003e\u003e (m | n)[\"b\"]\n\"c\"\n```\n\nNote that composition follows the convention that `(m * n)[k] == n[m[k]]` --- the *first* fsmdict given is the *innermost*. \nThis convention is used in other FST libraries (e.g. Pynini, XFST, foma), and makes sense if `m` and `n` are thought of as sound \nchanges, rewrite rules, or other transformations: `m*n` can be read in chronological order as \"first apply `m`, then apply `n`.\" \nBut it differs from the math convention where `f ∘ g(x) == f(g(x))`, or the Haskell convention where `(f . g) x == f (g x)`,\nboth readable as \"first apply `g`, then apply `f`.\"\n\n## One-to-many\n\nUnlike regular dictionaries, fsmdicts can be one-to-many. When a key is mapped to many values, the values are returned as an\nfsmset, which offers the same interface as an ordinary set and can be compared with ordinary sets for equality.\n\n```python\n\u003e\u003e\u003e m3 = FstMapping({\"a\": \"c\", \"b\": \"c\"})\n\u003e\u003e\u003e (~m3)[\"c\"] == {\"a\", \"b\"}\nTrue\n```\n\nfsmsets or ordinary sets can also be used as keys. `m[{a,b, ... ,z}]` is syntactic sugar for \n`fsmset(m[a]) + fsmset(m[b]) + ... + fsmset(m[z])`. \nThis gives the convenient properties that any value in `m` can serve as a key for `~m` and that `~m * m` maps every value \nin `m.values()` to itself. It also preserves the identity `(m * n)[a] == n[m[a]]`.\n\n```python\n\u003e\u003e\u003e m3[{\"a\", \"b\"}]\n\"c\"\n\u003e\u003e\u003e (~m3 * m3)[\"c\"]\n\"c\"\n```\n\n`len(m)` returns the number of individual (non-set) keys in `m`, and `m.keys()` returns an iterator over the individual\n(non-set) keys. This means that `len(m)` is equal to `len(m.keys())` but may not \nbe equal to `len(m.values())`, that `len(m)` and `len(~m)` may not be equal, and that if `s` is a set then\n`m[s]` is well-defined even though `s not in m.keys()`. \n\n(There is some Pythonic precedent for the last behavior\nin slice objects: if `l` is an ordinary Python list and `s` is a slice, then `l[s]` is well-defined even though \n`s not in range(len(l))`).\n\n## Potentially infinite\n\nfsmdicts can be based on cyclic FSTs. This means that, unlike dictionaries, they can support mappings that take\na (theoretically) infinite number of keys, or return a (theoretically) infinite number of different values, or both. \nFor instance, this transducer maps a string of `a`s of any length to a string of `b`s of equal length.\n\n```python\n\u003e\u003e\u003e m2 = fsmdict(pynini.transducer(\"a\", \"b\").closure())\n\u003e\u003e\u003e m2[\"aaaaa\"]\n\"bbbbb\"\n\u003e\u003e\u003e m2[\"a\" * 100000]\n\"bbb ... b\"\n```\n\nIf `m` is cyclic on its key side, `len(m)`, `m.keys()`, `m.values()`, `m.items()`, and `for k in m` raise errors. If `m` is cyclic on its value\nside and maps `k` to an infinite number of values, then `m[k]` and `m.get(k)` also raise errors.\n\nA fsmdict created using the `limit()` method shows a different behavior. If `m = FstMapping( ... ).limit(n)` then \n`len(m)` is `n`; `len(m[k])` is at most `n` for any `k`; and `m.keys()`, `m.values()`, and `m.items()` yield at most `n` items\nbefore stopping. There is no guarantee that `m.keys()`, `m.values()`, and `m.items()` on a limited FstMapping will yield \n*corresponding* keys and values: there may be some `k` in `m.keys()` such that `m[k]` is *not* in `m.values()`.\n\n## Goals\n\n* Python 2/3 support\n* FstSet class for infinite set support, so that `m[k]` can be well-defined even if `k*m` is cyclic on the output side\n* Priority union and priority composition operators (`/` and `%`?) for rule-based linguistic applications\n* Methods for taking closure etc without calling on Pynini functions\n* Automagic handling of sigmas in rewrite rules\n* Mutability?\n* Support for more FST libraries?\n* Conversion between FstSets and re2 regexes?\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleahvelleman%2Ffsmcontainers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fleahvelleman%2Ffsmcontainers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleahvelleman%2Ffsmcontainers/lists"}