{"id":34030916,"url":"https://github.com/gening/seq_re","last_synced_at":"2026-04-07T07:03:35.664Z","repository":{"id":57465724,"uuid":"87981938","full_name":"gening/seq_re","owner":"gening","description":"2-dimensional Sequence Regular Expression (SEQ RE)","archived":false,"fork":false,"pushed_at":"2020-03-07T17:32:00.000Z","size":540,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-03-13T01:57:47.051Z","etag":null,"topics":["2d-array","matrix","n-tuple","n-vector","regex","regular-expression","seq","sequence","token"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gening.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-04-11T21:41:37.000Z","updated_at":"2019-11-19T22:59:31.000Z","dependencies_parsed_at":"2022-09-17T18:10:13.126Z","dependency_job_id":null,"html_url":"https://github.com/gening/seq_re","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/gening/seq_re","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gening%2Fseq_re","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gening%2Fseq_re/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gening%2Fseq_re/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gening%2Fseq_re/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gening","download_url":"https://codeload.github.com/gening/seq_re/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gening%2Fseq_re/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31503394,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T03:10:19.677Z","status":"ssl_error","status_checked_at":"2026-04-07T03:10:13.982Z","response_time":105,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["2d-array","matrix","n-tuple","n-vector","regex","regular-expression","seq","sequence","token"],"created_at":"2025-12-13T18:04:16.652Z","updated_at":"2026-04-07T07:03:35.659Z","avatar_url":"https://github.com/gening.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"2-dimensional Sequence Regular Expression (SEQ RE)\n==================================================\n\nThis module provides regular expression matching operations over a sequence of tuples\n(or a sequence of sequence) data structure. It looks like the following::\n\n    seq_m_n = [[str_11, str_12, ... str_1n],\n               [str_21, str_22, ... str_2n],\n                ...,\n               [str_m1, str_m2, ... str_mn]]\n\nThe sequence is a homogeneous 2D array, that is a matrix with m rows and n columns.\nIn practice, m maybe vary from sequence to sequence, while n is usually a fixed-size.\n\nA element in the tuple of the sequence can be considered as either a string, a word, a phrase,\na char, a flag, a token or a tag, and maybe a set of tags or values (multi-values) in the future.\n\nTo match a pattern over a sequence of tuples,\nthe SEQ RE patterns is written like one of the examples::\n\n    ([;;PERSON]+) [was|has been] [an]? .{0,3} ([^painter|drawing artist|画家])\n\n    (?P\u003cname@0,1,2\u003e[;;PERSON]) [;VERB be;] [born] [on] (?P\u003cbirthday@0:3\u003e([;;NUMBER|MONTH]|[-]){2,3})\n\n\n1. The syntax of SEQ RE pattern\n-------------------------------\n\nA SEQ RE pattern is very similar to the ordinary regular express (RE) used in Python,\nin which the delimiters ``[...]`` is to indicate a tuple -- the second dimension of the sequence.\n\n1.1 Inside ``[...]``\n++++++++++++++++++++\n\n- ``[`` and ``]``\n\n  is the beginning and end delimiter of the tuple, e.g. ``[...]``.\n\n- ``;``\n\n  separates each element which the tuple contains,\n  and the continuous ``;`` at the tail can be omitted,\n  e.g. ``[A|B;X;;]``, ``[A|B;X]``.\n\n- ``|``\n\n  indicates the different values of one element, e.g. ``A|B``.\n  These values form a set, and any string in the set will be matched,\n  e.g. ``A|B`` will match ``A`` or ``B``.\n\n- ``^``\n\n  be the first character of an element,\n  all the string that are not in the value set of this element will be matched.\n  And ``^`` has no special meaning if it’s not the first character of the element.\n  If ``^`` comes the first character of an element but it is a part of a literal string,\n  ``\\^`` should be used to escape it.\n\n- The priority of above-mentioned operations:\n\n  ``[`` ``]`` \u003c ``;`` \u003c ``^`` (not literal) \u003c ``|`` \u003c ``^`` (literal) .\n\n- ``\\``\n\n  is an escaping symbol before aforementioned special characters.\n  Characters other than ``]``, ``:`` or ``\\`` lose their special meaning inside ``[...]``.\n  To express ``]``, ``:`` or ``|`` in literal, ``\\`` should be added before ``]``, ``:`` or ``|``.\n  Meanwhile, to represent a literal backslash ``\\`` before ``]``, ``;`` or ``|``,\n  ``\\\\`` should be used in the plain text\n  that is to say ``'\\\\\\\\'`` must be used in the Python code.\n\n1.2 Outside ``[...]``\n+++++++++++++++++++++\n\n- The special meanings of special characters in the ordinary RE are available here,\n  but with the limitations discussed below.\n\n  1. **Not** support ``[`` and ``]`` as special characters to indicate a set of characters.\n\n  2. **Not** support the following escaped special characters:\n     ``\\number``, ``\\A``, ``\\b``, ``\\B``, ``\\d``, ``\\D``, ``\\s``, ``\\S``,\n     ``\\w``, ``\\W``, ``\\Z``, ``\\a``, ``\\b``, ``\\f``, ``\\n``, ``\\r``, ``\\t``, ``\\v``,\n     ``\\x``.\n\n  3. **Not** support ranges of characters,\n     such as ``[0-9A-Za-z]``, ``[\\u4E00-\\u9FBB\\u3007]`` (Unihan and Chinese character ``〇``)\n     used in ordinary RE.\n\n  4. The whitespace and non-special characters are ignored.\n\n- ``.`` is an abbreviation of an arbitrary tuple ``[]`` or ``[;]``.\n\n- The named groups in the pattern are very useful.\n  As an extension, a format string starting with ``@`` can be followed after the group name,\n  to describe which element of the tuples belonging this group will be output as the result.\n  For example: ``(?P\u003cname@d1,d2:d3\u003e...)``,\n  in which ``d1``, ``d2`` and ``d3`` are all 0-based position index number of elements in the tuple.\n\n  1. ``@0,2:4`` means in the matched result only the 0th\n     and from 2nd to 3rd elements of tuples will be output.\n\n  2. ``@@`` means the pattern of the group itself will be output other than the matched result.\n     one can choose whether to include the group name and parentheses or not.\n\n  3. ``@`` means all elements of tuples in the matched result will be output.\n\n1.3 Boolean logic in the ``[...]``\n++++++++++++++++++++++++++++++++++\n\nGiven a sequence of 3-tuple ``[[s1, s2, s3], ... ]``,\n\n- AND\n\n  ``[X;;Y]`` will match ``s1`` == ``X`` \u0026\u0026 ``s3`` == ``Y``.\n  Its behavior looks like the ordinary RE pattern ``(?:X.Y)``.\n\n- OR\n\n  ``[X;;]|[;;Y]`` will match ``s1`` == ``X`` || ``s3`` == ``Y``.\n  Its behavior looks like the ordinary RE pattern ``(?:X..)|(?:..Y)``\n\n- NOT\n\n  If ``[;^P;]`` will match ``s2`` != ``P``.\n  Its behavior looks like the ordinary RE pattern ``(?:.[^P].)``.\n\n  We can also use a negative lookahead assertion of the ordinary RE,\n  to give a negative covering its following.\n  e.g. ``(?![;P;][Q])[;;][;;]`` \u003c==\u003e ``[;^P;][^Q;;]``,\n  which behavior looks like the ordinary RE pattern ``(?!(?:.P.)(?:Q..))...``.\n\n2. Notes\n--------\n\n**Not** support comparing the number of figures.\n\nMulti-values of one element is not supported now, but this feature may be improved in the future.\n\nAlthough SEQ RE has sufficient ability to express a pattern over sequences of tuples,\nit is still not a cascaded regular expressions (see also: `Stanford TokensRegex\n\u003chttps://nlp.stanford.edu/software/tokensregex.html\u003e`_).\n\n\n3. Examples\n-----------\n\nThe usage of seq_re module::\n\n    from __future__ import print_function\n    import seq_re\n\n    n = 3\n    pattern = ('(?P\u003cname@0\u003e[;;PERSON]+) [is|was|has been] [a|an]? '\n               '(?P\u003cattrib@0,1\u003e.{0,3}) ([artist])')\n    seq = [['Vincent van Gogh', 'NNP', 'PERSON'],\n           ['was', 'VBD', 'O'],\n           ['a', 'DT', 'O'],\n           ['Dutch', 'JJ', 'O'],\n           ['Post-Impressionist', 'NN', 'O'],\n           ['painter', 'NN', 'OCCUPATION'],\n           ['who', 'WP', 'O'],\n           ['is', 'VBZ', 'O'],\n           ['among', 'IN', 'O'],\n           ['the', 'DT', 'O'],\n           ['most', 'RBS', 'O'],\n           ['famous', 'JJ', 'O'],\n           ['and', 'CC', 'O'],\n           ['influential', 'JJ', 'O'],\n           ['figures', 'NNS', 'O'],\n           ['in', 'IN', 'O'],\n           ['the', 'DT', 'O'],\n           ['history', 'NN', 'O'],\n           ['of', 'IN', 'O'],\n           ['Western art', 'NNP', 'DOMAIN'],\n           ['.', '.', 'O']]\n    placeholder_dict = {'artist': ['painter', 'drawing artist']}\n\n    sr = seq_re.SeqRegex(n).compile(pattern, **placeholder_dict)\n    match = sr.search(seq)\n    if match:\n        for g in match.group_list:\n            print(' '.join(['`'.join(tup) for tup in g[1]]))\n        for name in sorted(match.named_group_dict,\n                           key=lambda gn: match.named_group_dict[gn][0]):\n            print(name, match.format_group_to_str(name, True))\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgening%2Fseq_re","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgening%2Fseq_re","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgening%2Fseq_re/lists"}