{"id":18451178,"url":"https://github.com/vityok/cl-string-match","last_synced_at":"2026-01-24T03:13:24.641Z","repository":{"id":151247813,"uuid":"204561677","full_name":"vityok/cl-string-match","owner":"vityok","description":"Implementation of a number of string search algorithms in Common Lisp","archived":false,"fork":false,"pushed_at":"2019-08-26T21:08:20.000Z","size":338,"stargazers_count":3,"open_issues_count":4,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-19T09:29:25.951Z","etag":null,"topics":["aho-corasick","boyer-moore-horspool","brute-force-algorithm","knuth-morris-pratt","rabin-karp","string-matching","string-search","trie"],"latest_commit_sha":null,"homepage":"","language":"Common Lisp","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vityok.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-26T21:06:13.000Z","updated_at":"2025-03-09T11:52:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"d1ea3c34-94e8-4f4b-b86d-a4a95b497fc9","html_url":"https://github.com/vityok/cl-string-match","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vityok%2Fcl-string-match","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vityok%2Fcl-string-match/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vityok%2Fcl-string-match/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vityok%2Fcl-string-match/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vityok","download_url":"https://codeload.github.com/vityok/cl-string-match/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249692311,"owners_count":21311396,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aho-corasick","boyer-moore-horspool","brute-force-algorithm","knuth-morris-pratt","rabin-karp","string-matching","string-search","trie"],"created_at":"2024-11-06T07:27:47.884Z","updated_at":"2026-01-24T03:13:24.585Z","avatar_url":"https://github.com/vityok.png","language":"Common Lisp","readme":"CL-STRING-MATCH [![Quickdocs](http://quickdocs.org/badge/cl-string-match.svg)](http://quickdocs.org/cl-string-match/) aims at providing robust implementations of string\nmatching algorithms. These algorithms are also called \"[substring\nsearch](http://en.wikipedia.org/wiki/String_searching_algorithm)\"\nor \"subsequence search\" algorithms.\n\nCurrently it provides implementations of the following string matching\nalgorithms (see [wiki for details](https://bitbucket.org/vityok/cl-string-match/wiki/Manual)):\n\n* Brute-force (also known as naïve algorithm)\n* Boyer-Moore (with mismatched character heuristic and good suffix shift)\n* Boyer-Moore-Horspool algorithm\n* Knuth-Morris-Pratt algorithm\n* Rabin-Karp algorithm\n* Shift-OR algorithm (single pattern)\n* Aho-Corasick algorithm (with finite set of patterns, forward\n  transition and fail function)\n* A simple backtracking regular expressions engine\n  [re](https://github.com/massung/re) similar to that of the Lua\n  programming language. At the moment it significantly underperforms\n  compared to the CL-PPCRE.\n\nSome string processing algorithms are also implemented:\n\n* Simple (naїve) suffix tree construction algorithm\n* Ukkonen's suffix tree construction algorithm\n\nData structures:\n\n* Prefix trie\n* Suffix tree\n\nUtilities:\n\n* Testing whether a string has the given suffix or prefix (starts with\n  or ends with the pattern)\n\nSome algorithms (Brute-force, Boyer-Moore-Horspool) have parametric\nimplementations (code templates) making it possible to declare\nspecific implementations for application-specific custom data types\nand data structures.\n\nThis library is routinely tested on Steel Bank CL, Clozure CL,\nEmbeddable CL and Armed Bear CL. Chances are really high that it will\nwork on other platforms without problems (check its status on\n[CL-TEST-GRID](https://common-lisp.net/project/cl-test-grid/library/cl-string-match.html)).\n\nCheck the [API Reference](http://quickdocs.org/cl-string-match/) for more details.\n\nAdditional resources:\n\n* [Project home page](https://bitbucket.org/vityok/cl-string-match)\n* Also take a look at the [project Wiki](https://bitbucket.org/vityok/cl-string-match/wiki/Home)\n* [A mirror on SourceForge](http://clstringmatch.sourceforge.net/)\n\n\nRATIONALE\n=========\n\nSince the standard `search` function is working fine, one might ask:\nwhy do we need a yet another implementation? Answer is simple:\nadvanced algorithms offer different benefits compared to the standard\nimplementation that is based on the brute-force algorithm.\n\n[Benchmarks](https://bitbucket.org/vityok/cl-string-match/wiki/Benchmarks)\nshow that depending on environment and pattern of application, a\nBoyer-Moore-Horspool algorithm implementation can outperform standard\nsearch function in SBCL by almost 18 times! Check the code in the\n`bench` folder for further details.\n\n\nUSAGE\n=====\n\nCL-STRING-MATCH [![Quickdocs](http://quickdocs.org/badge/cl-string-match.svg)](http://quickdocs.org/cl-string-match/) is supported by Quicklisp and is known by its system name:\n\n```lisp\n(ql:quickload :cl-string-match)\n```\n\nCL-STRING-MATCH exports functions in `cl-string-match` package (that\nis also nicknamed as `sm`).\n\nShortcut functions search given pattern `pat` in text `txt`. They are\nusually much slower (because they build index structures every time\nthey are called) but are easier to use:\n\n* `string-contains-brute` *pat* *txt* — Brute-force\n* `string-contains-bm` *pat* *txt* — Boyer-Moore\n* `string-contains-bmh` *pat* *txt* — Boyer-Moore-Horspool\n* `string-contains-kmp` *pat* *txt* — Knuth-Morris-Pratt\n* `string-contains-ac` *pat* *txt* — Aho-Corasick\n* `string-contains-rk` *pat* *txt* — Rabin-Karp\n\nA more robust approach is to use pre-calculated index data that is\nprocessed by a pair of `initialize` and `search` functions:\n\n* `initialize-bm` *pat* and `search-bm` *bm* *txt*\n* `initialize-bmh` *pat* and `search-bmh` *bm* *txt*\n* `initialize-bmh8` *pat* and `search-bmh8` *bm* *txt*\n* `initialize-rk` *pat* and `search-rk` *rk* *txt*\n* `initialize-kmp` *pat* and `search-kmp` *kmp* *txt*\n* `initialize-ac` *pat* and `search-ac` *ac* *txt*. `initialize-ac`\n  can accept a list of patterns that are compiled into a trie.\n\nBrute-force algorithm does not use pre-calculated data and has no\n\"initialize\" function.\n\nBoyer-Moore-Horspool implementation (the `-BMH` and `-BMH8` functions)\nalso accepts `:start2` and `:end2` keywords for the \"search\" and\n\"contains\" functions.\n\nFollowing example looks for a given substring *pat* in a given line of\ntext *txt* using Boyer-Moore-Horspool algorithm implementation:\n\n```lisp\n(let ((idx (initialize-bmh \"abc\")))\n  (search-bmh idx \"ababcfbgsldkj\"))\n```\n\nCounting all matches of a given pattern in a string:\n\n```lisp\n(loop with str = \"____abc____abc____ab\"\n      with pat = \"abc\"\n      with idx = (sm:initialize-bmh8 pat)\n      with z = 0 with s = 0 while s do\n       (when (setf s (sm:search-bmh8 idx str :start2 s))\n\t (incf z) (incf s (length pat)))\n     finally (return z))\n```\n\nIt should be noted that Boyer-Moore-Horspool (`bmh`) implementation\ncan offer an order of magnitude boost to performance compared to the\nstandard `search` function.\n\nHowever, some implementations create a \"jump table\" that can be the\nsize of the alphabet (over 1M CHAR-CODE-LIMIT on implementations\nsupporting Unicode) and thus consume a significant chunk of\nmemory. There are different solutions to this problem and at the\nmoment a version for the ASCII strings is offered: `initialize-bmh8`\n*pat* and `search-bmh8` *bm* *txt* as well as `string-contains-bmh8`\n*pat* *txt* work for strings with characters inside the 256 char code\nlimit.\n\nCONTRIB\n=======\n\nThis project also contains code that is not directly invloved with the\npattern search algorithms but nevertheless might be found useful for\ntext handling/processing. Check the contrib folder in the repository\nfor more details. Currently it contains:\n\n* `ascii-strings.lisp` aims to provide single-byte strings\nfunctionality for Unicode-enabled Common Lisp implementations. Another\ngoal is to reduce memory footprint and boost performance of the\nstring-processing tasks, i.e. `read-line`.\n\n* `simple-scanf` implements a subset of the original POSIX standard\n`scanf(3)` function features.\n\n\nTODO\n====\n\nThe project still lacks some important features and is under constant\ndevelopment. Any kind of contributions or feedback are welcome.\n\nPlease take a look at the [list of open issues](https://bitbucket.org/vityok/cl-string-match/issues?status=new\u0026status=open) or the [Project Roadmap](https://bitbucket.org/vityok/cl-string-match/wiki/Project%20Roadmap).\n\nVisit [project Wiki](https://bitbucket.org/vityok/cl-string-match/wiki/Home) for additional information.","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvityok%2Fcl-string-match","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvityok%2Fcl-string-match","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvityok%2Fcl-string-match/lists"}