{"id":13420939,"url":"https://github.com/zvelo/libstemmer","last_synced_at":"2026-01-25T06:05:17.153Z","repository":{"id":3982290,"uuid":"5078070","full_name":"zvelo/libstemmer","owner":"zvelo","description":"Snowball Stemming Algorithms","archived":false,"fork":false,"pushed_at":"2017-03-02T05:53:04.000Z","size":103,"stargazers_count":22,"open_issues_count":0,"forks_count":5,"subscribers_count":49,"default_branch":"master","last_synced_at":"2024-07-31T22:57:06.715Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://snowball.tartarus.org/","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zvelo.png","metadata":{"files":{"readme":"README","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-07-17T06:24:12.000Z","updated_at":"2024-06-13T17:59:51.000Z","dependencies_parsed_at":"2022-09-12T11:52:18.871Z","dependency_job_id":null,"html_url":"https://github.com/zvelo/libstemmer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zvelo%2Flibstemmer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zvelo%2Flibstemmer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zvelo%2Flibstemmer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zvelo%2Flibstemmer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zvelo","download_url":"https://codeload.github.com/zvelo/libstemmer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243701276,"owners_count":20333615,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T22:01:44.561Z","updated_at":"2026-01-25T06:05:17.088Z","avatar_url":"https://github.com/zvelo.png","language":"C","funding_links":[],"categories":["TODO scan for Android support in followings"],"sub_categories":[],"readme":"libstemmer_c\n============\n\nThis document pertains to the C version of the libstemmer distribution,\navailable for download from:\n\nhttp://snowball.tartarus.org/dist/libstemmer_c.tgz\n\n\nCompiling the library\n=====================\n\nA simple makefile is provided for Unix style systems.  On such systems, it\nshould be possible simply to run \"make\", and the file \"libstemmer.o\"\nand the example program \"stemwords\" will be generated.\n\nIf this doesn't work on your system, you need to write your own build\nsystem (or call the compiler directly).  The files to compile are\nall contained in the \"libstemmer\", \"runtime\" and \"src_c\" directories,\nand the public header file is contained in the \"include\" directory.\n\nThe library comes in two flavours; UTF-8 only, and UTF-8 plus other character\nsets.  To use the utf-8 only flavour, compile \"libstemmer_utf8.c\" instead of\n\"libstemmer.c\".\n\nFor convenience \"mkinc.mak\" is a makefile fragment listing the source files and\nheader files used to compile the standard version of the library.\n\"mkinc_utf8.mak\" is a comparable makefile fragment listing just the source\nfiles for the UTF-8 only version of the library.\n\n\nUsing the library\n=================\n\nThe library provides a simple C API.  Essentially, a new stemmer can\nbe obtained by using \"sb_stemmer_new\".  \"sb_stemmer_stem\" is then\nused to stem a word, \"sb_stemmer_length\" returns the stemmed\nlength of the last word processed, and \"sb_stemmer_delete\" is\nused to delete a stemmer.\n\nCreating a stemmer is a relatively expensive operation - the expected\nusage pattern is that a new stemmer is created when needed, used\nto stem many words, and deleted after some time.\n\nStemmers are re-entrant, but not threadsafe.  In other words, if\nyou wish to access the same stemmer object from multiple threads,\nyou must ensure that all access is protected by a mutex or similar\ndevice.\n\nlibstemmer does not currently incorporate any mechanism for caching the results\nof stemming operations.  Such caching can greatly increase the performance of a\nstemmer under certain situations, so suitable patches will be considered for\ninclusion.\n\nThe standard libstemmer sources contain an algorithm for each of the supported\nlanguages.  The algorithm may be selected using the english name of the\nlanguage, or using the 2 or 3 letter ISO 639 language codes.  In addition,\nthe traditional \"Porter\" stemming algorithm for english is included for\nbackwards compatibility purposes, but we recommend use of the \"English\"\nstemmer in preference for new projects.\n\n(Some minor algorithms which are included only as curiosities in the snowball\nwebsite, such as the Lovins stemmer and the Kraaij Pohlmann stemmer, are not\nincluded in the standard libstemmer sources.  These are not really supported by\nthe snowball project, but it would be possible to compile a modified libstemmer\nlibrary containing these if desired.)\n\n\nThe stemwords example\n=====================\n\nThe stemwords example program allows you to run any of the stemmers\ncompiled into the libstemmer library on a sample vocabulary.  For\ndetails on how to use it, run it with the \"-h\" command line option.\n\n\nUsing the library in a larger system\n====================================\n\nIf you are incorporating the library into the build system of a larger\nprogram, I recommend copying the unpacked tarball without modification into\na subdirectory of the sources of your program.  Future versions of the\nlibrary are intended to keep the same structure, so this will keep the\nwork required to move to a new version of the library to a minimum.\n\nAs an additional convenience, the list of source and header files used\nin the library is detailed in mkinc.mak - a file which is in a suitable\nformat for inclusion by a Makefile.  By including this file in your build\nsystem, you can link the snowball system into your program with a few\nextra rules.\n\nUsing the library in a system using GNU autotools\n=================================================\n\nThe libstemmer_c library can be integrated into a larger system which uses the\nGNU autotool framework (and in particular, automake and autoconf) as follows:\n\n1) Unpack libstemmer_c.tgz in the top level project directory so that there is\n   a libstemmer_c subdirectory of the top level directory of the project.\n\n2) Add a file \"Makefile.am\" to the unpacked libstemmer_c folder, containing:\n   \nnoinst_LTLIBRARIES = libstemmer.la\ninclude $(srcdir)/mkinc.mak\nnoinst_HEADERS = $(snowball_headers)\nlibstemmer_la_SOURCES = $(snowball_sources) \n\n(You may also need to add other lines to this, for example, if you are using\ncompiler options which are not compatible with compiling the libstemmer\nlibrary.)\n\n3) Add libstemmer_c to the AC_CONFIG_FILES declaration in the project's\n   configure.ac file.\n\n4) Add to the top level makefile the following lines (or modify existing\n   assignments to these variables appropriately):\n\nAUTOMAKE_OPTIONS = subdir-objects\nAM_CPPFLAGS = -I$(top_srcdir)/libstemmer_c/include\nSUBDIRS=libstemmer_c\n\u003cname\u003e_LIBADD = libstemmer_c/libstemmer.la\n\n(Where \u003cname\u003e is the name of the library or executable which links against\nlibstemmer.) \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzvelo%2Flibstemmer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzvelo%2Flibstemmer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzvelo%2Flibstemmer/lists"}