Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gagolews/stringi
Fast and portable character string processing in R (with the Unicode ICU)
https://github.com/gagolews/stringi
icu icu4c natural-language-processing nlp r regex regexp string-manipulation stringi stringr text text-processing tidy-data unicode
Last synced: about 2 months ago
JSON representation
Fast and portable character string processing in R (with the Unicode ICU)
- Host: GitHub
- URL: https://github.com/gagolews/stringi
- Owner: gagolews
- License: other
- Created: 2013-01-05T15:54:29.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2023-12-10T08:12:09.000Z (7 months ago)
- Last Synced: 2024-02-01T11:11:39.888Z (5 months ago)
- Topics: icu, icu4c, natural-language-processing, nlp, r, regex, regexp, string-manipulation, stringi, stringr, text, text-processing, tidy-data, unicode
- Language: C++
- Homepage: https://stringi.gagolewski.com/
- Size: 210 MB
- Stars: 286
- Watchers: 21
- Forks: 47
- Open Issues: 42
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
Lists
- awesome-R - stringi <img class="emoji" alt="heart" src="https://cdn.jsdelivr.net/gh/qinwf/awesome-R@3c66da6e291bcc0520b1649125b0bed750896a9a/heart.png" height="20" align="absmiddle" width="20"> - ICU based string processing package. (Data Manipulation)
- awesome-R - stringi <img class="emoji" alt="heart" src="https://awesome-r.com/heart.png" height="20" align="absmiddle" width="20"> - ICU based string processing package. (Data Manipulation)
- jimsghstars - gagolews/stringi - Fast and portable character string processing in R (with the Unicode ICU) (C++)
- awesome-stars - gagolews/stringi - THE String Processing Package for R (with ICU) (C++)
- awesome_R - stringi <img class="emoji" alt="heart" src="https://awesome-r.com/heart.png" height="20" align="absmiddle" width="20"> - ICU based string processing package. (Data Manipulation)
- fucking-awesome-R - stringi <img class="emoji" alt="heart" src="https://cdn.jsdelivr.net/gh/qinwf/awesome-R@3c66da6e291bcc0520b1649125b0bed750896a9a/heart.png" height="20" align="absmiddle" width="20"> - ICU based string processing package. (Data Manipulation)
README
# [**`stringi`**](https://stringi.gagolewski.com/)### Fast and Portable Character String Processing in R (with the Unicode ICU)
![Build Status](https://github.com/gagolews/stringi/workflows/stringi%20for%20R/badge.svg)
![RStudio CRAN mirror downloads](http://cranlogs.r-pkg.org/badges/grand-total/stringi)
![RStudio CRAN mirror downloads](http://cranlogs.r-pkg.org/badges/last-month/stringi)
![RStudio CRAN mirror downloads](http://cranlogs.r-pkg.org/badges/last-day/stringi)> A comprehensive tutorial and reference manual is available
> at .
>
> Check out [**`stringx`**](https://stringx.gagolewski.com/) for a set of wrappers
> around **`stringi`** with a base R-compatible API.
>
> To learn more about R, check out Marek's open-access (free!) textbook
> [Deep R Programming](https://deepr.gagolewski.com/).**`stringi`** (pronounced “stringy”, IPA [strinɡi])
is THE *R* package for string/text/natural language processing.
It is very fast, consistent, convenient, and — thanks to the
[ICU – International Components for Unicode](https://icu.unicode.org/)
library — portable across all locales and platforms.Available features include:
* string concatenation, padding, wrapping,
* substring extraction,
* pattern searching (e.g., with Java-like regular expressions),
* collation and sorting,
* random string generation,
* case mapping and folding,
* string transliteration,
* Unicode normalisation,
* date-time formatting and parsing,and many more.
**Package Maintainer**: [Marek Gagolewski](https://www.gagolewski.com/)
**Authors and Contributors**: [Marek Gagolewski](https://www.gagolewski.com/),
with contributions from Bartłomiej Tartanus and many others.The package's API was inspired by that of the early (pre-tidyverse; v0.6.2)
version of Hadley Wickham's
[**`stringr`**](https://cran.r-project.org/web/packages/stringr/)
package (and since the 2015 v1.0.0 **`stringr`** is powered by **`stringi`**).**Homepage**: https://stringi.gagolewski.com/
**Citation**: Gagolewski M.,
**`stringi`**: Fast and portable character string processing in R,
*Journal of Statistical Software* **103**(2), 2022, 1–59,
.**CRAN Entry**: https://CRAN.R-project.org/package=stringi
**System Requirements**: *R >= 3.4*, *ICU4C >= 61* (refer to the
[INSTALL](https://raw.githubusercontent.com/gagolews/stringi/master/INSTALL)
file for more details)**License**: **`stringi`**'s source code is distributed under the open source
BSD-3-clause license. For more details, see
[LICENSE](https://raw.githubusercontent.com/gagolews/stringi/master/LICENSE).This *git* repository also contains a custom subset of *ICU4C* source code which
is copyrighted by Unicode, Inc. and others. A binary version of the Unicode
Character Database is included. For more details on copyright holders, see
[LICENSE](https://raw.githubusercontent.com/gagolews/stringi/master/LICENSE).
The *ICU* project is covered by the
[Unicode license](https://github.com/unicode-org/icu/blob/main/icu4c/LICENSE) —
a simple, permissive non-copyleft free software license, compatible with
the GNU GPL. The *ICU* license
is [intended](https://unicode-org.github.io/icu/userguide/icu4c/faq.html)
to allow *ICU* to be included in free software projects as well as
in proprietary or commercial products.**Changes**: see the
[NEWS](https://raw.githubusercontent.com/gagolews/stringi/master/NEWS) file.[How to access the stringi C++ API from within an Rcpp-based R package](https://github.com/gagolews/ExampleRcppStringi)