Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mlampros/texttinyr

Text Processing for Small or Big Data Files in R
https://github.com/mlampros/texttinyr

bh boost cpp11 processing r rcpp rcpparmadillo text

Last synced: 24 days ago
JSON representation

Text Processing for Small or Big Data Files in R

Awesome Lists containing this project

README

        

[![tic](https://github.com/mlampros/textTinyR/workflows/tic/badge.svg?branch=master)](https://github.com/mlampros/textTinyR/actions)
[![codecov.io](https://codecov.io/github/mlampros/textTinyR/coverage.svg?branch=master)](https://codecov.io/github/mlampros/textTinyR?branch=master)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/textTinyR)](http://cran.r-project.org/package=textTinyR)
[![Downloads](http://cranlogs.r-pkg.org/badges/grand-total/textTinyR?color=blue)](http://www.r-pkg.org/pkg/textTinyR)
Buy Me A Coffee
[![Dependencies](https://tinyverse.netlify.com/badge/textTinyR)](https://cran.r-project.org/package=textTinyR)
[![](https://img.shields.io/docker/automated/mlampros/texttinyr.svg)](https://hub.docker.com/r/mlampros/texttinyr)

## textTinyR

The *textTinyR* package consists of text processing functions for small or big data files. More details on the functionality of textTinyR can be found in [blog-post1](http://mlampros.github.io/2017/01/05/textTinyR_package/) and [blog-post2](http://mlampros.github.io/2018/04/04/extending_textTinyR_package/). The R package can be installed, in the following Operating Systems: Linux, Mac and Windows. However, there is one limitation : *chinese*, *japanese*, *korean*, *thai* or *languages with ambiguous word boundaries* are not supported.


**UPDATE 01-04-2018** : *boost-locale* is no longer a system requirement for the textTinyR package.


### **Installation of the textTinyR package (CRAN, Github)**


To install the package from CRAN use,

```R

install.packages('textTinyR')

```

and to download the latest version from Github use the *install_github* function of the devtools package,


```R

devtools::install_github(repo = 'mlampros/textTinyR')

```


Use the following link to report bugs/issues,

[https://github.com/mlampros/textTinyR/issues](https://github.com/mlampros/textTinyR/issues)



**UPDATE 06-02-2020**


**Docker images** of the *textTinyR* package are available to download from my [dockerhub](https://hub.docker.com/r/mlampros/texttinyr) account. The images come with *Rstudio* and the *R-development* version (latest) installed. The whole process was tested on Ubuntu 18.04. To **pull** & **run** the image do the following,


```R

docker pull mlampros/texttinyr:rstudiodev

docker run -d --name rstudio_dev -e USER=rstudio -e PASSWORD=give_here_your_password --rm -p 8787:8787 mlampros/texttinyr:rstudiodev

```


The user can also **bind** a home directory / folder to the image to use its files by specifying the **-v** command,


```R

docker run -d --name rstudio_dev -e USER=rstudio -e PASSWORD=give_here_your_password --rm -p 8787:8787 -v /home/YOUR_DIR:/home/rstudio/YOUR_DIR mlampros/texttinyr:rstudiodev

```


In the latter case you might have first give permission privileges for write access to **YOUR_DIR** directory (not necessarily) using,


```R

chmod -R 777 /home/YOUR_DIR

```


The **USER** defaults to *rstudio* but you have to give your **PASSWORD** of preference (see [https://rocker-project.org/](https://rocker-project.org/) for more information).


Open your web-browser and depending where the docker image was *build / run* give,


**1st. Option** on your personal computer,


```R
http://0.0.0.0:8787

```


**2nd. Option** on a cloud instance,


```R
http://Public DNS:8787

```


to access the Rstudio console in order to give your username and password.


### **Citation:**

If you use the code of this repository in your paper or research please cite both **textTinyR** and the **original software** [https://CRAN.R-project.org/package=textTinyR/citation.html](https://CRAN.R-project.org/package=textTinyR/citation.html):


```R
@Manual{,
title = {{textTinyR}: Text Processing for Small or Big Data Files},
author = {Lampros Mouselimis},
year = {2021},
note = {R package version 1.1.8},
url = {https://CRAN.R-project.org/package=textTinyR},
}
```