Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mlampros/fuzzywuzzyr
fuzzy string matching in R
https://github.com/mlampros/fuzzywuzzyr
fuzzywuzzy matching python r reticulate string
Last synced: 21 days ago
JSON representation
fuzzy string matching in R
- Host: GitHub
- URL: https://github.com/mlampros/fuzzywuzzyr
- Owner: mlampros
- Created: 2017-04-13T16:33:31.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2023-05-27T04:24:39.000Z (over 1 year ago)
- Last Synced: 2024-10-12T14:39:00.701Z (26 days ago)
- Topics: fuzzywuzzy, matching, python, r, reticulate, string
- Language: R
- Homepage: https://mlampros.github.io/fuzzywuzzyR/
- Size: 335 KB
- Stars: 37
- Watchers: 5
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![tic](https://github.com/mlampros/fuzzywuzzyR/workflows/tic/badge.svg?branch=master)](https://github.com/mlampros/fuzzywuzzyR/actions)
[![codecov.io](https://codecov.io/github/mlampros/fuzzywuzzyR/coverage.svg?branch=master)](https://codecov.io/github/mlampros/fuzzywuzzyR?branch=master)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/fuzzywuzzyR)](http://cran.r-project.org/package=fuzzywuzzyR)
[![Downloads](http://cranlogs.r-pkg.org/badges/grand-total/fuzzywuzzyR?color=blue)](http://www.r-pkg.org/pkg/fuzzywuzzyR)
[![Dependencies](https://tinyverse.netlify.com/badge/fuzzywuzzyR)](https://cran.r-project.org/package=fuzzywuzzyR)## fuzzywuzzyR
The **fuzzywuzzyR** package is a fuzzy string matching implementation of the [fuzzywuzzy](https://github.com/seatgeek/fuzzywuzzy) python package. It uses the [Levenshtein Distance](https://en.wikipedia.org/wiki/Levenshtein_distance) to calculate the differences between sequences. More details on the functionality of fuzzywuzzyR can be found in the [blog-post](http://mlampros.github.io/2017/04/13/fuzzywuzzyR_package/) and in the package Vignette.
**UPDATE 26-07-2018**: A [Singularity image file](http://mlampros.github.io/2018/07/26/singularity_containers/) is available in case that someone intends to run *fuzzywuzzyR* on Ubuntu Linux (locally or in a cloud instance) with all package requirements pre-installed. This allows the user to utilize the *fuzzywuzzyR* package without having to spend time on the installation process.
### **System Requirements**
* Python (>= 2.4)
* difflib
* fuzzywuzzy ( >=0.15.0 )
* [python-Levenshtein](https://github.com/ztane/python-Levenshtein/) ( >=0.12.0, optional, provides a 4-10x speedup in String Matching, though may result in differing results for certain cases)
Before the installation of any python modules one should check the python-configuration using :
```R
reticulate::py_config()```
All modules should be installed in the default python configuration (the configuration that the R-session displays as default), otherwise errors will occur during package installation.
#### **Debian/Ubuntu/Fedora**
**Python2**
```R
sudo apt-get install python-pip
sudo pip install --upgrade pip
pip install fuzzywuzzy
pip install python-Levenshtein
```**Python 3**
```R
sudo apt-get install python3-pip
sudo pip3 install --upgrade pip
pip3 install fuzzywuzzy
pip3 install python-Levenshtein
```#### **Macintosh OSX**
```R
sudo easy_install pip
sudo pip install fuzzywuzzy
sudo pip install python-Levenshtein
```#### **Windows OS**
* Download of [get-pip.py](https://bootstrap.pypa.io/get-pip.py)
* Update of the Environment variables ( Control Panel >> System and Security >> System >> Advanced system settings >> Environment variables >> System variables >> Path >> Edit ) by adding ( for instance in case of python 2.7 ) :
```R
C:\Python27;C:\Python27\Scripts
```* Install the [Build Tools for Visual Studio](https://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2017)
* Open the *Command prompt* and use the following commands:
```R
pip install fuzzywuzzy
pip install python-Levenshtein
```
### **Installation of the fuzzywuzzyR package**
To install the package from CRAN use,
```R
install.packages('fuzzywuzzyR')
```
and to download the latest version from Github use the *install_github* function of the devtools package,
```R
devtools::install_github(repo = 'mlampros/fuzzywuzzyR')
```
Use the following link to report bugs/issues,[https://github.com/mlampros/fuzzywuzzyR/issues](https://github.com/mlampros/fuzzywuzzyR/issues)
### **Citation:**
If you use the code of this repository in your paper or research please cite both **fuzzywuzzyR** and the **original software** [https://CRAN.R-project.org/package=fuzzywuzzyR/citation.html](https://CRAN.R-project.org/package=fuzzywuzzyR/citation.html):
```R
@Manual{,
title = {{fuzzywuzzyR}: Fuzzy String Matching in R},
author = {Lampros Mouselimis},
year = {2021},
note = {R package version 1.0.5},
url = {https://CRAN.R-project.org/package=fuzzywuzzyR},
}
```