Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/miserman/lingmatch
An all-in-one R package for the assessment of linguistic similarity
https://github.com/miserman/lingmatch
nlp r rcpp text-analysis
Last synced: 1 day ago
JSON representation
An all-in-one R package for the assessment of linguistic similarity
- Host: GitHub
- URL: https://github.com/miserman/lingmatch
- Owner: miserman
- Created: 2017-09-25T03:55:10.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2024-11-08T20:51:07.000Z (12 days ago)
- Last Synced: 2024-11-08T21:33:58.568Z (12 days ago)
- Topics: nlp, r, rcpp, text-analysis
- Language: R
- Homepage: https://miserman.github.io/lingmatch
- Size: 29.6 MB
- Stars: 11
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# lingmatch
An all-in-one R package for the assessment of linguistic matching and/or accommodation.## features
* Input raw text, a document-term matrix (DTM), or LIWC output.
* Apply various weighting functions to a DTM.
* Measure similarity and/or accommodation with various metrics.
* Calculate standard forms of Language Style Matching (LSM) and Latent Semantic Similarity (LSS).## resources
* Documentation and guides: [miserman.github.io/lingmatch](https://miserman.github.io/lingmatch/)
* [Quick Start](https://miserman.github.io/lingmatch/articles/quickstart.html)
* [Comparison Specification](https://miserman.github.io/lingmatch/articles/groups.html)
* [Introduction to Text Analysis](https://miserman.github.io/lingmatch/articles/introduction.html)
* [Word Vectors](https://miserman.github.io/lingmatch/articles/word_vectors.html)
* [Text Classification](https://miserman.github.io/lingmatch/articles/text_classification.html)
* [Dictionary Creation](https://miserman.github.io/lingmatch/articles/dictionary_creation.html)
* Dictionary repository: [osf.io/y6g5b](https://osf.io/y6g5b/wiki/home/)
* Latent semantic space repository: [osf.io/489he](https://osf.io/489he/wiki/home/)
* Dictionary builder: [miserman.github.io/dictionary_builder](https://miserman.github.io/dictionary_builder/)## installation
Download R from [r-project.org](https://www.r-project.org/), then install the package from an R console:Release ([version 1.0.7](https://CRAN.R-project.org/package=lingmatch))
```R
install.packages("lingmatch")
```
Development (version 1.0.8)
```R
# install.packages("remotes")
remotes::install_github("miserman/lingmatch")
```And load the package:
```R
library(lingmatch)
```
## examples
Can make a quick comparison between two bits of text; by default this will give the cosine similarity between raw
word-count vectors:
```R
lingmatch("First text to look at.", "Text to compare that text with.")
```Or, given a vector of texts:
```R
text = c(
"Why, hello there! How are you this evening?",
"I am well, thank you for your inquiry!",
"You are a most good at social interactions person!",
"Why, thank you! You're not all bad yourself!"
)
```
Process the texts in one step:
```R
# with a dictionary
inquirer_cats = lma_process(text, dict = "inquirer", dir = "~/Dictionaries")# with a latent semantic space
glove_vectors = lma_process(text, space = "glove", dir = "~/Latent Semantic Spaces")
```Or process the texts step by step, then measure similarity between each:
```R
dtm = lma_dtm(text)
dtm_weighted = lma_weight(dtm)
dtm_categorized = lma_termcat(dtm_weighted, lma_dict(1:9))
similarity = lma_simets(dtm_categorized, metric = "canberra")
```Or do that within a single function call:
```R
similarity = lingmatch(
text, weight = "frequency", dict = lma_dict(1:9), metric = "canberra"
)$sim
```Or, if you want a standard form (as in this example), specify a default:
```R
similarity = lingmatch(text, type = "lsm")$sim
```