Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mateuszk098/text-mining-and-data-analysis
Examples of text mining and statistical data analysis with R.
https://github.com/mateuszk098/text-mining-and-data-analysis
classifier data-exploration lda-model natural-language-processing nlp online r rstudio sentiment-analysis svm-classifier text-mining
Last synced: 1 day ago
JSON representation
Examples of text mining and statistical data analysis with R.
- Host: GitHub
- URL: https://github.com/mateuszk098/text-mining-and-data-analysis
- Owner: mateuszk098
- License: unlicense
- Created: 2021-10-23T12:37:31.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-10-25T14:24:25.000Z (about 2 years ago)
- Last Synced: 2024-11-07T10:32:53.963Z (about 2 months ago)
- Topics: classifier, data-exploration, lda-model, natural-language-processing, nlp, online, r, rstudio, sentiment-analysis, svm-classifier, text-mining
- Language: R
- Homepage:
- Size: 333 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# **Text Mining and Data Analysis**
![GitHub last commit](https://img.shields.io/github/last-commit/mateuszk098/text_mining_and_data_analysis)
## **Examples of text mining with R package. Scope of terms:**
- Text representation, representation models (bag-of-words, tf-idf etc).
- Statistical laws of language (Zipf's law, Heap's law).
- Natural language processing (NLP) and its applications.
- Measures of complexity (Herdan index, FOG readability index etc) and text similarity (cosine similarity).
- Sentiment analysis (dictionary and data-driven classifiers).
- Topic models.
- Online data analysis.## **Examples of data exploration methods with R package. Scope of terms:**
- Linear and quadratic discriminant analysis, classifier evaluation.
- Naive Bayes classifier.
- Nearest neighbour method.
- Classification trees.
- Classifier ensembles.
- Cluster analysis.
- Principal component analysis.
- Multidimensional scaling.
- Factor analysis.
- Support vector machines.