Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/chainsawriot/hkrug_tm

Hong Kong R User Group Pre-hackathon workshop on Text Mining.
https://github.com/chainsawriot/hkrug_tm

Last synced: 27 days ago
JSON representation

Hong Kong R User Group Pre-hackathon workshop on Text Mining.

Awesome Lists containing this project

README

        

# hkrug_tm

Hong Kong R User Group Pre-hackathon workshop on Text Mining.

This pre-hackathon is designed for those who want to get their hands dirty to build something. This is not a chit-chat session (in cantonese 吹水). YOU NEED TO CODE DURING THE WORKSHOP.

# prerequsite

You need to know some R and basic stats (regression and clustering analysis). You need to know what is going on with the following two R-snippnets.

```{r}
#R-code snippet 1
require(MASS); summary(glm(as.factor(vs)~mpg+cyl, data=mtcars, family=binomial)) ; predict(glm(as.factor(vs)~mpg+cyl, data=mtcars, family=binomial), mtcars)
#R-code snippet 2
kmeans(iris[,1:4], 3)
kmeans(iris[,1:4], 3)$cluster
table(kmeans(iris[,1:4], 3)$cluster, iris[,5])
```

# preparation

You should be running R > 3.0 and your preferred editor. (RStudio/emacs/vim/sublime) Please install the required R packages

```{r}
install.packages(c("tm", "stringr", "plyr", "topicmodels", "SnowballC", "magrittr", "klaR", "e1071"))
```

For those who are using Linux, you need to install GNU scientific library to compile the "topicmodels" library. If you are using Debian/Ubuntu, you can install GNU scientific library by:

```
sudo apt-get install libgsl0ldbl libgsl0-dev
```