{"id":16274045,"url":"https://github.com/trinker/stansent","last_synced_at":"2025-03-20T00:32:07.854Z","repository":{"id":36764906,"uuid":"41071593","full_name":"trinker/stansent","owner":"trinker","description":null,"archived":false,"fork":false,"pushed_at":"2021-10-07T14:09:27.000Z","size":372,"stargazers_count":16,"open_issues_count":7,"forks_count":3,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-02-28T21:57:22.689Z","etag":null,"topics":["sentiment","sentiment-analysis","stanford-nlp"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/trinker.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-08-20T02:43:35.000Z","updated_at":"2021-10-10T16:22:12.000Z","dependencies_parsed_at":"2022-09-23T04:13:32.395Z","dependency_job_id":null,"html_url":"https://github.com/trinker/stansent","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fstansent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fstansent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fstansent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fstansent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/trinker","download_url":"https://codeload.github.com/trinker/stansent/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244031028,"owners_count":20386534,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["sentiment","sentiment-analysis","stanford-nlp"],"created_at":"2024-10-10T18:26:52.163Z","updated_at":"2025-03-20T00:32:07.539Z","avatar_url":"https://github.com/trinker.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\ntitle: \"stansent\"\ndate: \"`r format(Sys.time(), '%d %B, %Y')`\"\noutput:\n  md_document:\n    toc: true      \n---\n\n```{r, echo=FALSE}\ndesc \u003c- suppressWarnings(readLines(\"DESCRIPTION\"))\nregex \u003c- \"(^Version:\\\\s+)(\\\\d+\\\\.\\\\d+\\\\.\\\\d+)\"\nloc \u003c- grep(regex, desc)\nver \u003c- gsub(regex, \"\\\\2\", desc[loc])\nverbadge \u003c- sprintf('\u003ca href=\"https://img.shields.io/badge/Version-%s-orange.svg\"\u003e\u003cimg src=\"https://img.shields.io/badge/Version-%s-orange.svg\" alt=\"Version\"/\u003e\u003c/a\u003e\u003c/p\u003e', ver, ver)\n#verbadge \u003c- \"\"\n````\n[![Project Status: Inactive – The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows.](https://www.repostatus.org/badges/latest/inactive.svg)](https://www.repostatus.org/#inactive)\n[![Build Status](https://travis-ci.org/trinker/stansent.svg?branch=master)](https://travis-ci.org/trinker/stansent)\n[![Coverage Status](https://coveralls.io/repos/trinker/stansent/badge.svg?branch=master)](https://coveralls.io/r/trinker/stansent?branch=master)\n`r verbadge`\n\n```{r, echo=FALSE}\nlibrary(knitr)\nknit_hooks$set(htmlcap = function(before, options, envir) {\n  if(!before) {\n    paste('\u003cp class=\"caption\"\u003e\u003cb\u003e\u003cem\u003e',options$htmlcap,\"\u003c/em\u003e\u003c/b\u003e\u003c/p\u003e\",sep=\"\")\n    }\n    })\nknitr::opts_knit$set(self.contained = TRUE, cache = FALSE)\nknitr::opts_chunk$set(fig.path = \"tools/figure/\")\n```\n\n![](tools/stansent_logo/core-nlp.jpg)  \n\n**stansent** wraps [Stanford's coreNLP sentiment tagger](http://nlp.stanford.edu/sentiment/) in a way that makes the process easier to get set up.  The output is designed to look and behave like the objects from the [**sentimentr**](https://github.com/trinker/sentimentr) package.  Plotting and the `sentimentr::highlight` functionality will work  similar to the `sentiment`/`sentiment_by` objects from **sentimentr**.  This requires less learning to work between the two packages.  \n\nIn addition to **sentimentr** and **stansent**, Matthew Jocker's has created the [**syuzhet** ](http://www.matthewjockers.net/2015/02/02/syuzhet/) package that utilizes dictionary lookups for the Bing, NRC, and Afinn methods.  Similarly, Subhasree Bose has contributed [RSentiment](https://CRAN.R-project.org/package=RSentiment) which utilizes dictionary lookup that atempts to address negation and sarcasm.  [Click here for a comparison](https://github.com/trinker/sentimentr#comparing-sentimentr-syuzhet-rsentiment-and-stanford) between **stansent**, **sentimentr**, **syuzhet**, and **RSentiment**.  Note the accuracy and run times of the packages.\n\n# Installation\n\nTo download the development version of **stansent**:\n\nDownload the [zip ball](https://github.com/trinker/stansent/zipball/master) or [tar ball](https://github.com/trinker/stansent/tarball/master), decompress and run `R CMD INSTALL` on it, or use the **pacman** package to install the development version:\n\n```r\nif (!require(\"pacman\")) install.packages(\"pacman\")\npacman::p_load_gh(\"trinker/coreNLPsetup\", \"trinker/stansent\")\n```\nAfter installing use the following to ensure Java and coreNLP are installed correctly:\n\n```\ncheck_setup()\n```\n\nto make sure your Java version is of the right version and [coreNLP](http://nlp.stanford.edu/software/corenlp.shtml) is set up in the right location.\n\n\n# Functions\n\nThere are two main functions in **sentimentr** with a few helper functions.  The main functions, task category, \u0026 descriptions are summarized in the table below:\n\n\n| Function                |  Function  |  Description                            |\n|-------------------------|------------|-----------------------------------------|\n| `sentiment_stanford`    | sentiment  | Sentiment at the sentence level         |\n| `sentiment_stanford_by` | sentiment  | Aggregated sentiment by group(s)        |\n| `uncombine`             | reshaping  | Extract sentence level sentiment from  `sentiment_by` |\n| `get_sentences`         | reshaping  | Regex based string to sentence parser (or get sentences from `sentiment`/`sentiment_by`)|\n| `highlight`      | Highlight positive/negative sentences as an HTML document |\n| `check_setup`           | initial set-up |Make sure Java and coreNLP are set up correctly     |\n\n\n\n# Contact\n\nYou are welcome to:\n* submit suggestions and bug-reports at: \u003chttps://github.com/trinker/stansent/issues\u003e\n* send a pull request on: \u003chttps://github.com/trinker/stansent/\u003e\n* compose a friendly e-mail to: \u003ctyler.rinker@gmail.com\u003e\n\n#Demonstration\n\n## Load the Packages/Data\n\n```{r, message=FALSE}\nif (!require(\"pacman\")) install.packages(\"pacman\")\npacman::p_load_gh(c(\"trinker/stansent\", \"trinker/sentimentr\"))\npacman::p_load(dplyr)\n\nmytext \u003c- c(\n    'do you like it?  But I hate really bad dogs',\n    'I am the best friend.',\n    'Do you really like it?  I\\'m not a fan'\n)\n\ndata(presidential_debates_2012, cannon_reviews)\nset.seed(100)\ndat \u003c- presidential_debates_2012[sample(1:nrow(presidential_debates_2012), 100), ]\n\n```\n\n## `sentiment_stanford`\n\n```{r}\nout1 \u003c- sentiment_stanford(mytext) \nout1[[\"text\"]] \u003c- unlist(get_sentences(out1))\nout1\n```\n\n## `sentiment_stanford_by`: Aggregation\n\nTo aggregate by element (column cell or vector element) use `sentiment_stanford_by` with `by = NULL`.\n\n```{r}\nout2 \u003c- sentiment_stanford_by(mytext) \nout2[[\"text\"]] \u003c- mytext\nout2\n```\n\nTo aggregate by grouping variables use `sentiment_by` using the `by` argument.\n\n```{r, echo=FALSE, results=\"hide\"}\ntic \u003c- Sys.time()\nout3 \u003c- with(dat, sentiment_stanford_by(dialogue, list(person, time)))\ntoc \u003c-  round(as.numeric(difftime(Sys.time(), tic, units = 'secs')), 1)\n```\n\n\n```{r, eval=FALSE}\n(out3 \u003c- with(dat, sentiment_stanford_by(dialogue, list(person, time))))\n```\n\n```{r, echo=FALSE}\nout3\n```\n\n## Recycling\n\nNote that the Stanford coreNLP functionality takes considerable time to compute (~`r toc` seconds to compute `out` above).  The output from `sentiment_stanford`/`sentiment_stanford_by` can be recycled inside of `sentiment_stanford_by`, reusing the raw scoring to save the new call to Java.\n\n\n```{r}\nwith(dat, sentiment_stanford_by(out3, list(role, time)))\n```\n\n## Plotting \n\n### Plotting at Aggregated Sentiment\n\nThe possible sentiment values in the output are {`r paste(seq(-1, 1, by = .5), collapse = \", \")`}.  The raw number of occurrences as each sentiment level are plotted as a bubble version of [Cleveland's dot plot](https://en.wikipedia.org/wiki/Dot_plot_(statistics)).  The red cross represents the mean sentiment score (grouping variables are ordered by this by default).  \n\n\n```{r, warning=FALSE}\nplot(out3)\n```\n\n### Plotting at the Sentence Level\n\nThe `plot` method for the class `sentiment` uses **syuzhet**'s `get_transformed_values` combined with **ggplot2** to make a reasonable, smoothed plot for the duration of the text based on percentage, allowing for comparison between plots of different texts.  This plot gives the overall shape of the text's sentiment.  The user can see `syuzhet::get_transformed_values` for more details.\n\n```{r}\nplot(uncombine(out3))\n```\n\n\n## Text Highlighting\n\nThe user may wish to see the output from `sentiment_stanford_by` line by line with positive/negative sentences highlighted.  The `sentimentr::highlight` function wraps a `sentiment_by` output to produces a highlighted HTML file (positive = green; negative = pink). Here we look at three random reviews from Hu and Liu's (2004) Cannon G3 Camera Amazon product reviews.  \n\n```{r, eval=FALSE}\nset.seed(2)\nhighlight(with(subset(cannon_reviews, number %in% sample(unique(number), 3)), sentiment_stanford_by(review, number)))\n```\n\n![](tools/figure/highlight.png)\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinker%2Fstansent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftrinker%2Fstansent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinker%2Fstansent/lists"}