{"id":15044218,"url":"https://github.com/apanimesh061/vadersentimentjava","last_synced_at":"2025-04-10T00:43:02.054Z","repository":{"id":45930294,"uuid":"55928989","full_name":"apanimesh061/VaderSentimentJava","owner":"apanimesh061","description":"Java port of Python NLTK Vader Sentiment Analyzer. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.","archived":false,"fork":false,"pushed_at":"2023-01-16T23:09:50.000Z","size":3047,"stargazers_count":64,"open_issues_count":1,"forks_count":29,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-10T00:42:54.730Z","etag":null,"topics":["java-8","nltk","sentiment-analysis","vader-sentiment-analysis"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apanimesh061.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-04-11T00:02:43.000Z","updated_at":"2025-03-08T10:56:57.000Z","dependencies_parsed_at":"2023-02-10T06:46:13.838Z","dependency_job_id":null,"html_url":"https://github.com/apanimesh061/VaderSentimentJava","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apanimesh061%2FVaderSentimentJava","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apanimesh061%2FVaderSentimentJava/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apanimesh061%2FVaderSentimentJava/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apanimesh061%2FVaderSentimentJava/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apanimesh061","download_url":"https://codeload.github.com/apanimesh061/VaderSentimentJava/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248137998,"owners_count":21053775,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["java-8","nltk","sentiment-analysis","vader-sentiment-analysis"],"created_at":"2024-09-24T20:50:18.082Z","updated_at":"2025-04-10T00:43:02.034Z","avatar_url":"https://github.com/apanimesh061.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"## VADER-Sentiment-Analysis in Java\r\n\r\n[![Build Status](https://travis-ci.org/apanimesh061/VaderSentimentJava.svg?branch=master)](https://travis-ci.org/apanimesh061/VaderSentimentJava)\r\n\r\nVADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is _specifically attuned to sentiments expressed in social media_. It is fully open-sourced under the [MIT License](http://choosealicense.com/) (we sincerely appreciate all attributions and readily accept most contributions, but please don't hold us liable).\r\n\r\nThis is a JAVA port of the NLTK VADER sentiment analysis originally written in Python.\r\n\r\n - The [Original](https://github.com/cjhutto/vaderSentiment) python module by the paper's author C.J. Hutto\r\n - The [NLTK](http://www.nltk.org/_modules/nltk/sentiment/vader.html) source\r\n\r\nFor the testing I have compared the results of the NLTK module with this Java port.\r\n\r\n### Update (Oct 2021)\r\n- - -\r\nReleasing `v1.1.1`.\r\n\r\nThanks to @ArjohnKampman for helping is optimizing some parts of the code. Since I was touching this repo after a long time, I noticed that a lot of the Maven dependencies and plugins were outdated, so I have updated them. `mvn package` still works so it should be fine.\r\n\r\nI also noticed a lot of comments on not being able to use the library from Maven. I did upload a Jar to Nexus a long time back and I was having trouble doing that again since I think I've lost the pass-phrases needed to sign and upload the Jar to the Nexus. Luckily, I found a new solution [here](https://stackoverflow.com/a/28483461) which suggests to use https://jitpack.io/ for public GitHub repositories. Turns out it is super simple to use it and get the pacakge from GitHub. I wanted to make sure I unblock anyone who wants to use this package.\r\n\r\nI created a test Maven project `test-mvn-pkg1` locally and added the following to its `pom.xml`:\r\n\r\n```\r\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\r\n\u003cproject xmlns=\"http://maven.apache.org/POM/4.0.0\"\r\n         xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\r\n         xsi:schemaLocation=\"http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd\"\u003e\r\n    \u003cmodelVersion\u003e4.0.0\u003c/modelVersion\u003e\r\n\r\n    \u003cgroupId\u003eorg.example\u003c/groupId\u003e\r\n    \u003cartifactId\u003etest-mvn-pkg1\u003c/artifactId\u003e\r\n    \u003cversion\u003e1.0-SNAPSHOT\u003c/version\u003e\r\n\r\n    \u003crepositories\u003e\r\n        \u003crepository\u003e\r\n            \u003cid\u003ejitpack.io\u003c/id\u003e\r\n            \u003curl\u003ehttps://jitpack.io\u003c/url\u003e\r\n        \u003c/repository\u003e\r\n    \u003c/repositories\u003e\r\n\r\n    \u003cdependencies\u003e\r\n        \u003cdependency\u003e\r\n            \u003cgroupId\u003ecom.github.apanimesh061\u003c/groupId\u003e\r\n            \u003cartifactId\u003eVaderSentimentJava\u003c/artifactId\u003e\r\n            \u003cversion\u003ev1.1.1\u003c/version\u003e\r\n        \u003c/dependency\u003e\r\n    \u003c/dependencies\u003e\r\n\r\n\u003c/project\u003e\r\n```\r\nOnce Maven downloads the dependencies, you can easily use it in your code like:\r\n\r\n```\r\npackage org.example;\r\n\r\nimport com.vader.sentiment.analyzer.SentimentAnalyzer;\r\nimport com.vader.sentiment.analyzer.SentimentPolarities;\r\n\r\npublic class Test {\r\n    public static void main(String[] args) {\r\n        final SentimentPolarities sentimentPolarities =\r\n            SentimentAnalyzer.getScoresFor(\"that's a rare and valuable feature.\");\r\n        System.out.println(sentimentPolarities);\r\n\t// SentimentPolarities{positivePolarity=0.437, negativePolarity=0.0, neutralPolarity=0.563, compoundPolarity=0.4767}\r\n    }\r\n}\r\n```\r\n\r\nI'll try the Nexus upload and figure out if I can create a new Maven repo all together. Meanwhile, `jitpack` should work for anyone wanting to use the package.\r\n\r\n\r\n### Update (Jan 2018)\r\n\r\n- - -\r\nBased on a recommendation from @alexpetlenko, I uploaded the jar to Nexus as `vader-sentiment-analyzer-1.0`.\r\n\r\nYou can download the jar by adding the following to you `pom.xml`:\r\n```xml\r\n\u003cdependency\u003e\r\n  \u003cgroupId\u003ecom.github.apanimesh061\u003c/groupId\u003e\r\n  \u003cartifactId\u003evader-sentiment-analyzer\u003c/artifactId\u003e\r\n  \u003cversion\u003e1.0\u003c/version\u003e\r\n\u003c/dependency\u003e\r\n```\r\n\r\nPath to Jar: [vader-sentiment-analyzer-1.0.jar](https://oss.sonatype.org/service/local/repositories/releases/content/com/github/apanimesh061/vader-sentiment-analyzer/1.0/vader-sentiment-analyzer-1.0.jar)\r\n\r\n### Update (May 2017)\r\n\r\n- - -\r\nMajor design refactorings resulting from addition of `checkstyle` to the project.\r\n\r\nAlso added JavaDocs to the project.\r\n\r\n### Update (Jan 2017)\r\n\r\n- - -\r\n\r\nI have corrected a few bugs that I encountered when I was adding more tests.\r\n\r\nThe details are [here](https://github.com/apanimesh061/VaderSentimentJava/commit/d1d30c4ceeb356ec838f8abac70514bd21a92b4b).\r\n\r\nThis project now includes tests on text from:\r\n\r\n1. Amazon Reviews\r\n2. Movie Reviews\r\n3. NyTimes Editorial snippets\r\n\r\n### Introduction\r\n- - -\r\n\r\nThis README file describes the dataset of the paper:\r\n\r\n  **VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text** \u003cbr /\u003e\r\n  (by C.J. Hutto and Eric Gilbert) \u003cbr /\u003e\r\n  Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014. \u003cbr /\u003e\r\n\r\nFor questions, please contact: \u003cbr /\u003e\r\n\r\nC.J. Hutto \u003cbr /\u003e\r\nGeorgia Institute of Technology, Atlanta, GA 30032  \u003cbr /\u003e\r\ncjhutto [at] gatech [dot] edu \u003cbr /\u003e\r\n\r\n### Citation Information\r\n- - -\r\n\r\nIf you use either the dataset or any of the VADER sentiment analysis tools (VADER sentiment lexicon or Python code for rule-based sentiment analysis engine) in your research, please cite the above paper. For example:  \u003cbr /\u003e\r\n\r\n  \u003e \u003csmall\u003e **Hutto, C.J. \u0026 Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.** \u003c/small\u003e\u003cbr /\u003e\r\n\r\n### Resources and Dataset Descriptions\r\n- - -\r\n\r\nThe compressed .tar.gz package includes **PRIMARY RESOURCES** (items 1-3) as well as additional **DATASETS AND TESTING RESOURCES** (items 4-12):\r\n\r\n1. [VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text](http://comp.social.gatech.edu/papers/icwsm14.vader.hutto.pdf) \u003cbr /\u003e\r\n    The original paper for the data set, see citation information (above).\r\n\r\n2. vader_sentiment_lexicon.txt \u003cbr /\u003e\r\n       Empirically validated by multiple independent human judges, VADER incorporates a \"gold-standard\" sentiment lexicon that is especially attuned to microblog-like contexts.  \u003cbr /\u003e\r\n    The VADER sentiment lexicon is sensitive both the **polarity** and the **intensity** of sentiments\r\n\texpressed in social media contexts, and is also generally applicable to sentiment analysis\r\n\tin other domains. \u003cbr /\u003e\r\n\t   Manually creating (much less, validating) a comprehensive sentiment lexicon is\r\n\ta labor intensive and sometimes error prone process, so it is no wonder that many\r\n\topinion mining researchers and practitioners rely so heavily on existing lexicons\r\n\tas primary resources. We are pleased to offer ours as a new resource. \u003cbr /\u003e\r\n\t   We begin by constructing a list inspired by examining existing well-established\r\n\tsentiment word-banks (LIWC, ANEW, and GI). To this, we next incorporate numerous\r\n\tlexical features common to sentiment expression in microblogs, including\r\n\t - a full list of Western-style emoticons, for example, :-) denotes a smiley face\r\n\t   and generally indicates positive sentiment)\r\n\t - sentiment-related acronyms and initialisms (e.g., LOL and WTF are both examples of\r\n\t   sentiment-laden initialisms)\r\n\t - commonly used slang with sentiment value (e.g., nah, meh and giggly).\r\n\r\n\tThis process provided us with over 9,000 lexical feature candidates. Next, we assessed\r\n\tthe general applicability of each feature candidate to sentiment expressions. We\r\n\tused a wisdom-of-the-crowd13 (WotC) approach (Surowiecki, 2004) to acquire a valid\r\n\tpoint estimate for the sentiment valence (intensity) of each context-free candidate\r\n\tfeature. We collected intensity ratings on each of our candidate lexical features\r\n\tfrom ten independent human raters (for a total of 90,000+ ratings). Features were\r\n\trated on a scale from \"[–4] Extremely Negative\" to \"[4] Extremely Positive\", with\r\n\tallowance for \"[0] Neutral (or Neither, N/A)\".  \u003cbr /\u003e\r\n\t   We kept every lexical feature that had a non-zero mean rating, and whose standard\r\n\tdeviation was less than 2.5 as determined by the aggregate of ten independent raters.\r\n\tThis left us with just over 7,500 lexical features with validated valence scores that\r\n\tindicated both the sentiment polarity (positive/negative), and the sentiment intensity\r\n\ton a scale from –4 to +4. For example, the word \"okay\" has a positive valence of 0.9,\r\n\t\"good\" is 1.9, and \"great\" is 3.1, whereas \"horrible\" is –2.5, the frowning emoticon :(\r\n\tis –2.2, and \"sucks\" and it's slang derivative \"sux\" are both –1.5.\r\n\r\n3. vaderSentiment.py \u003cbr /\u003e\r\n    The Python code for the rule-based sentiment analysis engine. Implements the\r\n\tgrammatical and syntactical rules described in the paper, incorporating empirically\r\n\tderived quantifications for the impact of each rule on the perceived intensity of\r\n\tsentiment in sentence-level text. Importantly, these heuristics go beyond what would\r\n\tnormally be captured in a typical bag-of-words model. They incorporate **word-order\r\n\tsensitive relationships** between terms. For example, degree modifiers (also called\r\n\tintensifiers, booster words, or degree adverbs) impact sentiment intensity by either\r\n\tincreasing or decreasing the intensity. Consider these examples: \u003cbr /\u003e\r\n\t   (a) \"The service here is extremely good\"  \u003cbr /\u003e\r\n\t   (b) \"The service here is good\" \u003cbr /\u003e\r\n\t   (c) \"The service here is marginally good\" \u003cbr /\u003e\r\n\tFrom Table 3 in the paper, we see that for 95% of the data, using a degree modifier\r\n    increases the positive sentiment intensity of example (a) by 0.227 to 0.36, with a\r\n\tmean difference of 0.293 on a rating scale from 1 to 4. Likewise, example (c) reduces\r\n\tthe perceived sentiment intensity by 0.293, on average.\r\n\r\n4. tweets_GroundTruth.txt \u003cbr /\u003e\r\n    **NOTE**: This java module uses this file for testing. \u003cbr /\u003e\r\n\tFORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, and TWEET-TEXT \u003cbr /\u003e\r\n    DESCRIPTION: includes \"tweet-like\" text as inspired by 4,000 tweets pulled from Twitter’s public timeline, plus 200 completely contrived tweet-like texts intended to specifically test syntactical and grammatical conventions of conveying differences in sentiment intensity. The \"tweet-like\" texts incorporate a fictitious username (@anonymous) in places where a username might typically appear, along with a fake URL ( http://url_removed ) in places where a URL might typically appear, as inspired by the original tweets. The ID and MEAN-SENTIMENT-RATING correspond to the raw sentiment rating data provided in 'tweets_anonDataRatings.txt' (described below).\r\n\r\n5. tweets_anonDataRatings.txt \u003cbr /\u003e\r\n    FORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, STANDARD DEVIATION, and RAW-SENTIMENT-RATINGS \u003cbr /\u003e\r\n\tDESCRIPTION: Sentiment ratings from a minimum of 20 independent human raters (all pre-screened, trained, and quality checked for optimal inter-rater reliability).\r\n\r\n6. nytEditorialSnippets_GroundTruth.txt \u003cbr /\u003e\r\n\tFORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, and TEXT-SNIPPET \u003cbr /\u003e\r\n    DESCRIPTION: includes 5,190 sentence-level snippets from 500 New York Times opinion news editorials/articles; we used the NLTK tokenizer to segment the articles into sentence phrases, and added sentiment intensity ratings. The ID and MEAN-SENTIMENT-RATING correspond to the raw sentiment rating data provided in 'nytEditorialSnippets_anonDataRatings.txt' (described below).\r\n\r\n7. nytEditorialSnippets_anonDataRatings.txt \u003cbr /\u003e\r\n\tFORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, STANDARD DEVIATION, and RAW-SENTIMENT-RATINGS \u003cbr /\u003e\r\n    DESCRIPTION: Sentiment ratings from a minimum of 20 independent human raters (all pre-screened, trained, and quality checked for optimal inter-rater reliability).\r\n\r\n8. movieReviewSnippets_GroundTruth.txt \u003cbr /\u003e\r\n\tFORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, and TEXT-SNIPPET \u003cbr /\u003e\r\n    DESCRIPTION: includes 10,605 sentence-level snippets from rotten.tomatoes.com. The snippets were derived from an original set of 2000 movie reviews (1000 positive and 1000 negative) in Pang \u0026 Lee (2004); we used the NLTK tokenizer to segment the reviews into sentence phrases, and added sentiment intensity ratings. The ID and MEAN-SENTIMENT-RATING correspond to the raw sentiment rating data provided in 'movieReviewSnippets_anonDataRatings.txt' (described below).\r\n\r\n9. movieReviewSnippets_anonDataRatings.txt \u003cbr /\u003e\r\n\tFORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, STANDARD DEVIATION, and RAW-SENTIMENT-RATINGS \u003cbr /\u003e\r\n    DESCRIPTION: Sentiment ratings from a minimum of 20 independent human raters (all pre-screened, trained, and quality checked for optimal inter-rater reliability).\r\n\r\n10. amazonReviewSnippets_GroundTruth.txt \u003cbr /\u003e\r\n\t FORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, and TEXT-SNIPPET \u003cbr /\u003e\r\n     DESCRIPTION: includes 3,708 sentence-level snippets from 309 customer reviews on 5 different products. The reviews were originally used in Hu \u0026 Liu (2004); we added sentiment intensity ratings. The ID and MEAN-SENTIMENT-RATING correspond to the raw sentiment rating data provided in 'amazonReviewSnippets_anonDataRatings.txt' (described below).\r\n\r\n11. amazonReviewSnippets_anonDataRatings.txt \u003cbr /\u003e\r\n\t FORMAT: the file is tab delimited with ID, MEAN-SENTIMENT-RATING, STANDARD DEVIATION, and RAW-SENTIMENT-RATINGS \u003cbr /\u003e\r\n     DESCRIPTION: Sentiment ratings from a minimum of 20 independent human raters (all pre-screened, trained, and quality checked for optimal inter-rater reliability).\r\n\r\n12. Comp.Social website with more papers/research: [Comp.Social](http://comp.social.gatech.edu/papers/)\r\n\t \r\n13. vader_sentiment_comparison_online_weblink \u003cbr /\u003e\r\n     A short-cut hyperlinked to the online (web-based) sentiment comparison using a \"light\" version of VADER. http://www.socialai.gatech.edu/apps/sentiment.html .\r\n\r\n\r\n## Java Code EXAMPLE:\r\n\r\n```\r\npublic static void main(String[] args) throws IOException {\r\n    ArrayList\u003cString\u003e sentences = new ArrayList\u003cString\u003e() {{\r\n        add(\"VADER is smart, handsome, and funny.\");\r\n        add(\"VADER is smart, handsome, and funny!\");\r\n        add(\"VADER is very smart, handsome, and funny.\");\r\n        add(\"VADER is VERY SMART, handsome, and FUNNY.\");\r\n        add(\"VADER is VERY SMART, handsome, and FUNNY!!!\");\r\n        add(\"VADER is VERY SMART, really handsome, and INCREDIBLY FUNNY!!!\");\r\n        add(\"The book was good.\");\r\n        add(\"The book was kind of good.\");\r\n        add(\"The plot was good, but the characters are uncompelling and the dialog is not great.\");\r\n        add(\"A really bad, horrible book.\");\r\n        add(\"At least it isn't a horrible book.\");\r\n        add(\":) and :D\");\r\n        add(\"\");\r\n        add(\"Today sux\");\r\n        add(\"Today sux!\");\r\n        add(\"Today SUX!\");\r\n        add(\"Today kinda sux! But I'll get by, lol\");\r\n    }};\r\n\r\n    for (String sentence : sentences) {\r\n        System.out.println(sentence);\r\n        final SentimentPolarities sentimentPolarities =\r\n\t\t\tSentimentAnalyzer.getScoresFor(sentence);\r\n        System.out.println(sentimentPolarities);\r\n    }\r\n}\r\n```\r\n\r\n### Online (web-based) Sentiment Comparison using VADER\r\n\r\nhttp://www.socialai.gatech.edu/apps/sentiment.html .\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapanimesh061%2Fvadersentimentjava","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapanimesh061%2Fvadersentimentjava","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapanimesh061%2Fvadersentimentjava/lists"}