Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brooksian/twittersentimentsparkcorenlp
Twitter Sentiment Analysis Using Spark CoreNLP
https://github.com/brooksian/twittersentimentsparkcorenlp
nlp-machine-learning spark sparksql zeppelin-notebook
Last synced: 3 months ago
JSON representation
Twitter Sentiment Analysis Using Spark CoreNLP
- Host: GitHub
- URL: https://github.com/brooksian/twittersentimentsparkcorenlp
- Owner: BrooksIan
- Created: 2018-05-23T16:11:36.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2020-10-20T16:59:30.000Z (over 4 years ago)
- Last Synced: 2023-07-19T18:49:09.574Z (over 1 year ago)
- Topics: nlp-machine-learning, spark, sparksql, zeppelin-notebook
- Homepage:
- Size: 225 KB
- Stars: 1
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TwitterSentimentSparkCoreNLP
## Spark Core NLP on Tweets
**Language**: Scala
**Requirements**:
- [HDP 2.6.X]
- Spark 2.x**Author** Ian Brooks \
**Follow** [LinkedIn - Ian Brooks PhD] (https://www.linkedin.com/in/ianrbrooksphd/) \
**HCC Article**: [Link] (https://community.hortonworks.com/articles/192368/spark-core-nlp-in-apache-zeppelin.html)Instructions:
1. Please follow this [tutorial](https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.html) to build the Solr collection 'tweets'2. Upload the notebook (JSON File) to Apache Zeppelin
3. Match the version of Spark with the SolrSpark Connector. The version list is included in [here](https://github.com/lucidworks/spark-solr)
4. Review Spark Core NLP's [API](https://github.com/databricks/spark-corenlp) which creates Spark wrapper to the [Stanford CoreNLP](https://stanfordnlp.github.io/CoreNLP/) library
5. In the Stanford Core NLP download found here http://nlp.stanford.edu/software/stanford-corenlp-full-2018-02-27.zip, find the stanford-corelop-*-models.jar and copy it to the /tmp directory. In Zeppelin's Interpreters configurations for Spark, include the following artifact: /tmp/stanford-corenlp-full-2018-02-27/stanford-corenlp-3.9.1-models.jar