https://github.com/interactivetech/ted_talk_sentiment_prediction
We are exploring several language models ( DOC2Vec, CNNTextClassifier) and see how it performs on text classification
https://github.com/interactivetech/ted_talk_sentiment_prediction
Last synced: 12 months ago
JSON representation
We are exploring several language models ( DOC2Vec, CNNTextClassifier) and see how it performs on text classification
- Host: GitHub
- URL: https://github.com/interactivetech/ted_talk_sentiment_prediction
- Owner: interactivetech
- Created: 2018-03-27T02:15:11.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-05-07T23:29:17.000Z (about 8 years ago)
- Last Synced: 2025-04-04T20:46:39.581Z (about 1 year ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 413 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Ted Talk Sentiment Prediction
We are exploring several language models (LSTM, DOC2Vec, CNNTextClassifier) and see how it performs on text classification
We examine sentiment analysis on the Ted Talk dataset.
The contributions of this paper are:
(1) We conduct basic prior analysis of the dataset.
(2) We experiment with different preprocessing techniques of the dataset like BoW, TF-IDF, Word2Vec, and Doc2Vec.
(3) We establish the baseline score using the most frequent tag.
(4) We compare the results of Logistic Regression, Support Vector Machines, and CNN to find the best model.
(5) We experiment with the using of more high level features than just transcripts.