Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/swamikannan/sentiment-analysis-using-huggingface-transformers

Building a Sentiment Analysis Machine usine Transformers
https://github.com/swamikannan/sentiment-analysis-using-huggingface-transformers

bert-embeddings bert-model eda exploratory-data-analysis kaggle keras rotten-tomatoes sentiment-analysis tensorflow2 tokenizer

Last synced: 24 days ago
JSON representation

Building a Sentiment Analysis Machine usine Transformers

Awesome Lists containing this project

README

        

# Transformers
Transformers have been exciting development in Deep Learning starting with the "Attention is all you need" paper by Ashish Vaswani, et. al. It maximally exploits any set of data where there are correlations between two data points such as sequence models and vision

# Sentiment Analysis on Movie Reviews



## Data:
"There's a thin line between likably old-fashioned and fuddy-duddy, and The Count of Monte Cristo ... never quite settles on either side."
The Rotten Tomatoes movie review dataset is a corpus of movie reviews used for sentiment analysis, originally collected by Pang and Lee. In their work on sentiment treebanks, Socher et al. used Amazon's Mechanical Turk to create fine-grained labels for all parsed phrases in the corpus. This competition presents a chance to benchmark your sentiment-analysis ideas on the Rotten Tomatoes dataset. You are asked to label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others make this task very challenging.

## Data reference:
https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews/
## Attributes :
The dataset is comprised of tab-separated files with phrases from the Rotten Tomatoes dataset. The train/test split has been preserved for the purposes of benchmarking, but the sentences have been shuffled from their original order. Each Sentence has been parsed into many phrases by the Stanford parser. Each phrase has a PhraseId. Each sentence has a SentenceId. Phrases that are repeated (such as short/common words) are only included once in the data.
train.tsv contains the phrases and their associated sentiment labels. We have additionally provided a SentenceId so that you can track which phrases belong to a single sentence.
test.tsv contains just phrases. You must assign a sentiment label to each phrase.
The sentiment labels are:
0 - negative
1 - somewhat negative
2 - neutral
3 - somewhat positive
4 – positive

## Key asks:
• Assign a sentiment label to each phrase in the test.tsv file

## Model


## Scores:
### Batch of 16



### Batch of 32



Credit for images:

List icons created by Freepik - Flaticon

Funnel icons created by Freepik - Flaticon

Message icons created by Freepik - Flaticon

Feedback icons created by Freepik - Flaticon

Complain icons created by Freepik - Flaticon

Adaptive icons created by IconMarketPK - Flaticon

Smartphone icons created by Nikita Golubev - Flaticon

Review icons created by Freepik - Flaticon

Customer experience icons created by juicy_fish - Flaticon