https://github.com/fusi3/natural_language_coursework

Assessing the impact of different pre-processing techniques for classifying the sentiment of movie reviews
https://github.com/fusi3/natural_language_coursework

bag-of-words latent-semantic-analysis lemmatization multilayer-perceptron nlp sentiment-analysis stemming support-vector-machines tfidf

Last synced: 2 months ago
JSON representation

Assessing the impact of different pre-processing techniques for classifying the sentiment of movie reviews

Host: GitHub
URL: https://github.com/fusi3/natural_language_coursework
Owner: fusi3
Created: 2024-07-12T16:16:46.000Z (10 months ago)
Default Branch: main
Last Pushed: 2024-07-12T20:38:55.000Z (10 months ago)
Last Synced: 2025-01-24T17:36:30.996Z (4 months ago)
Topics: bag-of-words, latent-semantic-analysis, lemmatization, multilayer-perceptron, nlp, sentiment-analysis, stemming, support-vector-machines, tfidf
Language: Jupyter Notebook
Homepage:
Size: 17.1 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# natural_language_coursework

This worked focused on attempting various kinds of preprocessing to see the impact that they may have on the classification of sentiment for movie reviews. The preprocessing was tested with Multilayer Perceptrons and Support Vector Machines. This included various n-gram levels through BoW and TF-IDF for stemming and lemming. The effect of Latent Semantic Analysis was also assessed, however it seemed that the best performance came through using stemming and uni gram tf-idf. Please read the other read me to run the notebooks.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fusi3/natural_language_coursework

Awesome Lists containing this project

README