https://github.com/fusi3/natural_language_coursework
Assessing the impact of different pre-processing techniques for classifying the sentiment of movie reviews
https://github.com/fusi3/natural_language_coursework
bag-of-words latent-semantic-analysis lemmatization multilayer-perceptron nlp sentiment-analysis stemming support-vector-machines tfidf
Last synced: 2 months ago
JSON representation
Assessing the impact of different pre-processing techniques for classifying the sentiment of movie reviews
- Host: GitHub
- URL: https://github.com/fusi3/natural_language_coursework
- Owner: fusi3
- Created: 2024-07-12T16:16:46.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-07-12T20:38:55.000Z (10 months ago)
- Last Synced: 2025-01-24T17:36:30.996Z (4 months ago)
- Topics: bag-of-words, latent-semantic-analysis, lemmatization, multilayer-perceptron, nlp, sentiment-analysis, stemming, support-vector-machines, tfidf
- Language: Jupyter Notebook
- Homepage:
- Size: 17.1 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# natural_language_coursework
This worked focused on attempting various kinds of preprocessing to see the impact that they may have on the classification of sentiment for movie reviews. The preprocessing was tested with Multilayer Perceptrons and Support Vector Machines. This included various n-gram levels through BoW and TF-IDF for stemming and lemming. The effect of Latent Semantic Analysis was also assessed, however it seemed that the best performance came through using stemming and uni gram tf-idf. Please read the other read me to run the notebooks.