https://github.com/ekramasif/pre-processing-and-feature-engineering-with-svm
https://github.com/ekramasif/pre-processing-and-feature-engineering-with-svm
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/ekramasif/pre-processing-and-feature-engineering-with-svm
- Owner: ekramasif
- License: mit
- Created: 2021-09-19T14:56:49.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-02-13T10:34:22.000Z (over 3 years ago)
- Last Synced: 2025-01-10T10:29:52.806Z (4 months ago)
- Language: Jupyter Notebook
- Size: 3.76 MB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Pre-Processing-and-Feature-Engineering-with-SVM
# General remarks
This assignment is meant to get your acquainted with Support Vector Machines (SVM). You
will also have to explore Pre-processing and Feature contribution quite a bit further (as
explained throughout the Lab classes). Additionally, you’re encouraged to experiment a
little further with report writing, by making your own choices and presenting it as a paper
like format if possible.
For the practical parts, you are asked to train and test a few models using SVM, and produce
what you think the best settings are for the data we’re using. Use the dataset (corona_data)
that we have used during our lab class. Note that you have to show the results based on
the big dataset. That is, you have to train your model using the “train.tsv” and the evaluation
of your final model would be based on the “test.tsv”. However, for tweaking your models,
you may use the small datasets.# Default settings:
Run a support vector machine with a linear kernel on the multiclass classification
(cls = svm.SVC(kernel=’linear’, C=1.0). Use default settings and report results using
the portion of the test/development set. Just make sure you document what you have done
along with the results.Reporting the Results (Evaluation Metrics):
As evaluation, you will have to report on the overall Accuracy as well as the class-wise
Precision/Recall and F1 score. Additionally, also include confusion metrics for each of your
experimentation. Please do not just put the results in the report. Try to explain a bit about
why you might thing the results have improved/degraded from the previous experiment(s).# Output:
Training the Classifier...
SVM Accuracy = 0.4precision recall f1-score support
Extremely Negative 0.479 0.240 0.320 146
Extremely Positive 0.591 0.340 0.431 162
Negative 0.381 0.483 0.426 302
Neutral 0.453 0.414 0.433 152
Positive 0.324 0.424 0.367 238accuracy 0.400 1000
macro avg 0.446 0.380 0.396 1000
weighted avg 0.427 0.400 0.399 1000Naive Bayes Accuracy = 0.269
precision recall f1-score support
Extremely Negative 0.296 0.199 0.238 146
Extremely Positive 0.272 0.136 0.181 162
Negative 0.335 0.255 0.289 302
Neutral 0.246 0.474 0.324 152
Positive 0.232 0.290 0.257 238accuracy 0.269 1000
macro avg 0.276 0.271 0.258 1000
weighted avg 0.281 0.269 0.262 1000