Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hivana98/datascience-binaracademy

Data Analysis, Data Science, Machine & Deep Learning
https://github.com/hivana98/datascience-binaracademy

data-science deep-learning deep-neural-networks machine-learning

Last synced: 19 days ago
JSON representation

Data Analysis, Data Science, Machine & Deep Learning

Host: GitHub
URL: https://github.com/hivana98/datascience-binaracademy
Owner: HivanA98
Created: 2024-05-28T09:08:36.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-07-18T06:14:59.000Z (7 months ago)
Last Synced: 2024-11-16T23:12:06.470Z (3 months ago)
Topics: data-science, deep-learning, deep-neural-networks, machine-learning
Language: Jupyter Notebook
Homepage: https://storage.googleapis.com/binar-academy-prd/[email protected]_-_Data_Science_-_18.pdf
Size: 16.5 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# DataScience-BinarAcademy
My Learning Journey as Student Data Science at Binar Academy
##
Gold Challange:
- Due to my previous @HIvanA98 account was hacked, I reupload it.
Final Version feature:
- "gold draft done cleaning.ipynb" --> Cleansing;
- "Gold draft Visualisasi.ipynb" --> Visualizing;
- "Draft Flask.py" --> Final Task, included clansing, flask, and swagger
##
Platinum Challange:
- V.0.0.1 First Commit
- V.0.0.2 Cleansing Update
- V.0.0.3 Final Cleansing & Preprocess
- V.0.0.4 Final Version with Team Merge Repository

Project Timeline:
##

V.0.0.1 First Commit
28 Mey 2024
- Making 2 Feature Extraction using TFIDF without Cleansing, and BOW without cleansing. And found and Choose the best accurate with good F1 Score is TFIDF without Cleansing

V.0.0.2 Cleansing Update

29 Mey 2024
- I made Cleansing unit test KFold for every each of LowerCase, Demoji, ChangeAlay, RemoveChars, and NLTK. With Result:
- Not any changes at Lowercase and Demoji
- NLTK and Stopwords made F1 Score be worsen
- A small improvment F1 using Removechars, Loop 3 the neutral from 78% -> 79%
- An improvment and deterioration F1 using Changealay:
- Loops 1 Negative 77% -> 76%, Neutral 79% -> 78%
- Loops 2 Neutral 74% ->72%
- Loops 4 Neutral 76% -> 78%
- Loops 5 Negative 79% -> 80%
- With Combinize Removechars and ChangeAlay, there are an improvment F1 score, but there is a big problem because when testing using predict, from Positive become Negative Labels
array([[0.07515561, 0.06784903, 0.85699536]]) -> array([[0.42839065, 0.17934933, 0.39226002]])
So finally I decided to use only Removechars for Cleansing. = TFIDF-Cleansing.jpynb
##

V.0.0.3
30 Mey 2024
- There is a total change and the final version for Cleansing and Preprocess:
- Cleansing now using both RemoveChars and ChangeAlay
- I made my own Dictonary for ChangeAlay
- I update Remove chars, So only word is accepted. Any Number, Unique Word, Single Letter wil delete
- I still not using stopwords, but a words that not have any tendency and always show in all labels will be delete in RemoveChars
- The reason why I put stopwords word in remove chars is to easy for any update

Result:
Awesome Improvement of F1 Score:
- Negative from 77% to--> 79% T-3,
- Neutral 81% from 78% to--> 81% T-3,
- Positive 81% from 88% to--> 89% T-1,
- Accuracy from: 0.8445454545454545 --> 0.8463636363636363.
Note:
- The improvement ran without any deterioration of each traning sesion, so it's awesome and perfect improvment.
##

V.0.0.4
3 June 2024
- Combinations from member team in to one repository,
- Where it's include Neural Network, LSTM, Flask and Swagger, and the PPT for presentation.

The original source is here: https://github.com/Ridzan12/24001074-18-Team_4-Analisis_Data_berdasarkan_Sentimen-Platinum

##
Library:
Pandas, Numpy, Regex, Flask, Flagger, Pickle, SKlearn

Cleansing Step:
Lowercase, Demoji, Change informal to formal word/ChangeAlay, censor