https://github.com/rlvtick/surabaya-rent-classification
Data Mining: Finding hidden trends and information behind the rent prices in Surabaya using SVM and Random Forest Algorithm.
https://github.com/rlvtick/surabaya-rent-classification
data-science datamining random-forest scraping svm
Last synced: 3 months ago
JSON representation
Data Mining: Finding hidden trends and information behind the rent prices in Surabaya using SVM and Random Forest Algorithm.
- Host: GitHub
- URL: https://github.com/rlvtick/surabaya-rent-classification
- Owner: Rlvtick
- Created: 2023-12-08T11:50:24.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-08T12:04:21.000Z (over 1 year ago)
- Last Synced: 2025-01-06T07:48:31.910Z (5 months ago)
- Topics: data-science, datamining, random-forest, scraping, svm
- Language: Jupyter Notebook
- Homepage:
- Size: 488 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Surabaya Rent Classification Using SVM and Random Forest
This collaborative data mining project was done on the dataset for rent prices of boarding houses and apartments all over Surabaya to find hidden trends and information behind the rent prices in Surabaya using SVM and Random Forest Algorithm. This project aims to give structured and organized information to prospective property renters. Furthermore, this project can also aid property owners in comprehending market preferences and needs, enabling them to optimize marketing and property development strategies in the future.
The project utilizes two machine learning algorithms, namely Support Vector Machine (SVM) and Random Forest, to discern patterns and relationships between various features and the target variable—rent prices. To gauge the quality and performance of each model, the AUC-ROC Curve (Area Under the Receiver Operating Characteristic Curve) evaluation metric is employed.
The outcome of the project revealed that the SVM model exhibited superior quality and performance, boasting an impressive AUC value of 0.93 (93%) compared to the Random Forest model, which achieved an AUC value of 0.94 (94%). While the SVM model yielded better results, both models demonstrated a high AUC value, indicative of their ability to effectively differentiate between various rent price categories.
In addition to AUC, other metrics such as accuracy, precision, recall, and F1-Score were utilized to assess the accuracy of each model. The comparative analysis of each model is presented below:
Support Vector Machine (SVM):
1. Accuracy : 0.89830
2. Precision : 0.91581
3. Recall : 0.90623
4. F1-Score : 0.90919Random Forest:
1. Accuracy : 0.90482
2. Precision : 0.90688
3. Recall : 0.92117
4. F1-Score : 0.91349