An open API service indexing awesome lists of open source software.

https://github.com/rlvtick/surabaya-rent-classification

Data Mining: Finding hidden trends and information behind the rent prices in Surabaya using SVM and Random Forest Algorithm.
https://github.com/rlvtick/surabaya-rent-classification

data-science datamining random-forest scraping svm

Last synced: 3 months ago
JSON representation

Data Mining: Finding hidden trends and information behind the rent prices in Surabaya using SVM and Random Forest Algorithm.

Awesome Lists containing this project

README

        

# Surabaya Rent Classification Using SVM and Random Forest

This collaborative data mining project was done on the dataset for rent prices of boarding houses and apartments all over Surabaya to find hidden trends and information behind the rent prices in Surabaya using SVM and Random Forest Algorithm. This project aims to give structured and organized information to prospective property renters. Furthermore, this project can also aid property owners in comprehending market preferences and needs, enabling them to optimize marketing and property development strategies in the future.

The project utilizes two machine learning algorithms, namely Support Vector Machine (SVM) and Random Forest, to discern patterns and relationships between various features and the target variable—rent prices. To gauge the quality and performance of each model, the AUC-ROC Curve (Area Under the Receiver Operating Characteristic Curve) evaluation metric is employed.

The outcome of the project revealed that the SVM model exhibited superior quality and performance, boasting an impressive AUC value of 0.93 (93%) compared to the Random Forest model, which achieved an AUC value of 0.94 (94%). While the SVM model yielded better results, both models demonstrated a high AUC value, indicative of their ability to effectively differentiate between various rent price categories.

In addition to AUC, other metrics such as accuracy, precision, recall, and F1-Score were utilized to assess the accuracy of each model. The comparative analysis of each model is presented below:

Support Vector Machine (SVM):
1. Accuracy : 0.89830
2. Precision : 0.91581
3. Recall : 0.90623
4. F1-Score : 0.90919

Random Forest:
1. Accuracy : 0.90482
2. Precision : 0.90688
3. Recall : 0.92117
4. F1-Score : 0.91349