{"id":18317711,"url":"https://github.com/datarohit/machineknight-hackathon","last_synced_at":"2025-04-09T13:49:09.315Z","repository":{"id":167551796,"uuid":"532314226","full_name":"DataRohit/Machineknight-Hackathon","owner":"DataRohit","description":"This repo contains files for a house price prediction project built for MachineKnight Hackathon that was hosted on unstop.","archived":false,"fork":false,"pushed_at":"2022-09-04T08:54:44.000Z","size":19636,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-15T07:47:31.529Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DataRohit.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-03T16:27:10.000Z","updated_at":"2023-03-20T17:49:28.000Z","dependencies_parsed_at":"2023-05-23T01:30:23.691Z","dependency_job_id":null,"html_url":"https://github.com/DataRohit/Machineknight-Hackathon","commit_stats":null,"previous_names":["datarohit/machineknight-hackathon"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataRohit%2FMachineknight-Hackathon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataRohit%2FMachineknight-Hackathon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataRohit%2FMachineknight-Hackathon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataRohit%2FMachineknight-Hackathon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DataRohit","download_url":"https://codeload.github.com/DataRohit/Machineknight-Hackathon/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248054218,"owners_count":21039951,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T18:07:12.409Z","updated_at":"2025-04-09T13:49:09.308Z","avatar_url":"https://github.com/DataRohit.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Machineknight Hackathon Project\n\n#### Github Repo Link (For Complete Files and Folder) - [Fly to Repo](https://github.com/DataRohit/Machineknight-Hackathon)\n\n## Problem Statement\nYour frined is going to start a real estate business, and ask your help to **predict** the **house rents** in his regions. He gave you a **housing data** to work on. You decided to build a **machine learning model** that can predict the rent of house.\n\u003cbr/\u003e\nAlso your friend has no idea about ml and how to make predictions using ml model. So, you have to build an **api hosted front-end web app**, so that your friend can easily operate that.\n\n## Task\nYou are give the **dataset of housing properties**. You task is to create a ML model that can predict the rent of house based on the given properties. Serve that **ML model using rest api**. You have to integrate both backend and frontend.\n\u003cbr/\u003e\n**Train your model using train data and make predictions of teh test data**\n\n## Installation\nThis project has two folders\n - One for Data Processing and Model Training and Testing \n - Second for API\n\nBoth these folders have their seperate **requirement.txt** files.\n\n#### Change directory to each of these folder and run the following commands step by step\n```python\n1. python -m venv venv\n2. venv\\Scripts\\activate\n3. pip install -r requirements.txt\n```\n#### Note: Commands for command prompt\n\n# Usage\nOne all the packages are installed you are ready to go.\n - For running the .ipynb file use google colab\n   - For model training and testing auto-sklearn package has been used. Auto-sklearn package only supports linux based systems so use google colab.\n   - Upload the the data files to runtime in colab.\n   - Firstly run the code cell which install the auto-sklearn package as auto-sklearn asks for restart of Runtime.\n   - One done you can Rull all the cell and get the output.\n - For running the API\n   - The same problem here is that auto-sklearn does not support windows.\n   - But other endpoints can be tested.\n   - Change directory to the API folder.\n   - Then run the following command ```uvicorn main:app --reload``` in the terminal to get the API started.\n\n**Solution to the Auto-Sklearn Problem - Luckily the servers running Google Colab and Heroku (Used for API hosting) both use Linux based servers for hosting.**\n\n## The API for the model has been hosted using Heroku.\u003cbr/\u003eView the Interactive API playground by clicking the link - [Mahineknight-house-price](https://machineknight-house-price.herokuapp.com/)\n\n# Requiremnts.txt\n## Data Processing \u0026 Model Training\n```python\npandas==1.4.4\nauto-sklearn==0.14.7\n```\n## API\n```python\nfastapi==0.81.0\npandas==1.4.4\nuvicorn==0.18.3\nauto-sklearn==0.14.7\ngunicorn==20.1.0\n```\n\n# Data Processing\n### The following steps were followed in order to model the data in the desired format followed by training and testing of the model:\n 1. Dropping `id`, `activation_data` and `locality` column from the data as they did not have much effect on the rent \u003cbr/\u003eModel was trained and tested with and withot `locality` column the model performance did not seemed to be affected.\n 2. `amenities` columns has stored in the form of stringified dictionary. Extracting that data and adding to original dataset.\n 3. Dropping the repeated columns. Some columns which are present in data are also in amenities so Dropping them.\n 4. All the features extracted from `amenities` are binary. So Assuming more the ammenities higher will be the price. So summing up all the amenities binary feature to make one single feature.\n 5. Dropping the ineffective columns and preparing the data for encoding of categorical columns.\n 6. Seperating the target and features from the data.\n 7. Label Encoding the categorical columns and storing the trained encoder for each feature to be used in the API.\n 8. Splitting the data into training and testing data form ML Model training and testing.\n 9. Initializing and Training AutoSklearn Regressor. The same model was trained from 1Hr and 2Hr. The 2Hr model performed slightly better and thus was used for making the predictions.\n 10. Custom function to calculate `RMSE`, `R2` and `Adjusted R2` score for checking the performance of the model.\n 11. Saving the trained model and trained encoders.\n\n# API - Working\n#### Test the API (Open in Browser)  - https://machineknight-house-price.herokuapp.com/docs\n### Base Path (https://machineknight-house-price.herokuapp.com/)\nThis base path is a path which supports GET request but it redirects to the `/docs` path which provides a GUI interface to test and play with the API. This path returns the HTML of the \"/docs\" page but works if the link is opened in Browser.\n### Get Object Features (https://machineknight-house-price.herokuapp.com/get_features/object_features)\nThis past supports GET Request and it returns the Object / Categorical columns that are expected as Input and their expected values in the form of list.\n### Get Number Features (https://machineknight-house-price.herokuapp.com/get_features/numerical_features)\nThis path supports GET Request and it return the Numerical columns that are expected as Input and their expected values in the form of list which is the range where the first value being the minimum and second being maximum. The max and min values was calculated from the complete data that was used for training and testing of the model.\n### Predict House Rent (https://machineknight-house-price.herokuapp.com/predict/rent)\nThis path supports POST Request and it expects 18 different values as query parameters. Once successfully making the request without any error or fault values the model returns a dictionary of the the values that were given as the input and the predicted house rent.\n\n# Screenshots\n|    |    |\n| ---| ---|\n|![Screenshot](/images/FastAPI-Swagger-UI-1.png)| ![Screenshot](/images/FastAPI-Swagger-UI-2.png) |\n|![Screenshot](/images/FastAPI-Swagger-UI-3.png)| ![Screenshot](/images/FastAPI-Swagger-UI-4.png) |","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatarohit%2Fmachineknight-hackathon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatarohit%2Fmachineknight-hackathon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatarohit%2Fmachineknight-hackathon/lists"}