{"id":16471100,"url":"https://github.com/kshru9/decision-trees-fromscratch","last_synced_at":"2026-06-08T18:32:46.336Z","repository":{"id":67398273,"uuid":"347955805","full_name":"kshru9/Decision-Trees-fromScratch","owner":"kshru9","description":"A complete implementation of Decision Trees and ensemble methods: bagging, random forest and Adaboost with all the necessary plots.","archived":false,"fork":false,"pushed_at":"2021-03-15T12:16:27.000Z","size":3091,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-28T08:57:27.042Z","etag":null,"topics":["adaboost","decision-tree-classifier","decision-tree-regression","decision-trees","ensemble-learning","iris-dataset","machine-learning-algorithms","random-forest","realestate","scratch","scratch-implementation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kshru9.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-15T12:13:32.000Z","updated_at":"2021-03-15T12:22:02.000Z","dependencies_parsed_at":"2023-05-01T16:46:35.691Z","dependency_job_id":null,"html_url":"https://github.com/kshru9/Decision-Trees-fromScratch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kshru9/Decision-Trees-fromScratch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kshru9%2FDecision-Trees-fromScratch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kshru9%2FDecision-Trees-fromScratch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kshru9%2FDecision-Trees-fromScratch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kshru9%2FDecision-Trees-fromScratch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kshru9","download_url":"https://codeload.github.com/kshru9/Decision-Trees-fromScratch/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kshru9%2FDecision-Trees-fromScratch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34075956,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adaboost","decision-tree-classifier","decision-tree-regression","decision-trees","ensemble-learning","iris-dataset","machine-learning-algorithms","random-forest","realestate","scratch","scratch-implementation"],"created_at":"2024-10-11T12:12:38.783Z","updated_at":"2026-06-08T18:32:46.331Z","avatar_url":"https://github.com/kshru9.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### Questions\n\n1. Complete the decision tree implementation in tree/base.py. **[5 marks]**\nThe code should be written in Python and not use existing libraries other than the ones already imported in the code. Your decision tree should work for four cases: i) discrete features, discrete output; ii) discrete features, real output; iii) real features, discrete output; real features, real output. Your decision tree should be able to use GiniIndex or InformationGain as the criteria for splitting. Your code should also be able to plot/display the decision tree. \n\n    \u003e You should be editing the following files.\n  \n    - `metrics.py`: Complete the performance metrics functions in this file. \n\n    - `usage.py`: Run this file to check your solutions.\n\n    - tree (Directory): Module for decision tree.\n      - `base.py` : Complete Decision Tree Class.\n      - `utils.py`: Complete all utility functions.\n      - `__init__.py`: **Do not edit this**\n\n    \u003e You should run _usage.py_ to check your solutions. \n\n2. \n    a) Show the usage of *your decision tree* on the [IRIS](https://archive.ics.uci.edu/ml/datasets/Iris) dataset. The first 70% of the data should be used for training purposes and the remaining 30% for test purposes. Show the accuracy, per-class precision and recall of the decision tree you implemented on the test dataset. **[1 mark]**\n\n    b) Use 5 fold cross-validation on the dataset. Using nested cross-validation find the optimum depth of the tree. **[2 marks]**\n    \n    \u003e You should be editing `iris-experiments.py` for the code containing the experiments.\n\n3. \n    a) Show the usage of your decision tree for the [real estate price prediction regression](https://archive.ics.uci.edu/ml/datasets/Real+estate+valuation+data+set) problem. **[1 mark]**\n    \n    b) Compare the performance of your model with the decision tree module from scikit learn. **[1 mark]**\n    \n   \u003e You should be editing `estate-experiments.py` for the code containing the experiments.\n    \n4. Create some fake data to do some experiments on the runtime complexity of your decision tree algorithm. Create a dataset with N samples and M binary features. Vary M and N to plot the time taken for: 1) learning the tree, 2) predicting for test data. How do these results compare with theoretical time complexity for decision tree creation and prediction. You should do the comparison for all the four cases of decision trees. **[2 marks]**\t\n\n    \u003eYou should be editing `experiments.py` for the code containing the experiments. \n\n5. \n    a) Implement Adaboost on Decision Stump (depth -1 tree). You could use Decision Tree learnt in assignment #1 or sklearn decision tree and solve it for the case of real input and discrete output. Edit `ensemble/ADABoost.py` **[2 marks]**\n\n    b) Implement AdaBoostClassifier on Iris data set. Fix a random seed of 42. Shuffle the dataset according to this random seed. Use the first 60% of the data for training and last 40% of the data set for testing. Using sepal width and petal width as the two features, plot the decision surfaces as done for Q1a) and compare the accuracy of AdaBoostClassifier using 3 estimators over decision stump. Include your code in `q5_ADABoost.py`. [*We will be solving the problem in 2 class setting. The two classes are: virginica and not virginica.  **[2 marks]**\n\n6.\n    a) Implement Bagging(BaseModel, num_estimators): where base model is be DecisionTree (or sklearn decision tree) you have implemented. In a later assignment, you would have to implement the above over LinearRegression() also. Edit `ensemble/bagging.py`. Use `q6_Bagging.py` for testing.[*We will be implementing only for DecisionTrees [2 marks*]]\n\n    \n7. \n    a) Implement RandomForestClassifier() and RandomForestRegressor() classes in `tree/randomForest.py`. Use `q7_RandomForest.py` for testing.[*2 marks*]\n\n     b) Generate the plots for Iris data set. Fix a random seed of 42. Shuffle the dataset according to this random seed. Use the first 60% of the data for training and last 40% of the data set for testing. Using sepal width and petal width as the two features. Include you code in `random_forest_iris.py`[*2 marks*]\n\n\nYou can answer the subjectve questions (timing analysis, displaying plots) by creating `assignment_q\u003cquestion-number\u003e_subjective_answers.md`\n\nDoubts about the assignment may be clarified here: https://iitgnacin-my.sharepoint.com/:w:/g/personal/nipun_batra_iitgn_ac_in/EZOsxJwGPFFLhJ-9XmPL0IEBEvpkVz935Bd-nblaVkEzOQ?e=MuXphv\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkshru9%2Fdecision-trees-fromscratch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkshru9%2Fdecision-trees-fromscratch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkshru9%2Fdecision-trees-fromscratch/lists"}