{"id":22357237,"url":"https://github.com/mdh266/randomforests","last_synced_at":"2026-03-10T10:32:13.304Z","repository":{"id":49272813,"uuid":"80865207","full_name":"mdh266/RandomForests","owner":"mdh266","description":"Random Forest Library In Python Compatible with Scikit-Learn","archived":false,"fork":false,"pushed_at":"2021-06-21T22:47:05.000Z","size":355,"stargazers_count":14,"open_issues_count":1,"forks_count":6,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-30T10:52:16.036Z","etag":null,"topics":["classification","data-science","decision-tree","ensemble-learning","machine-learning","machine-learning-algorithms","pandas","python","random-forest","regression","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mdh266.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-03T20:09:46.000Z","updated_at":"2024-11-18T14:05:26.000Z","dependencies_parsed_at":"2022-09-17T08:00:33.668Z","dependency_job_id":null,"html_url":"https://github.com/mdh266/RandomForests","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mdh266/RandomForests","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdh266%2FRandomForests","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdh266%2FRandomForests/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdh266%2FRandomForests/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdh266%2FRandomForests/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mdh266","download_url":"https://codeload.github.com/mdh266/RandomForests/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdh266%2FRandomForests/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001506,"owners_count":26083118,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","data-science","decision-tree","ensemble-learning","machine-learning","machine-learning-algorithms","pandas","python","random-forest","regression","scikit-learn"],"created_at":"2024-12-04T14:13:39.167Z","updated_at":"2025-10-09T14:44:22.685Z","avatar_url":"https://github.com/mdh266.png","language":"Python","readme":"[![Build Status](https://travis-ci.com/mdh266/RandomForests.svg?branch=master)](https://travis-ci.com/mdh266/RandomForests)\n[![codecov](https://codecov.io/gh/mdh266/RandomForests/branch/master/graph/badge.svg)](https://codecov.io/gh/mdh266/RandomForests)\n[![made-with-python](https://img.shields.io/badge/Made%20with-Python-1f425f.svg)](https://www.python.org/)\n\n# Random Forests In Python\n--------------\n\n## Intoduction\n-------------\nI started this project to better understand the way [Decision trees](https://en.wikipedia.org/wiki/Decision_tree) and [random forests](https://en.wikipedia.org/wiki/Random_forest) work. At this point the classifiers are only based off the gini-index and the regression models are based off the mean square error. Both the classifiers and regression models are built to work with [Pandas](http://pandas.pydata.org) and [Scikit-Learn](https://scikit-learn.org/)\n\n## Examples\n\nBasic classification example using Scikit-learn:\n\n    from randomforests import RandomForestClassifier\n    import pandas as pd\n\tfrom sklearn.model_selection import train_test_split\n\tfrom sklearn.model_selection import GridSearchCV\n    from sklearn.pipeline import Pipeline\n    from sklearn.metrics import accuracy_score\n    from sklearn.datasets import load_breast_cancer\n\tdataset = load_breast_cancer()\n\n\tcols = [dataset.data[:,i] for i in range(4)]\n\n\tX = pd.DataFrame({k:v for k,v in zip(dataset.feature_names,cols)})\n\ty = pd.Series(dataset.target)\n\n\tX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=24)\n\n\tpipe   = Pipeline([(\"forest\", RandomForestClassifier())])\n\n    params = {\"forest__max_depth\": [1,2,3]}\n\n    grid   = GridSearchCV(pipe, params, cv=5, n_jobs=-1)\n    model  = grid.fit(X_train,y_train)\n\n    preds  = model.predict(X_test)\n\n    print(\"Accuracy: \", accuracy_score(preds, y_test))\n\n    \u003e\u003e Accuracy:  0.9020979020979021\n\n\nBasic regression example using Scikit-learn:\n\n    from randomforests import RandomForestRegressor\n    from sklearn.metrics import r2_score,\n    from sklearn.datasets import load_boston\n    dataset = load_boston()\n\n    cols = [dataset.data[:,i] for i in range(4)]\n\n    X = pd.DataFrame({k:v for k,v in zip(dataset.feature_names,cols)})\n    y = pd.Series(dataset.target)\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=24)\n\n    pipe   = Pipeline([(\"forest\", RandomForestRegressor())])\n\n    params = {\"forest__max_depth\": [1,2,3]}\n\n    grid   = GridSearchCV(pipe, params, cv=5, n_jobs=-1)\n    model  = grid.fit(X,y)\n\n    preds  = model.predict(X_test)\n\n    print(\"R^2 : \", r2_score(y_test,preds))\n\n    \u003e\u003e R^2 : 0.37948488681649484\n\n## Installing\n-----------------\n\nUses the `setup.py` generated by [PyScaffold](https://pypi.org/project/PyScaffold/).  To install the library in development mode use the following:\n\n    python setup.py install\n\n\n## Test\n-----------------\nUses the `setup.py` generated by [PyScaffold](https://pypi.org/project/PyScaffold/):\n\n    python setup.py test\n\n## Dependencies\n--------------\nDependencies are minimal:\n\n    - Python (\u003e= 3.6)\n    - [Scikit-Learn](https://scikit-learn.org/stable/) (\u003e=0.23)\n    - [Pandas](https://pandas.pydata.org/) (\u003e=1.0)\n\n\n## References\n---------------\n- [An Introduction To Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/)\n\n- [Elements Of Statistical Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/)\n\n- [Scikit-learn Ensemble Methods](http://scikit-learn.org/stable/auto_examples/index.html#ensemble-methods)\n\n- [Scikit-Learn Custom Estimators](https://scikit-learn.org/dev/developers/develop.html)\n\n- [How to Implement Random Forest From Scratch In Python](http://machinelearningmastery.com/implement-random-forest-scratch-python/)\n\n- [How To Implement A Decision Tree From Scratch In Python](http://machinelearningmastery.com/implement-decision-tree-algorithm-scratch-python)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmdh266%2Frandomforests","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmdh266%2Frandomforests","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmdh266%2Frandomforests/lists"}