https://github.com/jhylin/ml2-2_random_forest
Machine learning series 2.2 on random forest
https://github.com/jhylin/ml2-2_random_forest
cheminformatics machine-learning random-forest-classifier random-forest-regression
Last synced: over 1 year ago
JSON representation
Machine learning series 2.2 on random forest
- Host: GitHub
- URL: https://github.com/jhylin/ml2-2_random_forest
- Owner: jhylin
- License: mit
- Created: 2023-09-29T06:44:58.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-05-03T01:35:51.000Z (about 2 years ago)
- Last Synced: 2025-01-29T02:48:28.221Z (over 1 year ago)
- Topics: cheminformatics, machine-learning, random-forest-classifier, random-forest-regression
- Language: Jupyter Notebook
- Homepage:
- Size: 2.82 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
#### **Machine learning series 2.2 - Random forest**
This repository currently holds most of the files used in this random forest (RF) project. RF is another commonly used machine learning (ML) algorithm in drug discovery. I've attempted to use a deeper-dive style to explore the following two posts, but there were actually a lot of other things to be covered, so the these posts were not completely comprehensive, but they'd be snapshots of how I approached it at the time prior to the published dates.
There are currently two posts on RF:
1. [Random forest](https://jhylin.github.io/Data_in_life_blog/posts/17_ML2-2_Random_forest/1_random_forest.html) (it's on a RF regressor, although named as RF only) or its Jupyter notebook version is available above (1_random_forest.ipynb). This work was run in a virtual environment of Python 3.10.
2. [Random forest classifier](https://jhylin.github.io/Data_in_life_blog/posts/17_ML2-2_Random_forest/2_random_forest_classifier.html) or its Jupyter notebook version is available above (2_random_forest_classifier.ipynb). This work was also run in a virtual environment of Python 3.10.
Both posts can be reached from my [blog](https://jhylin.github.io/Data_in_life_blog/) as well.
##### **Datasets**
The first post used the same set of data derived from ML series 2.1 (decision tree) was used. For details about how I've derived this dataset, please visit series 2.1 posts [1](https://jhylin.github.io/Data_in_life_blog/posts/16_ML2-1_Decision_tree/1_data_col_prep.html) and [2](https://jhylin.github.io/Data_in_life_blog/posts/16_ML2-1_Decision_tree/2_data_prep_tran.html).
The second post used [chembl_downloader](https://github.com/cthoyt/chembl-downloader) instead, for details please visit the post.