{"id":17998795,"url":"https://github.com/kingfengji/mgbdt","last_synced_at":"2025-03-26T06:31:19.972Z","repository":{"id":83527728,"uuid":"154836868","full_name":"kingfengji/mGBDT","owner":"kingfengji","description":"This is the official clone for the implementation of the NIPS18 paper  Multi-Layered Gradient Boosting Decision Trees (mGBDT) .","archived":false,"fork":false,"pushed_at":"2018-11-19T07:28:10.000Z","size":1074,"stargazers_count":103,"open_issues_count":4,"forks_count":26,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-21T08:48:31.687Z","etag":null,"topics":["gbdt","gradient-boosting-decision-trees","mgbdt","representation-learning","target-propagation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kingfengji.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-10-26T13:12:25.000Z","updated_at":"2025-02-26T00:46:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"e1569aa7-8911-4f78-bdfb-3e8e1e4e88dc","html_url":"https://github.com/kingfengji/mGBDT","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingfengji%2FmGBDT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingfengji%2FmGBDT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingfengji%2FmGBDT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingfengji%2FmGBDT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kingfengji","download_url":"https://codeload.github.com/kingfengji/mGBDT/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245603657,"owners_count":20642862,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gbdt","gradient-boosting-decision-trees","mgbdt","representation-learning","target-propagation"],"created_at":"2024-10-29T22:07:34.334Z","updated_at":"2025-03-26T06:31:19.179Z","avatar_url":"https://github.com/kingfengji.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Multi-Layered Gradient Boosting Decision Trees\n\nThis is the official clone for the implementation of mGBDT. \n\nPackage Official Website: http://lamda.nju.edu.cn/code_mGBDT.ashx\n\nThis package is provided \"AS IS\" and free for academic usage. You can run it at your own risk. For other purposes, please contact Prof. Zhi-Hua Zhou (zhouzh@lamda.nju.edu.cn).\n\nDescription: A python implementation of mGBDT proposed in [1].\nA demo implementation of mGBDT library as well as some demo client scripts to demostrate how to use the code.\nThe implementation is flexible enough for modifying the model or fit your own datasets.\n\n**Reference: [1] J. Feng, Y. Yu, and Z.-H. Zhou. [Multi-Layered Gradient Boosting Decision Trees](http://lamda.nju.edu.cn/fengj/paper/mGBDT.pdf). In:Advances in Neural Information Processing Systems 31 (NIPS'18), Montreal, Canada, 2018.**\n\nATTN: This package was developed and maintained by Mr.Ji Feng(http://lamda.nju.edu.cn/fengj/) .For any problem concerning the codes, please feel free to contact Mr.Feng.（fengj@lamda.nju.edu.cn) or open some issues here.\n\n# Environments\n- The code is developed under Python 3.5, so create a Python 3.5 environment using anaconda at first\n```\nconda create -n mgbdt python=3.5 anaconda\n```\n- Install the dependent packages\n```\nsource activate mgbdt\nconda install pytorch=0.1.12 cuda80 -c pytorch\npip install -r requirements.txt\n```\n\n# Demo Code\n\n```\nfrom sklearn import datasets\nfrom sklearn.model_selection import train_test_split\n\n# For using the mgbdt library, you have to include the library directory into your python path.\n# If you are in this repository's root directory, you can do it by using the following lines\nimport sys\nsys.path.insert(0, \"lib\")\n\nfrom mgbdt import MGBDT, MultiXGBModel\n\n# make a sythetic circle dataset using sklearn\nn_samples = 15000\nx_all, y_all = datasets.make_circles(n_samples=n_samples, factor=.5, noise=.04, random_state=0)\nx_train, x_test, y_train, y_test = train_test_split(x_all, y_all, test_size=0.3, random_state=0, stratify=y_all)\n\n# Create a multi-layerd GBDTs\nnet = MGBDT(loss=\"CrossEntropyLoss\", target_lr=1.0, epsilon=0.1)\n\n# add several target-propogation layers\n# F, G represent the forward mapping and inverse mapping (in this paper, we use gradient boosting decision tree)\nnet.add_layer(\"tp_layer\",\n    F=MultiXGBModel(input_size=2, output_size=5, learning_rate=0.1, max_depth=5, num_boost_round=5),\n    G=None)\nnet.add_layer(\"tp_layer\",\n    F=MultiXGBModel(input_size=5, output_size=3, learning_rate=0.1, max_depth=5, num_boost_round=5),\n    G=MultiXGBModel(input_size=3, output_size=5, learning_rate=0.1, max_depth=5, num_boost_round=5))\nnet.add_layer(\"tp_layer\",\n    F=MultiXGBModel(input_size=3, output_size=2, learning_rate=0.1, max_depth=5, num_boost_round=5),\n    G=MultiXGBModel(input_size=2, output_size=3, learning_rate=0.1, max_depth=5, num_boost_round=5))\n\n# init the forward mapping\nnet.init(x_train, n_rounds=5)\n\n# fit the dataset\nnet.fit(x_train, y_train, n_epochs=50, eval_sets=[(x_test, y_test)], eval_metric=\"accuracy\")\n\n# prediction\ny_pred = net.forward(x_test)\n\n# get the hidden outputs\n# hiddens[0] represent the input data\n# hiddens[1] represent the output of the first layer\n# hiddens[2] represent the output of the second layer\n# hiddens[3] represent the output of the final layer (same as y_pred)\nhiddens = net.get_hiddens(x_test)\n```\n\n# Expriments\n\n## circle dataset\nBy running the following scripts\n- It will train a multi-layered GBDTs with structure (input - 5 - 3 - output) on the sythetic circle dataset\n- The visualization of the input (which is 2D) will be saved in outputs/circle/input.jpg (as show below)\n- The visualization of the second layer's output (which is 3D) will be saved in outputs/circle/pred2.jpg (as show below)\n```\npython exp/circle.py\n```\n\nInput                          |  Transformed\n:-----------------------------:|:------------------------------:\n![](figures/circle/input.jpg) |  ![](figures/circle/pred2.jpg)\n\n## scurve dataset\nBy running the following scripts\n- It will train an autoencoder using multi-layered GBDTs with structure (input - 5 - output) on the sythetic scurve dataset\n- The visualization of the input (which is 3D) will be saved in outputs/circle/input.jpg (as show below)\n- The visualization of the resonstructed result (which is 3D) will be saved in outputs/circle/pred2.jpg (as show below)\n```\npython exp/scurve.py\n```\n\nInput                             |  Reconstructed\n:--------------------------------:|:----------------------------------:\n![](figures/scurve/input.jpg)    |  ![](figures/scurve/pred2.jpg)\n\n- The visualization of the encoding will also be saved, since the 5D encodings are impossible to visualize directly, here we visualize every pairs of the 5 dimentions in 2D space.\n- The visualization of the $i'th and $j'th dimension will be saved in outputs/scurve/pred1.$i_$j.jpg (as show below)\n\nDimension 1 and 2                 |  Dimension 1 and 5\n:--------------------------------:|:----------------------------------:\n![](figures/scurve/pred1.1_2.jpg) |  ![](figures/scurve/pred1.1_5.jpg)\n\n\n## [UCI Adult](https://archive.ics.uci.edu/ml/datasets/adult)\nAt first, you need to download the dataset by running the following command:\n```Shell\ncd dataset/uci_adult\nsh get_data.sh\n```\nThen, by running the following scripts\n- It will train a multi-layered GBDTs with structure (input - 128 - 128 - output)\n- the accuracy will be logged for each epochs\n```\npython exp/uci_adult.py\n```\n\n## [UCI Yeast](https://archive.ics.uci.edu/ml/datasets/Yeast)\nBy running the following scripts\n- It will train a multi-layered GBDTs with structure (input - 16 - 16 - output)\n- 10-fold cross-validation is used\n- the accuracy will be logged for each epochs and each folds\n```\npython exp/uci_yeast.py\n```\nHappy hacking.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingfengji%2Fmgbdt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkingfengji%2Fmgbdt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingfengji%2Fmgbdt/lists"}