{"id":13701215,"url":"https://github.com/wepe/tgboost","last_synced_at":"2025-10-29T07:14:19.616Z","repository":{"id":87470036,"uuid":"75713433","full_name":"wepe/tgboost","owner":"wepe","description":"Tiny Gradient Boosting Tree","archived":false,"fork":false,"pushed_at":"2019-06-13T14:18:08.000Z","size":22844,"stargazers_count":321,"open_issues_count":5,"forks_count":103,"subscribers_count":19,"default_branch":"master","last_synced_at":"2025-04-02T03:31:06.764Z","etag":null,"topics":["boosted-trees","gradient-boosting-machine","machine-learning","sliq","xgboost"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wepe.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-12-06T08:59:49.000Z","updated_at":"2025-01-14T06:15:35.000Z","dependencies_parsed_at":null,"dependency_job_id":"53c21436-743b-4cd6-8147-b183c0dae5fa","html_url":"https://github.com/wepe/tgboost","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wepe%2Ftgboost","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wepe%2Ftgboost/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wepe%2Ftgboost/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wepe%2Ftgboost/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wepe","download_url":"https://codeload.github.com/wepe/tgboost/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248008631,"owners_count":21032556,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["boosted-trees","gradient-boosting-machine","machine-learning","sliq","xgboost"],"created_at":"2024-08-02T20:01:22.533Z","updated_at":"2025-10-29T07:14:19.520Z","avatar_url":"https://github.com/wepe.png","language":"Java","funding_links":[],"categories":["Java","人工智能"],"sub_categories":["机器学习"],"readme":"## What is TGBoost\n\nIt is a **T**iny implement of **G**radient **Boost**ing tree, based on  XGBoost's scoring function and SLIQ's efficient tree building algorithm. TGBoost build the tree in a level-wise way as in SLIQ (by constructing Attribute list and Class list). Currently, TGBoost support  parallel learning on single machine,  the speed and memory consumption are comparable to XGBoost.\n\n\nTGBoost supports most features as other library:  \n\n- **Built-in loss** , Square error loss for regression task, Logistic loss for classification task\n\n- **Early stopping** , evaluate on validation set and conduct early stopping\n\n-  **Feature importance** , output the feature importance after training\n\n- **Regularization** , lambda, gamma\n\n- **Randomness**, subsample，colsample\n\n- **Weighted loss function** , assign weight to each sample\n\n\nAnother two features  are novel: \n\n- **Handle missing value**, XGBoost learn a direction for those with missing value, the direction is left or right. TGBoost take a different approach: it enumerate missing value go to left child, right child and missing value child, then choose the best one. So TGBoost use Ternary Tree.\n\n-  **Handle categorical feature**, TGBoost order the categorical feature by their statistic (Gradient_sum / Hessian_sum) on each tree node, then conduct split finding as numeric feature.\n\n\n## Installation\n\nThe current version is implemented in pure Java, to use TGBoost you should first install [JDK](http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html). For Python user, Python binding is also provided:\n\n```\ngit clone git@github.com:wepe/tgboost.git\ncd python-package\nsudo python setup.py install\n```\n\n## To Understand TGBoost\n\nFor those want to understand how TGBoost work, and dive into Gradient Boosting Machine, please refer to the Python implementation of TGBoost: [tgboost-python](https://github.com/wepe/tgboost/tree/tgboost-python), the python source code is relatively easy to follow. \n\n\n## Example\n\nHere is an example, download the data [here](https://pan.baidu.com/s/1dGDr7pR)\n\n```python\n\nimport tgboost as tgb\n\n# training phase\nftrain = \"data/train.csv\"\nfval = \"data/val.csv\"\nparams = {'categorical_features': [\"PRI_jet_num\"],\n          'early_stopping_rounds': 10,\n          'maximize': True,\n          'eval_metric': 'auc',\n          'loss': 'logloss',\n          'eta': 0.3,\n          'num_boost_round': 20,\n          'max_depth': 7,\n          'scale_pos_weight':1.,\n          'subsample': 0.8,\n          'colsample': 0.8,\n          'min_child_weight': 1.,\n          'min_sample_split': 5,\n          'reg_lambda': 1.,\n          'gamma': 0.,\n          'num_thread': -1\n          }\n\nmodel = tgb.train(ftrain, fval, params)\n\n# testing phase\nftest = \"data/test.csv\"\nfoutput = \"data/test_preds.csv\"\nmodel.predict(ftest, foutput)\n\n# save the model\nmodel.save('./tgb.model')\n\n# load model and predict\nmodel = tgb.load_model('./tgb.model')\nmodel.predict(ftest, foutput)\n\n```\n\n\n## Reference\n\n- [XGBoost: A Scalable Tree Boosting System](https://arxiv.org/abs/1603.02754)\n- [SLIQ: A Fast Scalable Classifier for Data Mining](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.7734\u0026rep=rep1\u0026type=pdf)\n\n- [GBDT算法原理与系统设计简介](http://wepon.me/files/gbdt.pdf)\n- [efficient-decision-tree-notes (chinese)](https://github.com/wepe/efficient-decision-tree-notes)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwepe%2Ftgboost","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwepe%2Ftgboost","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwepe%2Ftgboost/lists"}