{"id":41947897,"url":"https://github.com/s-marton/GRANDE","last_synced_at":"2026-02-04T17:01:30.711Z","repository":{"id":196835205,"uuid":"697243463","full_name":"s-marton/GRANDE","owner":"s-marton","description":"(ICLR 2024) GRANDE: Gradient-Based Decision Tree Ensembles","archived":false,"fork":false,"pushed_at":"2025-12-16T17:32:55.000Z","size":10375,"stargazers_count":98,"open_issues_count":0,"forks_count":11,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-12-20T08:29:17.200Z","etag":null,"topics":["decision-tree","decision-trees","gradient-descent","tabular-data","tree-ensemble","tree-ensembles"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2309.17130","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/s-marton.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-09-27T10:40:41.000Z","updated_at":"2025-12-17T07:17:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/s-marton/GRANDE","commit_stats":null,"previous_names":["s-marton/grande"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/s-marton/GRANDE","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-marton%2FGRANDE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-marton%2FGRANDE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-marton%2FGRANDE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-marton%2FGRANDE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/s-marton","download_url":"https://codeload.github.com/s-marton/GRANDE/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-marton%2FGRANDE/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29091317,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-04T03:31:03.593Z","status":"ssl_error","status_checked_at":"2026-02-04T03:29:50.742Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["decision-tree","decision-trees","gradient-descent","tabular-data","tree-ensemble","tree-ensembles"],"created_at":"2026-01-25T20:00:26.261Z","updated_at":"2026-02-04T17:01:30.706Z","avatar_url":"https://github.com/s-marton.png","language":"Jupyter Notebook","funding_links":[],"categories":["Implementations"],"sub_categories":["Other Frameworks"],"readme":"# 🌳 GRANDE: Gradient-Based Decision Tree Ensembles 🌳\n\n[![PyPI version](https://img.shields.io/pypi/v/GRANDE)](https://pypi.org/project/GRANDE/) [![OpenReview](https://img.shields.io/badge/OpenReview-XEFWBxi075-blue)](https://openreview.net/forum?id=XEFWBxi075) [![arXiv](https://img.shields.io/badge/arXiv-2309.17130-b31b1b.svg)](https://arxiv.org/abs/2309.17130)\n\n\n\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"figures/tabarena_leaderboard.jpg\" alt=\"TabArena Leaderboard\" width=\"60%\"/\u003e\n\n\u003cp\u003e\u003cstrong\u003eTabArena.\u003c/strong\u003e The updated PyTorch GRANDE has been evaluated on TabArena and achieved strong results.\u003c/p\u003e\n\n\u003c/div\u003e\n\n🔍 What's new?\n- PyTorch-native implementation for seamless integration; TensorFlow is maintained as a legacy version.\n- Strong results on TabArena (specifically for binary classification and regression; multi-class results are less strong dragging down the overall performance which can hopefully be fixed in a future release)\n- Method updates for improved performance, including optional categorical and numerical embeddings.\n- Training improvements (optimizers, schedulers, early stopping, optional SWA).\n- Enhanced preprocessing pipeline with optional frequency encoding and robust normalization.\n\n\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"figures/grande.jpg\" alt=\"GRANDE Overview\" width=\"50%\"/\u003e\n\n\u003cp\u003e\u003cstrong\u003eFigure 1: Overview GRANDE.\u003c/strong\u003e GRANDE learns hard, axis-aligned trees end-to-end via gradient descent, and uses dynamic instance-wise leaf weighting to combine estimators into a strong ensemble.\u003c/p\u003e\n\u003c/div\u003e\n\n\n🌳 GRANDE is a gradient-based decision tree ensemble for tabular data.\nGRANDE trains ensembles of hard, axis-aligned decision trees end-to-end with gradient descent. Each estimator contributes via instance-wise leaf weights that are learned jointly with split locations and leaf values. This combines the strong inductive bias of trees with the flexibility of neural optimization. The PyTorch version optionally augments inputs with learnable categorical and numerical embeddings, improving representation capacity while preserving interpretability of splits.\n\n📝 More details in the paper: https://openreview.net/forum?id=XEFWBxi075\n\n\n## Cite us\n```text\n@inproceedings{\nmarton2024grande,\ntitle={{GRANDE}: Gradient-Based Decision Tree Ensembles},\nauthor={Sascha Marton and Stefan L{\\\"u}dtke and Christian Bartelt and Heiner Stuckenschmidt},\nbooktitle={The Twelfth International Conference on Learning Representations},\nyear={2024},\nurl={https://openreview.net/forum?id=XEFWBxi075}\n}\n```\n\n## Installation\nTo install the latest release:\n```bash\npip install git+https://github.com/s-marton/GRANDE.git\n```\n\n## Dependencies\nInstall core runtime requirements (and optional notebook/example dependencies) via:\n\n```bash\npip install -r requirements.txt\n```\n\nNotes:\n- The file contains a **core** section (library runtime deps) and a **notebook/example-only** section (OpenML/XGBoost/CatBoost).\n\n## Usage (PyTorch)\nExample aligned with the attached notebook (binary classification, OpenML dataset 46915). GPU is recommended.\n\n```python\n# Enable GPU (optional)\nimport os\nos.environ['CUDA_VISIBLE_DEVICES'] = '0'\n\n# Load data\nfrom sklearn.model_selection import train_test_split\nimport openml\nimport numpy as np\nimport sklearn\n\ndataset = openml.datasets.get_dataset(46915, download_data=True, download_qualities=True, download_features_meta_data=True)\nX, y, categorical_indicator, attribute_names = dataset.get_data(target=dataset.default_target_attribute)\n\nX_temp, X_test, y_temp, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\nX_train, X_valid, y_train, y_valid = train_test_split(X_temp, y_temp, test_size=0.2, random_state=42)\n\n# GRANDE (PyTorch)\nfrom GRANDE import GRANDE\n\nparams = {\n    'depth': 5,\n    'n_estimators': 1024,\n\n    'learning_rate_weights': 0.001,\n    'learning_rate_index': 0.01,\n    'learning_rate_values': 0.05,\n    'learning_rate_leaf': 0.05,\n    'learning_rate_embedding': 0.02,  # used if embeddings are enabled\n\n    # Embeddings (set True to enable)\n    'use_category_embeddings': False,  # True to enable\n    'embedding_dim_cat': 8,\n    'use_numeric_embeddings': False,   # True to enable\n    'embedding_dim_num': 8,\n    'embedding_threshold': 1,          # low-cardinality split for categorical embeddings\n    'loo_cardinality': 10,             # high-cardinality split for encoders\n\n    'dropout': 0.2,\n    'selected_variables': 0.8,\n    'data_subset_fraction': 1.0,\n    'bootstrap': False,\n    'missing_values': False,\n\n    'optimizer': 'adam',               # options: nadam, radam, adamw, adam\n    'cosine_decay_restarts': False,\n    'reduce_on_plateau_scheduler': True,\n    'label_smoothing': 0.0,\n    'use_class_weights': False,\n    'focal_loss': False,\n    'swa': False,\n    'es_metric': True,  # AUC for binary, MSE for regression, val_loss for multiclass\n\n    'epochs': 250,\n    'batch_size': 256,\n    'early_stopping_epochs': 50,\n\n    'use_freq_enc': False,\n    'use_robust_scale_smoothing': False,\n\n    # Important: use problem_type, not objective\n    'problem_type': 'binary',  # {'binary', 'multiclass', 'regression'}\n\n    'random_seed': 42,\n    'verbose': 2,\n}\n\nmodel_grande = GRANDE(params=params)\nmodel_grande.fit(X=X_train, y=y_train, X_val=X_valid, y_val=y_valid)\n\n# Predict\npreds_grande = model_grande.predict_proba(X_test)\n\n# Evaluate (binary)\naccuracy = sklearn.metrics.accuracy_score(y_test, np.round(preds_grande[:, 1]))\nf1 = sklearn.metrics.f1_score(y_test, np.round(preds_grande[:, 1]), average='macro')\nroc_auc = sklearn.metrics.roc_auc_score(y_test, preds_grande[:, 1], average='macro')\n\nprint('Accuracy GRANDE:', accuracy)\nprint('F1 Score GRANDE:', f1)\nprint('ROC AUC GRANDE:', roc_auc)\n```\n\nNotes:\n- Set use_category_embeddings/use_numeric_embeddings to True to enable embeddings.\n- For multiclass, use problem_type='multiclass'. For regression, use 'regression'.\n- TensorFlow is supported as a legacy version; the PyTorch path is the recommended/default.\n\n## More\nThis is an experimental implementation. If you encounter issues, please open an issue or report unexpected behavior.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs-marton%2FGRANDE","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fs-marton%2FGRANDE","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs-marton%2FGRANDE/lists"}