{"id":16778639,"url":"https://github.com/ritvik19/pyradox-tabular","last_synced_at":"2025-04-10T20:51:31.057Z","repository":{"id":104587599,"uuid":"453356323","full_name":"Ritvik19/pyradox-tabular","owner":"Ritvik19","description":"State of the Art Neural Networks for Tabular Deep Learning","archived":false,"fork":false,"pushed_at":"2022-02-21T16:15:56.000Z","size":80,"stargazers_count":9,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-24T18:21:20.266Z","etag":null,"topics":["deep-learning","deep-neural-networks","tabular-data"],"latest_commit_sha":null,"homepage":"https://ritvik19.github.io/pyradox-tabular/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ritvik19.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-01-29T09:44:13.000Z","updated_at":"2024-12-26T10:11:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"b6689c11-62fc-4b66-8657-d0f18c257109","html_url":"https://github.com/Ritvik19/pyradox-tabular","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2Fpyradox-tabular","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2Fpyradox-tabular/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2Fpyradox-tabular/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2Fpyradox-tabular/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ritvik19","download_url":"https://codeload.github.com/Ritvik19/pyradox-tabular/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248296679,"owners_count":21080302,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","deep-neural-networks","tabular-data"],"created_at":"2024-10-13T07:28:18.024Z","updated_at":"2025-04-10T20:51:31.047Z","avatar_url":"https://github.com/Ritvik19.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [pyradox-tabular](https://github.com/Ritvik19/pyradox-tabular)\n\nState of the Art Neural Networks for Tabular Deep Learning\n\n[![Downloads](https://pepy.tech/badge/pyradox-tabular)](https://pepy.tech/project/pyradox-tabular)\n[![Downloads](https://pepy.tech/badge/pyradox-tabular/month)](https://pepy.tech/project/pyradox-tabular)\n[![Downloads](https://pepy.tech/badge/pyradox-tabular/week)](https://pepy.tech/project/pyradox-tabular)\n\n---\n\n## Table of Contents\n\n- [pyradox-tabular](#pyradox-tabular)\n  - [Table of Contents](#table-of-contents)\n  - [Installation](#installation)\n  - [Usage](#usage)\n    - [Data Preparation](#data-preparation)\n    - [Deep Tabular Network](#deep-tabular-network)\n    - [Wide and Deep Tabular Network](#wide-and-deep-tabular-network)\n    - [Deep and Cross Tabular Network](#deep-and-cross-tabular-network)\n    - [TabTansformer](#tabtansformer)\n    - [TabNet](#tabnet)\n    - [Deep Neural Decision Tree](#deep-neural-decision-tree)\n    - [Deep Neural Decision Forest](#deep-neural-decision-forest)\n    - [Neural Oblivious Decision Tree](#neural-oblivious-decision-tree)\n    - [Neural Oblivious Decision Ensemble](#neural-oblivious-decision-ensemble)\n    - [Feature Tokenizer Transformer](#feature-tokenizer-transformer)\n    - [Tabular ResNet](#tabular-resnet)\n  - [References](#references)\n\n---\n\n## Installation\n\n```bash\npip install pyradox-tabular\n```\n\n---\n\n## Usage\n\n### Data Preparation\n\npyradox-tabular comes with its own `DataLoader` Class which can be used to load data from a pandas `DataFrame`.\nWe provide a utility `DataConfig` class which stores the configuration of the data, which are then required by the model for feature preprocessing.\nWe also provide seperate `ModelConfig` classes for the different models, which ae required to store the model hyperparamers.\n\n```python\nfrom pyradox_tabular.data import DataLoader\nfrom pyradox_tabular.data_config import DataConfig\n\ndata_config = DataConfig(\n    numeric_feature_names=[\"numerical\", \"column\",\"names\"],\n    categorical_features_with_vocabulary={\n        \"column\": [\"label\", \"encoded\", \"unique\", \"values\", \"as\", \"strings\"],\n    },\n)\n\ndata_train = DataLoader.from_df(x_train, y_train, batch_size=1024)\ndata_valid = DataLoader.from_df(x_valid, y_valid, batch_size=1024)\ndata_test = DataLoader.from_df(x_test, batch_size=1024)\n```\n\nThis library provides the implementations of the following tabular deep learning models:\n\n### Deep Tabular Network\n\nIn principle a neural network can approximate any continuous function and piece wise continuous function. However, it is not suitable to approximate arbitrary non-continuous functions as it assumes certain level of continuity in its general form.\n\nUnlike unstructured data found in nature, structured data with categorical features may not have continuity at all and even if it has it may not be so obvious.\n\nDeep Tabular Network use the entity embedding method to automatically learn the representation of categorical features in multi-dimensional spaces which reveals the intrinsic continuity of the data and helps neural networks to solve the problem.\n\n```python\nfrom pyradox_tabular.model_config import DeepNetworkConfig\nfrom pyradox_tabular.nn import DeepTabularNetwork\n\nmodel_config = DeepNetworkConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64])\nmodel = DeepTabularNetwork.from_config(data_config, model_config, name=\"deep_network\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Wide and Deep Tabular Network\n\nThe human brain is a sophisticated learning machine, forming rules by memorizing everyday events and generalizing those learnings to apply tothings we haven't seen before. Perhaps more powerfully, memorization also allows us to further refine our generalized rules with exceptions.\n\nBy jointly training a wide linear model (for memorization) alongside a deep neural network (for generalization) Wide and Deep Tabular Networks combine the strengths of both to bring us one step closer to teach computers to learn like humans do.\n\n```python\nfrom pyradox_tabular.model_config import WideAndDeepNetworkConfig\nfrom pyradox_tabular.nn import WideAndDeepTabularNetwork\n\nmodel_config = WideAndDeepNetworkConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64])\nmodel = WideAndDeepTabularNetwork.from_config(data_config, model_config, name=\"wide_deep_network\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Deep and Cross Tabular Network\n\nFeature engineering has been the key to the success of many prediction models. However, the process is nontrivial and often requires manual feature engineering or exhaustive searching. DNNs are able to automatically learn feature interactions; however, they generate all the interactions implicitly, and are not necessarily efficient in learning all types of cross features.\n\nDeep and Cross Tabular Network explicitly applies feature crossing at each layer, requires no manual feature engineering, and adds negligible extra complexity to the DNN model.\n\n```python\nfrom pyradox_tabular.model_config import DeepAndCrossNetworkConfig\nfrom pyradox_tabular.nn import DeepAndCrossTabularNetwork\n\nmodel_config = DeepAndCrossNetworkConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64], n_cross=2)\nmodel = DeepAndCrossTabularNetwork.from_config(data_config, model_config, name=\"deep_cross_network\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### TabTansformer\n\nTabTransformer is built upon self-attention based on Transformers. The Transformer layers transform the embeddings of categorical features into robust contextual embeddings to achieve higher prediction accuracy.\n\nThe contextual embeddings learned from TabTransformer are highly robust against both missing and noisy data features, and provide better interpretability.\n\n```python\nfrom pyradox_tabular.model_config import TabTransformerConfig\nfrom pyradox_tabular.nn import TabTransformer\n\nmodel_config = TabTransformerConfig(num_outputs=1, out_activation='sigmoid', num_transformer_blocks=3, num_heads=4, mlp_hidden_units_factors=[2, 1])\nmodel = TabTransformer.from_config(data_config, model_config, name=\"tab_transformer\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### TabNet\n\nTabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and better learning as the learning capacity is used for the most salient features.\n\nIt employs a single deep learning architecture for feature selection and reasoning.\n\n```python\nfrom pyradox_tabular.model_config import TabNetConfig\nfrom pyradox_tabular.nn import TabNet\n\nmodel_config = TabNetConfig(num_outputs=1, out_activation='sigmoid',feature_dim=16, output_dim=12, num_decision_steps=5)\nmodel = TabNet.from_config(data_config, model_config, name=\"tabnet\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Deep Neural Decision Tree\n\nDeep Neural Decision Trees unifies classification trees with the representation learning functionality known from deep convolutional network. These are essentially a stochastic and differentiable decision tree model.\n\n```python\nfrom pyradox_tabular.model_config import NeuralDecisionTreeConfig\nfrom pyradox_tabular.nn import NeuralDecisionTree\n\nmodel_config = NeuralDecisionTreeConfig(depth=2, used_features_rate=1, num_classes=2)\nmodel = NeuralDecisionTree.from_config(data_config, model_config, name=\"deep_neural_decision_tree\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Deep Neural Decision Forest\n\nA Deep Neural Decision Forest is an bagging ensemble of Deep Neural Decision Trees.\n\n```python\nfrom pyradox_tabular.model_config import NeuralDecisionForestConfig\nfrom pyradox_tabular.nn import NeuralDecisionForest\n\nmodel_config = NeuralDecisionForestConfig(num_trees=10, depth=2, used_features_rate=0.8, num_classes=2)\nmodel = NeuralDecisionForest.from_config(data_config, model_config, name=\"deep_neural_decision_forest\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Neural Oblivious Decision Tree\n\n```python\nfrom pyradox_tabular.model_config import NeuralObliviousDecisionTreeConfig\nfrom pyradox_tabular.nn import NeuralObliviousDecisionTree\n\nmodel_config = NeuralObliviousDecisionTreeConfig()\nmodel = NeuralObliviousDecisionTree.from_config(data_config, model_config, name=\"neural_oblivious_decision_tree\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Neural Oblivious Decision Ensemble\n\nNODE architecture generalizes ensembles of oblivious decision trees, but benefits from both end-to-end gradient-based optimization and the power of multi-layer hierarchical representation learning.\n\n```python\nfrom pyradox_tabular.model_config import NeuralObliviousDecisionEnsembleConfig\nfrom pyradox_tabular.nn import NeuralObliviousDecisionEnsemble\n\nmodel_config = NeuralObliviousDecisionEnsembleConfig()\nmodel = NeuralObliviousDecisionEnsemble.from_config(data_config, model_config, name=\"neural_oblivious_decision_ensemble\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Feature Tokenizer Transformer\n\nIt is a simple adaptation of the Transformer architecture for the tabular domain. In a nutshell, Feature Tokenizer Transformer transforms all features (categorical and numerical) to embeddings and applies a stack of Transformer layers to the embeddings.\n\nThus, every Transformer layer operates on the feature level of one object.\n\n```python\nfrom pyradox_tabular.model_config import FeatureTokenizerTransformerConfig\nfrom pyradox_tabular.nn import FeatureTokenizerTransformer\n\nmodel_config = FeatureTokenizerTransformerConfig(num_outputs=1, out_activation='sigmoid', num_transformer_blocks=2, num_heads=8, embedding_dim=32, dense_dim=16)\nmodel = FeatureTokenizerTransformer.from_config(data_config, model_config, name=\"feature_tokenizer_transformer\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n### Tabular ResNet\n\nTabular Resnet is a ResNet like architecture containing skip connection but instead of Convolutional Layers, it consists of Linear Layers.\n\n```python\nfrom pyradox_tabular.model_config import TabularResNetConfig\nfrom pyradox_tabular.nn import TabularResNet\n\nmodel_config = TabularResNetConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64])\nmodel = TabularResNet.from_config(data_config, model_config, name=\"deep_network\")\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\nmodel.fit(data_train, validation_data=data_valid)\npreds = model.predict(data_test)\n```\n\n---\n\n## References\n\n- [Entity Embeddings of Categorical Variables (2016, April)](https://arxiv.org/abs/1604.06737)\n- [Wide \u0026 Deep Learning: Better Together with TensorFlow (2016, June)](https://ai.googleblog.com/2016/06/wide-deep-learning-better-together-with.html)\n- [Deep \u0026 Cross Network for Ad Click Predictions (2017, August)](https://arxiv.org/pdf/1708.05123.pdf)\n- [TabTransformer: Tabular Data Modeling Using Contextual Embeddings (2020, December)](https://arxiv.org/pdf/2012.06678.pdf)\n- [TabNet: Attentive Interpretable Tabular Learning (2020, December)](https://arxiv.org/pdf/1908.07442.pdf)\n- [Deep Neural Decision Forests (2015, December)](https://ieeexplore.ieee.org/document/7410529)\n- [Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data (2019, September)](https://arxiv.org/pdf/1909.06312.pdf)\n- [Revisiting Deep Learning Models for Tabular Data (2021, June)](https://arxiv.org/abs/2106.11959)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fritvik19%2Fpyradox-tabular","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fritvik19%2Fpyradox-tabular","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fritvik19%2Fpyradox-tabular/lists"}