{"id":18858957,"url":"https://github.com/kohlerhector/interpreter-py","last_synced_at":"2026-03-09T06:32:41.513Z","repository":{"id":248242313,"uuid":"823730221","full_name":"KohlerHECTOR/interpreter-py","owner":"KohlerHECTOR","description":"Implementation of Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning  (Kohler, Delfosse, et. al. 2024).","archived":false,"fork":false,"pushed_at":"2024-09-10T12:05:31.000Z","size":1690,"stargazers_count":13,"open_issues_count":4,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-14T12:13:28.085Z","etag":null,"topics":["code-generation","explainability","explainable-ai","imitation-learning","interpretability","mujoco","program-generation","programmatic","reinforcement-learning"],"latest_commit_sha":null,"homepage":"https://kohlerhector.github.io/interpreter-py/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KohlerHECTOR.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-03T15:37:36.000Z","updated_at":"2025-01-16T19:54:54.000Z","dependencies_parsed_at":"2024-11-08T04:15:32.635Z","dependency_job_id":"6337b32d-8046-4fa3-b634-33272dc12c7a","html_url":"https://github.com/KohlerHECTOR/interpreter-py","commit_stats":null,"previous_names":["kohlerhector/interpreter-py"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/KohlerHECTOR/interpreter-py","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KohlerHECTOR%2Finterpreter-py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KohlerHECTOR%2Finterpreter-py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KohlerHECTOR%2Finterpreter-py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KohlerHECTOR%2Finterpreter-py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KohlerHECTOR","download_url":"https://codeload.github.com/KohlerHECTOR/interpreter-py/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KohlerHECTOR%2Finterpreter-py/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30284776,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-09T02:57:19.223Z","status":"ssl_error","status_checked_at":"2026-03-09T02:56:26.373Z","response_time":61,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["code-generation","explainability","explainable-ai","imitation-learning","interpretability","mujoco","program-generation","programmatic","reinforcement-learning"],"created_at":"2024-11-08T04:15:18.018Z","updated_at":"2026-03-09T06:32:41.475Z","avatar_url":"https://github.com/KohlerHECTOR.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning\n## Imitation Learning Context\nIn imitation learning, the goal is to train a policy (in this case, a decision tree) that mimics the behavior of an expert policy (typically a neural network). The expert policy provides demonstrations (state-action pairs), which the imitator uses to learn how to act in the environment.\n\n## Traditional Decision Trees\nTraditional decision trees split the data using single features at a time. For instance, a split might be based on whether feature_1 \u003e threshold. This can be limiting, as it only considers the value of one feature independently when making decisions.\n\n## Oblique Decision Trees\nOblique decision trees, on the other hand, use linear combinations of multiple features to make splits. A decision rule in an oblique decision tree might look like a1 * feature_1 + a2 * feature_2 + ... + an * feature_n \u003e threshold, where a1, a2, ..., an are coefficients. This allows the tree to create more complex, non-axis-aligned decision boundaries, which can capture interactions between features.\n\n## Oblique Data Generation in Imitation Learning\nTo train an oblique decision tree, the feature space is often transformed to include additional features that represent linear combinations or interactions between the original features. This enriched feature space can help the tree model more complex patterns in the data, similar to those captured by a neural network.\n\n## ObliqueDTPolicy Class\nIn the provided ```ObliqueDTPolicy``` class, the method get_oblique_data generates this enriched feature space by including pairwise differences between features.\n\n\n# Usage\n```bash\npip install git+https://github.com/KohlerHECTOR/interpreter-py.git@v0.3.2\n```\n\n```python\nfrom interpreter import Interpreter\nfrom interpreter import ObliqueDTPolicy, SB3Policy, DTPolicy\n\nfrom stable_baselines3 import SAC\nfrom stable_baselines3.common.evaluation import evaluate_policy\nfrom stable_baselines3.common.monitor import Monitor\n\nimport gymnasium as gym\nfrom sklearn.tree import DecisionTreeRegressor\nfrom huggingface_sb3 import load_from_hub\n\nfrom pickle import dump, load\n\n# Download a policy from the stable-baselines3 zoo\ncheckpoint = load_from_hub(\n    repo_id=\"sb3/sac-HalfCheetah-v3\", filename=\"sac-HalfCheetah-v3.zip\"\n)\n\n# Load the oracle policy\nenv = gym.make(\"HalfCheetah-v4\")\nmodel = SAC.load(checkpoint)\noracle = SB3Policy(model.policy)\n\n# Get oracle performance\nprint(evaluate_policy(oracle, Monitor(env))[0])\n\n# Instantiate the decision tree class (here a regression tree with at most 16 leaves)\nclf = DecisionTreeRegressor(\n    max_leaf_nodes=32\n)  # Change to DecisionTreeClassifier for discrete Actions.\nlearner = ObliqueDTPolicy(clf, env)  #\n# You can replace by DTPolicy(clf, env) for interpretable axis-parallel DTs.\n\n# Start the imitation learning\ninterpret = Interpreter(oracle, learner, env)\ninterpret.fit(5e4)\n\n# Eval and save the best tree\nfinal_tree_reward, _ = evaluate_policy(interpret._policy, env=env, n_eval_episodes=10)\nprint(final_tree_reward)\n# Here you can replace pickle with joblib or cloudpickle\nwith open(\"tree_halfcheetah.pkl\", \"wb\") as f:\n    dump(interpret._policy.clf, f)\n\nwith open(\"tree_halfcheetah.pkl\", \"rb\") as f:\n    clf = load(f)\n\n```\n\n# Cite\n```bibtex\n@inproceedings{\nkohler2024interpretable,\ntitle={Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning},\nauthor={Hector Kohler and Quentin Delfosse and Riad Akrour and Kristian Kersting and Philippe Preux},\nbooktitle={Seventeenth European Workshop on Reinforcement Learning},\nyear={2024},\nurl={https://openreview.net/forum?id=yDicN3WVZ2}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkohlerhector%2Finterpreter-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkohlerhector%2Finterpreter-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkohlerhector%2Finterpreter-py/lists"}