{"id":29861022,"url":"https://github.com/rebelosa/random-subgroups","last_synced_at":"2025-07-30T04:10:15.397Z","repository":{"id":57459908,"uuid":"288425267","full_name":"rebelosa/random-subgroups","owner":"rebelosa","description":"A machine learning python package for learning ensembles of subgroups for predictive tasks.","archived":false,"fork":false,"pushed_at":"2021-02-14T13:02:05.000Z","size":104,"stargazers_count":7,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-27T13:01:00.853Z","etag":null,"topics":["interpretability","interpretable-machine-learning","pysubgroup","python","python-package","python3","random-forest","scikit-learn","subgroup-discovery","subgroups"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rebelosa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-08-18T10:30:40.000Z","updated_at":"2024-11-06T14:44:46.000Z","dependencies_parsed_at":"2022-09-10T05:50:20.268Z","dependency_job_id":null,"html_url":"https://github.com/rebelosa/random-subgroups","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/rebelosa/random-subgroups","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rebelosa%2Frandom-subgroups","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rebelosa%2Frandom-subgroups/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rebelosa%2Frandom-subgroups/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rebelosa%2Frandom-subgroups/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rebelosa","download_url":"https://codeload.github.com/rebelosa/random-subgroups/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rebelosa%2Frandom-subgroups/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267808215,"owners_count":24147388,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-30T02:00:09.044Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["interpretability","interpretable-machine-learning","pysubgroup","python","python-package","python3","random-forest","scikit-learn","subgroup-discovery","subgroups"],"created_at":"2025-07-30T04:10:11.798Z","updated_at":"2025-07-30T04:10:15.353Z","avatar_url":"https://github.com/rebelosa.png","language":"Python","readme":"# Random Subgroups python package\n\n## Making predictions with subgroups\n\n**random-subgroups** is a machine learning package compatible with [scikit-learn](https://scikit-learn.org).\n\nIt uses ensembles of weak estimators, as in random forests, for classification and\nregression tasks. The main difference from the random forests algorithm is that\nit uses **subgroups** as estimators.\n\nThe subgroup discovery implementation of this package is made on top of the \n[pysubgroup](https://github.com/flemmerich/pysubgroup/) package. It uses many of the features \nfrom **pysubgroup**, but it also extends it with different quality\nmeasures (more suitable for prediction) and different search strategies.\n\n### Installation:\n```commandline\npip install random-subgroups\n```\n\n\n### Example of the classifier:\n```python\nfrom randomsubgroups import RandomSubgroupClassifier\nfrom sklearn import datasets\n\ndata = datasets.load_breast_cancer()\ny = data.target\nX = data.data\n\nsg_classifier = RandomSubgroupClassifier(n_estimators=30)\n\nsg_classifier.fit(X, y)\n```\n\n```python\n\u003e\u003e\u003e classifiers_df = sg_classifier.show_models()\n\nTarget: 0; Model: Col26\u003e=0.27 AND Col7\u003e=0.06\nTarget: 0; Model: Col3\u003e=435.17 AND Col6\u003e=0.11\nTarget: 0; Model: Col20\u003e=18.22 AND Col3\u003e=806.60\nTarget: 0; Model: Col16\u003e=0.02 AND Col20\u003e=15.87\nTarget: 0; Model: Col17\u003e=0.01 AND Col20\u003e=17.91\nTarget: 0; Model: Col20\u003e=17.50 AND Col22\u003e=118.60\nTarget: 0; Model: Col23\u003e=1004.60 AND Col7\u003e=0.05\nTarget: 0; Model: Col0\u003e=15.33 AND Col13\u003e=21.73\nTarget: 0; Model: Col22\u003e=124.16\nTarget: 0; Model: Col13\u003e=18.88 AND Col3\u003e=716.60\nTarget: 0; Model: Col12\u003e=1.39 AND Col22\u003e=123.11\nTarget: 0; Model: Col23\u003e=1030.0 AND Col6\u003e=0.05\nTarget: 0; Model: Col27\u003e=0.15 AND Col3\u003e=358.90\nTarget: 0; Model: Col15\u003e=0.01 AND Col23\u003e=883.99\nTarget: 0; Model: Col0\u003e=10.98 AND Col27\u003e=0.16\nTarget: 1; Model: Col22\u003c105.0 AND Col27\u003c0.16\nTarget: 1; Model: Col20\u003c15.53 AND Col26\u003c0.35\nTarget: 1; Model: Col7\u003c0.05\nTarget: 1; Model: Col13\u003c42.86 AND Col27\u003c0.12\nTarget: 1; Model: Col23\u003c771.47 AND Col27\u003c0.12\nTarget: 1; Model: Col20\u003c17.79 AND Col25\u003c0.20\nTarget: 1; Model: Col20\u003c15.49 AND Col27\u003c0.15\nTarget: 1; Model: Col13\u003c29.40 AND Col1\u003c19.98\nTarget: 1; Model: Col20\u003c15.75 AND Col6\u003c0.08\nTarget: 1; Model: Col20\u003c15.63 AND Col27\u003c0.20\nTarget: 1; Model: Col22\u003c104.79 AND Col29\u003c0.10\nTarget: 1; Model: Col20\u003c14.69 AND Col6\u003c0.12\nTarget: 1; Model: Col27\u003c0.12 AND Col3\u003c693.70\nTarget: 1; Model: Col20\u003c16.00 AND Col6\u003c0.09\nTarget: 1; Model: Col27\u003c0.11 AND Col7\u003c0.06\n```\n\n```python\n\u003e\u003e\u003e sg_classifier.show_decision(X[5])\n\nThe predicted value is: 0\nFrom a total of 6 estimators.\n\nThe subgroups used in the prediction are:\n\n Predicting target 0\nCol0\u003e=10.98 AND Col27\u003e=0.16 ---\u003e 0\nCol26\u003e=0.27 AND Col7\u003e=0.06 ---\u003e 0\nCol27\u003e=0.15 AND Col3\u003e=358.90 ---\u003e 0\nCol3\u003e=435.17 AND Col6\u003e=0.11 ---\u003e 0\n\n Predicting target 1\nCol13\u003c29.40 AND Col1\u003c19.98 ---\u003e 1\nCol20\u003c15.63 AND Col27\u003c0.20 ---\u003e 1\n\nThe targets of the subgroups used in the prediction have the following distribution:\n```\n![output](example_classification.png)\n\n\n### Example of the regressor:\n```python\nfrom randomsubgroups import RandomSubgroupRegressor\nfrom sklearn import datasets\n\ndata = datasets.load_diabetes()\ny = data.target\nX = data.data\n\nsg_regressor = RandomSubgroupRegressor(n_estimators=30)\n\nsg_regressor.fit(X, y)\n```\n\n```python\n\u003e\u003e\u003e regressors_df = sg_regressor.show_models()\n\nTarget: 98.35; Model: Col2\u003c-0.03 AND Col5\u003c0.05\nTarget: 104.24844720496894; Model: Col6\u003e=-0.01 AND Col7\u003c-0.00\nTarget: 107.71686746987952; Model: Col6\u003e=-0.02 AND Col7\u003c-0.00\nTarget: 109.73033707865169; Model: Col3\u003c0.06 AND Col8\u003c-0.01\nTarget: 191.60625; Model: Col2\u003e=0.00 AND Col3\u003e=-0.03\nTarget: 192.41304347826087; Model: Col3\u003e=-0.02 AND Col7\u003e=0.03\nTarget: 199.28795811518324; Model: Col8\u003e=0.01\nTarget: 202.17094017094018; Model: Col3\u003e=0.04 AND Col4\u003e=-0.05\nTarget: 206.8709677419355; Model: Col2\u003e=0.02 AND Col7\u003e=-0.04\nTarget: 211.1290322580645; Model: Col2\u003e=-0.02 AND Col8\u003e=0.01\nTarget: 212.44036697247705; Model: Col4\u003e=-0.01 AND Col8\u003e=0.03\nTarget: 212.8655462184874; Model: Col7\u003e=-0.00 AND Col8\u003e=0.02\nTarget: 213.66935483870967; Model: Col7\u003e=-0.01 AND Col8\u003e=0.03\nTarget: 216.0079365079365; Model: Col3\u003e=-0.02 AND Col6\u003c-0.01\nTarget: 218.92233009708738; Model: Col0\u003e=-0.03 AND Col8\u003e=0.03\nTarget: 219.56435643564356; Model: Col2\u003e=0.02 AND Col7\u003e=-0.00\nTarget: 220.40740740740742; Model: Col2\u003e=0.02 AND Col6\u003c-0.02\nTarget: 220.46153846153845; Model: Col3\u003e=-0.04 AND Col8\u003e=0.03\nTarget: 222.0222222222222; Model: Col8\u003e=0.02 AND Col9\u003e=0.01\nTarget: 222.92592592592592; Model: Col2\u003e=0.00 AND Col3\u003e=0.03\nTarget: 224.375; Model: Col6\u003c0.00 AND Col9\u003e=0.02\nTarget: 224.3939393939394; Model: Col2\u003e=0.02 AND Col7\u003e=-0.00\nTarget: 224.70833333333334; Model: Col3\u003e=0.02 AND Col8\u003e=0.01\nTarget: 226.5257731958763; Model: Col2\u003e=-0.00 AND Col8\u003e=0.02\nTarget: 233.0185185185185; Model: Col2\u003e=0.02 AND Col7\u003e=-0.00\nTarget: 239.25882352941176; Model: Col2\u003e=0.00 AND Col9\u003e=0.02\nTarget: 243.9375; Model: Col2\u003e=0.03 AND Col3\u003e=0.01\nTarget: 247.63492063492063; Model: Col2\u003e=-0.01 AND Col9\u003e=0.05\nTarget: 248.56756756756758; Model: Col2\u003e=0.03 AND Col9\u003e=0.02\nTarget: 260.29411764705884; Model: Col2\u003e=0.06 AND Col8\u003e=-0.01\n```\n\n```python\n\u003e\u003e\u003e sg_regressor.show_decision(X[0])\n\nThe predicted value is: 220.93552644658507\nFrom a total of 12 estimators.\n\nThe subgroups used in the prediction are:\nCol2\u003e=0.00 AND Col3\u003e=-0.03 ---\u003e 191.60625\nCol8\u003e=0.01 ---\u003e 199.28795811518324\nCol2\u003e=0.02 AND Col7\u003e=-0.04 ---\u003e 206.8709677419355\nCol2\u003e=-0.02 AND Col8\u003e=0.01 ---\u003e 211.1290322580645\nCol3\u003e=-0.02 AND Col6\u003c-0.01 ---\u003e 216.0079365079365\nCol2\u003e=0.02 AND Col7\u003e=-0.00 ---\u003e 219.56435643564356\nCol2\u003e=0.02 AND Col6\u003c-0.02 ---\u003e 220.40740740740742\nCol2\u003e=0.02 AND Col7\u003e=-0.00 ---\u003e 224.3939393939394\nCol3\u003e=0.02 AND Col8\u003e=0.01 ---\u003e 224.70833333333334\nCol2\u003e=0.02 AND Col7\u003e=-0.00 ---\u003e 233.0185185185185\nCol2\u003e=0.03 AND Col3\u003e=0.01 ---\u003e 243.9375\nCol2\u003e=0.06 AND Col8\u003e=-0.01 ---\u003e 260.29411764705884\n\nThe targets of the subgroups used in the prediction have the following distribution:\n```\n![output](example_regression.png)","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frebelosa%2Frandom-subgroups","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frebelosa%2Frandom-subgroups","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frebelosa%2Frandom-subgroups/lists"}