{"id":15063981,"url":"https://github.com/abess-team/abess","last_synced_at":"2025-05-15T17:04:58.794Z","repository":{"id":37007179,"uuid":"323082026","full_name":"abess-team/abess","owner":"abess-team","description":"Fast Best-Subset Selection Library","archived":false,"fork":false,"pushed_at":"2024-09-14T08:09:15.000Z","size":218772,"stargazers_count":482,"open_issues_count":12,"forks_count":42,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-11T23:57:11.259Z","etag":null,"topics":["best-subset-selection","classification-algorithm","cox-regression","feature-selection","high-dimensional-data","linear-regression","logistic-regression","machine-learning","multitask-learning","ordinal-regression","poisson-regression","polynomial-algorithm","principal-component-analysis","python","r","robust-principal-component-analysis","scikit-learn","sparse-principal-component-analysis","sure-independence-screening"],"latest_commit_sha":null,"homepage":"https://abess.readthedocs.io/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/abess-team.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/Contributing/AfterCodeDeveloping.rst","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-12-20T13:47:31.000Z","updated_at":"2025-03-17T13:49:21.000Z","dependencies_parsed_at":"2023-02-15T23:46:08.728Z","dependency_job_id":"b3fc045d-d500-4238-a13f-524a0de54ed5","html_url":"https://github.com/abess-team/abess","commit_stats":{"total_commits":1985,"total_committers":28,"mean_commits":70.89285714285714,"dds":0.6453400503778337,"last_synced_commit":"2724e4f1b237d392e307284c633f1a6e55688120"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abess-team%2Fabess","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abess-team%2Fabess/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abess-team%2Fabess/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abess-team%2Fabess/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/abess-team","download_url":"https://codeload.github.com/abess-team/abess/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254384987,"owners_count":22062422,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["best-subset-selection","classification-algorithm","cox-regression","feature-selection","high-dimensional-data","linear-regression","logistic-regression","machine-learning","multitask-learning","ordinal-regression","poisson-regression","polynomial-algorithm","principal-component-analysis","python","r","robust-principal-component-analysis","scikit-learn","sparse-principal-component-analysis","sure-independence-screening"],"created_at":"2024-09-25T00:09:40.356Z","updated_at":"2025-05-15T17:04:53.778Z","avatar_url":"https://github.com/abess-team.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src='https://raw.githubusercontent.com/abess-team/abess/master/docs/image/icon_long.png' align=\"center\"/\u003e\u003c/a\u003e\n\n# abess: Fast Best-Subset Selection in Python and R\n\n[![Python Build](https://github.com/abess-team/abess/actions/workflows/python_test.yml/badge.svg)](https://github.com/abess-team/abess/actions/workflows/python_test.yml)\n[![R Build](https://github.com/abess-team/abess/actions/workflows/r_test.yml/badge.svg)](https://github.com/abess-team/abess/actions/workflows/r_test.yml)\n[![codecov](https://codecov.io/gh/abess-team/abess/branch/master/graph/badge.svg?token=LK56LHXV00)](https://codecov.io/gh/abess-team/abess)\n[![docs](https://readthedocs.org/projects/abess/badge/?version=latest)](https://abess.readthedocs.io/en/latest/?badge=latest)\n[![R docs](https://github.com/abess-team/abess/actions/workflows/r_website.yml/badge.svg)](https://abess-team.github.io/abess/)\n[![cran](https://img.shields.io/cran/v/abess?logo=R)](https://cran.r-project.org/package=abess)\n[![pypi](https://img.shields.io/pypi/v/abess?logo=Pypi)](https://pypi.org/project/abess)\n[![Conda version](https://img.shields.io/conda/vn/conda-forge/abess.svg?logo=condaforge)](https://anaconda.org/conda-forge/abess)\n[![pyversions](https://img.shields.io/pypi/pyversions/abess)](https://img.shields.io/pypi/pyversions/abess)\n[![License](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](http://www.gnu.org/licenses/gpl-3.0)\n[![Codacy Badge](https://app.codacy.com/project/badge/Grade/3f6e60a3a3e44699a033159633981b76)](https://www.codacy.com/gh/abess-team/abess/dashboard?utm_source=github.com\u0026utm_medium=referral\u0026utm_content=abess-team/abess\u0026utm_campaign=Badge_Grade)\n[![CodeFactor](https://www.codefactor.io/repository/github/abess-team/abess/badge)](https://www.codefactor.io/repository/github/abess-team/abess)\n[![Platform](https://anaconda.org/conda-forge/abess/badges/platforms.svg)](https://anaconda.org/conda-forge/abess)\n[![Downloads](https://pepy.tech/badge/abess)](https://pepy.tech/project/abess)\n\n\u003c!-- [![Build Status](https://travis-ci.com/abess-team/abess.svg?branch=master)](https://travis-ci.com/abess-team/abess) --\u003e\n\n## Overview\n\n`abess` (Adaptive BEst Subset Selection) library aims to solve general best subset selection, i.e.,\nfind a small subset of predictors such that the resulting model is expected to have the highest accuracy.\nThe selection for best subset shows great value in scientific researches and practical applications.\nFor example, clinicians want to know whether a patient is healthy or not based on the expression levels of a few of important genes.\n\nThis library implements a generic algorithm framework to find the optimal solution in an extremely fast way.\nThis framework now supports the detection of best subset under:\n[linear regression](https://abess.readthedocs.io/en/latest/auto_gallery/1-glm/plot_1_LinearRegression.html),\n[classification (binary or multi-class)](https://abess.readthedocs.io/en/latest/auto_gallery/1-glm/plot_2_LogisticRegression.html),\n[counting-response modeling](https://abess.readthedocs.io/en/latest/auto_gallery/1-glm/plot_5_PossionGammaRegression.html),\n[censored-response modeling](https://abess.readthedocs.io/en/latest/auto_gallery/1-glm/plot_4_CoxRegression.html#sphx-glr-auto-gallery-1-glm-plot-4-coxregression-py),\n[multi-response modeling (multi-tasks learning)](https://abess.readthedocs.io/en/latest/auto_gallery/1-glm/plot_3_MultiTaskLearning.html), etc.\nIt also supports the variants of best subset selection like\n[group best subset selection](https://abess.readthedocs.io/en/latest/auto_gallery/3-advanced-features/plot_best_group.html),\n[nuisance penalized regression](https://abess.readthedocs.io/en/latest/auto_gallery/3-advanced-features/plot_best_nuisance.html),\nEspecially, the time complexity of (group) best subset selection for linear regression is certifiably polynomial.\n\n## Quick start\n\nThe `abess` software has both Python and R's interfaces. Here a quick start will be given and for more details, please view: [Installation](https://abess.readthedocs.io/en/latest/Installation.html).\n\n### Python package\n\nInstall the stable version of Python-package from [Pypi](https://pypi.org/project/abess/):\n\n```shell\n$ pip install abess\n```\n\nor [conda-forge](https://anaconda.org/conda-forge/abess):\n\n```shell\n$ conda install abess\n```\n\nBest subset selection for linear regression on a simulated dataset in Python:\n\n```python\nfrom abess.linear import LinearRegression\nfrom abess.datasets import make_glm_data\nsim_dat = make_glm_data(n = 300, p = 1000, k = 10, family = \"gaussian\")\nmodel = LinearRegression()\nmodel.fit(sim_dat.x, sim_dat.y)\n```\n\nSee more examples analyzed with Python in the [Python tutorials](https://abess.readthedocs.io/en/latest/auto_gallery/index.html).\n\n### R package\n\nInstall the stable version of R-package from [CRAN](https://cran.r-project.org/web/packages/abess) with:\n\n```r\ninstall.packages(\"abess\")\n```\n\nBest subset selection for linear regression on a simulated dataset in R:\n\n```r\nlibrary(abess)\nsim_dat \u003c- generate.data(n = 300, p = 1000)\nabess(x = sim_dat[[\"x\"]], y = sim_dat[[\"y\"]])\n```\n\nSee more examples analyzed with R in the [R tutorials](https://abess-team.github.io/abess/articles/).\n\n## Runtime Performance\n\nTo show the power of abess in computation, we assess its timings of the CPU execution (seconds) on synthetic datasets, and compare to state-of-the-art variable selection methods. The variable selection and estimation results are deferred to [Python performance](https://abess.readthedocs.io/en/latest/auto_gallery/1-glm/plot_a1_power_of_abess.html) and [R performance](https://abess-team.github.io/abess/articles/v11-power-of-abess.html). All computations are conducted on a Ubuntu platform with Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz and 48 RAM.\n\n### Python package\n\nWe compare `abess` Python package with `scikit-learn` on linear regression and logistic regression. Results are presented in the below figure:\n\n![](./docs/image/timings.png)\n\nIt can be see that `abess` uses the least runtime to find the solution. This results can be reproduced by running the command in shell:\n\n```shell\n$ python abess/docs/simulation/Python/timings.py\n```\n\n### R package\n\nWe compare `abess` R package with three widely used R packages: `glmnet`, `ncvreg`, and `L0Learn`.\nWe get the runtime comparison results:\n\n![](docs/image/r_runtime.png)\n\nCompared with other packages,\n`abess` shows competitive computational efficiency,\nand achieves the best computational power when variables have a large correlation.\n\nConducting the following command in shell can reproduce the above results in R:\n\n```shell\n$ Rscript abess/docs/simulation/R/timings.R\n```\n\n## Open source software\n\n`abess` is a free software and its source code is publicly available on [Github](https://github.com/abess-team/abess). The core framework is programmed in C++, and user-friendly R and Python interfaces are offered. You can redistribute it and/or modify it under the terms of the [GPL-v3 License](https://www.gnu.org/licenses/gpl-3.0.html). We welcome contributions for `abess`, especially stretching `abess` to the other best subset selection problems.\n\n## What's news\n\nNew features version `0.4.7`:\n\n- Support limiting beta into a range by clipping method. One application is to perform non-negative fitting.\n- Support no-intercept model for most regressors in ``abess.linear`` with argument ``fit_intercept=False``. We assume that the data has been centered for these models.\n- Support AUC criterion for Logistic and Multinomial Regression.\n\n\nNew features version `0.4.6`:\n\n- Support no-intercept model for most regressors in `abess.linear` with argument `fit_intercept=False`. We assume that the data has been centered for these models. (Python)\n- `abess` can be used via `mlr3extralearners` as learners `regr.abess` and `classif.abess`. (R)\n- Use [CMake](https://cmake.org/) on compiling to increase scalability.\n- Support score functions for all GLM models. (Python)\n- Rearrange some arguments in Python package to improve legibility. Please check the latest [API document](https://abess.readthedocs.io/en/latest/Python-package/index.html). (Python)\n\n## Citation\n\nIf you use `abess` or reference our tutorials in a presentation or publication, we would appreciate citations of our library.\n\n\u003e Zhu Jin, Xueqin Wang, Liyuan Hu, Junhao Huang, Kangkang Jiang, Yanhang Zhang, Shiyun Lin, and Junxian Zhu. \"abess: A Fast Best-Subset Selection Library in Python and R.\" Journal of Machine Learning Research 23, no. 202 (2022): 1-7.\n\nThe corresponding BibteX entry:\n\n```\n@article{JMLR:v23:21-1060,\n  author  = {Jin Zhu and Xueqin Wang and Liyuan Hu and Junhao Huang and Kangkang Jiang and Yanhang Zhang and Shiyun Lin and Junxian Zhu},\n  title   = {abess: A Fast Best-Subset Selection Library in Python and R},\n  journal = {Journal of Machine Learning Research},\n  year    = {2022},\n  volume  = {23},\n  number  = {202},\n  pages   = {1--7},\n  url     = {http://jmlr.org/papers/v23/21-1060.html}\n}\n```\n\n## References\n\n- Junxian Zhu, Canhong Wen, Jin Zhu, Heping Zhang, and Xueqin Wang (2020). A polynomial algorithm for best-subset selection problem. Proceedings of the National Academy of Sciences, 117(52):33117-33123.\n- Pölsterl, S (2020). scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. J. Mach. Learn. Res., 21(212), 1-6.\n- Yanhang Zhang, Junxian Zhu, Jin Zhu, and Xueqin Wang. A splicing approach to best\nsubset of groups selection. INFORMS Journal on Computing, 35(1):104–119, 2023. doi:\n10.1287/ijoc.2022.1241.\n- Qiang Sun and Heping Zhang (2020). Targeted Inference Involving High-Dimensional Data Using Nuisance Penalized Regression, Journal of the American Statistical Association, DOI: 10.1080/01621459.2020.1737079.\n- Zhu Jin, Xueqin Wang, Liyuan Hu, Junhao Huang, Kangkang Jiang, Yanhang Zhang, Shiyun Lin, and Junxian Zhu. \"abess: A Fast Best-Subset Selection Library in Python and R.\" Journal of Machine Learning Research 23, no. 202 (2022): 1-7.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabess-team%2Fabess","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fabess-team%2Fabess","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabess-team%2Fabess/lists"}