{"id":15922059,"url":"https://github.com/georgedouzas/imbalanced-learn-extra","last_synced_at":"2026-02-06T00:33:17.290Z","repository":{"id":43221886,"uuid":"193223262","full_name":"georgedouzas/imbalanced-learn-extra","owner":"georgedouzas","description":"Implementation of novel oversampling algorithms.","archived":false,"fork":false,"pushed_at":"2025-02-05T17:05:31.000Z","size":1352,"stargazers_count":34,"open_issues_count":0,"forks_count":16,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-24T03:36:46.735Z","etag":null,"topics":["clustering-based-oversampling","data-science","g-somo","geometric-smote","imbalanced-learning","kmeans-smote","machine-learning","oversampling","python","scikit-learn","smote"],"latest_commit_sha":null,"homepage":"https://georgedouzas.github.io/imbalanced-learn-extra/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/georgedouzas.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["georgedouzas"]}},"created_at":"2019-06-22T10:51:42.000Z","updated_at":"2025-03-23T11:11:15.000Z","dependencies_parsed_at":"2025-04-20T00:34:24.474Z","dependency_job_id":"02e01f2c-f323-4584-ad97-4f008d3cc93e","html_url":"https://github.com/georgedouzas/imbalanced-learn-extra","commit_stats":null,"previous_names":["algowit/geometric-smote","georgedouzas/geometric-smote","georgedouzas/imbalanced-learn-extra","nova-ims-innovation-and-analytics-lab/geometric-smote"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/georgedouzas/imbalanced-learn-extra","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georgedouzas%2Fimbalanced-learn-extra","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georgedouzas%2Fimbalanced-learn-extra/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georgedouzas%2Fimbalanced-learn-extra/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georgedouzas%2Fimbalanced-learn-extra/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/georgedouzas","download_url":"https://codeload.github.com/georgedouzas/imbalanced-learn-extra/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georgedouzas%2Fimbalanced-learn-extra/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265815107,"owners_count":23832838,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering-based-oversampling","data-science","g-somo","geometric-smote","imbalanced-learning","kmeans-smote","machine-learning","oversampling","python","scikit-learn","smote"],"created_at":"2024-10-06T20:04:31.084Z","updated_at":"2026-02-06T00:33:17.277Z","avatar_url":"https://github.com/georgedouzas.png","language":"Python","funding_links":["https://github.com/sponsors/georgedouzas"],"categories":[],"sub_categories":[],"readme":"[scikit-learn]: \u003chttp://scikit-learn.org/stable/\u003e\n[imbalanced-learn]: \u003chttp://imbalanced-learn.org/stable/\u003e\n[SOMO]: \u003chttps://www.sciencedirect.com/science/article/abs/pii/S0957417417302324\u003e\n[KMeans-SMOTE]: \u003chttps://www.sciencedirect.com/science/article/abs/pii/S0020025518304997\u003e\n[G-SOMO]: \u003chttps://www.sciencedirect.com/science/article/abs/pii/S095741742100662X\u003e\n[black badge]: \u003chttps://img.shields.io/badge/%20style-black-000000.svg\u003e\n[black]: \u003chttps://github.com/psf/black\u003e\n[docformatter badge]: \u003chttps://img.shields.io/badge/%20formatter-docformatter-fedcba.svg\u003e\n[docformatter]: \u003chttps://github.com/PyCQA/docformatter\u003e\n[ruff badge]: \u003chttps://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v1.json\u003e\n[ruff]: \u003chttps://github.com/charliermarsh/ruff\u003e\n[mypy badge]: \u003chttp://www.mypy-lang.org/static/mypy_badge.svg\u003e\n[mypy]: \u003chttp://mypy-lang.org\u003e\n[mkdocs badge]: \u003chttps://img.shields.io/badge/docs-mkdocs%20material-blue.svg?style=flat\u003e\n[mkdocs]: \u003chttps://squidfunk.github.io/mkdocs-material\u003e\n[version badge]: \u003chttps://img.shields.io/pypi/v/imbalanced-learn-extra.svg\u003e\n[pythonversion badge]: \u003chttps://img.shields.io/pypi/pyversions/imbalanced-learn-extra.svg\u003e\n[downloads badge]: \u003chttps://img.shields.io/pypi/dd/imbalanced-learn-extra\u003e\n[gitter]: \u003chttps://gitter.im/imbalanced-learn-extra/community\u003e\n[gitter badge]: \u003chttps://badges.gitter.im/join%20chat.svg\u003e\n[discussions]: \u003chttps://github.com/georgedouzas/imbalanced-learn-extra/discussions\u003e\n[discussions badge]: \u003chttps://img.shields.io/github/discussions/georgedouzas/imbalanced-learn-extra\u003e\n[ci]: \u003chttps://github.com/georgedouzas/imbalanced-learn-extra/actions?query=workflow\u003e\n[ci badge]: \u003chttps://github.com/georgedouzas/imbalanced-learn-extra/actions/workflows/ci.yml/badge.svg?branch=main\u003e\n[doc]: \u003chttps://github.com/georgedouzas/imbalanced-learn-extra/actions?query=workflow\u003e\n[doc badge]: \u003chttps://github.com/georgedouzas/imbalanced-learn-extra/actions/workflows/doc.yml/badge.svg?branch=main\u003e\n\n# imbalanced-learn-extra\n\n[![ci][ci badge]][ci] [![doc][doc badge]][doc]\n\n| Category          | Tools    |\n| ------------------| -------- |\n| **Development**   | [![black][black badge]][black] [![ruff][ruff badge]][ruff] [![mypy][mypy badge]][mypy] [![docformatter][docformatter badge]][docformatter] |\n| **Package**       | ![version][version badge] ![pythonversion][pythonversion badge] ![downloads][downloads badge] |\n| **Documentation** | [![mkdocs][mkdocs badge]][mkdocs]|\n| **Communication** | [![gitter][gitter badge]][gitter] [![discussions][discussions badge]][discussions] |\n\n## Introduction\n\n`imbalanced-learn-extra` is a Python package that extends [imbalanced-learn]. It implements algorithms that are not included in\n[imbalanced-learn] due to their novelty or lower citation number. The current version includes the following:\n\n- A general interface for clustering-based oversampling algorithms.\n\n- The Geometric SMOTE algorithm. It is a geometrically enhanced drop-in replacement for SMOTE, that handles numerical as well as\ncategorical features.\n\n## Installation\n\nFor user installation, `imbalanced-learn-extra` is currently available on the PyPi's repository, and you can\ninstall it via `pip`:\n\n```bash\npip install imbalanced-learn-extra\n```\n\nDevelopment installation requires cloning the repository and then using [PDM](https://github.com/pdm-project/pdm) to install the\nproject as well as the main and development dependencies:\n\n```bash\ngit clone https://github.com/georgedouzas/imbalanced-learn-extra.git\ncd imbalanced-learn-extra\npdm install\n```\n\nSOM clusterer requires optional dependencies:\n\n```bash\npip install imbalanced-learn-extra[som]\n```\n\n## Usage\n\nAll the classes included in `imbalanced-learn-extra` follow the [imbalanced-learn] API using the functionality of the base\noversampler. Using [scikit-learn] convention, the data are represented as follows:\n\n- Input data `X`: 2D array-like or sparse matrices.\n- Targets `y`: 1D array-like.\n\nThe oversamplers implement a `fit` method to learn from `X` and `y`:\n\n```python\noversampler.fit(X, y)\n```\n\nThey also implement a `fit_resample` method to resample `X` and `y`:\n\n```python\nX_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)\n```\n\n## Citing `imbalanced-learn-extra`\n\nPublications using clustering-based oversampling:\n\n- [G. Douzas, F. Bacao, \"Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning\", Expert Systems with\n    Applications, vol. 82, pp. 40-52, 2017.][SOMO]\n- [G. Douzas, F. Bacao, F. Last, \"Improving imbalanced learning through a heuristic oversampling method based on k-means and\n    SMOTE\", Information Sciences, vol. 465, pp. 1-20, 2018.][KMeans-SMOTE]\n- [G. Douzas, F. Bacao, F. Last, \"G-SOMO: An oversampling approach based on self-organized maps and geometric SMOTE\", Expert\n    Systems with Applications, vol. 183,115230, 2021.][G-SOMO]\n\nPublications using Geometric-SMOTE:\n\n- Douzas, G., Bacao, B. (2019). Geometric SMOTE: a geometrically enhanced\n  drop-in replacement for SMOTE. Information Sciences, 501, 118-135.\n  \u003chttps://doi.org/10.1016/j.ins.2019.06.007\u003e\n\n- Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of\n  Active Learning: Introducing Artificial Data Generation in Active Learning\n  for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619.\n  \u003chttps://doi.org/10.3390/rs13132619\u003e\n\n- Douzas, G., Bacao, F., Fonseca, J., Khudinyan, M. (2019). Imbalanced\n  Learning in Land Cover Classification: Improving Minority Classes’\n  Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing,\n  11(24), 3040. \u003chttps://doi.org/10.3390/rs11243040\u003e\n\n## User Support\n\nIf you encounter a bug, have a question, or would like to request a new feature, you can get support through the project’s GitHub\nissue tracker.\n\n- **Report a bug:** Open a [new issue](https://github.com/georgedouzas/imbalanced-learn-extra/issues/new)\n  and describe the problem, including steps to reproduce it and your environment details.\n- **Request a feature:** Open a [new issue](https://github.com/georgedouzas/imbalanced-learn-extra/issues/new)\n  describing the functionality you’d like to see added.\n- **Ask a question or request help:** Use the [Q\u0026A discussion board](https://github.com/georgedouzas/imbalanced-learn-extra/discussions)\n  for general usage questions or clarifications.\n\nBefore opening a new issue, please check the [existing issues](https://github.com/georgedouzas/imbalanced-learn-extra/issues)\nto see if your question or problem has already been addressed.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeorgedouzas%2Fimbalanced-learn-extra","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgeorgedouzas%2Fimbalanced-learn-extra","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeorgedouzas%2Fimbalanced-learn-extra/lists"}