Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/scikit-learn-contrib/sklearn-ann
Integration with (approximate) nearest neighbors libraries for scikit-learn + clustering based on with kNN-graphs.
https://github.com/scikit-learn-contrib/sklearn-ann
approximate-nearest-neighbor-search clustering knn knn-graphs scikit-learn
Last synced: about 4 hours ago
JSON representation
Integration with (approximate) nearest neighbors libraries for scikit-learn + clustering based on with kNN-graphs.
- Host: GitHub
- URL: https://github.com/scikit-learn-contrib/sklearn-ann
- Owner: scikit-learn-contrib
- License: bsd-3-clause
- Created: 2020-12-30T14:08:33.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2024-10-30T09:15:29.000Z (8 days ago)
- Last Synced: 2024-10-30T10:23:18.522Z (8 days ago)
- Topics: approximate-nearest-neighbor-search, clustering, knn, knn-graphs, scikit-learn
- Language: Python
- Homepage: https://sklearn-ann.readthedocs.io/en/latest/
- Size: 194 KB
- Stars: 15
- Watchers: 4
- Forks: 5
- Open Issues: 9
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
.. -*- mode: rst -*-
|PyPI|_ |ReadTheDocs|_
.. |PyPI| image:: https://img.shields.io/pypi/v/sklearn-ann
.. _PyPI: https://pypi.org/project/sklearn-ann/.. |ReadTheDocs| image:: https://readthedocs.org/projects/sklearn-ann/badge/?version=latest
.. _ReadTheDocs: https://sklearn-ann.readthedocs.io/en/latest/?badge=latestsklearn-ann
===========.. inclusion-marker-do-not-remove
**sklearn-ann** eases integration of approximate nearest neighbours
libraries such as annoy, nmslib and faiss into your sklearn
pipelines. It consists of:* ``Transformers`` conforming to the same interface as
``KNeighborsTransformer`` which can be used to transform feature matrices
into sparse distance matrices for use by any estimator that can deal with
sparse distance matrices. Many, but not all, of scikit-learn's clustering and
manifold learning algorithms can work with this kind of input.
* RNN-DBSCAN: a variant of DBSCAN based on reverse nearest
neighbours.Installation
============To install the latest release from PyPI, run:
.. code-block:: bash
pip install sklearn-ann
To install the latest development version from GitHub, run:
.. code-block:: bash
pip install git+https://github.com/scikit-learn-contrib/sklearn-ann.git#egg=sklearn-ann
Why? When do I want this?
=========================The main scenarios in which this is needed is for performing
*clustering or manifold learning or high dimensional data*. The
reason is that currently the only neighbourhood algorithms which are
build into scikit-learn are essentially the standard tree approaches
to space partitioning: the ball tree and the K-D tree. These do not
perform competitively in high dimensional spaces.Development
===========This project is managed using Hatch_ and pre-commit_. To get started, run ``pre-commit
install`` and ``hatch env create``. Run all commands using ``hatch run python
`` which will ensure the environment is kept up to date. pre-commit_ comes into
play on every `git commit` after installation.Consult ``pyproject.toml`` for which dependency groups and extras exist,
and the Hatch help or user guide for more info on what they are... _Hatch: https://hatch.pypa.io/
.. _pre-commit: https://pre-commit.com/