https://github.com/scikit-learn-contrib/sklearn-ann

Integration with (approximate) nearest neighbors libraries for scikit-learn + clustering based on with kNN-graphs.
https://github.com/scikit-learn-contrib/sklearn-ann

approximate-nearest-neighbor-search clustering knn knn-graphs scikit-learn

Last synced: 3 months ago
JSON representation

Integration with (approximate) nearest neighbors libraries for scikit-learn + clustering based on with kNN-graphs.

Host: GitHub
URL: https://github.com/scikit-learn-contrib/sklearn-ann
Owner: scikit-learn-contrib
License: bsd-3-clause
Created: 2020-12-30T14:08:33.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2025-04-08T08:51:51.000Z (3 months ago)
Last Synced: 2025-04-08T09:45:44.787Z (3 months ago)
Topics: approximate-nearest-neighbor-search, clustering, knn, knn-graphs, scikit-learn
Language: Python
Homepage: https://sklearn-ann.readthedocs.io/en/latest/
Size: 228 KB
Stars: 16
Watchers: 3
Forks: 6
Open Issues: 10
Metadata Files:
- Readme: README.rst
- License: LICENSE

Awesome Lists containing this project

README

        .. -*- mode: rst -*-

|PyPI|_ |ReadTheDocs|_

.. |PyPI| image:: https://img.shields.io/pypi/v/sklearn-ann

.. _PyPI: https://pypi.org/project/sklearn-ann/

.. |ReadTheDocs| image:: https://readthedocs.org/projects/sklearn-ann/badge/?version=latest

.. _ReadTheDocs: https://sklearn-ann.readthedocs.io/en/latest/?badge=latest

sklearn-ann

===========

.. inclusion-marker-do-not-remove

**sklearn-ann** eases integration of approximate nearest neighbours

libraries such as annoy, nmslib and faiss into your sklearn

pipelines. It consists of:

* ``Transformers`` conforming to the same interface as

  ``KNeighborsTransformer`` which can be used to transform feature matrices

  into sparse distance matrices for use by any estimator that can deal with

  sparse distance matrices. Many, but not all, of scikit-learn's clustering and

  manifold learning algorithms can work with this kind of input.

* RNN-DBSCAN: a variant of DBSCAN based on reverse nearest

  neighbours.

Installation

============

To install the latest release from PyPI, run:

.. code-block:: bash

    pip install sklearn-ann

To install the latest development version from GitHub, run:

.. code-block:: bash

    pip install git+https://github.com/scikit-learn-contrib/sklearn-ann.git#egg=sklearn-ann

Why? When do I want this?

=========================

The main scenarios in which this is needed is for performing

*clustering or manifold learning or high dimensional data*. The

reason is that currently the only neighbourhood algorithms which are

build into scikit-learn are essentially the standard tree approaches

to space partitioning: the ball tree and the K-D tree. These do not

perform competitively in high dimensional spaces.

Development

===========

This project is managed using Hatch_ and pre-commit_. To get started, run ``pre-commit

install`` and ``hatch env create``. Run all commands using ``hatch run python

`` which will ensure the environment is kept up to date. pre-commit_ comes into

play on every `git commit` after installation.

Consult ``pyproject.toml`` for which dependency groups and extras exist,

and the Hatch help or user guide for more info on what they are.

.. _Hatch: https://hatch.pypa.io/

.. _pre-commit: https://pre-commit.com/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/scikit-learn-contrib/sklearn-ann

Awesome Lists containing this project

README