{"id":13532171,"url":"https://github.com/lmcinnes/pynndescent","last_synced_at":"2025-10-24T10:18:14.596Z","repository":{"id":38903664,"uuid":"120684701","full_name":"lmcinnes/pynndescent","owner":"lmcinnes","description":"A Python nearest neighbor descent for approximate nearest neighbors","archived":false,"fork":false,"pushed_at":"2024-11-10T14:49:37.000Z","size":10049,"stargazers_count":919,"open_issues_count":76,"forks_count":107,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-04-28T10:56:04.599Z","etag":null,"topics":["approximate-nearest-neighbor-search","knn-graphs","nearest-neighbor-search"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lmcinnes.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-02-07T23:23:54.000Z","updated_at":"2025-04-23T18:09:42.000Z","dependencies_parsed_at":"2023-02-01T09:17:02.655Z","dependency_job_id":"54be6aa7-022a-4eb1-b383-3f03d81ef901","html_url":"https://github.com/lmcinnes/pynndescent","commit_stats":{"total_commits":550,"total_committers":31,"mean_commits":"17.741935483870968","dds":0.3563636363636363,"last_synced_commit":"ab65f93d108406444932474bf981de7cee322cde"},"previous_names":[],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lmcinnes%2Fpynndescent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lmcinnes%2Fpynndescent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lmcinnes%2Fpynndescent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lmcinnes%2Fpynndescent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lmcinnes","download_url":"https://codeload.github.com/lmcinnes/pynndescent/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254020611,"owners_count":22000754,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["approximate-nearest-neighbor-search","knn-graphs","nearest-neighbor-search"],"created_at":"2024-08-01T07:01:08.750Z","updated_at":"2025-10-24T10:18:09.575Z","avatar_url":"https://github.com/lmcinnes.png","language":"Python","funding_links":[],"categories":["Libraries and Tools","Sdks \u0026 Libraries","Python","Machine Learning","向量相似度搜索（ANN）","Awesome Vector Search Engine"],"sub_categories":["2023","Unsupervised","Library"],"readme":".. image:: doc/pynndescent_logo.png\n  :width: 600\n  :align: center\n  :alt: PyNNDescent Logo\n\n.. image:: https://dev.azure.com/TutteInstitute/build-pipelines/_apis/build/status%2Flmcinnes.pynndescent?branchName=master\n    :target: https://dev.azure.com/TutteInstitute/build-pipelines/_build?definitionId=17\n    :alt: Azure Pipelines Build Status\n.. image:: https://readthedocs.org/projects/pynndescent/badge/?version=latest\n    :target: https://pynndescent.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n\n===========\nPyNNDescent\n===========\n\nPyNNDescent is a Python nearest neighbor descent for approximate nearest neighbors.\nIt provides a python implementation of Nearest Neighbor\nDescent for k-neighbor-graph construction and approximate nearest neighbor\nsearch, as per the paper:\n\nDong, Wei, Charikar Moses, and Kai Li.\n*\"Efficient k-nearest neighbor graph construction for generic similarity\nmeasures.\"*\nProceedings of the 20th international conference on World wide web. ACM, 2011.\n\nThis library supplements that approach with the use of random projection trees for\ninitialisation. This can be particularly useful for the metrics that are\namenable to such approaches (euclidean, minkowski, angular, cosine, etc.). Graph\ndiversification is also performed, pruning the longest edges of any triangles in the\ngraph.\n\nCurrently this library targets relatively high accuracy \n(80%-100% accuracy rate) approximate nearest neighbor searches.\n\n--------------------\nWhy use PyNNDescent?\n--------------------\n\nPyNNDescent provides fast approximate nearest neighbor queries. The\n`ann-benchmarks \u003chttps://github.com/erikbern/ann-benchmarks\u003e`_ system puts it\nsolidly in the mix of top performing ANN libraries:\n\n**SIFT-128 Euclidean**\n\n.. image:: https://pynndescent.readthedocs.io/en/latest/_images/sift.png\n    :alt: ANN benchmark performance for SIFT 128 dataset\n\n**NYTimes-256 Angular**\n\n.. image:: https://pynndescent.readthedocs.io/en/latest/_images/nytimes.png\n    :alt: ANN benchmark performance for NYTimes 256 dataset\n\nWhile PyNNDescent is among fastest ANN library, it is also both easy to install (pip\nand conda installable) with no platform or compilation issues, and is very flexible,\nsupporting a wide variety of distance metrics by default:\n\n**Minkowski style metrics**\n\n- euclidean\n- manhattan\n- chebyshev\n- minkowski\n\n**Miscellaneous spatial metrics**\n\n- canberra\n- braycurtis\n- haversine\n\n**Normalized spatial metrics**\n\n- mahalanobis\n- wminkowski\n- seuclidean\n\n**Angular and correlation metrics**\n\n- cosine\n- dot\n- correlation\n- spearmanr\n- tsss\n- true_angular\n\n**Probability metrics**\n\n- hellinger\n- wasserstein\n\n**Metrics for binary data**\n\n- hamming\n- jaccard\n- dice\n- russelrao\n- kulsinski\n- rogerstanimoto\n- sokalmichener\n- sokalsneath\n- yule\n\nand also custom user defined distance metrics while still retaining performance.\n\nPyNNDescent also integrates well with Scikit-learn, including providing support\nfor the KNeighborTransformer as a drop in replacement for algorithms\nthat make use of nearest neighbor computations.\n\n----------------------\nHow to use PyNNDescent\n----------------------\n\nPyNNDescent aims to have a very simple interface. It is similar to (but more\nlimited than) KDTrees and BallTrees in ``sklearn``. In practice there are\nonly two operations -- index construction, and querying an index for nearest\nneighbors.\n\nTo build a new search index on some training data ``data`` you can do something\nlike\n\n.. code:: python\n\n    from pynndescent import NNDescent\n    index = NNDescent(data)\n\nYou can then use the index for searching (and can pickle it to disk if you\nwish). To search a pynndescent index for the 15 nearest neighbors of a test data\nset ``query_data`` you can do something like\n\n.. code:: python\n\n    index.query(query_data, k=15)\n\nand that is pretty much all there is to it. You can find more details in the\n`documentation \u003chttps://pynndescent.readthedocs.org\u003e`_.\n\n----------\nInstalling\n----------\n\nPyNNDescent is designed to be easy to install being a pure python module with\nrelatively light requirements:\n\n* numpy\n* scipy\n* scikit-learn \u003e= 0.22\n* numba \u003e= 0.51\n\nall of which should be pip or conda installable. The easiest way to install should be\nvia conda:\n\n.. code:: bash\n\n    conda install -c conda-forge pynndescent\n\nor via pip:\n\n.. code:: bash\n\n    pip install pynndescent\n\nTo manually install this package:\n\n.. code:: bash\n\n    wget https://github.com/lmcinnes/pynndescent/archive/master.zip\n    unzip master.zip\n    rm master.zip\n    cd pynndescent-master\n    python setup.py install\n\n----------------\nHelp and Support\n----------------\n\nThis project is still young. The documentation is still growing. In the meantime please\n`open an issue \u003chttps://github.com/lmcinnes/pynndescent/issues/new\u003e`_\nand I will try to provide any help and guidance that I can. Please also check\nthe docstrings on the code, which provide some descriptions of the parameters.\n\n-------\nLicense\n-------\n\nThe pynndescent package is 2-clause BSD licensed. Enjoy.\n\n------------\nContributing\n------------\n\nContributions are more than welcome! There are lots of opportunities\nfor potential projects, so please get in touch if you would like to\nhelp out. Everything from code to notebooks to\nexamples and documentation are all *equally valuable* so please don't feel\nyou can't contribute. To contribute please `fork the project \u003chttps://github.com/lmcinnes/pynndescent/issues#fork-destination-box\u003e`_ make your changes and\nsubmit a pull request. We will do our best to work through any issues with\nyou and get your code merged into the main branch.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flmcinnes%2Fpynndescent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flmcinnes%2Fpynndescent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flmcinnes%2Fpynndescent/lists"}