{"id":31027639,"url":"https://github.com/brj0/nndescent","last_synced_at":"2025-09-13T19:50:16.567Z","repository":{"id":170745324,"uuid":"646919067","full_name":"brj0/nndescent","owner":"brj0","description":"C++/Python implementation of Nearest Neighbor Descent for efficient approximate nearest neighbor search","archived":false,"fork":false,"pushed_at":"2024-07-29T07:45:01.000Z","size":6139,"stargazers_count":21,"open_issues_count":2,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-09-11T08:08:49.978Z","etag":null,"topics":["approximate-nearest-neighbor-search","cpp","knn","knn-graphs","nearest-neighbor-search","nearest-neighbors","nearest-neighbours-classifier","python"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brj0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-29T16:34:03.000Z","updated_at":"2025-08-05T23:46:51.000Z","dependencies_parsed_at":"2024-07-29T09:10:19.438Z","dependency_job_id":null,"html_url":"https://github.com/brj0/nndescent","commit_stats":null,"previous_names":["brj0/nndescent"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/brj0/nndescent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brj0%2Fnndescent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brj0%2Fnndescent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brj0%2Fnndescent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brj0%2Fnndescent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brj0","download_url":"https://codeload.github.com/brj0/nndescent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brj0%2Fnndescent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274997547,"owners_count":25387934,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-13T02:00:10.085Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["approximate-nearest-neighbor-search","cpp","knn","knn-graphs","nearest-neighbor-search","nearest-neighbors","nearest-neighbours-classifier","python"],"created_at":"2025-09-13T19:50:15.835Z","updated_at":"2025-09-13T19:50:16.551Z","avatar_url":"https://github.com/brj0.png","language":"C++","readme":"# Nearest Neighbor Descent (nndescent)\n\nNearest Neighbor Descent (nndescent) is a C++ implementation of the nearest neighbor descent algorithm, designed for efficient and accurate approximate nearest neighbor search. With seamless integration into Python, it offers a powerful solution for constructing k-nearest neighbor graphs. This algorithm is based on the [pynndescent library](https://github.com/lmcinnes/pynndescent).\n\n\n## Features\n\n- Seamless integration into Python and effortless installation using `pip`.\n- The handling of nndescent is very similar to that of pynndescent.\n- Pure C++11 implementation utilizing OpenMP for parallel computation. No other external libraries are needed.\n- Currently tested only on Linux.\n- Both dense and sparse matrices are supported.\n- Implementation of multiple distance functions, i.e.\n    - Bray-Curtis\n    - Canberra\n    - Chebyshev\n    - Circular Kantorovich (no sparse verion)\n    - Correlation\n    - Cosine\n    - Dice\n    - Dot\n    - Euclidean\n    - Hamming\n    - Haversine\n    - Hellinger\n    - Hellinger\n    - Jaccard\n    - Jensen-Shannon\n    - Kulsinski\n    - Manhattan\n    - Matching\n    - Minkowski\n    - Rogers-Tanimoto\n    - Russell-Rao\n    - Sokal-Michener\n    - Sokal-Sneath\n    - Spearman's Rank Correlation (no sparse version)\n    - Symmetric KL Divergence\n    - TSSS\n    - True Angular\n    - Wasserstein 1D (no sparse version)\n    - Yule\n\nPlease note that not all distances have undergone thorough testing. Therefore, it is advised to use them with caution and at your own discretion.\n\n\n## Installation\n\n### From PyPI\n\nYou can install nndescent directly from PyPI using pip:\n\n```sh\npip install nndescent\n```\n\nIf you want to run the examples in `tests`, additional packages are needed. You can install them manually or install nndescent with the full option:\n\n```sh\npip install nndescent[full]\n```\n\n### From Source\n\n1. Clone the repository:\n\n```sh\ngit clone https://github.com/brj0/nndescent.git\ncd nndescent\n```\n\n2. Build and install the package:\n\n```sh\npip install .\n```\n\nIf you want to run the examples in `tests`, additional packages are needed. You can install them manually or install nndescent with the full option:\n\n```sh\npip install .[full]\n```\n\n3. To run the examples in `tests` you must first download the datasets:\n\n```sh\npython tests/make_test_data.py\n```\n\n## Usage\n\nIn Python you can utilize the nndescent library in the following way:\n\n```python\nimport numpy as np\nimport nndescent\n\n# Data must be a 2D numpy array of dtype 'float32'.\ndata = np.random.randint(50, size=(20,3)).astype(np.float32)\n\n# Run NND algorithm\nnnd = nndescent.NNDescent(data, n_neighbors=4)\n\n# Get result\nnn_indices, nn_distances = nnd.neighbor_graph\n\n# Query data must be a 2D numpy array of dtype 'float32'.\nquery_data = np.random.randint(50, size=(5,3)).astype(np.float32)\n\n# Calculate nearest neighbors for each query point\nnn_query_indices, nn_query_distances = nnd.query(query_data, k=6)\n```\n\nTo compile and run the C++ examples use the following commands within the project folder:\n\n```sh\nmkdir build\ncd build\ncmake ..\nmake\n./simple\n```\n\nFor detailed usage in C++ and for further Python/C++ examples please refer to the examples provided in the `tests` directory of the repository and the code documentation.\n\n\n## Performance\n\nOn my computer, the training phase of nndescent is approximately 15% faster than pynndescent for dense matrices, and 75% faster for sparse matrices. Furthermore, the search query phase shows a significant improvement, with \u003e70% faster execution time. Below is the output obtained from running `tests/benchmark.py`, an ad hoc benchmark test. In this test, both nndescent and pynndescent were executed with the same parameters using either 'euclidean' or 'dot' as metric:\n\n# Benchmark test pynndescent (py) vs nndescent (c)\nData set     | py train [ms] | c train [ms] | ratio | py vs c match | py test [ms] | c test [ms] | ratio | py accuracy | c accuracy\n-------------|---------------|--------------|-------|---------------|--------------|-------------|-------|-------------|-----------\nfaces        |         149.8 |        145.9 | 0.974 |         1.000 |       1663.7 |        18.4 | 0.011 |       1.000 |      0.999\nfmnist       |       11959.2 |      10768.7 | 0.900 |         0.997 |       5754.8 |      1334.1 | 0.232 |       0.978 |      0.978\nglove25      |      149754.2 |     101864.0 | 0.680 |         0.964 |      98740.6 |      9907.4 | 0.100 |       0.796 |      0.808\nglove50      |      192965.8 |     137171.8 | 0.711 |         0.885 |      99750.8 |     10647.7 | 0.107 |       0.705 |      0.743\nglove100     |      218202.9 |     180088.4 | 0.825 |         0.815 |      98770.2 |     12080.4 | 0.122 |       0.651 |      0.731\nglove200     |      287206.6 |     243466.6 | 0.848 |         0.772 |     101639.4 |     17615.6 | 0.173 |       0.622 |      0.773\nmnist        |       11319.7 |      10188.1 | 0.900 |         0.997 |       5725.9 |      1273.8 | 0.222 |       0.969 |      0.968\nnytimes      |       63323.8 |      55638.1 | 0.879 |         0.814 |      23632.1 |      7108.9 | 0.301 |       0.614 |      0.811\nsift         |      131711.4 |     105826.0 | 0.803 |         0.974 |      82503.7 |      7957.9 | 0.096 |       0.838 |      0.839\n20newsgroups |      107339.0 |      28339.7 | 0.264 |         0.922 |      67518.6 |     22513.1 | 0.333 |       0.858 |      0.929\n\nThe compilation time and the lengthy numba loading time during runtime and import for 'pynndescent' are not considered in this ad hoc benchmark test. An [Ann-Benchmarks](https://github.com/erikbern/ann-benchmarks/tree/main) wrapper is planned for the future.\n\n\n## Background\n\nThe theoretical background of NND is based on the following paper:\n\n- Dong, Wei, Charikar Moses, and Kai Li. [\"Efficient k-nearest neighbor graph construction for generic similarity measures.\"](https://www.cs.princeton.edu/cass/papers/www11.pdf) Proceedings of the 20th International Conference on World Wide Web. 2011.\n\nIn addition, the algorithm utilizes random projection trees for initializing\nthe nearest neighbor graph. The nndescent algorithm constructs a tree by\nrandomly selecting two points and splitting the data along a hyperplane passing\nthrough their midpoint. For a more theoretical background, please refer to:\n\n- DASGUPTA, Sanjoy; FREUND, Yoav. [Random projection trees and low dimensional manifolds](https://cseweb.ucsd.edu/~dasgupta/papers/rptree-stoc.pdf). In: Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing. 2008.\n\n\n## Contributing\n\nContributions are welcome! If you have any bug reports, feature requests, or suggestions, please open an issue or submit a pull request.\n\n\n## License\n\nThis project is licensed under the [BSD-2-Clause license](LICENSE).\n\n\n## Acknowledgements\n\nThis implementation is based on the original pynndescent library by Leland McInnes. I would like to acknowledge and appreciate his work as a source of inspiration for this project.\n\nFor more information, visit the [pynndescent GitHub repository](https://github.com/lmcinnes/pynndescent).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrj0%2Fnndescent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrj0%2Fnndescent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrj0%2Fnndescent/lists"}