{"id":17310976,"url":"https://github.com/d-chambers/dbscan1d","last_synced_at":"2025-07-01T16:05:57.479Z","repository":{"id":46110170,"uuid":"213299128","full_name":"d-chambers/dbscan1d","owner":"d-chambers","description":"An efficient 1D implementation of the DBSCAN clustering algorithm","archived":false,"fork":false,"pushed_at":"2024-10-09T20:38:49.000Z","size":149,"stargazers_count":24,"open_issues_count":0,"forks_count":6,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-12T18:41:37.683Z","etag":null,"topics":["clustering","dbscan-algorithm","machine-learning","python"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/d-chambers.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-10-07T05:04:44.000Z","updated_at":"2025-02-14T20:55:31.000Z","dependencies_parsed_at":"2024-11-15T02:03:56.535Z","dependency_job_id":null,"html_url":"https://github.com/d-chambers/dbscan1d","commit_stats":{"total_commits":13,"total_committers":3,"mean_commits":4.333333333333333,"dds":0.3846153846153846,"last_synced_commit":"ae953931bbc651016699e328eb039249ea7b30a2"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/d-chambers/dbscan1d","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-chambers%2Fdbscan1d","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-chambers%2Fdbscan1d/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-chambers%2Fdbscan1d/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-chambers%2Fdbscan1d/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/d-chambers","download_url":"https://codeload.github.com/d-chambers/dbscan1d/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-chambers%2Fdbscan1d/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262996340,"owners_count":23396902,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering","dbscan-algorithm","machine-learning","python"],"created_at":"2024-10-15T12:39:09.554Z","updated_at":"2025-07-01T16:05:57.433Z","avatar_url":"https://github.com/d-chambers.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DBSCAN1D\n\n[![Coverage](https://codecov.io/gh/d-chambers/dbscan1d/branch/master/graph/badge.svg)](https://codecov.io/gh/d-chambers/dbscan1d)\n[![Supported Versions](https://img.shields.io/pypi/pyversions/dbscan1d.svg)](https://pypi.python.org/pypi/dbscan1d)\n[![PyPI](https://pepy.tech/badge/dbscan1d)](https://pepy.tech/project/dbscan1d)\n[![Licence](https://www.gnu.org/graphics/lgplv3-88x31.png)](https://www.gnu.org/licenses/lgpl.html)\n\ndbscan1d is a 1D implementation of the [DBSCAN algorithm](https://en.wikipedia.org/wiki/DBSCAN). It was created to efficiently\npreform clustering on large 1D arrays.\n\n[Sci-kit Learn's DBSCAN implementation](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html) does\nnot have a special case for 1D, where calculating the full distance matrix is wasteful. It is much better to simply sort\nthe input array and performing efficient bisects for finding closest points. Here are the results of running the simple\nprofile script included with the package. In every case DBSCAN1D is much faster than scikit learn's implementation.\n\n![image](https://github.com/d-chambers/dbscan1d/raw/master/profile_results.png)\n\n## Installation\nSimply use pip to install dbscan1d:\n```bash\npip install dbscan1d\n```\nIt only requires numpy.\n\n## Quickstart\ndbscan1d is designed to be interchangable with sklearn's implementation in almost\nall cases.\n\n```python\nfrom sklearn.datasets import make_blobs\n\nfrom dbscan1d.core import DBSCAN1D\n\n# make blobs to test clustering\nX = make_blobs(1_000_000, centers=2, n_features=1)[0]\n\n# init dbscan object\ndbs = DBSCAN1D(eps=.5, min_samples=4)\n\n# get labels for each point\nlabels = dbs.fit_predict(X)\n\n# show core point indices\ndbs.core_sample_indices_\n\n# get values of core points\ndbs.components_\n```\n\n## Notes\n\n- dbscan1d can return different group numbers than sklearn for non-core points which are within\neps distances of core points for two separate groups. For example:\n `--C1--C1--P--C2--C2`\nHere C1 and C2 are core points for group 1 and group 2, respectively. If P is within eps of both C1 and\nC2, dbscan1d will assign it the same label as the core point that is closest. Sklearn doesn't always do this.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-chambers%2Fdbscan1d","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fd-chambers%2Fdbscan1d","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-chambers%2Fdbscan1d/lists"}