{"id":15049364,"url":"https://github.com/klimentlagrangiewicz/dbscan","last_synced_at":"2026-02-14T17:31:33.504Z","repository":{"id":163222551,"uuid":"602990920","full_name":"KlimentLagrangiewicz/DBSCAN","owner":"KlimentLagrangiewicz","description":"Implementation of DBSCAN clustering algorithm in C (standard C89/C90)","archived":false,"fork":false,"pushed_at":"2025-01-18T20:00:16.000Z","size":2609,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-14T01:16:22.125Z","etag":null,"topics":["ansi-c","c89","clustering","data-clustering","dbscan","noise-detection"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KlimentLagrangiewicz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-17T11:37:32.000Z","updated_at":"2025-01-18T20:00:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"efb312db-e670-43ee-b0ad-c436f4337ba4","html_url":"https://github.com/KlimentLagrangiewicz/DBSCAN","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KlimentLagrangiewicz%2FDBSCAN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KlimentLagrangiewicz%2FDBSCAN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KlimentLagrangiewicz%2FDBSCAN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KlimentLagrangiewicz%2FDBSCAN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KlimentLagrangiewicz","download_url":"https://codeload.github.com/KlimentLagrangiewicz/DBSCAN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243505960,"owners_count":20301619,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ansi-c","c89","clustering","data-clustering","dbscan","noise-detection"],"created_at":"2024-09-24T21:19:58.198Z","updated_at":"2026-02-14T17:31:33.476Z","avatar_url":"https://github.com/KlimentLagrangiewicz.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DBSCAN\nImplementation of DBSCAN clustering algorithm in C programming language (standard C89/C90)\n## About DBSCAN  \nDBSCAN (Density-based spatial clustering of applications with noise) is a density-based clustering non-parametric algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996.  \nThe main idea of the algorithm is to assign neighboring closely located points to the same cluster.  \n#### Input data:\n  +  $X=\\mathrm{x}_{i=1,j=1}^{n,m}$ — description of objects, where $n$ is number of objects, $m$ is number of attributes;  \n  +  $minPts \\in ℕ$ — minimum number of points required to form a dense region;  \n  +  $\\epsilon \u003e 0$ — concentration radius.\n#### Output data:   \n  +  $Y=\\left\\\\{y_i|y_i\\in ℤ,i=\\overline{\\left(1,n\\right)}\\right\\\\}$ — cluster labels and noise markers.\n#### Advantages of DBSCAN:\n  +  Easy to implement;  \n  +  Low algorithmic complexity;  \n  +  The possibility of identifying clusters of complex shapes;  \n  +  DBSCAN can find noise;\n  +  Presence of many modifications.  \n#### Disadvantages of DBSCAN:   \n  +  DBSCAN works badly with clusters with different density;  \n  +  DBSCAN does not work well on clusters that have a small intersection with each other. For example, the points of two touching circles DBSCAN will refer to the same cluster;  \n  +  DBSCAN can't processing big data;  \n  +  Accuracy of clustering with DBSCAN strongly depends on value $\\epsilon$.\n#### Definitions\nDef.1. A point p is a core point if at least minPts points are within distance $\\epsilon$ of it (including p).  \nDef.2. A point q is directly reachable from p if point q is within distance ε from point p: $\\rho\\left(q,p\\right)\\leqslant\\epsilon$.  \nDef.3. A point q is reachable from p if there is a path $p_1,\\ldots,p_n$ with $p_1 = p$ and $p_n = q$, where each $p_{i+1}$ is directly reachable from $p_i$.\n### Steps of DBSCAN\nStep 1. Data preparing (autoscaling): $x_{i,j}=\\frac{x_{i,j}-\\mathrm{E_{X^{j}}}}{\\sigma_{X^{j}}}$;  \nStep 2. Mark all points as unselected;  \nStep 3. Select point from unselected core points;  \nStep 4. Mark that point and points, which reachable from it;  \nStep 5. Repeat steps 3, 4 until all core points have been marking;  \nStep 6. Mark the unmarked points as noise.\n## Example of usage\nCloning project and changing current directory:\n```\ngit clone https://github.com/KlimentLagrangiewicz/DBSCAN\ncd DBSCAN\n```\nBuilding from source (Linux):\n```\nmake\n```\nBuilding from source (Windows):\n```\nmake windows\n```\nIf building was successfully, you can find executable file in `bin` subdirectory\n```\n./bin/dbscan ./datasets/iris/data.txt 150 4 3 1.1 ./datasets/iris/new_result.txt\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fklimentlagrangiewicz%2Fdbscan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fklimentlagrangiewicz%2Fdbscan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fklimentlagrangiewicz%2Fdbscan/lists"}