{"id":38666449,"url":"https://github.com/tahiri-lab/phydbscan","last_synced_at":"2026-01-17T09:48:51.412Z","repository":{"id":142814893,"uuid":"555979643","full_name":"tahiri-lab/PhyDBSCAN","owner":"tahiri-lab","description":"phyDBSCAN: Building alternative phylogenetic trees using DBSCAN and Robinson and Foulds distance","archived":false,"fork":false,"pushed_at":"2025-08-04T18:33:15.000Z","size":4742,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-06T01:46:39.480Z","etag":null,"topics":["bioinformatics","classification","clustering","consensus-tree","dbscan","phylogeny","robinson-foulds","supertree"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tahiri-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-22T19:42:20.000Z","updated_at":"2025-09-16T02:41:17.000Z","dependencies_parsed_at":"2023-12-30T00:22:53.922Z","dependency_job_id":"9ec43eb8-59ca-4cf7-a35a-6dd4a2336023","html_url":"https://github.com/tahiri-lab/PhyDBSCAN","commit_stats":{"total_commits":43,"total_committers":2,"mean_commits":21.5,"dds":0.4418604651162791,"last_synced_commit":"e7e01ab0aef542a70c4ac86598d463126219aa9e"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/tahiri-lab/PhyDBSCAN","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tahiri-lab%2FPhyDBSCAN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tahiri-lab%2FPhyDBSCAN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tahiri-lab%2FPhyDBSCAN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tahiri-lab%2FPhyDBSCAN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tahiri-lab","download_url":"https://codeload.github.com/tahiri-lab/PhyDBSCAN/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tahiri-lab%2FPhyDBSCAN/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28505565,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T06:57:29.758Z","status":"ssl_error","status_checked_at":"2026-01-17T06:56:03.931Z","response_time":85,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","classification","clustering","consensus-tree","dbscan","phylogeny","robinson-foulds","supertree"],"created_at":"2026-01-17T09:48:51.253Z","updated_at":"2026-01-17T09:48:51.359Z","avatar_url":"https://github.com/tahiri-lab.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"﻿﻿﻿﻿﻿﻿﻿﻿\u003ch1  align=\"center\"\u003e phyDBSCAN \u003cp align='center'\u003e \n        [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) \n        [![Contributions](https://img.shields.io/badge/contributions-welcome-blue.svg)](https://devdocs.io/cpp/)\n        ![GitHub release](https://img.shields.io/github/v/release/tahiri-lab/phyDBSCAN?include_prereleases\u0026label=pre-release\u0026logo=github) \n        \u003c/p\u003e\n\n\n\u003ch2  align=\"center\"\u003eBuilding alternative phylogenetic trees using DBSCAN\u003c/h2\u003e\n\n\n\n\u003cdetails open\u003e\n  \u003csummary\u003eTable of Contents\u003c/summary\u003e\n  \u003col\u003e\n    \u003cli\u003e\n      \u003ca href=\"#about-the-project\"\u003eAbout the project\u003c/a\u003e\n    \u003c/li\u003e\n    \u003cli\u003e\n      \u003ca href=\"#Installation\"\u003eInstallation\u003c/a\u003e\n    \u003c/li\u003e\n    \u003cli\u003e\n      \u003ca href=\"#Examples\"\u003eExamples\u003c/a\u003e\n    \u003c/li\u003e\n    \u003cli\u003e\n      \u003ca href=\"#contact\"\u003eContact\u003c/a\u003e\n    \u003c/li\u003e\n  \u003c/ol\u003e\n\u003c/details\u003e\n\n\n\n# About the project\n\nThis project aims to perform tree classification using the DBSCAN algorithm. Instead of using traditional coordinates, \ndistances between points are employed for the classification.\n\nIf you would like to find out more about the project, the ideas for improvement, the difficulties encountered and \nthe changes to be made, please read the \"phyDBSCAN_Project_Report.pdf\" in attachment.\n\n# Installation\n\nInsert your dataset matrix in the \"resources/input_data.txt\" file, then use one of the two compilation methods.\n\n### Using Makefile:\n\nUse the provided Makefile to install the project:\n\n```\nmake\n```\n\nTo run the project, execute:\n\n```\n./phyDBSCAN input.txt output.csv\n```\n\nTo clean the project, execute:\n\n```\nmake clean\n```\n\n### Using CMakeLists:\n\nAlternatively, if you are using Clion IDE, you can use CMake for building the project. Here are the steps:\n\n1. Run Clion IDE \u0026 Open the project\n2. Go to Run -\u003e Edit Configurations\n3. Click on the \"+\" button and select \"CMake\"\n4. In the \"Name\" field, enter \"phyDBSCAN\" and fill information like in the following image:\n   ![CMakeLists.png](https://github.com/tahiri-lab/phyDBSCAN/blob/main/img/CMakeLists.png)\n5. Click on \"Apply\" and \"OK\" and run the project\n\n# Examples of use\n\nTo test, we took a matrix from the \"resources/input_simulation_dataset.txt\" file\n\nInput Data Set used in this example (distance matrix) we put in the file \"resources/input_data.txt\":\n\n```\n0\t0.4\t0.4\t0.4\t0.4\t1\t1\t1\t1\t1\t0.8\t1\t1\t1\t1\t0.8\t0.8\t0.6\t0.8\t0.8\n0.4\t0\t0.4\t0.8\t0.8\t0.8\t0.8\t0.8\t0.8\t0.8\t1\t0.8\t1\t1\t0.8\t0.8\t0.8\t0.8\t0.8\t0.8\n0.4\t0.4\t0\t0.8\t0.8\t1\t1\t1\t1\t1\t1\t1\t0.8\t0.8\t1\t0.8\t0.8\t0.8\t0.8\t0.8\n0.4\t0.8\t0.8\t0\t0.6\t1\t1\t1\t1\t1\t0.8\t1\t1\t1\t1\t0.6\t0.6\t0.4\t0.6\t0.6\n0.4\t0.8\t0.8\t0.6\t0\t1\t1\t1\t1\t1\t0.6\t0.8\t0.8\t0.8\t0.8\t1\t1\t0.8\t1\t1\n1\t0.8\t1\t1\t1\t0\t0.4\t0.4\t0.4\t0.4\t1\t0.8\t1\t1\t0.8\t1\t1\t0.8\t1\t1\n1\t0.8\t1\t1\t1\t0.4\t0\t0.6\t0.4\t0.6\t1\t0.6\t1\t1\t0.6\t1\t1\t0.8\t1\t1\n1\t0.8\t1\t1\t1\t0.4\t0.6\t0\t0.6\t0.6\t1\t0.8\t1\t1\t0.8\t1\t1\t0.8\t1\t1\n1\t0.8\t1\t1\t1\t0.4\t0.4\t0.6\t0\t0.6\t1\t0.8\t1\t1\t0.8\t1\t1\t0.8\t1\t1\n1\t0.8\t1\t1\t1\t0.4\t0.6\t0.6\t0.6\t0\t1\t0.8\t1\t1\t0.8\t1\t1\t0.8\t1\t1\n0.8\t1\t1\t0.8\t0.6\t1\t1\t1\t1\t1\t0\t0.4\t0.4\t0.4\t0.4\t1\t1\t0.8\t1\t1\n1\t0.8\t1\t1\t0.8\t0.8\t0.6\t0.8\t0.8\t0.8\t0.4\t0\t0.4\t0.4\t0\t1\t1\t1\t1\t1\n1\t1\t0.8\t1\t0.8\t1\t1\t1\t1\t1\t0.4\t0.4\t0\t0\t0.4\t1\t1\t1\t1\t1\n1\t1\t0.8\t1\t0.8\t1\t1\t1\t1\t1\t0.4\t0.4\t0\t0\t0.4\t1\t1\t1\t1\t1\n1\t0.8\t1\t1\t0.8\t0.8\t0.6\t0.8\t0.8\t0.8\t0.4\t0\t0.4\t0.4\t0\t1\t1\t1\t1\t1\n0.8\t0.8\t0.8\t0.6\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t0\t0.4\t0.4\t0.4\t0.4\n0.8\t0.8\t0.8\t0.6\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t0.4\t0\t0.4\t0.6\t0.6\n0.6\t0.8\t0.8\t0.4\t0.8\t0.8\t0.8\t0.8\t0.8\t0.8\t0.8\t1\t1\t1\t1\t0.4\t0.4\t0\t0.6\t0.6\n0.8\t0.8\t0.8\t0.6\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t0.4\t0.6\t0.6\t0\t0.6\n0.8\t0.8\t0.8\t0.6\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t1\t0.4\t0.6\t0.6\t0.6\t0\n```\n\nIn the \"input_simulated_data.txt\" file, the first line of this dataset is the following:\n20\t8\t4\t0\t50\n\nThe first number (20) is the number of points in the dataset, the third number (4) is the number of clusters expected, it is used to calculate the ARI (Adjusted Rand Index).\n\nThe output of the program will be stored in the output.csv file as follows :\n```\nDBSCAN;0.490000;3;20;8;4;50;1.000000;(1\u003c\u003e1\u003c\u003e1\u003c\u003e1\u003c\u003e1\u003c\u003e2\u003c\u003e2\u003c\u003e2\u003c\u003e2\u003c\u003e2\u003c\u003e3\u003c\u003e3\u003c\u003e3\u003c\u003e3\u003c\u003e3\u003c\u003e4\u003c\u003e4\u003c\u003e4\u003c\u003e4\u003c\u003e4);462\n```\nDBSCAN : method used for the clustering\n\n0.490000 : value of epsilon\n\n3 : number of minimum points\n\n20 : number of trees in the matrix\n\n8 : number of leaves in each trees\n\n4 : number of cluster we expect to find\n\n50 : noise (differences between the trees within a cluster)\n\n1.00000 : ARI\n\n(\u003c\u003e\u003c\u003e\u003c\u003e) : partition\n\n462 : time it took the program to calculate the clusters and ARI for the matrix\n\n# Contact\nPlease email us at : \u003cNadia.Tahiri@USherbrooke.ca\u003e or \u003cThibaut.Leval@USherbrooke.ca\u003e for any question or feedback.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftahiri-lab%2Fphydbscan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftahiri-lab%2Fphydbscan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftahiri-lab%2Fphydbscan/lists"}