{"id":16140474,"url":"https://github.com/mreliptik/dmfinalproject","last_synced_at":"2026-01-20T18:56:40.450Z","repository":{"id":35177916,"uuid":"164938139","full_name":"MrEliptik/DMFinalProject","owner":"MrEliptik","description":"Final project for Data Mining course : Using OPTICS on 2 datasets","archived":false,"fork":false,"pushed_at":"2022-12-08T01:32:28.000Z","size":96791,"stargazers_count":1,"open_issues_count":5,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-02T05:09:32.374Z","etag":null,"topics":["clustering","datamining","optics-clustering","python"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MrEliptik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-01-09T21:07:25.000Z","updated_at":"2022-12-08T08:24:18.000Z","dependencies_parsed_at":"2023-01-15T15:30:59.511Z","dependency_job_id":null,"html_url":"https://github.com/MrEliptik/DMFinalProject","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MrEliptik%2FDMFinalProject","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MrEliptik%2FDMFinalProject/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MrEliptik%2FDMFinalProject/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MrEliptik%2FDMFinalProject/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MrEliptik","download_url":"https://codeload.github.com/MrEliptik/DMFinalProject/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247703572,"owners_count":20982284,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering","datamining","optics-clustering","python"],"created_at":"2024-10-09T23:52:45.316Z","updated_at":"2026-01-20T18:56:40.411Z","avatar_url":"https://github.com/MrEliptik.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DMFinalProject\n\nThis is a project that uses OPTICS clustering algorithm to cluster footballer faces and diabetic patients' data.\n\n## Requirements : \n\nInstall scipy==0.21dev0 with : \n\n    pip install git+https://github.com/scikit-learn/scikit-learn.git \n    \nunless OPTICS is now part of the stable releas\n\nInstall all the rest of the requirements with:\n\n    pip install -r requirements.txt\n\n## Getting Started\n\n### File structure\n\n**Datasets**\n\n    *dataset_diabetes*\n\n        - diabetic_data.csv : 130 US hospital data from 1999 to 2008[1](https://archive.ics.uci.edu/ml/datasets/Diabetes+130-US+hospitals+for+years+1999-2008)\n\n        - IDs_mapping.csv   : mapping for admission type, discharge disposition and admission source\n\n    *footballers*   : 124 footballers' photo (Neymar Jr., Lionel Messi, Cristiano Ronaldo, Luis Suarez, and Mohamed Salah)\n        *Predict*   : 5 footballers' photo to use for prediction (Neymar Jr., Lionel Messi, Cristiano Ronaldo, Luis Suarez, and Mohamed Salah)\n        \n**Ressources**\n\n    - footballers_encodings.pickle          : Encodings of the footballers after using *encode_faces.py*\n\n    - footballers_predict_encodings.pickle  : Encodings of the footballers for prediction after using *encode_faces.py*\n\n    - GUFD_encodings.pickle                 : Encodings of the GUFD photo after using *encode_faces.py*\n\n    - shape_predictor_68_face_landmarks.dat : Used to extract facial features in *encode_features.py*\n\n- *encode_faces.py*           : used to encode the face in an image as a 128-d vector\n- *encode_features.py*       : used to encode the facial features of a face as a 7-d vector\n- *faces_clustering.py*       : used to cluster the footballers' faces\n- *diabetic_clustering.py*    : used to cluster the diabetic's data\n- *similarity_clustering.py*  : used to cluster the GUFD dataset based on facial features similiarities\n- *optics.py*                 : contains the clustering and predicting methods\n\n- *requirements.txt*          : file containing all the python packages requirements\n\n## Authors\n\n* **Victor MEUNIER** - *DMFinalProject* - [MrEliptik](https://github.com/MrEliptik)\n\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmreliptik%2Fdmfinalproject","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmreliptik%2Fdmfinalproject","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmreliptik%2Fdmfinalproject/lists"}