{"id":25370025,"url":"https://github.com/tschechlovdev/diss_evaluation","last_synced_at":"2025-04-09T07:18:54.147Z","repository":{"id":251723661,"uuid":"694545274","full_name":"tschechlovdev/Diss_Evaluation","owner":"tschechlovdev","description":null,"archived":false,"fork":false,"pushed_at":"2024-09-23T07:05:21.000Z","size":26694,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-15T01:38:21.173Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tschechlovdev.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-21T08:02:05.000Z","updated_at":"2024-09-23T07:05:24.000Z","dependencies_parsed_at":"2024-08-05T10:58:33.759Z","dependency_job_id":null,"html_url":"https://github.com/tschechlovdev/Diss_Evaluation","commit_stats":null,"previous_names":["tschechlovdev/diss_evaluation"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2FDiss_Evaluation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2FDiss_Evaluation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2FDiss_Evaluation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2FDiss_Evaluation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tschechlovdev","download_url":"https://codeload.github.com/tschechlovdev/Diss_Evaluation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247994122,"owners_count":21030051,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-15T01:38:36.418Z","updated_at":"2025-04-09T07:18:54.118Z","avatar_url":"https://github.com/tschechlovdev.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Overall Evaluation of Democratizing Clustering Analyses: AutoML, Meta-Learning, and Ensemble Clustering to Support Novice Analysts\n\nPrototypical Implementation in Python of the submitted Dissertation \"Democratizing Clustering Analyses: AutoML, Meta-Learning, and Ensemble Clustering to Support Novice Analysts\" at the University of Stuttgart.\n\n## Overview\n\nThe main code is in the \"src\" folder. It contains the following modules:\n\n- ``automlclustering``: Contains the code for AutoML4Clust [1] (Chapter 3) and ML2DAC [2] (Chapter 4).\n- ``effens``: Code for EffEns - Efficient Ensemble Clustering [3] (Chapter 5).\n- ``overall_evaluation``: Code for the overall evaluation and comparison of the three approaches AutoML4Clust, ML2DAC, and EffEns (Chapter 6).\n- ``datagen_classification``: Code for the data generator and the subsequent evaluation of the three clustering approaches for subsequent classification [4] (Chapter 7).\n\n## Installation\n\nOur implementation is based on Python and we require Python 3.9.\nFurthermore, as SMAC only runs on Linux, we also require a Linux system.\nWe have tested on Ubuntu 20.04.\n\nBefore installing EffEns, you first have to install the following that are required for some of the libraries:\n- ``sudo apt-get install build-essential``\n- ``sudo apt-get install gcc``\n\nThe easiest way of installing EffEns is to use Anaconda. Follow https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html\nto install Anaconda.\nWe will then create a prepared Python 3.9 environment:\n- ``conda env create -f environment.yml``\n\nThis should create a conda environment with the name \"automated_ensemble_clustering\".\nThen you have to install ib_base as it is not available as package: \n\n```git clone https://collaborating.tuhh.de/cip3725/ib_base.git\ncd ib_base\npython setup.py install\ncd ..\n```\n\nAfter finishing this, you have to add the \"src\" folder of DissEval and the path to \"ib_base\" to your PYTHONPATH\nYou may also have to add them to your conda path\n``gedit  ~/miniconda3/envs/automated_ensemble_clustering_39/lib/python3.9/site-packages/conda.pth``\nor anaconda instead of miniconda.\n\nNow everything should be setup and you can try to run ``python src/Experiments/SyntehticData/EffEns_Experiment_synthetic.py``.\nThis should run without any errors.\n\n## References\n\n[1] Fritz, M., Tschechlov, D.,\u0026 Schwarz, H. (2021). Efficient Exploratory  Clustering Analyses with Qualitative Approximations. Extending  Database Technology (EDBT), 337–342.\n\n[2] Treder-Tschechlov, D., Fritz, M., Schwarz, H., \u0026 Mitschang, B. (2023).  ML2DAC: Meta-Learning to Democratize AutoML for Clustering Analysis.  Proceedings of the ACM on Management of Data (PACMMOD),  1(2), 1–26.\n\n[3] Treder-Tschechlov, D., Fritz, M., Schwarz, H., \u0026 Mitschang, B. (2024).  Efficient Ensemble Clustering based on Meta-Learning and Hyperparameter  Optimization. In: To appear in Proc. VLDB Endow. 17, 11.\n\n[4] Treder-Tschechlov, D., Reimann, P., Schwarz, H., \u0026 Mitschang, B. (2023). Approach to synthetic data generation for imbalanced multiclass  problems with heterogeneous groups. In: BTW 2023.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftschechlovdev%2Fdiss_evaluation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftschechlovdev%2Fdiss_evaluation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftschechlovdev%2Fdiss_evaluation/lists"}