An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with cluster-analysis

A curated list of projects in awesome lists tagged with cluster-analysis .

https://github.com/kubesphere/kubeeye

KubeEye aims to find various problems on Kubernetes, such as application misconfiguration, unhealthy cluster components and node problems.

cluster-analysis k8s kubeeye kubernetes observability

Last synced: 14 May 2025

https://github.com/erda-project/kubeprober

Large-scale Kubernetes cluster diagnostic tool.

cluster-analysis docker go golang k8s kubernetes observability

Last synced: 05 Apr 2025

https://github.com/thejj/ceph-balancer

Efficient Ceph placement optimization, aiming for maximum storage capacity through equal OSD utilization.

ceph ceph-balancer cluster-analysis optimization-problem python

Last synced: 05 Apr 2025

https://github.com/luisscoccola/persistable

density-based clustering for exploratory data analysis based on multi-parameter persistence

cluster-analysis clustering clustering-algorithm machine-learning machine-learning-algorithms topological-data-analysis unsupervised-learning

Last synced: 30 Oct 2025

https://github.com/lucko515/clustering-python

Different clustering approaches applied on different problemsets

cluster-analysis clustering clustering-algorithm custom-algorithm kmeans machine-learning

Last synced: 21 Jun 2025

https://github.com/niekdt/latrend

An R package for clustering longitudinal datasets in a standardized way, providing interfaces to various R packages for longitudinal clustering, and facilitating the rapid implementation and evaluation of new methods

cluster-analysis clustering-evaluation clustering-methods data-science longitudinal-clustering longitudinal-data mixture-models r r-package time-series-analysis

Last synced: 03 Jul 2025

https://github.com/philips-software/latrend

An R package for clustering longitudinal datasets in a standardized way, providing interfaces to various R packages for longitudinal clustering, and facilitating the rapid implementation and evaluation of new methods

cluster-analysis clustering-evaluation clustering-methods data-science longitudinal-clustering longitudinal-data mixture-models r r-package time-series-analysis

Last synced: 13 Apr 2025

https://github.com/epigen/unsupervised_analysis

A general purpose Snakemake workflow and MrBiomics module to perform unsupervised analyses (dimensionality reduction & cluster analysis) and visualizations of high-dimensional data.

cluster-analysis cluster-validation clustering clustering-algorithm clustree data-science data-visualization densmap dimensionality-reduction heatmap high-dimensional-data leiden-algorithm pca principal-component-analysis snakemake umap unsupervised-learning visualization workflow

Last synced: 15 Apr 2025

https://github.com/mlr-org/mlr3cluster

Cluster analysis for mlr3

cluster-analysis clustering mlr3 r r-package

Last synced: 28 Oct 2025

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 14 Jul 2025

https://github.com/instamatic-dev/edtools

Collection of tools for automated processing and clustering of electron diffraction data

cluster-analysis electron-diffraction xds

Last synced: 14 Dec 2025

https://github.com/philips-labs/demo-clustering-longitudinal-data

Supplementary materials for the manuscript "Clustering of longitudinal data: A tutorial on a variety of approaches" by N. G. P. Den Teuling, S.C. Pauws, and E.R. van den Heuvel (2021)

cluster-analysis longitudinal-analysis r tutorial

Last synced: 08 Jul 2025

https://github.com/nunofachada/amvidc

Data clustering algorithm based on agglomerative hierarchical clustering (AHC) which uses minimum volume increase (MVI) and minimum direction change (MDC) clustering criteria.

algorithm cluster-analysis clustering clustering-algorithm clustering-criteria convex-hull convexhull data-clustering-algorithm fscore matlab matlab-toolbox minimum-direction-change minimum-volume-increase pddp principal-components volume

Last synced: 20 Mar 2025

https://github.com/acabassi/coca

R package for COCA: Cluster-of-Clusters Analysis

cluster-analysis cluster-of-clusters clustering coca genomics integrative-clustering multi-omics

Last synced: 22 Oct 2025

https://github.com/nafisalawalidris/911-call-analysis

The 911 Call Analysis project explores and visualises emergency call data to uncover patterns and trends. It includes data preparation, exploratory analysis, visualizing call volume and reasons and generating heatmaps. Users can customize the code for their dataset. The project relies on libraries like Pandas, NumPy, Matplotlib, Seaborn, and SciPy

cluster-analysis data-analysis data-visualization decision-making emergency-calls emergency-services exploratory-data-analysis heatmaps matplotlib numpy pandas patterns-and-trends resource-allocation scipy seaborn

Last synced: 26 Jun 2025

https://github.com/myntra/analyse-redis-cluster-nodes

Tired of analysing redis cluster using `cluster nodes` command. Try using this simple shell script.

cluster-analysis redis redis-cluster

Last synced: 14 Apr 2025

https://github.com/nredell/rari

A python package which implements a distance-based extension of the adjusted Rand index for the supervised validation of 2 cluster analysis solutions

adjusted-rand-index ari cluster-analysis cluster-validation cluster-validity-index ranked-adjusted-rand-index rari t-sne umap

Last synced: 13 Apr 2025

https://github.com/pnavaro/geometricclusteranalysis.jl

Geometric methods for Cluster Analysis

cluster-analysis julialang

Last synced: 26 Jul 2025

https://github.com/philips-labs/comparison-clustering-longitudinal-data

Supplementary materials for the manuscript "A comparison of methods for clustering longitudinal data with slowly changing trends" by N. G. P. Den Teuling, S.C. Pauws, and E.R. van den Heuvel, published in Communications in Statistics - Simulation and Computation (2021).

cluster-analysis longitudinal-analysis r simulation-study

Last synced: 30 Apr 2025

https://github.com/xaheli/spectrums

Spectrums: Optimize GMM params, UI, export palettes, improve clustering accuracy, and add customizable color schemes.

cluster-analysis color-palette colors gaussian gaussian-mixture-models imageclassifier palette-colors

Last synced: 15 Mar 2025

https://github.com/gauravkoradiya/ticket-clustering

This Repository contains various methodology for cluster unstructured user tickets.

cluster-analysis clustering-algorithm deep-learning machine-learning ticket-management

Last synced: 15 Mar 2025

https://github.com/kskbhat/silhouette

Silhouette-Based Diagnostics for Standard, Soft, and Multi-Way Clustering

classification cluster-analysis clustering-algorithm membership-probability proximity-measure silhouette

Last synced: 22 Oct 2025

https://github.com/thennen/counting-molecules

Automates counting and categorization of molecules in scanning probe microscopy images

cluster-analysis computer-vision image-analysis microscopy molecules zernike-moments

Last synced: 11 Apr 2025

https://github.com/hetuvpatel/ml-diabetes-risk-progression-stage

Machine learning project analyzing diabetes risk progression using K-Means and Hierarchical clustering techniques on the Pima Indian Diabetes dataset. 🧠📊

cluster-analysis data-visualization heirarchical-clustering kmap kmeans machine-learning matplotlib sckit-learn seaborn

Last synced: 23 Sep 2025

https://github.com/schultzm/havic

Detect Hepatitis A Virus Infection Clusters

cluster-analysis phylogenomics-pipeline transmission viral-genomics

Last synced: 15 Jun 2025

https://github.com/salman-khan-mohammed/predicting-the-intent-of-online-shoppers

This project aims to predict online shoppers' purchase intentions using browsing history and user data from e-commerce sites. By analyzing clickstream and session information, the goal is to create a machine learning model that accurately forecasts customers' likelihood of making a purchase.

cluster-analysis data-analysis data-pre eda outliers prediction

Last synced: 31 Oct 2025

https://github.com/harmim/vut-izp-proj3

Základy programování - Projekt 3 - Jednoduchá shluková analýza

c cluster-analysis doxygen project vut

Last synced: 23 Aug 2025

https://github.com/annaanastasy/consumer-behavior-clustering

Segmented customer data into clusters using KMeans to uncover actionable insights into consumer behavior for targeted marketing strategies.

cluster-analysis clustering data-science exploratory-data-analysis kmeans-clustering machine-learning-algorithms python unsupervised-learning

Last synced: 01 Jul 2025

https://github.com/jansvabik/vutfit-izp-proj3

Simple cluster analysis implemented in C. Third VUT FIT IZP project.

brno-university-of-technology cluster-analysis fit project vut

Last synced: 11 Mar 2025

https://github.com/mansi-k/fifa_clustering

Performed KMeans, Agglomerative, Divisive, DBSCAN clustering on FIFA dataset along with outlier detection and cluster analysis

agglomerative-clustering cluster-analysis dbscan-clustering divisive-clustering fifa-clustering kmeans-clustering outlier-detection preprocessing visualization

Last synced: 15 Apr 2025

https://github.com/zmyzheng/stack_overflow_qa_assistant

Big Data Analysis project with recommendation, cluster analysis and graph database

big-data-analytics cluster-analysis data-visualization graph-database hadoop mahout recommendation-system

Last synced: 30 Mar 2025

https://github.com/zmyzheng/browserassistant

Big Data & Cloud Computing project for recommendation, cluster analysis, data visualization with Hadoop and Spark deployed in auto- scaling cloud environment, youtube link:

angular big-data-analytics cloud cluster-analysis data-visualization elasticsearch flask hadoop recommendation-system spark spring-boot

Last synced: 16 Jul 2025

https://github.com/deenuy/yorku-customer-segment-analysis

Repository for Customer Segment Analysis using Python & Shiny App Dashboard

alogrithmic-marketing apriori-algorithm cluster-analysis r shinydashboard york-university

Last synced: 17 Oct 2025

https://github.com/hazim-hf/business-analytics

This course introduces techniques to transform raw data into actionable insights for business analysis, covering customer, operation, and people analytics. Customer analytics examines and predicts customer behavior; operation analytics aligns supply with demand and optimizes decisions; people analytics uses data to manage the workforce effectively.

association-rules classification-analysis cluster-analysis linear-regression shiny-dashboard time-series

Last synced: 23 Sep 2025

https://github.com/chaganti-reddy/ai-prototype-customer-segmentation

Artificial Intelligence Prototype product based model for Customer Segmentation in E-Commerce Industry.

artificial-intelligence cluster-analysis customer-segmentation data-analysis machine-learning product-based prototype

Last synced: 13 Mar 2025

https://github.com/niteshchawla/clustering-ml

Analyzing the vast data of learners can uncover patterns in their professional backgrounds and preferences. Allowing Scaler to make tailored content recommendations and provide specialized mentorship.

cluster-analysis clustering hierarchical-clustering k-means-clustering machine-learning numpy pca-analysis visualisation

Last synced: 27 Feb 2025

https://github.com/efthymioscosta/fullfactorialmixedclustering

This repository includes the R code used for the project "Mixed-type data clustering: a full factorial benchmarking study on distance-based clustering methods", written by Efthymios Costa. The project is supervised by Dr. Ioanna Papatsouma (Imperial College London) and co-supervised by Professor Alastair Young (Imperial College London).

benchmarking-study cluster-analysis clustering full-factorial mixed-data

Last synced: 13 Sep 2025

https://github.com/edisonslightbulbs/3dintactoolkit

3DINTACT: an open-source CXX_11 project for segmenting interaction regions on tabletop surfaces near real-time

3d-point-clouds cluster-analysis cxx11 libtorch object-detection opencv opengl real-time rendering segmentation toolkit

Last synced: 04 Mar 2025

https://github.com/mohdrasmil7/customer-insights-and-segmentation-with-machine-learning

Analyze customer data to segment and understand your ideal customers. This app helps businesses tailor products and marketing strategies for different customer segments using detailed analysis and clustering. 🚀

classification cluster-analysis jupyter-notebook machine-learning-algorithms python streamlit-webapp

Last synced: 16 Mar 2025

https://github.com/noorulhudaajmal/customer-segmentation-analysis

Customer segmentation and analysis of purchasing behaviour

cluster-analysis customer-segmentation data-analysis

Last synced: 07 Oct 2025

https://github.com/nghiant3110/customer_segment_9

This is a DA Project about Clustering based on the Customer Mall datasets from Kaggle

cluster-analysis ml python

Last synced: 17 Jun 2025

https://github.com/efthymioscosta/robustmodelclustering

This repository includes the R code used for the project "Investigating robust partitional clustering methods", written by Efthymios Costa. The project is supervised by Dr. Ioanna Papatsouma (Imperial College London) and co-supervised by Professor Alastair Young (Imperial College London).

cluster-analysis clustering full-factorial partitioning robustness

Last synced: 09 Apr 2025

https://github.com/cyberoctane29/penguins-data-analysis-and-modeling

This project applies statistical modeling, including single and multiple linear regression, using Python. It covers exploratory data analysis, data cleaning, and modeling with pandas, NumPy, statsmodels, and scikit-learn. Regression analyzes relationships, while clustering identifies patterns. Seaborn visualizations enhance interpretability.

cluster-analysis clustering-algorithm data-analytics eda kmeans-clustering machine-learning multiple-linear-regression penguins predictive-modeling regression-analysis simple-linear-regression statistical-modeling supervised-learning unsupervised-learning

Last synced: 02 Apr 2025

https://github.com/gaizkiaadeline/clustering-and-topic-extraction-of-twitter-x-responses-to-bsi-s-2023-ransomware-attack

A project analyzing user tweets about the 2023 BSI ransomware attack using clustering and topic extraction methods. Persona analysis is performed on both approaches, with a comparison of the results to extract key insights.

cluster-analysis lda mining nltk text topic-extraction topic-modeling

Last synced: 12 May 2025

https://github.com/akansharajput280799/data-driven-insights-into-job-satisfaction-and-compensation-trends

This project analyzes 2020 employee data to identify factors influencing job satisfaction, performance, and salary differences, offering insights for improving engagement and workplace strategies.

cluster-analysis colab-notebook data-cleaning descriptive-statistics factor-analysis hypothesis-testing jupyter-notebook matplotlib python scikit-learn seaborn t-test visualization

Last synced: 14 Oct 2025

https://github.com/marianamartiyns/rfm-cluster-analysis

Customer behavior and sales analysis, including data cleaning, RFM calculation, churn analysis and customer clustering.

cluster-analysis data-analysis data-cleaning data-visualization pyhton

Last synced: 16 Mar 2025

https://github.com/netcodez/nlp-and-clustering---movie-similarity

Using NLP and Clustering on movie plot summaries from IMDB and Wikipedia to quantify movie similarity

cluster-analysis clustering clustering-algorithm nlp-machine-learning nltk-library nltk-python nltk-tokenizer

Last synced: 05 Dec 2025

https://github.com/cyberoctane29/optimizing-k-in-k-means-a-visual-and-quantitative-exploration

Exploring K-means clustering through image color compression and high-dimensional data analysis. Learn how pixel grouping in RGB space builds intuition, while inertia/silhouette scores optimize clusters. Demonstrates K-means' power to reveal patterns in both visual and abstract data by optimizing groupings and selecting ideal k-values.

cluster-analysis clustering-algorithm inertia kmeans kmeans-clustering kmeans-image-clustering kmeans-image-compression kmeans-plus-plus silhouette-score

Last synced: 09 Jul 2025

https://github.com/adityakumarda/kmeans-web-analytics

Built with Python, Pandas, and Scikit-learn, this machine learning project uses K-Means to cluster website users by behavior. It reveals patterns in engagement and bounce, helping drive data-informed decisions.

cluster-analysis elbow-curves elbow-method elbow-plot jupyter-notebook kmeans-clustering machine-learning matplotlib numpy pandas python python3 relationship scikit-learn seaborn sklearn

Last synced: 30 Dec 2025

https://github.com/tnleite/real-estate-opportunities-analysis

Este repositório apresenta uma análise de oportunidades no mercado imobiliário, combinando séries temporais, clusterização e previsões para identificar estados com maior potencial de crescimento e orientar estratégias de expansão eficientes.

catboostregressor cluster-analysis data-science kmeans-clustering lightgbm-regressor machine-learning-algorithms numpy regression-models scikit-learn xgboost-regression

Last synced: 10 Apr 2025

https://github.com/kardevroop/csci723graphdb

Implementation of the Label Propagation algorithm with a slight variation in the stopping criteria.

cluster-analysis cypher graph-database graphql neo4j

Last synced: 09 Oct 2025

https://github.com/tanyakuznetsova/word-embeddings-on-brown-corpus

This notebook explores how clustering semantically similar words can help make Natural Language Processing tasks easier.

brown-corpus cluster-analysis nlp nlp-machine-learning unsupervised-machine-learning word-embeddings

Last synced: 07 Sep 2025

https://github.com/sanikaptl/automotive-industry-analysis

Indian Car Market Analysis & Price Prediction: A Microsoft Engage Project Using Random Forest and K-Means Clustering

cluster-analysis indian-automotive-industry jupyter-notebook k-means-clustering knn-classification market-analysis microsoft-engage-2022 random-forest seaborn streamlit vscode

Last synced: 09 Oct 2025

https://github.com/lefteris-souflas/propensity-to-lapse-model-building-exercise

Analyzed customer churn using transaction data. Built ML model to predict lapses. Dataset includes customer status, collection/redemption info, and program tenure. Delivered business presentation outlining modeling approach, findings, and churn reduction strategies.

cluster-analysis data-driven-decisions data-preprocessing data-splitting decision-tree feature-engineering gradient-boosting logistic-regression model-interpretation model-optimization model-selection-and-evaluation neural-network random-forest sas-visual-analytics support-vector-machine

Last synced: 02 Mar 2025

https://github.com/oelin/nutshell-cluster-analysis

Semantic clustering of video titles from the YouTube channel *In A Nutshell*.

cluster-analysis clustering-algorithm data-science in-a-nutshell nlp statistics unsupervised-learning

Last synced: 05 Aug 2025

https://github.com/m30m/cluster_pack

A simple tool for visualizing clusters using D3.js

cluster-analysis clustering d3-js d3-visualization visual-analytics visualization

Last synced: 26 Jul 2025

https://github.com/johannaschmidle/ufo-project

Exploring the Relationship Between UFOs, Location, Time, and Human Emotion [SQL, Python]

cluster-analysis data-exploration eda k-means-clustering location-analysis nltk-python sentiment-analysis time-analysis ufo-sightings wordcloud

Last synced: 23 Aug 2025

https://github.com/paulj1989/player-similarities

Using FB Ref player data to measure player similarity within positions, using clustering methods

cluster-analysis dimensionality-reduction football positions python soccer sports-analytics

Last synced: 14 Oct 2025

https://github.com/pnavaro/juliaparis2023

Slides for "Journées Julia et Optimisation" Paris 04-06 October 2023

cluster-analysis julialang

Last synced: 03 Jan 2026