An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/hadeel-13/new_home

New Home is a Website for Buying and Selling Real Estate with user preferences, it is my Graduation project with a grade of 93%.

bootstrap5 chartjs css css3 data-analysis data-mining google-maps html html5 javascript jquery

Last synced: 12 Apr 2026

https://github.com/deypadma2020/dataanalysis-mlalgo

Practice repository for data analysis, feature engineering, statistics, web scraping, and building ML model pipelines in Python.

data-analysis eda feature-engineering machine-learning-algorithms ml-pipeline statistics web-scraping

Last synced: 30 May 2026

https://github.com/mahmoudwal27/sql-data-analysis

This project demonstrates SQL operations for managing student enrollments, including creating tables, inserting data, updating records, and running queries to analyze student and course information. It showcases skills in data manipulation, aggregation, and advanced query formulation.

analysis data-analysis sql sql-data-analysis sql-queries

Last synced: 13 Feb 2026

https://github.com/ot-code/sql-sabor-y-tradicion

A SQL-driven project that integrates menu and order data to reveal insights on dish performance, customer preferences, and spending trends. It informs pricing strategies, menu adjustments, and targeted promotions, ultimately enhancing the overall customer experience and driving business growth.

analytical-queries data data-aggregation data-analysis database-design join-queries mysql order-analytics relational-databases restaurant-data sql sql-script

Last synced: 08 Apr 2025

https://github.com/kathkoeh/pimaindian-kk

Logistic regression analysis of diabetes risk using the Pima Indians dataset. Includes prevalence analysis, modeling, ROC/AUC evaluation, and patient testing in Python.

data-analysis diabetes epidemiology logistic-regression machine-learning public-health python

Last synced: 28 Apr 2026

https://github.com/leosimoes/datascienceacademy-python

Atividades do curso Fundamentos de Linguagem Python Para Análise de Dados e Data Science (Com ChatGPT) da DataScienceAcademy.

chatgpt data-analysis data-science python

Last synced: 02 May 2026

https://github.com/guilherme-marcello/r-data-analysis-piechart

Reading RDS files, processing and presentation in pie charts

data-analysis data-visualization pie-chart r

Last synced: 13 Jul 2025

https://github.com/andersoncrs/analisis-de-texto-tweets

En este proyecto exploro el análisis de texto de tweets para descubrir tendencias, opiniones y temas relevantes en redes sociales. Usando herramientas de procesamiento de lenguaje natural, convierto grandes volúmenes de mensajes en información clara y visualmente atractiva.

data-analysis data-visualization eda text-mining

Last synced: 21 Jul 2025

https://github.com/artemzarubin/xml-document-processor

XML processing tool using the Strategy design pattern.

csharp data-analysis data-transformation design-patterns strategy xml

Last synced: 21 Jul 2025

https://github.com/BingyanStudio/github-analyzer

锐评一下你都在 GitHub 写了什么

data-analysis github llm reports selfhosted typescript

Last synced: 12 May 2025

https://github.com/1adityakadam/Carnegie_classifications_website

A comprehensive data analytics platform analyzing 50+ years of U.S. higher education trends through interactive visualizations and historical institution tracking.

css data-analysis html javascript python ui-design web-development

Last synced: 25 Jun 2025

https://github.com/nadamarei/data-analyzer

The Qualitative Data Analysis Tool is a powerful Streamlit application designed for researchers to analyze word frequencies in corporate documents. This tool processes PDF reports, identifies target words and their contextually relevant synonyms, and generates comprehensive reports with document statistics, summary analysis, and per-file breakdowns

data-analysis data-visualization python-3 streamlit

Last synced: 18 May 2026

https://github.com/rohansoni45/movie-recommendation-system

This project is a Content-Based Recommender System that suggests movies to users based on their preferences and watched history. The system leverages cosine similarity to find and recommend movies similar to a selected title. It is built using Python and libraries like Pandas, NumPy, and Scikit-learn.

content-based-filtering cosine-similarity data-analysis data-science machine-learning numpy pandas python recommender-system render scikit-learn

Last synced: 17 Apr 2026

https://github.com/ddihora1604/iitk_task

A comprehensive financial data analysis system that collects, processes, and analyzes data from approximately 500 tickers in the S&P Global Index. It provides detailed financial information, ESG metrics, and various financial statements for comprehensive market analysis.

beautifulsoup4 data-analysis data-visualization datamodelling dataset esg machine-learning python yahoo-finance

Last synced: 30 Jun 2026

https://github.com/namratagulati/tweets_analysis

This repository focuses on sentiment analysis of Twitter data using Python, Natural Language Processing (NLP), and the Natural Language Toolkit (NLTK). The goal is to extract valuable insights from social media discussions, such as word frequency, hashtag trends, and sentiment patterns.

analysis data-analysis natural-language-processing nlp-machine-learning nltk-corpus nltk-python sentiment-analysis twitter-sentiment-analysis

Last synced: 07 Aug 2025

https://github.com/pronzzz/diabetes-prediction

Diabetes prediction using a KNN model and Pima Indian Diabetes Dataset

data-analysis data-manipulation data-preprocessing data-visualization knn machine-learning outlier-detection seaborn

Last synced: 13 Apr 2025

https://github.com/jelhamm/model-ensembles-boosting-in-machine-learning

"This repository contains implementations of Boosting method, popular techniques in Model Ensembles, aimed at improving predictive performance by combining multiple models. by using titanic database."

boosting boosting-algorithms boosting-ensemble boosting-machine data-analysis database-analysis datamining datamining-algorithms jupyter-notebook machine-learning machine-learning-models machine-learning-projects matplotlib-python model-ensemble module numpy-library pandas-library python sklearn-library

Last synced: 16 May 2026

https://github.com/ireneflorez/exploration_r

Data exploration on the 'White Wine Quality' dataset using R

data-analysis data-visualization r

Last synced: 16 Jun 2026

https://github.com/jelhamm/singular-value-decomposition-data-mining

"This repository hosts an implementation of the Singular Value Decomposition (SVD) algorithm tailored for data mining tasks. SVD is utilized for efficient dimensionality reduction, aiding in the extraction of key patterns and features from large and complex datasets."

data-analysis dimension-reduction jyputer-notebook machine-learning matplotlib numpy-library pandas-library preprocessing python scipy-library singular-value-decomposition sklearn-library standardscaler svd svd-matrix-factorisation

Last synced: 18 May 2026

https://github.com/carmoreno/nobelprizes

Final project of Big Data Module.

data-analysis mongodb

Last synced: 29 Apr 2026

https://github.com/dineshdhamodharan24/singapore_flat_resale_

This project focuses on developing a machine learning model to predict the resale values of apartments in Singapore. The goal is to create a user-friendly online application that enables users to obtain accurate predictions for the resale values of specific properties.

data-analysis flat json numpy pandas pickle project python streamlit

Last synced: 07 Apr 2026

https://github.com/rugwiroparfait/alx_sql

This repo is where I save my queries and learning materials in Data Science program from ALX

anaconda data data-analysis jupyter-notebook sql

Last synced: 19 Aug 2025

https://github.com/dinamohsin/ai-job-market-analysis-using-sql-excel

This project explores a dataset of AI-related jobs to uncover insights about salary trends, in-demand skills, education levels, and remote work preferences. The analysis was done using SQL for querying and Excel for data cleaning and preparation.

data-analysis data-preprocessing excel functions query sql sql-server

Last synced: 25 Jun 2025

https://github.com/jhermienpaul/google-data-analytics-program

Hands-on learning materials from the 8-course Google Data Analytics Professional Certificate program, covering foundational data skills, tools, and real-world business problem-solving

bigquery dashboard data-analysis data-analytics data-modeling data-storytelling data-visualization data-wrangling descriptive-analytics diagnostic-analytics etl-pipeline r-programming rstudio sql tableau

Last synced: 13 Jul 2025

https://github.com/vbhvsingh0/nflteam_corr_population

The goal of this project is to find the correlation in between NFL teams' win and loss with the population of the city.

data-analysis data-cleaning-and-preprocessing data-manipulation-with-pandas numpy-library pandas-python pearson-correlation python3

Last synced: 29 Jun 2026

https://github.com/myktorijus/retention-cohort

Extracted cohort data using SQL in BigQuery focusing on weekly retention from week 0 to week 6

bigquery data-analysis data-visualization powerbi sql

Last synced: 13 Jul 2025

https://github.com/gappeah/credit-card-transactions-fraud-detection-project

The Credit Card Transactions Fraud Detection Project repository is designed to analyse and detect fraudulent transactions in credit card data.

data-analysis postgresql sql

Last synced: 12 Jul 2025

https://github.com/farzeennimran/fashion-mnist-dataset-classification-using-neural-network

Implementation of a Multi-layer Perceptron classifier with hyperparameter tuning and k-fold cross-validation employing GridSearchCV for classifying images on the Fashion MNIST dataset 👗👚👖

artificial-intelligence data-analysis data-mining data-science dataset deep-learning fashion-mnist-dataset gridsearchcv hyperparameter-tuning kfold-cross-validation machine-learning multilayer-perceptron-network neural-network numpy pandas python sklearn

Last synced: 03 Apr 2026

https://github.com/shubhammittal-data/sales-customer_dashboard_tableau

An interactive Tableau project showcasing advanced data visualization techniques for sales performance and customer analytics. This dashboard provides key business insights using KPIs, trend analysis, and customer segmentation. Designed for executives, sales managers, and marketing teams to drive data-driven decision-making.

customer-behavior-analysis customer-segmentation data-analysis data-visualization product-analytics sales-analysis tableau tableau-dashboards tableau-public

Last synced: 07 Mar 2026

https://github.com/jlee9503/defense-risk-prediction

Build a machine learning pipeline that ingests defense procurement data, identifies high-risk contracts, and visualizes the results in an interactive dashboard.

data-analysis data-visualization exploratory-data-analysis python

Last synced: 25 Jan 2026

https://github.com/harmanveer-2546/motor-vehicle-accidents-in-india

As per the report, a total of 4,61,312 road accidents have been reported by States and Union Territories (UTs) during the calendar year 2022, which claimed 1,68,491 lives and caused injuries to 4,43,366 persons.

accidents accidents-analysis darkgrid data-analysis eda exploratory-data-analysis indian-roads inline matplotlib motor-vehicles numpy pandas review seaborn visualization

Last synced: 19 Jan 2026

https://github.com/mrendiks/analyst-data-survey-monkey

Learn how to analyst data from dataset surver monkey using Excel and Python

data-analysis ipynb-jupyter-notebook python

Last synced: 07 Mar 2026

https://github.com/mituskillologies/aiml-dypiemr-sep24

Programs conducted at DYPIEMR, Pune in training on AIML during September 2024.

artificial-intelligence data-analysis data-science machine-learning matplotlib neural-network numpy pandas python3

Last synced: 05 Apr 2025

https://github.com/chahelgupta/fitness-data-analysis-r-project

This project focuses on analyzing fitness data collected from various tracking devices to gain insights into users' activity levels, sleep patterns, calorie expenditure, and heart rate. The dataset used in this project consists of multiple CSV files, each containing different aspects of fitness-related data.

data-analysis data-cleaning data-exploration data-science data-visualization r r-language r-programming r-studio

Last synced: 18 May 2026

https://github.com/jwt218/sinc

MATLAB Standardization and Isotope Normalization for CSIA (with integrated correction and uncertainty quantification)

data-analysis geochemistry isotopes matlab

Last synced: 23 Jun 2025

https://github.com/jonathancaleb/adap

📊🌱 Agricultural Data Analysis Platform 🌍🚜 A personal initiative to analyze coffee growth trends in Uganda using Python, data science, and machine learning. This project supports sustainable farming with predictive models and interactive visualizations. 🍃📈

data-analysis data-science python

Last synced: 18 May 2026

https://github.com/jofaval/boston-housing

Regression Analysis into the Boston Housing in-demand pricing in 1978

boston-housing data-analysis data-science data-visualization machine-learning python regression

Last synced: 16 May 2026

https://github.com/Fisseha-Estifanos/telecom

A showcase repository for a specific telecommunication company. Used to analyze several telecommunication data set features and generate useful insights accordingly. Insights generated could be seen at https://github.com/Fisseha-Estifanos/telecom-visualizer or at https://fisseha-estifanos-telecom-visualizer-home-huxgy0.streamlitapp.com/

data-analysis notebooks-jupyter python visual-studio-code visualization

Last synced: 11 Mar 2025

https://github.com/majajuri/text-classification-using-string-kernels

Projekt u sklopu predmeta Uvod u znanost o podacima

data-analysis string-kernel

Last synced: 05 Apr 2025

https://github.com/ashvinhandoo/bionic-lab-projects

Computational neurophysiology pipelines for analyzing astrocyte and vascular dynamics. Includes Python- and MATLAB-based analysis frameworks for modeling calcium, vasomotion, and pupil-linked activity, demonstrating advanced signal processing, transfer entropy estimation, and data visualization skills used in biomedical research.

biocomputation bioinformatics biomedical-engineering computational-biology data-analysis matlab neuroscience python signal-processing time-series

Last synced: 18 May 2026

https://github.com/v41bh4vr4jput/data-analysis-with-python

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

api data data-analysis data-visualization matplotlib numpy pandas python sakila-db seaborn

Last synced: 09 Apr 2026

https://github.com/martachesnova/python-apis

A weather analysis that randomly selects more than 500 cities across the globe, pulls data from the OpenWeatherMap API for each city. Analysis of the weather and perfect vacation spot is viewable on my Jupyter Notebook.

api data-analysis python

Last synced: 24 Feb 2025

https://github.com/martachesnova/python

Created a Python script to calculate and analyze financial records of a company. Created another Python script to do calculations and analysis of the voting process in a small town.

data-analysis python

Last synced: 24 Apr 2026

https://github.com/gmasson/datadash

DataDash é uma biblioteca JavaScript e CSS para criar dashboards interativos, para visualização de dados dinâmicos em páginas web.

dashboard dashboard-application dashboards data-analysis data-science data-visualization javascript

Last synced: 08 Aug 2025

https://github.com/nikbarb810/motif_detection_in_r

Motif Detection for TFBS in Glycolysis and Glyconeogenesis pathways

bioinformatics data-analysis null-hypothesis pwm r

Last synced: 23 Jun 2025

https://github.com/data-edd/e-commercestore_analysis

This project analyzes e-commerce data to provide insights into sales performance, profitability, and customer behavior using Power BI.

data-analysis powerbi powerbidashboard

Last synced: 02 Feb 2026

https://github.com/lparham2/factors-driving-ev-adoption-charging-station-deployment

This project explores factors driving EV adoption and charging station deployment using Python-based data analysis. It examines sales trends, infrastructure growth, and socioeconomic influences to uncover key insights. The goal is to aid policymakers and businesses in optimizing EV infrastructure and accelerating sustainable transportation.

data-analysis data-visualization electric-vehicle-charging-station electric-vehicles powerpoint-presentations python

Last synced: 18 May 2026

https://github.com/rociobenitez/airbnb-data-mining

Análisis detallado y modelado predictivo de alojamientos en Madrid utilizando técnicas de Big Data y estadística en R, enfocado en optimización de datos y predicción de características de propiedades.

airbnb data-analysis data-mining estadistica prediction-model predictive-analytics predictive-modeling qmd r rstudio

Last synced: 23 Jun 2025

https://github.com/antononcube/wl-mosaicplot-paclet

Wolfram Language (aka Mathematica) paclet for mosaic plots over datasets or lists of records.

data-analysis machine-learning mosaic mosaic-plots

Last synced: 16 Jan 2026

https://github.com/gui-sitton/games

Identify patterns that determine whether a game is successful or not. This will allow you to identify potential big winners and plan advertising campaigns.

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 18 May 2026

https://github.com/wikidata/purdue-data-mine-2024

Program materials for WMDE's 2024 Purdue Data Mine project

analytics data-analysis data-quality data-science etl open-data python wikidata wikimedia

Last synced: 12 May 2025

https://github.com/xjwllmsx/profitable-app-profiles

Analyzes Google Play & App Store data to recommend profitable profiles for free, ad-supported mobile apps

data data-analysis data-cleaning jupyter pandas python

Last synced: 18 May 2026

https://github.com/pyramidheadshark/ai-mirea-sem1p

Completed set of all MIREA AI an DA practices (1 sem.)

beginner-friendly data-analysis data-science jupyter mirea

Last synced: 05 Apr 2025

https://github.com/jayita11/eda-student-exam-performance

This project performs Exploratory Data Analysis (EDA) and hypothesis testing on student performance data. It explores trends based on attributes like gender, race/ethnicity, parental education, lunch type, and test preparation course completion.

data-analysis eda hypothesis-testing matplotlib pandas python seaborn statsmodels student-performance-analysis

Last synced: 11 Jul 2025

https://github.com/derogative404/google_data_analytics_capstone

Capstone project part of the Google Data Analytics Certificate Program

data-analysis excel r tableau

Last synced: 26 Mar 2025

https://github.com/shubhamprajapati7748/end-to-end-house-price-prediction

A machine learning model that accurately predicts housing prices using the Boston Housing dataset by analyzing various house features, and it utilizes a CatBoost model to assist potential buyers or sellers in estimating housing prices.

boston-housing-price-prediction data-analysis data-science-projects machine-learning regression regression-models

Last synced: 30 Oct 2025

https://github.com/ddihora1604/social_media_analysis

A powerful, interactive dashboard for analyzing social media conversations, trends, and network dynamics. This tool allows researchers and analysts to explore patterns in social media data, identify key trends, and detect coordinated behavior.

aiml css data-analysis data-visualization html javascript python

Last synced: 01 Jul 2026

https://github.com/balajimohan18/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataeset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization eda kmeans-clustering machine-learning sql unsupervised-learning

Last synced: 27 Jul 2025

https://github.com/vedantshi/tableau-bike-data-dashboard

London Bike Rides Analysis explores bike usage patterns using data visualization and machine learning. It identifies trends through a dynamic moving average, analyzes weather impact with heatmaps, and provides actionable insights via an interactive Tableau dashboard. Tools: Python, Tableau.

data-analysis data-visualization python tableau weather-data

Last synced: 16 May 2026

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/maxbiostat/diehl_ebola_cell_2016

supplementary code and data to Diehl et al, 2016 (Cell)

data-analysis data-visualization disease-spread ebola mutation

Last synced: 11 Jul 2025

https://github.com/adriangalvanzamora/ecommerce-analytics-olist

Data analysis project based on the Olist Brazilian E-Commerce dataset. Includes data cleaning, exploratory analysis, delivery performance metrics, customer satisfaction modeling, and geospatial insights. Built entirely in Python (Jupyter Notebook) using real-world data from Kaggle.

brazil customer-satisfaction data-analysis data-visualization ecommerce folium geospatial-analysis machine-learning matplotlib notebook pandas plotly python seaborn

Last synced: 06 May 2026

https://github.com/drisskhattabi6/meteo-data-mining

This repo contains using Data Mining Techniques to analyze meteorological (meteo) data. The objective is to extract meaningful insights and patterns from the data that can aid in understanding weather phenomena and predicting future weather conditions.

cart data-analysis data-mining data-visualization decision-making decision-tree extract-data extract-insights insights-analytics insights-data k-means knn machine-learning svm

Last synced: 21 Mar 2025

https://github.com/jagoda11/elastic-vision

This repository contains a full-stack application designed to explore data from ElasticSearch🧐indices and visualize it using charts and graphs. The backend is built using Node.js and the frontend is powered🚀 by React.

backend chartjs dashboard-development data-analysis data-visualization docker elasticsearch frontend fullstack javascript material-ui monorepo mui-x node pie-chart react restful-api tables

Last synced: 09 Apr 2026

https://github.com/sakan811/gachascope

Evaluate the cost-effectiveness of various in-app purchase bundles available in gacha games.

data data-analysis data-visualization game honkai honkai-star-rail honkai-starrail hoyoverse javascript nextjs tableau tableau-public typescript wutheringwaves

Last synced: 04 May 2026

https://github.com/adnanrahin/nlp-with-disaster-tweets

Kaggle Competition: Predict which Tweets are about real disasters and which ones are not. Natural Language Processing.

data-analysis data-science data-visualization kaggle-competition machine-learning natural-language-processing regular-expression tweets

Last synced: 21 Jun 2025

https://github.com/pkjjoshi/restaurants-analysis

Performed beginner-level EDA on a restaurant dataset using Python. Analyzed top cuisines, city-wise ratings, price ranges, and online delivery impact using Pandas and Matplotlib. Includes 4 well-structured notebooks with visual insights.

beginner-project data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas python restaurant-data seaborn

Last synced: 21 Jun 2025

https://github.com/teditae/data-analysis-with-pandas

Mini data science projects focused on Pandas-powered analysis.

data-analysis data-manipulation pandas python

Last synced: 30 Apr 2026

https://github.com/atharvkadammm/suicide-prediction-system

A machine learning project predicting suicide risk based on multiple socio-economic and environmental factors using data mining techniques.

csv data-analysis data-science data-visualization datamining exploratory-data-analysis feature-engineering machine-learnin matplotlib mental-health numpy pandas riskassesment seaborn sklearn suicide-prediction supervised-

Last synced: 01 Jul 2025

https://github.com/liebsen/overlemon

Overlemon institutional application

data-analysis design devops sysadmin webdev

Last synced: 21 Jul 2025

https://github.com/capjamesg/personal-notebooks

Notebooks for personal experiments with machine learning and computer vision.

data-analysis machine-learning notebooks

Last synced: 03 Apr 2025

https://github.com/bamresearch/utah-saxs-tools

The Utah SAXS Tools (USToo), adapted for Python 3, originally by David P. Goldenberg, 2009-2012

data-analysis saxs small-angle-scattering small-angle-xray-scattering

Last synced: 17 Jan 2026

https://github.com/atharvkadammm/calmlytic

An end-to-end machine learning project that predicts anxiety severity using classification models (Naive Bayes, Decision Tree, SVM, Logistic Regression, XGBoost), based on lifestyle, health, and behavioral features.

anxiety-prediction classification csv data-analysis data-preprocessing-and-cleaning data-science data-visualization ensemble-learning logistic-regression machine-learning-algorithms matplotlib mental-health numpy pandas python sci-kit-learn seaborn supervised-learning svm xgboost

Last synced: 21 Jun 2025

https://github.com/rezowanrahat/netflix_analysis

Data analysis of Netflix content using Python, Pandas, and Seaborn

data-analysis data-visualization netflix pandas python

Last synced: 07 May 2026

https://github.com/kushagrakumar04/visual-age-distribution

A Bar chart or histogram to visually depict the distribution of a categorical or continuous variable, such as the age distribution or gender composition within a population. This graphical representation provides a clear and insightful overview of the data's patterns and trends.

data-analysis data-science google-colab

Last synced: 21 Jun 2025

https://github.com/lavkalsi/tableau-project-stock-market-analysis

The Tableau Project: Stock Market Analysis features a dashboard that combines Descriptive, Diagnostic, Predictive, and Prescriptive analytics to provide insights into stock market trends. Using Python for data processing and an LSTM model for forecasting, this project visualizes historical and predicted stock prices, helping make informed decision.

dashboard data-analysis deep-learning lstm-model python tableau

Last synced: 18 May 2026

https://github.com/caprogs/paris-events-analyzer

A project to analyze events in Paris using open source data provided by the city.

data data-analysis data-platform dbt docker ingestion python streamlit transformation vizualisation

Last synced: 04 May 2026