An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance

Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.

bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse

Last synced: 12 Feb 2026

https://github.com/l1ght14/e-commerce-sales-analysis

Interactive Power BI dashboard analyzing e-commerce sales, profit trends, top products, and customer segments using the Sample Superstore dataset.

dashboard data-analysis powerbi

Last synced: 12 Feb 2026

https://github.com/yalai92/alfalfa_imp_exp_analysis

This repository covers data cleaning, analysis, and visualization of global alfalfa and pellet imports, focusing on trends from 2003 to 2023. It also includes a predictive analysis of global alfalfa demand for 2024-2029, using data science techniques to provide insights for stakeholders in the alfalfa industry.

data-analysis data-cleaning data-visualization matplotlib numpy pandas python sckiit-learn tableau

Last synced: 12 Feb 2026

https://github.com/ankit21111/carpredict

This project predicts car prices using machine learning models, including Simple and Multiple Linear Regression. It covers data acquisition, feature selection, and optimization techniques like Ridge Regression. The best model, Multiple Linear Regression, achieved an R² score of 0.84. Check out the full analysis in the repository!

data-analysis data-visualization matplotlib numpy pandas pyhton scipy seaborn sklearn

Last synced: 16 Apr 2026

https://github.com/martachesnova/big-data

Finding out whether reviews from Amazon's Vine program are trustworthy. Performed ETL process in the Cloud and uploaded a DataFrame to an RDS instance. Used PySpark and Spark SQL to perform a statistical analysis and uncover "hidden" insights.

big-data data-analysis dataset python spark sql

Last synced: 16 Apr 2026

https://github.com/edoaltamura/rotational-ksz-macsis

Repository for suppelementary material from my publication on the rotational kinetic SZ effect in MACSIS

cosmology data-analysis galaxy-clusters high-performance-computing hydrodynamics

Last synced: 28 Feb 2026

https://github.com/kariemseiam/geoegy

An innovative and responsive dashboard to discover, filter, and analyze places across Egypt. Featuring advanced search, interactive maps with Leaflet.js, real-time analytics, dark mode, and seamless data export—all wrapped in a sleek, modern design with RTL support.

accessibility data-analysis data-visualization es6-modules geojson javascript leaflet mapping openstreetmap places-data responsive-design web-development

Last synced: 13 Feb 2026

https://github.com/secureauditx/ecommerce-user-behavior-analysis

E-commerce User Behavior Analysis with Streamlit Dashboard

customer-segmentation data-analysis ecommerce python streamlit

Last synced: 28 Feb 2026

https://github.com/m-ah07/text-sentiment-analysis-api

A lightweight Python project for analyzing the sentiment of textual data using the TextBlob library. This project provides a simple and effective way to measure the polarity and subjectivity of any given text.

data-analysis machine-learning python python-project sentiment-analysis text-analysis text-mining

Last synced: 14 Feb 2026

https://github.com/malakaburamila/power-bi-dashboards

A portfolio of interactive Power BI dashboards I developed, showcasing data visualization, analytics, and data-driven insights.

amazonsalesanalysis analytics dashboards data-analysis data-visualization datasets hranalytics power-bi

Last synced: 14 Feb 2026

https://github.com/kambleakash0/mubi_eda

Mini Project #1 for EAS503 course at SUNY Buffalo

data-analysis data-visualization eda

Last synced: 16 Apr 2026

https://github.com/chanmeng666/mnist-handwritten-digit-recognition-project

【Sprinkle some star dust on this repo! ⭐️ It's good karma!】A comprehensive implementation and analysis of handwritten digit recognition using multiple neural network architectures on the MNIST dataset. Features basic MLP, optimized feature-selected model, and deep CNN approaches with detailed performance comparisons and visualizations.

cnn computer-vision data-analysis data-visualization deep-learning feature-analysis handwritten-digit-recognition keras machine-learning mlp mnist model-optimization neural-networks python scikit-learn tensorflow

Last synced: 02 Apr 2026

https://github.com/mo-elshamy/machine-learning-practice

This repository serves as a collection of my work and learning in machine learning while my internship in Cellual-Technologies, including algorithm explanations, data preprocessing workflows, and two projects.

data-analysis data-science dbscan decision-trees eda gradient-boosting gxboost hierarchical-clustering kmeans-clustering knn-classification linear-regression logistic-regression machine-learning model pca polynomial-regression preprocessing random-forest support-vector-machines training

Last synced: 14 Feb 2026

https://github.com/balajimohan18/tableau-visualization-project

This repository contains Visualization Projects which is visualized through Tableau Software, by using the visualization we can gain multiple insights and strategies which helps to develop the business for gaining high profit margins and also it provides social values in some cases to reduce damages by calamities.

data-analysis data-science data-visualization exploratory-data-analysis tableau tableau-public

Last synced: 19 Mar 2026

https://github.com/hlexnc/project-arepo

Data-driven stroke risk assessment & personalized recommendations, powered by machine-learning and an NLU-driven chatbot.

chatbot data-analysis docker docker-compose machine-learning nlu-chatbot python rasa scikit-learn sklearn streamlit

Last synced: 15 Feb 2026

https://github.com/risdorn/restaurant-delivery-platforms-analysis-bdm-project

This project analyzes restaurant delivery platforms to understand customer preferences, industry competition, and expansion opportunities. Conducted as part of the BDM project from IITM, it includes descriptive stats, distribution, correlation, regression, and geospatial analysis using multiple datasets.

data-analysis data-visualization jupyter-notebook kaggle

Last synced: 15 Feb 2026

https://github.com/nmelgar/marathons_data_viz

Data visualization project to analyze finishing times and other data.

csv csv-files data data-analysis data-insight data-visualization data-viz dataset tableau

Last synced: 15 Feb 2026

https://github.com/chandrashekhar-01/globalterrorism-analysis

A data mining and analytics project on the Global Terrorism Database (Kaggle) that explores worldwide terrorism trends through Python-based data visualization and statistical analysis.

data-analysis data-mining data-visualization exploratory-data-analysis

Last synced: 28 Feb 2026

https://github.com/siddhant2105s/bring-your-own-device-boyd-system

This repository contains the design and implementation of the Bring Your Own Device (BYOD) System for managing personal devices at Life Insurance Company. It includes an ERD diagram, MySQL scripts for database creation, data insertion, and queries, as well as detailed data definitions and system requirements documentation.

data-analysis database-design database-normalization entity-relationship-diagram entity-relationship-models my-sql relational-databases relational-model sql-queries

Last synced: 15 Feb 2026

https://github.com/swethajoseph/sales-eda-project

Performed an advanced Excel-based exploratory data analysis (EDA) of an E-Commerce sales dataset to create an interactive dashboard for uncovering key business insights.

advancedexcel data-analysis data-visualization datacleaning dataformatting exploratory-data-analysis msexcel pivot-tables

Last synced: 19 Mar 2026

https://github.com/k-bloch/car-theft-analysis

A dashboard created to inform the public about car theft, providing insights extracted from real-world police stats.

data-analysis maven-analytics tableau

Last synced: 19 Mar 2026

https://github.com/devexpress-examples/aspxpivotgrid-group-date-time-values

This example shows how to group date-time values in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 01 Mar 2026

https://github.com/tejas-130704/dataanalysis-hr-manager

Presence Insights of Employees This project provides insightful data analysis on employee attendance and presence, including work-from-home (WFH) data, sick leave records, and presence excluding holidays. The analysis spans a three-month period and is visualized using Power BI to help HR managers understand trends and optimize workflow.

dashboard data-analysis data-visualization hr-manager power-bi

Last synced: 01 Mar 2026

https://github.com/nagar2nd/airbnb-property-management-optimization

This project aims to analyze Airbnb’s dataset to optimize rental strategies, enhance customer satisfaction, and maximize revenue for property owners. Using Tableau, the insights generated will help improve decision-making for both Airbnb and its hosts.

data-analysis data-visualization tableau

Last synced: 01 Mar 2026

https://github.com/grishmahat/discord-data-cli

A terminal UI tool to analyze your Discord data exportbuilt in Rust

cli data-analysis discord discord-data ratatui rust terminal tui

Last synced: 01 Mar 2026

https://github.com/johannaschmidle/road-collisions-project

Analyzed road accident data in the UK from 2019 to 2022 to identify patterns and trends in road accidents, for Effective Road Management [Excel]

data-analysis data-visualization excel pivot-tables traffic-analysis

Last synced: 01 Mar 2026

https://github.com/aleks-andrs/bigdataanalytics

Public repository for CM3111: Big Data Analytics Coursework (Meteorite landings analysis)

data-analysis data-science machine-learning

Last synced: 02 Mar 2026

https://github.com/yash22222/pwc-power-bi-virtual-case-experience

The Power BI PwC Virtual Case Experience is an exciting and educational program designed to provide participants with hands-on exposure to Power BI, a prominent business intelligence and data visualization tool, within the context of consulting at PwC.

business-analyst business-analytics business-intelligence dashboard data-analysis data-analyst data-analytics dax microsoft-power-bi powerbi powerbi-dashboards powerbi-visuals pwc

Last synced: 02 Mar 2026

https://github.com/mayankyadav23/amazon-sales-data-analysis

Diving into Amazon sales data to uncover hidden gems! 📈 Analyzing iNeuron's dataset to optimize sales strategies and boost performance 💡 Driving business growth with data-driven decisions! 💻

amazon data-analysis data-visualization ineuron-ai internship-project

Last synced: 02 Mar 2026

https://github.com/badranalyst/covid-deaths-dashboard-with-tableau

This project showcases an interactive dashboard developed in Tableau to visualize COVID-19 deaths data. It provides insights into trends, geographical distributions, and key metrics related to mortality during the pandemic. The dashboard aims to enhance understanding of the data, supporting public health analysis and decision-making.

covid-19 dashboard data data-analysis data-visualization dataset tableau tableau-dashboards visualization

Last synced: 02 Mar 2026

https://github.com/madusales/powerbi-etl-elt

Venho estudando, através do Bootcamp da DIO sobre Data Analytics & Power BI, acerca do uso de SQL para criar soluções em BI. Esse repositório é dedicado a registrar os meus conhecimentos adquiridos até então sobre o que é BI, Tipos de análises, ETL e ELT.

big-data business-intelligence data-analysis powerbi

Last synced: 19 Mar 2026

https://github.com/chaitanyaprasad60/sql-queries

This is a list of complex SQL Queries I have practiced.

data-analysis sql window-functions

Last synced: 03 Mar 2026

https://github.com/elrf3lipes/ramon-s_portfolio

I'm passionate about Cloud and DevOps, and for the moment I'm posting some of my work and personal projects here to showcase that. If its useful for you, feel free to integrate or contribute!

api-integration biopython clinical-trials data-analysis data-extraction data-parsing django docker entrez ipython medline-xml pandas pubmed-parser requests rest-api

Last synced: 27 Mar 2026

https://github.com/mugambi645/exploring-ebay-car-sales-data

Exploring ebay car sales dataset

car-sales data-analysis numpy pandas

Last synced: 16 Apr 2026

https://github.com/steno-aarhus/mediation-analysis-course

Modern mediation analysis for basic, clinical and epidemiological research in diabetes and endocrinology

data-analysis data-analysis-in-r diabetes diabetes-epidemiology mediation-analysis open-educational-resource

Last synced: 03 Mar 2026

https://github.com/grindelfp/logistic-regression-study

Example of logical regression data analysis and exercise on it.

data-analysis ipynb logistic-regression python

Last synced: 03 Mar 2026

https://github.com/banner-19/extraction-and-analysis-of-text

The objective is to analyze text content from a list of URLs. This involves extracting article titles and text, then performing natural language processing to generate metrics like sentiment, readability, and word usage. Finally, the results are stored for further analysis or visualization.

data-analysis data-analytics data-science nlp nltk python3 text-analysis text-extraction

Last synced: 03 May 2026

https://github.com/jofaval/melbourne-housing

Data Analysis of the Housing Market in Melbourne, Australia in 2016-2017

data-analysis data-science data-visualization deep-learning google-colab kaggle machine-learning melbourne python xgboost

Last synced: 16 Apr 2026

https://github.com/edanur-y/bank-customer-churn-prediction-with-classification-models

Comparing the performances of multi-layer perceptron, decision tree, random forest, gradient boosting and extreme gradient boosting classifications on customer data to predict their status of exiting the bank.

data-analysis data-transformation hyperparameter-tuning python

Last synced: 16 Apr 2026

https://github.com/bishopce16/school_district_analysis

The school board requested an analysis on the various performance metrics for the school district.

data-analysis jupyter-notebook numpy pandas python visual-studio-code

Last synced: 16 Apr 2026

https://github.com/samuelson777/titanic-dataset-analysis

Exploratory data analysis of the Titanic dataset, uncovering insights on passenger survival rates based on gender, age, and class. Includes data cleaning, visualization, and findings.

data-analysis data-visualization exploratory-data-analysis kaggle machine-learning matplotlib pandas python seaborn titanic-dataset

Last synced: 16 Apr 2026

https://github.com/ronaessi-28/sales-data-analysis-visualization-project

A comprehensive data analysis and visualization project using Python, Pandas, Matplotlib, Seaborn, and Streamlit. The project explores Superstore sales data to uncover trends, region-wise performance, product category insights, and builds an interactive dashboard.

data-analysis data-visualization eda matplotlib pandas plotly python-project sales-dashboard seaborn streamlit

Last synced: 16 Apr 2026

https://github.com/akash-srm/user-engagement-analysis

Analyzed user engagement and feedback data to derive actionable insights for an online learning platform.

analytics-projects data-analysis data-cleaning eda jupyter-notebook pandas python seaborn student-engagement

Last synced: 16 Apr 2026

https://github.com/marben06/rent-in-germany

Interactive visualizations and maps depicting topics around rent prices and income in Germany built with Svelte.

charts d3 d3-visualization d3js data-analysis data-visualization gis gis-data infographic infographics map mapbox mapbox-gl mapbox-gl-js mapboxgl svelte

Last synced: 27 Apr 2026

https://github.com/danpoynor/omdb-api-data-analysis

Gathers data for Oscar-winning movies using their IMDB ids, saves the information to a CSV file, and answers a few data analysis questions about the movies using JupyterLab.

analytics csv data-analysis jupyter-notebook matplotlib omdb-api pandas-dataframe python-dotenv python3 seaborn-plots

Last synced: 16 Apr 2026

https://github.com/johannaschmidle/netflix-subscription-analysis

Examined Netflix subscription data to understand market behaviour, predict future trends, and identify consumer preferences. [SQL, Tableau]

data-analysis data-cleaning data-trend data-visualization netflix

Last synced: 05 Mar 2026

https://github.com/yasumorishima/yasumorishima

Manufacturing Engineer & Data Analyst. 17 years exp in MFG. Python, VBA, Automation Specialist. (盛島康徳 / Yasunori Morishima)

automation data-analysis manufacturing portfolio python vba

Last synced: 05 Mar 2026

https://github.com/satyacoder29/e-commerce-sales-analysis

Performed E-commerce Sales Analysis to identify trends, optimize sales, and improve decision-making. Analyzed customer patterns, seasonal trends, and product performance using Python, SQL, and Power BI. Delivered actionable insights to enhance revenue, streamline inventory management, and boost customer engagement.

data-analysis data-visualization datacleaning msexcel pivottables powerquerym visualisation vlookups

Last synced: 05 Mar 2026

https://github.com/dina-hosny/analyze-and-model-airline-system

Analyzing Airline System and Building Data Warehouse Model to Store the Data and Answer Some Business Questions

data-analysis data-modeling data-warehouse datawarehousing dwh plsql sql

Last synced: 05 Mar 2026

https://github.com/kheriberto/knn_project

This is a simple project that uses dummie data to practice and demonstrate my knowledge of the KNN algorithm.

data-analysis knn-classifier numpy python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/ruajean/netflixmoviescraper

🎬 A powerful tool for gathering movie data and user reviews from FilmAffinity's Netflix category. This script scrapes movie details and iterates through user reviews, saving structured information to a CSV file for analysis. Ideal for insights into user sentiments and movie popularity on FilmAffinity.

data-analysis data-visualization dataset jupyter-notebook python scraping

Last synced: 17 Apr 2026

https://github.com/ngangawairimu/linear-regression-

This project builds a linear regression model in Python to predict outcomes and derive insights from feature data. It covers data cleaning, feature analysis, and model evaluation, showcasing predictive modeling techniques using scikit-learn, pandas, and visualization libraries.

data-analysis linear-regression machine-learning predictive-modeling python scikit-learn

Last synced: 17 Apr 2026

https://github.com/eliasdehondt/learn-r

Welcome to the Learn-R repository! This is your go-to resource for learning the R programming language, whether you're a beginner or looking to enhance your skills.

data-analysis data-visualization education machine-learning programming r statistics tutorials

Last synced: 03 Apr 2026

https://github.com/jhrcook/checkplease

Analysis of an immune checkpoint-blockade screen.

bayesian-statistics data-analysis pymc3 python python3 r

Last synced: 17 Apr 2026

https://github.com/shimazadeh/ft_linear_regression

Implementing a modular linear regression from scratch to predict the price of cars using a gradient descent algorithm.

data-analysis data-science hyperparameter-tuning linear-regression predictive-modeling

Last synced: 03 Jun 2026

https://github.com/mahmoudwal27/manufacturing_downtime

This project focuses on improving manufacturing efficiency by analyzing production data. Using Python, SQL, and Power BI, we built interactive dashboards to uncover patterns, minimize downtime, and optimize operations. The goal is to help stakeholders make data driven decisions for enhanced productivity.

data-analysis data-analysis-python data-visualization google-colab powerbi python sql

Last synced: 17 Apr 2026

https://github.com/ridemountainpig/education-level-data-analysis

An analysis of the relationship between education levels, unemployment rates, and credit card spending in Taiwan's six major cities.

data-analysis matplotlib pandas-python

Last synced: 17 Apr 2026

https://github.com/nathaliacosim/migration-patrim

Automação para extração, conversão e migração de dados patrimoniais para o sistema patrimônio cloud da betha sistemas. O projeto garante um fluxo estruturado e seguro de transferência de informações, utilizando C# (.NET Framework), PostgreSQL e integração via API.

conversion-tool data-analysis data-conversion data-transformation dotnet dotnet-code dotnet-console-app migration-tool

Last synced: 17 Apr 2026

https://github.com/rishisolanke/pdf_query_langchain

PDF Query LangChain is a tool that extracts and queries information from PDF documents using advanced language processing. Leveraging LangChain, OpenAI, and Cassandra, this app enables efficient, interactive querying of PDF content. Ideal for data analysis, research, and automated reporting, it simplifies detailed document analysis with ease.

artificial-intelligence data-analysis document-query langchain natural-language-processing nlp openai pdf-analysis pdf-extraction python research-tool

Last synced: 17 Apr 2026

https://github.com/victoorv/criminalite_us

Une analyse de la criminalité en fonction de variables socio-économiques a été menée, incluant la sélection et la comparaison de modèles de régression multiple ainsi que des tests d'hypothèses sur les coefficients et la significativité des modèles.

data-analysis data-science r regression regression-analysis regression-models statistical-analysis statistical-tests statistics

Last synced: 04 Apr 2026

https://github.com/royungar/sql_chicago_data_analysis_project

SQL-based data analysis project using SQLite, pandas, and Jupyter SQL magic commands. Analyzes crime, school, and census data from Chicago to explore socioeconomic patterns using filtering, joins, aggregation, and subqueries.

aggregation census-data chicago crime-data data-analysis data-engineering education-data ibm jupyter-notebook pandas sql sqlite subqueries

Last synced: 04 Jun 2026

https://github.com/royungar/automotive_sales_insights_dashboard

Data visualization project analyzing automotive sales, recalls, and customer sentiment using IBM Cognos Analytics. Features KPIs, treemaps, heatmaps, and advanced visual storytelling techniques.

automotive-industry business-intelligence cognos-analytics csv customer-sentiment dashboard data-analysis data-engineering data-visualization eda excel heatmap ibm kpi recall-analysis sales-data treemap

Last synced: 04 Jun 2026

https://github.com/davidmalko87/steam-library-exporter

Python script to export your Steam game library to CSV — playtime, genres, reviews, metacritic scores, prices, tags & estimated owners via Steam Web API + Store API + SteamSpy

csv-export data-analysis game-data metacritic playtime-tracker python steam steam-api steam-games steam-library steamspy

Last synced: 04 Apr 2026

https://github.com/sevilaymuni/project-no.3-seaborn-plots

Pandas and Seaborn Mediated Comprehensive Analysis on Differentiated Thyroid Cancer

data-analysis data-structures data-visualization mathplotlib pandas python seaborn

Last synced: 18 Apr 2026

https://github.com/sanam2405/ahs

This contains the analysis of result of AHS Madhyamik Examination 2022

data-analysis data-visualization jupyter-notebook python

Last synced: 18 Apr 2026

https://github.com/yuvrajsaraogi/sales-prediction-using-python

Sales prediction involves estimating future product sales based on factors like advertising spend, target audience, and platform. Businesses rely on data scientists to forecast sales and optimize advertising costs. Machine learning in Python can be used for this task.

data data-analysis data-science data-visualization machine-learning matplotlib natural-language-processing numpy pandas prediction python sales-prediction-using-python sql

Last synced: 19 Apr 2026

https://github.com/prangonghose/wikipedia-blocking-policies

This study investigates the relationship between editors’ disruptive behavior and regulation policies on English Wikipedia, focusing on the Blocking Policy page. The study collects and analyzes data from 2004 to 2022 using the Wikipedia API, page statistics, and keyword extraction.

data-analysis data-visualization matplotlib open-source pandas python3 seaborn

Last synced: 18 Apr 2026

https://github.com/pawlo77/airline-performance-data-analysis

Preprocessing of structured data - part of IAD study program, Faculty of Mathematics and Information Science, Warsaw University of Technology

data-analysis data-science visualization

Last synced: 10 May 2026

https://github.com/awanraskall/retail-demand-analysis

Data analysis of retail meal orders, fulfillment centers, and product demand using Python

data-analysis data-visualization jupyter-notebook numpy pandas python

Last synced: 18 Apr 2026

https://github.com/zeraphim/streamlit-iris-classification-dashboard

A Streamlit web application that performs Exploratory Data Analysis (EDA), Data Preprocessing, and Supervised Machine Learning to classify Iris species from the Iris dataset (Setosa, Versicolor, and Virginica) using Decision Tree Classifier and Random Forest Regressor.

classification dashboard data-analysis data-science decision-tree-classifier eda machine-learning python3 random-forest-regressor streamlit supervised-learning

Last synced: 18 Apr 2026