An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/arielle0222/data_analysis

📊 Data analysis projects for autonomous driving and smart mobility engineering using Python and SQL.

autonomous-driving composite data-analysis electric-vehicles environmental-data python visualizatoin

Last synced: 30 Apr 2026

https://github.com/ayu-hack/ayu-hack

Enthusiastic learner passionate about building software and exploring the world of technology. Eager to contribute to open-source projects and collaborate with the developer community. Continuously developing my skills in Python,SQL,HTML,CSS,PowerBI, MacOS. Always open to feedback and excited to keep growing!

config css data-analysis github-config html powerbi-desktop python3 sql

Last synced: 30 Apr 2026

https://github.com/pradeepchegur/seamantic_web_design

We designed a semantic web for Instagram in Wix platform.

data-analysis framework instagram semantic-web website-design wix

Last synced: 19 Mar 2026

https://github.com/pratik-khose/data-analysis-with-pandasai

PandasAI with Llama3 for Interactive Data Analysis

data-analysis llama3 llma pandasai streamlit visualization

Last synced: 11 May 2026

https://github.com/chayandatta/got_script_manipulation

Game of Thrones Script - String & file manipulation

data-analysis data-science pandas python3

Last synced: 11 May 2026

https://github.com/nafisalawalidris/hici-african-foods

HiCi African Foods: Excel dashboard & pivot table analysis of EU food rejection data to identify risks & recommend focus areas for market expansion.

data-analysis data-cleaning data-visualization eu-food-rejection excel-dashboard hici-african-foods market-expansion pivot-tables

Last synced: 19 Mar 2026

https://github.com/easycris-software/easycris

Professional statistical analysis and RNA-seq for researchers — no coding required

anova bioinformatics data-analysis desktop-app genomics pharmacology research-tools rna-seq statistics tauri

Last synced: 11 May 2026

https://github.com/mehulcode12/atliq-bank_creditcard_transaction_analysis

The credit card project at Atliq Bank comprises two key phases: market identification and trial. This initiative aims to leverage mathematical and statistical concepts to analyze data related to demographics, income, credit scores, and spending patterns in order to identify the target audience for the credit card.

codebasics data-analysis data-science data-visualization mathematics python python3 statistics

Last synced: 30 Apr 2026

https://github.com/edisedis777/duckdb-analyzer

A powerful tool for analyzing large CSV datasets using DuckDB.

csv data-analysis database duckdb

Last synced: 16 Apr 2026

https://github.com/phillbertnevinemmanuel/movieindustryanalysis-correlation

This project is a comprehensive data analysis endeavor within the Movie Industry, spanning from Data Cleaning to Exploratory Data Analysis, Correlation Analysis, and Temporal Analysis. The dataset was sourced from Kaggle, purportedly scraped using the IMDb API. Python was the primary tool utilized for analysis.

data-analysis data-cleaning python

Last synced: 30 Apr 2026

https://github.com/elissorokin/data-analyst-portfolio-rus

Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.

ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis

Last synced: 25 Feb 2026

https://github.com/alcestide/scianalytics

Playground for Data Analysis and Visualization for Research and Scientifical Purposes with Pandas and Plotly.

csv data-analysis data-science data-visualization pandas plotly python science-research statistics

Last synced: 30 Apr 2026

https://github.com/alrza2003/alrza2003.github.io

This repository contains the source files for my personal portfolio website. It highlights my background as a data analyst and radiology student, and showcases real-world projects, tools I use, and ways to connect with me. The site is based on a pre-built template that I customized to reflect my profile and experience.

data data-analysis data-visualization portfolio portfolio-website python

Last synced: 30 Apr 2026

https://github.com/leosimoes/datascienceacademy-powerbi-3.0

Projetos do curso Microsoft Power BI Para Data Science Versão 3.0 da DataScienceAcademy. Dashboards para diversos casos de negócios.

business-intelligence dashboards data-analysis data-visualization microsoft-power-bi

Last synced: 19 Mar 2026

https://github.com/shadan100/sales-prediction-analysis

The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.

artificial-intelligence data-analysis data-science django django-framework jupyter-notebook machine-learning matplotlib pandas predictive-modeling python sales-prediction

Last synced: 01 Mar 2026

https://github.com/mkk-1817/hr-attrition

This project, conducted during my internship at MeriSKILL, focuses on HR Attrition Prediction using advanced Machine Learning models. The initiative includes the development of a dynamic Dashboard and in-depth Analysis to offer actionable insights for proactive human resource strategies.

data-analysis data-science data-visualization jupyter-notebook machine-learning-algorithms powerbi python

Last synced: 03 May 2026

https://github.com/affec-ds/dashboard-ventas-vinilos

Dashboard interactivo de ventas para tienda de vinilos. Análisis visual, KPIs clave y filtros dinámicos para decisiones comerciales.

business-intelligence data-analysis data-visualization ipywidgets jupyter-notebook kpis matplotlib music-industry notebook-project python retail-analytics sales-analysis seaborn vinyl-records

Last synced: 30 Apr 2026

https://github.com/antononcube/wl-datareshapers-paclet

Wolfram Language (aka Mathematica) paclet for data reshaping functions, like, long- and wide form, cross tabulation, etc.

contingency-table cross-tabulation data-analysis data-transformation long-form wide-form

Last synced: 20 Mar 2026

https://github.com/is-leeroy-jenkins/sherpa

A budget execution & data analysis tool based on Winforms, .NET 6, and written in C# for EPA analysts

budget-management data-analysis data-science data-visualization federal-government

Last synced: 13 May 2026

https://github.com/antononcube/wl-quantileregression-paclet

Wolfram Language (aka Mathematica) paclet that provides various Quantile Regression functions.

data-analysis machine-learning quantile-regression time-series time-series-analysis

Last synced: 20 Mar 2026

https://github.com/seekinginfiniteloop/fedcal

A feature-rich Python calendar that enables time series analyses of changes in federal workforce schedules and shifts in executive department funding status.

data-analysis data-science econometrics economic-data economics federal federal-government hr pandas pandas-library pandas-python pydata python

Last synced: 15 Apr 2026

https://github.com/hetuvpatel/research-chatgpt

Research and data analysis project evaluating the social, ethical, and educational impacts of ChatGPT using survey-driven insights and Python-powered data analysis. 📚🤖

data-analysis matplotlib pandas python seaborn

Last synced: 01 May 2026

https://github.com/zpreisler/modules

Python libraries and modules for processing simulation outputs

data-analysis python scripts tensorflow

Last synced: 13 May 2026

https://github.com/lijesh010/globalsuperstoresalesanalysis

The Global Superstore Sales Analysis repository showcases a comprehensive Power BI dashboard that provides valuable insights into sales performance. This project is designed to present key information and trends to stakeholders, enabling informed decision-making.

dashboard data-analysis data-visualization msexcel power-bi sales-analysis

Last synced: 19 Mar 2026

https://github.com/iguptashubham/pizzahut-analysis-sql

best dataset for data analysis. Pizzahut data analysis done by Shubham Gupta in MySql. This dataset is provided by friend of mine intern at pizzahut. In pizzahut, they used this dataset to train and ask question. This data does not reveal anything about the pizzahut. It is safe to share. data

data-analysis data-analytics database dataset datasets mysql mysql-database pizzahut

Last synced: 14 May 2026

https://github.com/tnleite/projeto_king_lift

Este projeto apresenta uma análise detalhada dos dados financeiros da King Lift, uma empresa de locação de empilhadeiras. Utilizando Microsoft Excel, Power Query e Power Pivot, desenvolvi um dashboard interativo, também em Excel, que ajuda a empresa a obter insights valiosos para melhorar a eficiência operacional e aumentar o faturamento.

data-analysis data-science data-visualization excel

Last synced: 19 Mar 2026

https://github.com/harshmule1/store-sales-analysis

Sales Analysis Using Power Bi

data-analysis powerbi

Last synced: 19 Mar 2026

https://github.com/pferreirafabricio/data-immersion

🏊🏻‍♂️ Activities and exercises from 'Imersão Dados' event

data data-analysis data-science dataset jupiter-notebook python

Last synced: 14 May 2026

https://github.com/scarblase/salary-comparison

Submission for the DataCamp Salary Competition(1 level). 🏆

data data-analysis data-science data-visualization engineering python sql structured-data

Last synced: 01 May 2026

https://github.com/nmsby/pca-machine-learning-lab

Principal Component Analysis (PCA) implementation and analysis lab for Machine Learning. Features manual PCA implementation, scikit-learn applications, data compression, and feature extraction with detailed visualizations.

data-analysis dimensionality-reduction jupyter-notebook machine-learning numpy pca python scikit-learn visualization

Last synced: 01 May 2026

https://github.com/adagio/ivoox_episodes

iVoox Episodes: Scraping & Analysis

beautifulsoup4 data-analysis ivoox pandas python python3 scraping

Last synced: 20 Apr 2026

https://github.com/jrbourbeau/cr-composition

IceCube cosmic-ray composition analysis

cosmic-rays data-analysis machine-learning physics python

Last synced: 20 Apr 2026

https://github.com/aaryan-agr/canadian-energy

This project analyzes Canada's energy trade, focusing on imports, exports, and market trends in the energy sector.

data-analysis data-cleaning data-manipulation data-processing data-science data-vizualisation energy-sector time-series-analysis

Last synced: 10 Jun 2025

https://github.com/nafisalawalidris/building-a-clustering-model-for-customer-segmentation

Customer Segmentation Using Clustering: This repo applies clustering algorithms to a customer transaction dataset, grouping similar customers together based on their purchasing behavior. Targeted marketing strategies can be developed by analyzing distinct customer segments.

clustering customer-segmentation data-analysis data-visualization k-means machine-learning marketing-analytics unsupervised-learning

Last synced: 16 Mar 2025

https://github.com/nafisalawalidris/tools-for-data-science

It covers popular languages (Python, R, SQL) and libraries (NumPy, Pandas) used in the field. The author shares their objectives of teaching data analysis, web development, and critical thinking skills. The repository also includes code examples, explanations of arithmetic expressions, and contact information for the author.

arithmetic-expressions data-analysis data-science data-visualization languages libraries matplotlib numpy pandas programming python r sql tools web-development

Last synced: 11 Apr 2026

https://github.com/nafisalawalidris/buybuy-e-commerce-company

The BuyBuy E-commerce Company repository is a comprehensive hub for the company's e-commerce platform. It includes source code, documentation, and data analysis insights, providing a data-driven approach to improve customer experience, drive revenue, and inform decision-making.

buybuy cleaning-data company customer-experience data data-analysis decision-making documentation e-commerce excel insights postgresql repository revenue source-code sql

Last synced: 16 Mar 2025

https://github.com/nouman6093/advanced-statistical-models

in this repository i will upload everything i have learned about data science advanced statistical models. there are over 42 statistical models. each of them work on algorithms. and there are over 32 algorithms. each library has its own way of writing such statistical models. after learning i will try to upload as much statistical models as possibl

data data-analysis data-science data-visualization

Last synced: 11 Jun 2026

https://github.com/akankshaaa013/30-day-machine-learning-deep-learning

To practically Learn, Explore, and Share my Insights on the Libraries and Tools that power Machine Learning.

data-analysis machine-learning python

Last synced: 15 Mar 2025

https://github.com/greed2411/ndl

Numbers Don't Lie, attempt on Data Analysis using pandas and matplotlib.

cities data-analysis data-science data-visualization india kaggle

Last synced: 19 Apr 2026

https://github.com/ssreeramj/youtube_channels_analysis

This web app gives a detailed analysis of the videos uploaded in a particular youtube channel.

data-analysis heroku pandas python streamlit youtube

Last synced: 29 Apr 2026

https://github.com/ayobami6/tweet-data-analysis

WeRateDogs Tweets Scrape using twitter Api

data-analysis data-science twitter webscraping

Last synced: 31 May 2026

https://github.com/sanam2405/chatinfo

Analysing the WhatsApp Chat with my crush over a 6M period

data-analysis data-visualization python

Last synced: 27 Apr 2026

https://github.com/carterlasalle/sportsarbfinder

Sports Betting Arbitrage Finder: Python tool for identifying profitable arbitrage opportunities across bookmakers. Features multi-region support, customizable profit margins, interactive calculator, and web interface. Uses real-time odds data from The Odds API. Ideal for betting enthusiasts, analysts, and educational purposes.

arbitrage-betting betting-strategy data-analysis finance gambling odds-api python sports-analytics sports-betting

Last synced: 31 Mar 2025

https://github.com/bristolmyerssquibb/blockr.workshop

R in Pharma 2024 blockr workshop

data-analysis nocode r

Last synced: 18 Apr 2026

https://github.com/lobooooooo14/badwords-pt-br

💬 Wordlist com palavrões em pt-BR para análise de dados, filtros, ou texto considerado "evitável"

badword-filter badwords brasil data-analysis filter filter-lists filterlist portugues portuguese text-analysis wordlist

Last synced: 25 Mar 2025

https://github.com/asifdotexe/air-quality-analysis-aqa

AQA is a data-driven project focused on analyzing air quality data sourced from data.gov.in. The project encompasses data preprocessing, analysis, and visualization to gain insights into air pollution levels across various locations in India. By examining six key pollutants, the project aims to raise awareness about the environmental issues

aqi-analysis data-analysis data-preprocessing data-science data-visualization presentation

Last synced: 07 Jun 2026

https://github.com/narenkhatwani/arkouda-projects

This repository contains the source codes of the projects done using Arkouda (a software package that allows a user to interactively issue massive parallel computations on distributed data using functions and syntax that mimic NumPy, the underlying computational library used in most Python data science workflows.)

arkouda data-analysis data-analytics data-science high-performance high-performance-computing highperformancecomputing numpy pandas parallel-computing parallel-processing parallelization python

Last synced: 17 Apr 2026

https://github.com/kishlayjeet/zomato-data-exploration

In this project, we will be exploring a dataset containing information on various restaurants and their ratings, location, and other attributes.

data-analysis eda matplotlib numpy pandas zomato-data-exploration

Last synced: 10 Apr 2026

https://github.com/prangonghose/analysis_of_bangladesh_economic_complexity

In this project a brief analysis has been done by our team in the export economy of Bangldesh for the past three decades.

data-analysis data-science data-visualization inequalipy matplotlib pandas plotly

Last synced: 22 May 2026

https://github.com/prernarohra/heart-disease-prediction

This project develops a machine learning model to predict heart disease risk based on symptoms and medical history. The model achieved the best accuracy with Logistic Regression, as it works well for binary classification problems.

artificial-intelligence data-analysis data-science dataset heartdisease-prediction machine-learning models

Last synced: 06 Nov 2025

https://github.com/matthewgrosman/messenger-analytics

Project that ingests Facebook Messenger conversations and generates analytics.

analytics data-analysis excel facebook facebook-messenger java mongodb

Last synced: 15 Apr 2025

https://github.com/raad07/sql_project-world_layoffs_dataset

This is a SQL project which comprises the Data Cleaning in the first part and Exploratory Data Analysis (EDA) in the second part.

data-analysis database mysql sql

Last synced: 27 Jan 2026

https://github.com/sathyasris27/statistical-analysis-on-rehoming-time-for-different-dog-breeds-in-animal-shelter

The aim of this project is given a collection of records documenting the stray, unwanted, or neglected dogs sent to animal shelters to be rehomed, we analyse their rehoming patterns based on their breeds.

data-analysis r statistical-analysis statistical-inference statistical-models

Last synced: 05 Jun 2026

https://github.com/noturlee/imdb-dataanalysis

A data model that predicts the IMDb rating of a movie based on features like genre, director, and actors. Using regression techniques to tackle this problem.

data-analysis data-cleaning data-modeling data-science data-visualization

Last synced: 08 Apr 2025

https://github.com/agustinmusanti/delitosencaba-proyectofinal-dataanalytics-coderhouse

En este repositorio muestro mi proyecto final en el curso "Data Analytics" de Coderhouse.

data-analysis excel powerbi

Last synced: 22 Jan 2026

https://github.com/jossimmar/ensa-ss25

Repositorio destinado al manejo de datos de consumo de los Clientes Mayores de ENSA del Grupo Distriluz.

data-analysis electrical-engineering python sqlite

Last synced: 30 Mar 2025

https://github.com/seabbs/explorebcgonoutcomes

Analysis to explore the association of BCG vaccination and TB outcomes.

bcg data-analysis regression rstats tuberculosis

Last synced: 23 Feb 2026

https://github.com/rkolehov/retail-sales-analysis-project

End-to-end e-commerce analysis showcasing SQL and data visualization skills. Tracks sales, customer behavior, product performance, and delivery efficiency. Interactive dashboards provide actionable insights for business decision-making

analytics dashboard data-analysis ecommerce jupyter-notebook postgresql python sql tableau vscode

Last synced: 19 Apr 2026

https://github.com/ndohvich/ibm-data-science-professional-certificate

Kickstart your career in data science & ML. Build data science skills, learn Python & SQL, analyze & visualize data, build machine learning models. No degree or prior experience required.

coursera dash data-analysis data-science html5 ibm ibm-professional-certificate javascript machine-learnng python sql

Last synced: 16 Nov 2025

https://github.com/unnatmalik/dattavism-ai-powered-data-insight-generator-

Dattavism is an AI-powered data insight platform that transforms raw CSV files into comprehensive, contextualized reports—complete with visualizations, statistical summaries, and natural language insights. Dattavism is designed to handle datasets across diverse domains. it is Built using Python, Streamlit, Gemini API, Pandas, Matplotlib, NumPy,

data-analysis python streamlit

Last synced: 24 Jul 2025

https://github.com/ryanfranklin237/data-visualization-spreadsheets

Data visualization done with microsoft excel and google spreadsheets

data-analysis data-science data-visualization google-spreadsheets microsoft-excel

Last synced: 22 Feb 2026

https://github.com/nirmit27/book-recommender-system

This is a book recommendation system based on item-based Collaborative Filtering memory-based model created using Flask.

data-analysis data-science flask python python3 recommender-system render

Last synced: 05 May 2026

https://github.com/rohithsaji97/open_gate_dip

An automatic gate opening system with an additional parking system (using Raspberry PI).

automated data-analysis digital-image-processing opencv python3 raspberry-pi-3 trained-models

Last synced: 04 Feb 2026

https://github.com/shadan100/stroke-prediction-analysis

A web based application to predict whether a patient is likely to get stroke based on the input parameters like gender, age, various diseases, and smoking status. Each row in the data provides relevant information about the patient.

artificial-intelligence data-analysis data-science django django-framework jupyter-notebook machine-learning matplotlib pandas predictive-modeling python stroke-prediction web-application

Last synced: 08 Mar 2026

https://github.com/alejo1630/chicago_crimes

A Jupyter Notebook with the data analysis and data visualization of crimes in Chicago from 2017 to 2023 using libraries such as seaborn and folium

data-analysis data-visualization folium pandas python seaborn

Last synced: 12 Apr 2026

https://github.com/akash1070/data-science-virtual-internship-by-anz

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

data-analysis data-science predictive-analytics presentation-slides

Last synced: 24 Mar 2025

https://github.com/jubinjacob03/heartdiseaseclassify-ml

Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.

data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine

Last synced: 18 Jan 2026

https://github.com/5ekastanx/data-analysis

Extracting data from parsing, for example, like hacking using Python using all sorts of function methods

data-analysis html python

Last synced: 14 Mar 2025

https://github.com/nafisalawalidris/springforth-university-foodbank

Springforth University Food Bank: A collaborative initiative with UNESCO to address student food insecurity. Contains code and resources for the web application, data analysis, and insights into the prevalence and impact of food insecurity on academic performance.

academic-performance collaborative-initiative data-analysis data-visualization excel pivot-tables powerbi springforth-university-food-bank student-food-insecurity unesco

Last synced: 17 Feb 2026

https://github.com/md-emon-hasan/1-simple-stock-price-ml-app

A simple mahcine learning application for stock prices, demonstrating data preprocessing, model training, and deployment using scikit-learn.

data-analysis data-science eda ml-app streamlit-webapp time-series time-series-analysis webapp

Last synced: 31 May 2026

https://github.com/shadowk29/cusumtools

An eclectic collection of python scripts I have found to be useful in processing nanopore data

data-analysis data-visualization time-series-analysis

Last synced: 16 Mar 2026

https://github.com/dcs-training/intronetworkanalysis

This is a repository for the Introduction to Network Analysis course provided by Brian Wong for the CDCS. Within the repository there are files with sample datasets and a guide to building datasets. It will be updated before each section. Go to the Readme file

data-analysis data-visualisation gephi network-analysis text-analysis

Last synced: 27 Jan 2026

https://github.com/dcs-training/good-data-visualisation-with-r

Our guide on how we create data visualisations through R. Go to the readme file

data-analysis data-visualisation r rmarkdown

Last synced: 16 Jun 2026

https://github.com/rani-sikdar/pwc-virtual-internship-powerbi

Comprehensive Power BI dashboards showcasing insights on Call Centre Trends, Customer Retention, and Diversity & Inclusion to drive business impact.

business-analytics business-intelligence data-analysis data-cleaning data-visualization interactive interactive-visualizations powerbi

Last synced: 07 Jan 2026