An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/akash1070/predicting-zomato-restaurant-ratings

Perform extensive Exploratory Data Analysis(EDA) on the Zomato Dataset. Building an appropriate Machine Learning Model that will help various Zomato Restaurants to predict their respective Ratings based on certain features deploy the Machine learning model via Flask

data-analysis extratreesregressor flask linear-regression machine-learning random-forest zomato-bangalore zomato-data-analysis

Last synced: 18 May 2026

https://github.com/huynhtanphatt/diagnosing-uk-railway-performances

This project analyzes UK railway ticket and operation data to show how revenue, passenger demand, and on-time performance are connected.

data-analysis data-visualization datastorytelling python railway sql ticketing transportation

Last synced: 24 Apr 2026

https://github.com/sbera01/credit-card-approval-predictor

End-to-end Machine Learning project to predict credit card approval decisions using real-world financial features. Includes EDA, model training, and deployment-ready architecture

credit-card-approval-prediction data-analysis machine-learning python scikit-learn streamlit

Last synced: 24 Dec 2025

https://github.com/sebastianurdaneguibisalaya/enfermedades-fissal

Análisis holístico de atenciones por enfermedades raras, huérfanas y transplantes coberturados por FISSAL en el Perú.

data-analysis data-visualization python

Last synced: 24 Feb 2025

https://github.com/dinamohsin/toman-bikeshare-data-analysis-sql-power-bi

This project involves data analysis using SQL, Power BI, and CSV datasets to extract insights and visualize key business metrics.

csv-files data-analysis data-visualization database powerbi sql sql-server

Last synced: 22 Apr 2026

https://github.com/jerinpious/house-price-prediction

This project is a machine learning-based application to predict house prices. A frontend interface has been developed using Streamlit to make the prediction process user-friendly for regular customers. The project is structured

data-analysis data-engineering data-science eda machine-learning pandas python random-forest scikit-learn streamlit

Last synced: 05 Apr 2026

https://github.com/sreejabethu/smart-report-analyzer

An AI-powered app to analyze and summarize Excel, CSV, and PDF reports using Hugging Face language models. Built with Streamlit.

data-analysis huggingface llm nlp pdf-analysis python question-answering streamlit summarization

Last synced: 18 May 2026

https://github.com/cowboymrzamo2380/json-to-excel-converter

This repository provides a tool to convert JSON data to Excel format (.xlsx). It allows you to easily transform structured JSON data into a well-organized spreadsheet for better analysis and visualization.

automation-script automation-tools data-analysis data-converter data-export data-formatting data-tools data-visualization excel excel-automation excel-converter excel-tools json json-exporter json-parser json-processing json-to-csv json-to-excel programming-tools spreadsheet-tools

Last synced: 05 Apr 2025

https://github.com/clarajacintho/ig4-ds

The final project for the Multidimensional Data Analysis and Data Mining courses, where we analyze data from motorcyclists to determine what causes accidents

data-analysis data-science shiny-apps

Last synced: 11 May 2025

https://github.com/saadhaniftaj/logistic--lasso-regression-data-analysis

Iris dataset analysis with logistic and Lasso regression, using coordinate descent for feature selection and binary classification. Includes preprocessing and data visualizations

data-analysis lasso-regression-model logistic-regression python statistics

Last synced: 18 May 2026

https://github.com/thoratstuti/power-bi-dashboards-for-finance-analysis

Power BI can group and gather information from multiple systems to present the whole picture of business data analytics in one “single view”. It made the staff of the financial institution work in a collective digital platform, where they can compute and share relevant data.

data-analysis data-visualizations excel graph pie-chart powerbi

Last synced: 07 Mar 2026

https://github.com/steviecurran/multi-dish

Scripts to reduce data from large radio telescopes (GMRT, VLA)

data-analysis interferometer pipeline radio-astronomy telescopes

Last synced: 09 May 2026

https://github.com/ljadhav25/django-data-analyzer

Django Data Analyzer is a web application built using the Django framework, designed to streamline data analysis tasks. Users can upload CSV files containing data for analysis. The application utilizes the powerful data manipulation capabilities of Python libraries like pandas and numpy to perform various analyses on the uploaded data.

data-analysis data-visualization django-application matplotlib numpy pandas python seaborn

Last synced: 01 Mar 2026

https://github.com/lucycatherine/healthinsuranceproject

This repository contains a machine learning project that analyzes the factors influencing health insurance charges, such as age, smoking status, and medical conditions.

data-analysis data-science data-visualization jupyter-notebook machine-learning python

Last synced: 18 May 2026

https://github.com/gui-sitton/y.music

In this project I compared the musical preferences of the citizens of Springfild and Shelbyville. I examined real Y.Music data to test hypotheses and compare the behavior of users in these two cities.

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 18 May 2026

https://github.com/niniola-creator/niniola-creator

This is a repository that I have created to show my skills, share my projects and track my progress in my data science/web development journey.

bootstrap5 css3 data-analysis data-science data-visualization database html5 javascipt javascript matplotlib pandas powerbi python spreadsheets sql

Last synced: 07 Apr 2026

https://github.com/cosmoduende/r-ggcats

StrangeR things: Adding… Cats? to your plots on R. How to analyze and visualize data with the help of funny cats with the ”ggcat” package.

data-analysis data-analytics data-science data-visualisation data-visualization data-viz dataviz ggcats r-language r-library r-package r-programming r-scripts r-studio rstats rstudio

Last synced: 22 Jul 2025

https://github.com/a19xys/dm-csgo_analysis

Analysis to address the most important aspects of the knowledge discovery process from data.

data-analysis data-mining data-science dataset jupyter-notebook python

Last synced: 18 May 2026

https://github.com/datalopes1/bankabc_churn

Neste projeto será realizado o processo de EDA (Exploratory Data Analysis) com foco na análise de Churn a partir do datas ser Bank Customer Churn Dataset, que pode ser encontrado no Kaggle e disponibilizado por Gaurav Topre.

churn-analysis data-analysis data-science eda python

Last synced: 18 May 2026

https://github.com/1adityakadam/carnegie-classifications-ancestry-grid

A concise, interactive tool for exploring the historical lineage of U.S. higher education institutions using Carnegie Classification data from 1973–2021.

dash data-analysis html javascript pandas python

Last synced: 25 Jun 2025

https://github.com/andersoncrs/analisis-de-texto-tweets

En este proyecto exploro el análisis de texto de tweets para descubrir tendencias, opiniones y temas relevantes en redes sociales. Usando herramientas de procesamiento de lenguaje natural, convierto grandes volúmenes de mensajes en información clara y visualmente atractiva.

data-analysis data-visualization eda text-mining

Last synced: 21 Jul 2025

https://github.com/artemzarubin/xml-document-processor

XML processing tool using the Strategy design pattern.

csharp data-analysis data-transformation design-patterns strategy xml

Last synced: 21 Jul 2025

https://github.com/BingyanStudio/github-analyzer

锐评一下你都在 GitHub 写了什么

data-analysis github llm reports selfhosted typescript

Last synced: 12 May 2025

https://github.com/1adityakadam/Carnegie_classifications_website

A comprehensive data analytics platform analyzing 50+ years of U.S. higher education trends through interactive visualizations and historical institution tracking.

css data-analysis html javascript python ui-design web-development

Last synced: 25 Jun 2025

https://github.com/nadamarei/data-analyzer

The Qualitative Data Analysis Tool is a powerful Streamlit application designed for researchers to analyze word frequencies in corporate documents. This tool processes PDF reports, identifies target words and their contextually relevant synonyms, and generates comprehensive reports with document statistics, summary analysis, and per-file breakdowns

data-analysis data-visualization python-3 streamlit

Last synced: 18 May 2026

https://github.com/rohansoni45/movie-recommendation-system

This project is a Content-Based Recommender System that suggests movies to users based on their preferences and watched history. The system leverages cosine similarity to find and recommend movies similar to a selected title. It is built using Python and libraries like Pandas, NumPy, and Scikit-learn.

content-based-filtering cosine-similarity data-analysis data-science machine-learning numpy pandas python recommender-system render scikit-learn

Last synced: 17 Apr 2026

https://github.com/ddihora1604/iitk_task

A comprehensive financial data analysis system that collects, processes, and analyzes data from approximately 500 tickers in the S&P Global Index. It provides detailed financial information, ESG metrics, and various financial statements for comprehensive market analysis.

beautifulsoup4 data-analysis data-visualization datamodelling dataset esg machine-learning python yahoo-finance

Last synced: 29 Oct 2025

https://github.com/pronzzz/diabetes-prediction

Diabetes prediction using a KNN model and Pima Indian Diabetes Dataset

data-analysis data-manipulation data-preprocessing data-visualization knn machine-learning outlier-detection seaborn

Last synced: 13 Apr 2025

https://github.com/jelhamm/model-ensembles-boosting-in-machine-learning

"This repository contains implementations of Boosting method, popular techniques in Model Ensembles, aimed at improving predictive performance by combining multiple models. by using titanic database."

boosting boosting-algorithms boosting-ensemble boosting-machine data-analysis database-analysis datamining datamining-algorithms jupyter-notebook machine-learning machine-learning-models machine-learning-projects matplotlib-python model-ensemble module numpy-library pandas-library python sklearn-library

Last synced: 16 May 2026

https://github.com/ireneflorez/exploration_r

Data exploration on the 'White Wine Quality' dataset using R

data-analysis data-visualization r

Last synced: 16 Jun 2026

https://github.com/jelhamm/singular-value-decomposition-data-mining

"This repository hosts an implementation of the Singular Value Decomposition (SVD) algorithm tailored for data mining tasks. SVD is utilized for efficient dimensionality reduction, aiding in the extraction of key patterns and features from large and complex datasets."

data-analysis dimension-reduction jyputer-notebook machine-learning matplotlib numpy-library pandas-library preprocessing python scipy-library singular-value-decomposition sklearn-library standardscaler svd svd-matrix-factorisation

Last synced: 18 May 2026

https://github.com/carmoreno/nobelprizes

Final project of Big Data Module.

data-analysis mongodb

Last synced: 29 Apr 2026

https://github.com/dineshdhamodharan24/singapore_flat_resale_

This project focuses on developing a machine learning model to predict the resale values of apartments in Singapore. The goal is to create a user-friendly online application that enables users to obtain accurate predictions for the resale values of specific properties.

data-analysis flat json numpy pandas pickle project python streamlit

Last synced: 07 Apr 2026

https://github.com/dinamohsin/ai-job-market-analysis-using-sql-excel

This project explores a dataset of AI-related jobs to uncover insights about salary trends, in-demand skills, education levels, and remote work preferences. The analysis was done using SQL for querying and Excel for data cleaning and preparation.

data-analysis data-preprocessing excel functions query sql sql-server

Last synced: 25 Jun 2025

https://github.com/vbhvsingh0/nflteam_corr_population

The goal of this project is to find the correlation in between NFL teams' win and loss with the population of the city.

data-analysis data-cleaning-and-preprocessing data-manipulation-with-pandas numpy-library pandas-python pearson-correlation python3

Last synced: 04 Mar 2025

https://github.com/shubhammittal-data/sales-customer_dashboard_tableau

An interactive Tableau project showcasing advanced data visualization techniques for sales performance and customer analytics. This dashboard provides key business insights using KPIs, trend analysis, and customer segmentation. Designed for executives, sales managers, and marketing teams to drive data-driven decision-making.

customer-behavior-analysis customer-segmentation data-analysis data-visualization product-analytics sales-analysis tableau tableau-dashboards tableau-public

Last synced: 07 Mar 2026

https://github.com/jlee9503/defense-risk-prediction

Build a machine learning pipeline that ingests defense procurement data, identifies high-risk contracts, and visualizes the results in an interactive dashboard.

data-analysis data-visualization exploratory-data-analysis python

Last synced: 25 Jan 2026

https://github.com/harmanveer-2546/motor-vehicle-accidents-in-india

As per the report, a total of 4,61,312 road accidents have been reported by States and Union Territories (UTs) during the calendar year 2022, which claimed 1,68,491 lives and caused injuries to 4,43,366 persons.

accidents accidents-analysis darkgrid data-analysis eda exploratory-data-analysis indian-roads inline matplotlib motor-vehicles numpy pandas review seaborn visualization

Last synced: 19 Jan 2026

https://github.com/mrendiks/analyst-data-survey-monkey

Learn how to analyst data from dataset surver monkey using Excel and Python

data-analysis ipynb-jupyter-notebook python

Last synced: 07 Mar 2026

https://github.com/mituskillologies/aiml-dypiemr-sep24

Programs conducted at DYPIEMR, Pune in training on AIML during September 2024.

artificial-intelligence data-analysis data-science machine-learning matplotlib neural-network numpy pandas python3

Last synced: 05 Apr 2025

https://github.com/chahelgupta/fitness-data-analysis-r-project

This project focuses on analyzing fitness data collected from various tracking devices to gain insights into users' activity levels, sleep patterns, calorie expenditure, and heart rate. The dataset used in this project consists of multiple CSV files, each containing different aspects of fitness-related data.

data-analysis data-cleaning data-exploration data-science data-visualization r r-language r-programming r-studio

Last synced: 18 May 2026

https://github.com/jonathancaleb/adap

📊🌱 Agricultural Data Analysis Platform 🌍🚜 A personal initiative to analyze coffee growth trends in Uganda using Python, data science, and machine learning. This project supports sustainable farming with predictive models and interactive visualizations. 🍃📈

data-analysis data-science python

Last synced: 18 May 2026

https://github.com/Fisseha-Estifanos/telecom

A showcase repository for a specific telecommunication company. Used to analyze several telecommunication data set features and generate useful insights accordingly. Insights generated could be seen at https://github.com/Fisseha-Estifanos/telecom-visualizer or at https://fisseha-estifanos-telecom-visualizer-home-huxgy0.streamlitapp.com/

data-analysis notebooks-jupyter python visual-studio-code visualization

Last synced: 11 Mar 2025

https://github.com/majajuri/text-classification-using-string-kernels

Projekt u sklopu predmeta Uvod u znanost o podacima

data-analysis string-kernel

Last synced: 05 Apr 2025

https://github.com/ashvinhandoo/bionic-lab-projects

Computational neurophysiology pipelines for analyzing astrocyte and vascular dynamics. Includes Python- and MATLAB-based analysis frameworks for modeling calcium, vasomotion, and pupil-linked activity, demonstrating advanced signal processing, transfer entropy estimation, and data visualization skills used in biomedical research.

biocomputation bioinformatics biomedical-engineering computational-biology data-analysis matlab neuroscience python signal-processing time-series

Last synced: 18 May 2026

https://github.com/martachesnova/python-apis

A weather analysis that randomly selects more than 500 cities across the globe, pulls data from the OpenWeatherMap API for each city. Analysis of the weather and perfect vacation spot is viewable on my Jupyter Notebook.

api data-analysis python

Last synced: 24 Feb 2025

https://github.com/martachesnova/python

Created a Python script to calculate and analyze financial records of a company. Created another Python script to do calculations and analysis of the voting process in a small town.

data-analysis python

Last synced: 24 Apr 2026

https://github.com/data-edd/e-commercestore_analysis

This project analyzes e-commerce data to provide insights into sales performance, profitability, and customer behavior using Power BI.

data-analysis powerbi powerbidashboard

Last synced: 02 Feb 2026

https://github.com/lparham2/factors-driving-ev-adoption-charging-station-deployment

This project explores factors driving EV adoption and charging station deployment using Python-based data analysis. It examines sales trends, infrastructure growth, and socioeconomic influences to uncover key insights. The goal is to aid policymakers and businesses in optimizing EV infrastructure and accelerating sustainable transportation.

data-analysis data-visualization electric-vehicle-charging-station electric-vehicles powerpoint-presentations python

Last synced: 18 May 2026

https://github.com/antononcube/wl-mosaicplot-paclet

Wolfram Language (aka Mathematica) paclet for mosaic plots over datasets or lists of records.

data-analysis machine-learning mosaic mosaic-plots

Last synced: 16 Jan 2026

https://github.com/gui-sitton/games

Identify patterns that determine whether a game is successful or not. This will allow you to identify potential big winners and plan advertising campaigns.

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 18 May 2026

https://github.com/wikidata/purdue-data-mine-2024

Program materials for WMDE's 2024 Purdue Data Mine project

analytics data-analysis data-quality data-science etl open-data python wikidata wikimedia

Last synced: 12 May 2025

https://github.com/xjwllmsx/profitable-app-profiles

Analyzes Google Play & App Store data to recommend profitable profiles for free, ad-supported mobile apps

data data-analysis data-cleaning jupyter pandas python

Last synced: 18 May 2026

https://github.com/pyramidheadshark/ai-mirea-sem1p

Completed set of all MIREA AI an DA practices (1 sem.)

beginner-friendly data-analysis data-science jupyter mirea

Last synced: 05 Apr 2025

https://github.com/shubhamprajapati7748/end-to-end-house-price-prediction

A machine learning model that accurately predicts housing prices using the Boston Housing dataset by analyzing various house features, and it utilizes a CatBoost model to assist potential buyers or sellers in estimating housing prices.

boston-housing-price-prediction data-analysis data-science-projects machine-learning regression regression-models

Last synced: 30 Oct 2025

https://github.com/ddihora1604/social_media_analysis

A powerful, interactive dashboard for analyzing social media conversations, trends, and network dynamics. This tool allows researchers and analysts to explore patterns in social media data, identify key trends, and detect coordinated behavior.

aiml css data-analysis data-visualization html javascript python

Last synced: 30 Oct 2025

https://github.com/adriangalvanzamora/ecommerce-analytics-olist

Data analysis project based on the Olist Brazilian E-Commerce dataset. Includes data cleaning, exploratory analysis, delivery performance metrics, customer satisfaction modeling, and geospatial insights. Built entirely in Python (Jupyter Notebook) using real-world data from Kaggle.

brazil customer-satisfaction data-analysis data-visualization ecommerce folium geospatial-analysis machine-learning matplotlib notebook pandas plotly python seaborn

Last synced: 06 May 2026

https://github.com/drisskhattabi6/meteo-data-mining

This repo contains using Data Mining Techniques to analyze meteorological (meteo) data. The objective is to extract meaningful insights and patterns from the data that can aid in understanding weather phenomena and predicting future weather conditions.

cart data-analysis data-mining data-visualization decision-making decision-tree extract-data extract-insights insights-analytics insights-data k-means knn machine-learning svm

Last synced: 21 Mar 2025

https://github.com/liebsen/overlemon

Overlemon institutional application

data-analysis design devops sysadmin webdev

Last synced: 21 Jul 2025

https://github.com/capjamesg/personal-notebooks

Notebooks for personal experiments with machine learning and computer vision.

data-analysis machine-learning notebooks

Last synced: 03 Apr 2025

https://github.com/bamresearch/utah-saxs-tools

The Utah SAXS Tools (USToo), adapted for Python 3, originally by David P. Goldenberg, 2009-2012

data-analysis saxs small-angle-scattering small-angle-xray-scattering

Last synced: 17 Jan 2026

https://github.com/lavkalsi/tableau-project-stock-market-analysis

The Tableau Project: Stock Market Analysis features a dashboard that combines Descriptive, Diagnostic, Predictive, and Prescriptive analytics to provide insights into stock market trends. Using Python for data processing and an LSTM model for forecasting, this project visualizes historical and predicted stock prices, helping make informed decision.

dashboard data-analysis deep-learning lstm-model python tableau

Last synced: 18 May 2026

https://github.com/caprogs/paris-events-analyzer

A project to analyze events in Paris using open source data provided by the city.

data data-analysis data-platform dbt docker ingestion python streamlit transformation vizualisation

Last synced: 04 May 2026

https://github.com/rathod-shubham/google-data-analytics

Learning a wide range of skills that are useful in everyday life as well as being a data analyst.

data-analysis data-analysis-in-r data-analyst data-analyst-nanodegree data-analytics data-visualization google

Last synced: 03 Feb 2026

https://github.com/dsrodrigovieira/rossmannsales

Este repositório contém um projeto desenvolvido para praticar análise de dados e aplicação de modelos de regressão (aprendizagem supervisionada)

data-analysis data-science machine-learning python telegram-bot xgboost-regression

Last synced: 19 May 2026

https://github.com/kevin-rsj/the-substance-sentiment-analysis

Se analiza los comentarios de usuarios de Reddit sobre la película The Substance (2024) usando técnicas de NLP. Se obtuvo un sentiment score promedio de 0.19, y palabras clave como "horror" y "like" destacan entre las opiniones.

data-analysis notebook python sentiment-analysis tableau visualization

Last synced: 19 May 2026

https://github.com/kianaasd93/faostat

build a multilayer perceptron model that can be used for forecasting the export value of crop products for a geographical region three years into the future

agriculture data-analysis data-science faostat machine-learning ml multiplayer python rnn

Last synced: 19 May 2026

https://github.com/marcogdepinto/olympichistoryanalysis

Python visual analysis of the Olympic Games history. Kaggle gold medal with 15000+ views, 200+ upvotes and 100+ comments.

data-analysis data-science jupyter-notebook olympic-games python seaborn

Last synced: 29 Apr 2026

https://github.com/shrunga92/5g_qos_data_transformation_python

Resource Allocation in 5G Network Service

5g-nr data-analysis python

Last synced: 19 May 2026

https://github.com/jesusgomez-data/retail-sales-data-analysis

End-to-end retail sales data analysis project using SQL, SQLite and Python (Pandas). Includes data generation, KPIs and business insights.

data-analysis junior-data-analyst pandas portfolio-project python retail-analysis sql sqlite sqlite3

Last synced: 11 Apr 2026

https://github.com/saidabderrahmane/bus_line_supervision

Performance evaluation of the Saint-Sébastien bus line using real data to predict the number of passengers.

beautifulsoup4 data-analysis data-science deep-learning machine-learning python scraper sklearn

Last synced: 11 Apr 2026

https://github.com/jatin-s16/netflix_analysis

This project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL. The goal is to extract valuable insights and answer various business questions based on the dataset. The following README provides a detailed account of the project's objectives, business problems, solutions, findings, and conclusions.

data-analysis excel postgresql sql

Last synced: 19 May 2026

https://github.com/jidesamuell/data-analytics-projects

This is a repository i have created to showcase my skills, share projects and track my progress in Data Analytics areas.

data-analysis excel matplotlib powrebi python sql

Last synced: 04 May 2026

https://github.com/first-coding/aidanalyst

AIDAnalyst is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow

data-analysis llm openai prompt-engineering python

Last synced: 19 May 2026

https://github.com/vubacktracking/freecodecamp-data-analysis-with-python

5 Projects in Data Analysis With Python Course on Freecodecamp

data-analysis freecodecamp freecodecamp-project python

Last synced: 19 May 2026

https://github.com/hamzacham/data_set-projet-8

Analyzing a real world data-set with SQL and Python

data-analysis database dataset jupyter-notebook paython sql

Last synced: 19 May 2026

https://github.com/bjornmelin/data-analytics-playground

🧐 Collection of academic data analytics projects showcasing exploratory data analysis, geographic visualization, and interactive dashboards.

data-analysis data-analytics data-visualization geographic-analysis ggplot interactive-maps leaflet r r-programming shiny tidyverse

Last synced: 06 Apr 2025

https://github.com/abdoomohamedd/data-science-projects

A collection of data science projects ranging from exploratory data analysis to predictive modeling and clustering. Each project is designed to solve specific problems or explore particular datasets using various data science techniques and tools.

data-analysis data-analysis-python data-cleaning data-science data-visualization machine-learning machine-learning-algorithms

Last synced: 14 May 2025