An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mpoojithavigneswari/bangalore-house-price-prediction

This project involves creating a website that predicts Bangalore house prices with 94.65% accuracy using a machine learning algorithm.

data-analysis data-science flask-server machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/skivhisink/econometricnsu

Семестровый магистерский курс по эконометрике на первом курсе магистратуры экономического факультета НГУ

data-analysis econometrics economics education nsu r

Last synced: 09 Apr 2025

https://github.com/sarveshdhond/top_25_cad_stocks

In this project I have used Python Jupyter lab and Pandas to import data set from Yahoo stocks website. I have imported the top 25 most active Canadian stocks on 12th July 2024. This project shows skills such as Python, Web Scrapping and Pandas.

data-analysis pandas-dataframe python webscraping

Last synced: 01 Apr 2025

https://github.com/fatihilhan42/the-office-eda

Data analysis study of my favorite sitcom, The Office (US).

data-analysis data-science data-visualization fatihilhan office python sitcom

Last synced: 04 May 2026

https://github.com/abdelrahmanbayoumi/titanic-machine-learning-from-disasters

Knowing from a training set of samples listing passengers who survived or did not survive the Titanic disaster, can our model determine based on a given test dataset not containing the survival information, if these passengers in the test dataset survived or not.

data-analysis data-science data-visualization machine-learning pandas

Last synced: 09 Apr 2025

https://github.com/borjamome/soho_cholera

Cholera deaths in the Soho District (London)

data-analysis data-visualization london r

Last synced: 04 Sep 2025

https://github.com/b-varun-reddy/fairwai-bias-detection

Submission for the FairwAI Hospitality Intern Challenge. This project analyzes bias signals in Yelp hospitality reviews using open-source data, Python, and fairness-focused keyword detection.

bias-detection data-analysis ethical-ai fairness hospitality machine-learning natural-language-processing python social-impact yelp-dataset

Last synced: 19 Apr 2025

https://github.com/manel15279/datamining-project

A university project that aims to explore various data mining techniques like Data Exploration, Association Rule Mining, Supervised and Unsupervised Learning, applied to real-world datasets, focusing on soil fertility analysis and COVID-19 cases evolution over time.

covid-19 data-analysis data-mining data-visualization datascience gradio machine-learning python soil-properties

Last synced: 10 Jun 2025

https://github.com/kernix13/github-readme-seo-analysis

A Jupyter Notebook GitHub README and Repo SEO Analysis to determine what makes a repo rank in the SERPS

accessibility data-analysis readme seo seo-analysis

Last synced: 29 May 2026

https://github.com/yanny-alt/competitor-sales-analysis-in-power-bi

This project aims to analyze competitor sales for a fictional manufacturing company, Sintec, using Power BI. The focus is on integrating, cleaning, and modeling data from multiple sources to generate insightful reports on company and competitor performance.

data-analysis powerbi sales-analysis

Last synced: 07 Jan 2026

https://github.com/tolumie/rfm-marketing-analysis

This project focuses on RFM (Recency, Frequency, and Monetary) Analysis, a powerful customer segmentation technique used in marketing and business analytics. The analysis helps businesses identify their most valuable customers, potential loyalists, at-risk customers, and churned users.

business-analytics customer-behavior-analysis customer-loyalty customer-retention customer-segmentation-analysis data-analysis data-driven-decisions ecommerce marketing-analytics python

Last synced: 18 May 2026

https://github.com/lfariello/atmospheric_reentry

Matlab code for the determination of the reentry trajectory, deceleration profiles, and heat flux of the ARD capsule during orbital reentry into Earth's atmosphere.

data-analysis heat-flux-prediction heat-transfer hypersonic hypersonic-capsule matlab-programming trajectory-prediction

Last synced: 23 Mar 2025

https://github.com/soham7998/data-analysis-projects

My Data Analysis Projects which are completed by me and gain a hands on Experience from each project. the project showcase different Concepts , Visualization and many things.

data data-analysis data-science machine-learning nlp python soham visualization

Last synced: 04 May 2026

https://github.com/mikeesto/ausvotes19

:bird: A collection of 67,284 public tweets published on the night of the 2019 Australian election

australia data-analysis data-visualization elections open-data twitter

Last synced: 06 Apr 2025

https://github.com/prakashjha1/new-analysis-using-llm-locally

An interactive news analysis tool built with Streamlit and local LLMs. This app allows users to analyze and gain insights from the latest news articles using advanced language models, all running locally. Explore trends, sentiment, and key topics with an intuitive interface.

artificial-intelligence data-analysis data-science llms ollama python streamlit

Last synced: 14 Mar 2025

https://github.com/thbaylson/datascience

All of my past data science assignments put into one singular notebook. Most of this comes from my Machine Learning course.

data-analysis data-science data-visualization decision-tree jupyter-notebook k-nearest-neighbors linear-regression machine-learning neural-network pandas-library python3 scikit-learn

Last synced: 09 May 2026

https://github.com/tszon/data-science-projects

Included are all the worth-noting Data Science projects in my learning journey with DataCamp.

data-analysis data-science exploratory-data-analysis feature-engineering machine-learning modelling preprocessing-data scikit-learn supervised-learning

Last synced: 15 Mar 2025

https://github.com/ibrahimm7004/supermarket-sales-analysis

This project focuses on Data Mining techniques to gather inisights about customer behaviour regarding Supermarket Sales. Includes: Association Rule Mining, Temporal Patterns in customer behavior, Sequential Pattern Mining, Classification, Regression, and Outlier Detection.

apriori association-rules data-analysis data-mining data-science data-visualization fpgrowth python sales-analysis supermarket-sales

Last synced: 04 May 2026

https://github.com/jnyambok/epl_dashboard

English Premier League Dashboard summarizing match data from 2009-2024

data-analysis data-science gcp powerbi

Last synced: 04 Sep 2025

https://github.com/ehsan-behzadi/online-retail-data-analysis-and-preprocessing

This project analyzes and preprocesses the Online Retail dataset to uncover insights into customer purchasing behaviors, sales trends, and product performance. It includes data cleaning, exploration, and visualization, with the goal of enhancing understanding of online retail dynamics.

cohort-analysis data-analysis data-cleaning data-exploration duplicate-detection exploratory-data-analysis-eda feature-encoding feature-engineering handling-missing-values online-retail outlier-detection preprocessing trends-visualization visualization z-score-method

Last synced: 16 Apr 2026

https://github.com/nilayhangarge/data-analysis-with-python

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

data-acquisition data-analysis data-analytics data-binning data-cleaning data-engineering data-fundamentals data-insights data-integration data-preprocessing data-science data-wrangling numpy pandas python

Last synced: 12 Apr 2026

https://github.com/noorulhudaajmal/customer-segmentation-analysis

Customer segmentation and analysis of purchasing behaviour

cluster-analysis customer-segmentation data-analysis

Last synced: 07 Oct 2025

https://github.com/syed-amjad-ali/-bank-churn-ml

Predicting bank customer churn using machine learning. This project includes exploratory data analysis (EDA), feature engineering, classification models (Logistic Regression, Random Forest), and customer segmentation using K-Means clustering.

classification data-analysis data-science eda jupyter-notebook k-means-clustering machine-learning ml python segmentation

Last synced: 09 Mar 2025

https://github.com/akashvarma26/data-analysis-on-olympics-csv-dataset

Data Analysis on Olympics dataset of csv format using re and Pandas in Jupyter notebook.

data-analysis jupyter-notebook pandas regex

Last synced: 02 May 2026

https://github.com/reddyprasade/r-program

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

data-analysis data-science r-programming

Last synced: 11 Apr 2026

https://github.com/jbalooshie/election_analysis

A Python script built to analyze specific election's results, and be re-purposed to analyze the results of other elections. The script provides you with different breakdowns of the vote based on candidate and county,

data-analysis data-science elections python

Last synced: 09 Apr 2025

https://github.com/wisdom-osborn/data-analytics-course-online-

🔍 Data Analytics with Python — Hands-on Course Materials Jupyter notebooks, projects, and datasets based on the freeCodeCamp Data Analysis with Python certification. Learn NumPy, Pandas, data cleaning, and visualization through real-world examples

data data-analysis data-science data-visualization freecodecamp numpy pandas pandas-dataframe project python

Last synced: 19 Apr 2026

https://github.com/sweta-kaundilya/python_for_data_analysis

Learning Python and all the relevant libraries in python for Data field.

cufflinks data-analysis data-science matplotlib numpy pandas plotly python seaborn

Last synced: 04 May 2026

https://github.com/bkataru/physics-e.e

Project repository for IB physics extended essay. Topic: Predictive data modeling of a variable binary star’s brightness over a period of time using astrostatistics.

astrometry astronomical-algorithms astronomical-images astronomy astrophotography astrostatistics data-analysis data-science data-visualization modeling physics polynomial-regression regression-analysis

Last synced: 09 Apr 2025

https://github.com/mnoalett/cscrawler

BSc degree thesis - crawler for www.couchsurfing.org

bsc-thesis couchsurfing crawler data-analysis database python

Last synced: 02 May 2026

https://github.com/adilshamim8/eda-on-health-and-sleep-data

Exploratory Data Analysis (EDA) on health and sleep data, uncovering patterns and insights using Python and visualization tools.

data-analysis data-visualization eda health healthcare sleep sleep-analysis

Last synced: 15 Mar 2025

https://github.com/devexpress-examples/wpf-pivot-grid-define-custom-cell-template-to-performing-data-editing

This example shows how to edit a cell with the cell editing template in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 02 May 2026

https://github.com/rohitblaze10/netflix_analysis_using_tableau

The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.

data data-analysis data-science data-visualization netflix tableau

Last synced: 04 Feb 2026

https://github.com/pngo1997/life-expectancy-logistic-regression

Life expectancy analysis project using logistic regression.

data-analysis logistic-regression r rmarkdown

Last synced: 10 Jun 2026

https://github.com/rtgrt5645/numpy-lab

🧮 Explore, manipulate, and visualize data with NumPy to enhance your Python skills in scientific computing and data analysis.

array-operations data-analysis data-science jupyter-notebook machine-learning numerical-computing numpy numpy-arrays numpy-library numpy-python python python3 scientific-computing

Last synced: 04 May 2026

https://github.com/ljadhav25/logistic-regression-data-science-

Logistic regression estimates the probability of an event occurring, such as voted or didn’t vote, based on a given data set of independent variables.

data-analysis data-science data-visualization logestic-regression machine-learning

Last synced: 04 May 2026

https://github.com/abelarduu/power_bi_analyst

Projeto Power BI para relatório de dados financeiros, com navegação intuitiva e recursos interativos. Oferece uma experiência completa ao usuário, combinando apresentação sofisticada e funcionalidade eficaz para análise de dados.

dashboard data-analysis data-analytics modelagem-de-dados powerbi tratamento-de-dados

Last synced: 08 Sep 2025

https://github.com/bhaskarbharati/ibm-datascience-hands-on-lab

This is the basic hands-on exercise using Jupyter Notebook. This lab is done in the process of learning course Tools For Data Science | IBM

data-analysis data-science data-visualization datawrangling eda machine-learning

Last synced: 23 Apr 2025

https://github.com/kailenroa/dashboad-excel-huisprijzen

This project focuses on developing a dashboard powered by Funda to visualize house pricing in the Netherlands. The dashboard simplifies the home-buying process by allowing users to compare prices, energy labels, number of rooms, and square meters across different provinces, all in one interactive platform..

dashboard data-analysis excel house-prices

Last synced: 05 Jan 2026

https://github.com/ankitmishralive/machinelearning

Continuously deep diving in understanding & advancing my expertise in Machine Learning through ongoing education and hands on experience with practical learning.

artificial-intelligence data-analysis data-cleaning data-gathering machine-learning machinel-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 22 Mar 2025

https://github.com/annnieglez/nlp-stock-market-and-news

This project focuses on detecting fake news from news headlines using advanced Natural Language Processing (NLP) techniques. It combines sentiment analysis with news headlines embeddings, generated from Hugging Face transformer models, to train a binary classification model that distinguishes between real and fake news.

classification-model data-analysis embeddings machine-learning machine-learning-models nlp nlp-deep-learning nlp-machine-learning python scraping-websites sentiment-analysis

Last synced: 25 Apr 2026

https://github.com/taralas209/moscow-programmer-salaries-analysis-dvmn

A Python script analyzing the average salaries of programmers in Moscow by popular programming languages using data from HeadHunter and SuperJob.

api data-analysis headhunter job-market-analysis python superjob

Last synced: 15 Mar 2025

https://github.com/deliprofesor/joblocationmapper

JobLocationMapper is a Python tool that visualizes job listings on an interactive map. It uses city and state data to place job markers accurately and color-codes them by occupation (Software, Marketing, Design). The map clusters markers for better organization, and users can click on them to view job details.

clustrered-markers data-analysis data-visualization folium geocoding geographical-visualization interactive-map job-listings map-visualization pandas python

Last synced: 14 May 2026

https://github.com/abhash-rai/analyzing-credit-card-eligibility

This work was performed as part of BCU undergraduate course.

data-analysis data-visualization ggplot ggplot2 latex r

Last synced: 20 Jan 2026

https://github.com/faysalalmahmud/bd-med-professional-analysis

Analysis of healthcare professionals in Bangladesh through web scraping, data processing, and interactive visualization.

data-analysis data-visualization jupyter-notebook python scraper selenium selenium-webdriver tableau

Last synced: 04 Sep 2025

https://github.com/felpzreiz/stockdata_pipeline

Este projeto consiste no desenvolvimento de um pipeline de dados que consome informações financeiras de uma API da Bolsa de Valores Americana (StockData.org) para análise e tratamento. Utilizando Python e bibliotecas como pandas, matplotlib e pyarrow

api data-analysis data-science jupyter-notebook pandas python

Last synced: 19 Apr 2026

https://github.com/bishopce16/pyber_analysis

The purpose of this project was to complete an exploratory analysis and create visualizations of the 2019 ride sharing data from PyBer.

data-analysis data-visualization jupyter-notebook matplotlib pandas python

Last synced: 04 May 2026

https://github.com/loginchik/mid_contracts

Анализ контрактов государственных закупок МИДа РФ

data-analysis dataset pandas python

Last synced: 17 Apr 2025

https://github.com/ymorsi7/caliwageanalysis

California employment and wage analysis on data from the past decade.

data-analysis data-science ipynb jupyter-notebook

Last synced: 21 Jan 2026

https://github.com/kernelshreyak/kaggle-notebooks

Collection of my Kaggle notebooks for data analysis and machine learning on a variety of datasets

data-analysis data-science data-visualization kaggle kaggle-competition machine-learning

Last synced: 27 Apr 2026

https://github.com/devlucho/modelos-predictivos

Modelos predictivos utilizando los algoritmos de Regresión Lineal, Regresión Logística y Árboles de Decisión.

data-analysis jupyter-notebook python3

Last synced: 03 May 2026

https://github.com/codeslash21/tmdb_data_analysis

We analysed TMDB dataset which contains around 11000 movies details. We analyzed to find some interesting facts about the dataset.

data-analysis data-visualization matplotlib nanodegree-project numpy pandas python tmdb-movie

Last synced: 03 May 2026

https://github.com/salma-mamdoh/project-writing-functions-for-product-analysis

My Project to learn the Basics of Analysis on DataCamp

data-analysis data-camp pandas python

Last synced: 03 May 2026

https://github.com/syed-m-nofel/python-data-science-fundamentals

Python notebooks for data manipulation (Pandas/NumPy) and API workflows – from basics to practical examples.

api beginner-friendly data-analysis data-science http-requests jupyter-notebook numpy pandas pandas-dataframe python tutorial

Last synced: 03 May 2026

https://github.com/nurulashraf/logistic-regression-loan-prediction

Loan approval prediction using logistic regression based on applicant data, including income, credit history, and property details, after data preparation and feature engineering.

data-analysis data-science loan-prediction logistic-regression machine-learning predictive-modeling python sklearn

Last synced: 03 May 2026

https://github.com/matteospanio/speed-analysis

A project to analyze the internet speed

bash-script data-analysis

Last synced: 03 May 2026

https://github.com/iguptashubham/ev-market-exploration

So, market size analysis is a crucial aspect of market research that determines the potential sales volume within a given market

data-analysis data-analysis-projects data-science-project forecast projects python

Last synced: 03 May 2026

https://github.com/samruddhi3012/screen-time-analysis

Hi! This repo demonstrates a python project on Screen Time Analysis.

data-analysis data-visualization python

Last synced: 04 May 2026

https://github.com/nickenshidqia/uber-new-york-data-analysis

Analyze Uber pickups on New York to get insight from this data

data-analysis data-analyst exploratory-data-analysis python

Last synced: 04 May 2026

https://github.com/xiaohan2012/myunisport

Visualize your Unisport annual training records

data-analysis data-visualization pandas pygal sports-stats tikzposter

Last synced: 04 May 2026

https://github.com/sanchittechnogeek/rental-data-visualization_python

Statistics and visualization of rental data with python

data-analysis data-science data-visualization statistics

Last synced: 04 May 2026

https://github.com/damisparks/become_data_analyst

Are you new to Data Analysis ? Here you will find simple notebook that will help through your journey. These are personal projects I work on and still working.

data data-analysis data-visualization matplotlib numpy pandas-tutorial

Last synced: 04 May 2026

https://github.com/mr-chang95/sf_data_visualization

In this personal project, I am interested in examining all of the active businesses in the San Francisco Bay Area while performing some simple data visualizations, mainly on categorical variables.

business data-analysis data-visualization jupyter-notebook pandas python san-francisco

Last synced: 04 May 2026

https://github.com/fatihilhan42/book-recommendation-system-with-python

In this project, we are making a book recommendation system that recommends similar books according to the genres or ratings that the user enters, using a large book dataset. The link of the dataset is given below. Happy reading...

books data-analysis data-science data-visualization kaggle python recommendation-engine recommendation-system

Last synced: 04 May 2026

https://github.com/hyperplasma/olympic-visualization-analysis

Multidimensional analysis and visualization of Olympic medals, economy, and happiness index.

data-analysis data-visualization matplotlib numpy pandas python wordcloud

Last synced: 04 May 2026

https://github.com/halyusa16/e-commerce-analysis

This project analyzes a public e-commerce dataset to uncover valuable insights and answer critical business questions. The dataset contains customer, product, order, and transaction details, providing a comprehensive view of the e-commerce platform's operations.

data-analysis data-cleaning data-exploration data-visualization self-project

Last synced: 09 Jun 2026

https://github.com/abhinav330/911-emergency-calls-analysis

This Python Notebook analyzes emergency call data from the '911.csv' dataset. It uses various data visualization techniques to explore and gain insights into the emergency call data, including the types of calls, reasons for calls, and call patterns over time.

data-analysis data-science data-visualization eda exploratory-data-analysis exploratory-data-visualizations numpy pandas python

Last synced: 09 Jun 2026

https://github.com/saitoxu/data-analysis-workspace

Docker image for data analysis

data-analysis docker python

Last synced: 04 May 2026

https://github.com/youssefyaser/scrape-the-imdb-site-for-the-top-250-movies

Web scraping the top 250 movies in IMDB site.

data-analysis numpy pandas python

Last synced: 04 May 2026

https://github.com/drod75/nyc-arrests-analysis

This is a simple Data Science Project made to analyze and display data and trends found within the NYC Arrests Year to Date Dataset.

data-analysis data-visualization folium jupyter-notebook matplotlib-pyplot nyc-opendata nypd python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/flytomarsz/bike-sharing-system-analysis

This analysis project aim to identify bike rental's behavior in 2012 from Capital Bikeshare system, Washington D.C., USA. This project is part of my Data Analysis study at Dicoding.

data-analysis data-visualization jupyter-notebook python streamlit

Last synced: 04 May 2026

https://github.com/tasosfotiadis/time-series-analysis-and-forecasting-of-cryptocurrency-prices

Forecasted Cardano (ADA) cryptocurrency prices using time series analysis. The project involved data preprocessing, trend and seasonality analysis, and model building with ARIMA, SARIMA, and LSTM. Models were evaluated using metrics like MAE and MAPE, providing insights for financial decision-making.

applied-st classical-statistical-models data-analysis deep-learning lstm machine-learning neural-network python r time-series

Last synced: 05 May 2026

https://github.com/dhruvsrikanth/basic-data-science

A short Data Science Project I took up for fun! This is a data analysis based on a dataset I created to predict the distribution of wealth within an economy as well as several characteristics of each class within society!

analysis data-analysis data-pipeline data-science data-visualization machine-learning matplotlib pandas python seaborn sklearn

Last synced: 05 May 2026

https://github.com/badranalyst/residential-unit-prices-data-analysis-application

Python-based analysis of residential unit prices, focusing on data cleaning, visualization, and exploratory data analysis (EDA). Key features include price distribution, and correlation analysis between factors like size, location, and pricing.

data-analysis data-visualization dataset matplotlib numpy pandas python seaborn

Last synced: 05 May 2026

https://github.com/jacktheprogrammer/time-series-forecasting-and-analysis

My personal project consisting of my personally created notebooks to work with time series forecasting and analysis. In these projects, I've used deep learning using tensorflow, xgboost, statsmodels and scipy libraries of python. The series were of weather, energy consumption and that of stocks.

data-analysis data-science deep-neural-networks energy-consumption machine-learning portfolio prophet-facebook prophet-model python python3 scipy statsmodels stocks tensorflow time-series time-series-analysis timeseries-forecasting weather xgboost

Last synced: 05 May 2026

https://github.com/codewithmayank-py/box-office-analysis-with-seaborn-and-python

This repository contains Python code and datasets for analyzing box office data. Explore trends, patterns, and factors influencing movie performance.

analysis box-office-data-analysis data-analysis data-visualization dataset jupyter-notebook matplotlib pandas python3 seaborn

Last synced: 05 May 2026

https://github.com/pcanadas/weather_scraper

Este proyecto automatiza la recopilación y el procesamiento de datos meteorológicos históricos y previsionales. Utiliza Selenium para extraer información de sitios web de clima, procesa los datos con Pandas y los almacena en archivos CSV limpios. Es ideal para análisis climáticos, visualización de datos o integración en otros sistemas.

beautifulsoup data-analysis pandas python selenium

Last synced: 05 May 2026

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/akash-47-tank/personalized-e-commerce-review-summarizer

Personalized E-commerce Product Review Summarizer: A Streamlit app that summarizes product reviews (e.g., from a CSV) using T5-small and tailors summaries to user preferences (price, durability, etc.) with NLP and lightweight ML.

data-analysis e-commerce machine-learning nlp personalization portfolio python scikit-learn sentiment-analysis streamlit t5 transformers web-app

Last synced: 05 May 2026

https://github.com/ayaatmohammed/amazon-sales-analysis-pyspark

In-depth analysis of the Olist E-commerce dataset from Kaggle using PySpark for customer segmentation (RFM) and market basket analysis.

big-data big-data-analytics customer-segmentation data-analysis data-science ecommerce jupyter-notebook kaggle pyspark python rfm-analysis

Last synced: 05 May 2026