An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/henriquetourinho/s.i.g.m.a

Plataforma de busca e análise de arquivos para Linux, com GUI avançada em PySide6 e foco em metadados ricos para investigações profundas.

data-analysis developer-tools file-search metadata open-source pyqt pyside6 python python-brasil qt6 sysadmin-tools

Last synced: 24 Apr 2026

https://github.com/voidnire/redditviralmysteryposts

Análise de posts de subreddits de mistério. O que define um post viral neste tipo de sub?

data-analysis data-visualization mysteries mystery nlms python-3 reddit

Last synced: 24 Apr 2026

https://github.com/muthukumar0908/youtube-data-harvesting-and-warehousing-using-sql-mongodb-and-streamlit

Create a simple and intuitive user interface using Streamlit, From the youtube getting and extracting the data by using API key. That data stored in database.

data-analysis mongodb-atlas python sqldatabase streamlit-webapp youtube-api

Last synced: 24 Apr 2026

https://github.com/manisharora96/data-analysis-of-smartwatch

The project is structured with sample data, step-by-step Jupyter notebooks, and modular Python scripts for automated analysis

data-analysis data-visualization jupyter-notebook python smartwatch-analysis

Last synced: 24 Apr 2026

https://github.com/cyberoctane29/python-for-data-analysis

A repository dedicated to learning Python for data analysis, data science, and data analytics. This collection of Jupyter notebooks covers practical exercises and concepts from the Google Advanced Data Analytics Professional Certificate program.

data data-analysis data-analytics data-science python

Last synced: 24 Apr 2026

https://github.com/pedrohdosanjos/economic-data-analysis

This project aims to analyze the export data from various states in the United States to Brazil over time. The data is sourced from the FRED (Federal Reserve Economic Data) API and processed to identify the top 5 exporting states for each year, as well as the states with the highest total export value across all years.

api data-analysis data-visualization jupyter-notebook python

Last synced: 24 Apr 2026

https://github.com/mehmetkahya0/gallstone_dataset_analysis_project

Safra Taşı Hastalığı (Gallstone-1) Veri Seti Analizi (https://archive.ics.uci.edu/dataset/1150/gallstone-1)

analysis analytics data data-analysis data-science data-visualization database graph matplotlib python

Last synced: 25 Apr 2026

https://github.com/fbarffmann/belly-button-challenge

Built an interactive JavaScript dashboard to visualize bacterial biodiversity from belly button samples. Analyzed data from 153 participants and identified OTU 1167 as the most common bacteria.

biodiversity dashboard data-analysis data-visualization interactive-charts javascript json plotly

Last synced: 25 Apr 2026

https://github.com/tmoulik/bikeshare-python

Analysis of Bikeshare data from three major cities

data-analysis data-visualization python udacity-nanodegree

Last synced: 25 Apr 2026

https://github.com/xjwllmsx/hacker-news-engagement

Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.

data data-analysis jupyter python

Last synced: 25 Apr 2026

https://github.com/ddihora1604/iit_patna

A multifaceted project involving applying ML models like Ridge Classifier, RNN, RIDOR, Rotation Forest and RUSBoost, integrating SMOTE for class balancing, and handling diverse datasets including those for seating arrangement tasks.

data-analysis data-visualization datamodelling machine-learning-algorithms python

Last synced: 25 Apr 2026

https://github.com/m-biriulova/python-job-market-analysis

Web scraping, data analysis, and visualization of Python developer vacancies in Czech Republic.

automation beautifulsoup data-analysis data-visualization portfolio-project python selenium web-scraping

Last synced: 25 Apr 2026

https://github.com/sarangs1621/weather-prediction

Weather Prediction Using Machine Learning is a project that leverages machine learning algorithms to predict weather conditions based on historical data. It evaluates three popular ML models (Decision Tree, KNN, and Logistic Regression) and provides performance insights through metrics and visualizations.

data-analysis decision-tree jupyter-notebook knn logistic-regression machine-learning predictive-modeling python scikit-learn weather-prediction

Last synced: 25 Apr 2026

https://github.com/edwinrlambert/investigating-netflix-movies

Demonstrates data analysis and visualization techniques for Netflix movies using Python in a Jupyter notebook. This is a DataCamp project.

data-analysis data-analysis-python netflix python

Last synced: 25 Apr 2026

https://github.com/devexpress-examples/winforms-create-a-custom-exporter-for-pivotgridcontrol-with-xtrareport

This example illustrates how to dynamically create a custom report based on PivotGridControl content in WinForms.

data-analysis dotnet pivot-grid pivot-grid-for-winforms winforms

Last synced: 26 Apr 2026

https://github.com/devexpress-examples/wpf-pivotgrid-customize-the-cell-template

This example demonstrates how to customize the cell appearance in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 26 Apr 2026

https://github.com/dcs-training/2023-10-22-carpentry-social-science

Go to https://dcs-training.github.io/2023-10-22-Carpentry-Social-Science/ to follow along the material

data-analysis data-visualisation data-wrangling intro-to-programming r

Last synced: 06 Jun 2026

https://github.com/akashvarma26/data-analysis-on-imbd-using-sqlite3

Data Analysis on IMDb dataset using sqlite3 and Pandas in Jupyter notebook.

data-analysis jupyter-notebook pandas-dataframe sqlite

Last synced: 27 Apr 2026

https://github.com/mumtaz4118/amazon-iphone-12-data-scrapped

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

data-analysis data-extraction data-science data-scraping html mark-up python

Last synced: 27 Apr 2026

https://github.com/malexandersalazar/covid-19-peru-estimacion-oxigeno-requerido

Análisis técnico de casos confirmados por COVID-19 en Perú para la estimación de oxígeno medicinal requerido.

covid-19 data-analysis data-science peru python

Last synced: 27 Apr 2026

https://github.com/busesimsek/sql-projects

A collection of my SQL projects with insights into real-world datasets.

data-analysis data-analytics mysql sql

Last synced: 07 Jun 2026

https://github.com/hfzdzakii/dicoding-airqualityanalysisdata

This repo is a master submission for my Dicoding Final Project. Air Quality Dataset is being used to fulfill the submission. Feel free to explore and I hope my work give you some insight!

data-analysis data-visualization streamlit

Last synced: 27 Apr 2026

https://github.com/banyc/dfplot

Summarize a data frame by plotting. `cargo install --git https://github.com/Banyc/dfplot.git`.

csv data-analysis plotly plotting statistics

Last synced: 27 Apr 2026

https://github.com/airdac/ml-palmerpenguins

Classification and analysis of the palmerpenguins dataset in Python. Team project from UPC's Master's Degree in Data Science

classification data-analysis data-science machine-learning palmer-penguin python upc

Last synced: 07 Jun 2026

https://github.com/mango606/da__

2021.9 데이터분석프로그래밍 과제

data-analysis python task

Last synced: 27 Apr 2026

https://github.com/lotfiferaga/hotel-reviews-sentiment-analysis

Efficient Python-driven sentiment analysis for hotel reviews, providing insightful evaluations.

data-analysis data-visualization nlp python

Last synced: 07 Jun 2026

https://github.com/l2nce/datamining-study

Introduction to data mining

data-analysis data-mining matplotlib numpy panda

Last synced: 28 Apr 2026

https://github.com/sweta2501/netflix_dataanalysis

With the help of Netflix Data, I have done some Data Analysis.

data-analysis data-science jupyter-notebook python

Last synced: 28 Apr 2026

https://github.com/sujata-adhikari/data-analysis

Data analysis of Market sales data using PowerBi, created dashboard to show analysis.

data-analysis excel pandas powerbi

Last synced: 12 Jun 2026

https://github.com/simranshaikh20/diwali-sales-analysis-for-business-insights

A data analyst project on diwali sales . In this state according state , gender, age we are able to know how much sale it done.

data-analysis data-visualization python

Last synced: 28 Apr 2026

https://github.com/stefagnone/movies-dataset-analysis-project

Comprehensive analysis of the Movies dataset, exploring genre trends, comparisons, and qualitative insights using Python, Pandas, and visualizations. Designed to uncover actionable findings for stakeholders.

data-analysis data-visualization exploratory-data-analysis matplotlib movies-analysis pandas python seaborn storytelling-with-data

Last synced: 28 Apr 2026

https://github.com/datalopes1/warehouse_rfv

Neste projeto será realizada uma análise do tipo RFV (Recência, Frequência e Valor) com dados que encontrei neste video no Youtube do canal Jie Jenn.

analise-rfv data-analysis data-science kmeans python rfm-analysis

Last synced: 28 Apr 2026

https://github.com/hadson0/chess-live-ratings-data

A study project focused on web scraping the live chess ratings from chess.com, with data analysis and visualization on nearly 5000 players in the classical world ranking.

beautifulsoup chess data-analysis data-visualization numpy pandas python seaborn web-scraping

Last synced: 28 Apr 2026

https://github.com/emmanuelletocs/steam-game-recommender

A powerful recommendation system for Steam games, combining Content-Based and Collaborative Filtering techniques. Built with Python, Scikit-learn, and Streamlit to deliver accurate, real-time game recommendations. Perfect for gamers and data scientists interested in building intelligent recommendation engines.

als-algorithm data-analysis gaming-industry knn machine-learning mds mysql ncf neural-network pyspark recommendation-engine recommendation-system scikit-learn spark

Last synced: 28 Apr 2026

https://github.com/elmezianech/autoinventory

This project is an end-to-end, fully automated warehouse management solution designed to tackle real-world inventory challenges in the FMCG sector. From real-time data ingestion and predictive analytics to interactive dashboards, this project combines cutting-edge technologies and an event-driven architecture to simulate a business-ready system.

automation dashboard data-analysis data-engineering-pipeline docker etl glue-job inventory-management kafka kpis lambda-functions lstm ml-pipeline mlflow power-bi pytorch redshift s3 streamlit warehouse-management

Last synced: 28 Apr 2026

https://github.com/abdeldjalilchafai/us-flight-delay-eda

Structured EDA on 2015 US flight delay data. Clean, reproducible notebook using a 6-step data analysis framework for real-world datasets.

data-analysis data-cleaning eda exploratory-data-analysis flight-delays kaggle matplotlib numpy pandas python seaborn

Last synced: 28 Apr 2026

https://github.com/george-njuguna/spotify-etl-pipeline

This is an ETL pipeline that uses Spotify API , Docker and Airflow

apache-airflow data-analysis docker pipelines python

Last synced: 28 Apr 2026

https://github.com/dcs-training/decode-winterschool

In here you can find material on cluster analysis, data wrangling, and network analysis. Go to the readme file for more info

data-analysis data-visualisation data-wrangling gephi network-analysis python r statistics

Last synced: 28 Apr 2026

https://github.com/manalisbhavsar/stock-price-prediction

Stock Price Prediction model using Machine Learning and LSTM to forecast future stock prices based on historical data. Achieved a low error rate of 3.2% by leveraging moving averages and deep learning techniques, ensuring accurate predictions.

data-analysis deep-learning lstm machine-learning matplotlib numpy pandas python

Last synced: 28 Apr 2026

https://github.com/abhi227070/car-price-prediction

This project implements a machine learning model to predict the price of cars based on various features such as mileage, manufacturing date, fuel type, and more. Users can input car information, and the model will estimate the price of the car based on the provided data. This tool can be useful for both car buyers and sellers to estimate car price.

data-analysis machine-learning machine-learning-algorithms machinelearning python3 regression regression-models scikit-learn scikitlearn-machine-learning

Last synced: 28 Apr 2026

https://github.com/buabaj/fortran-assignment

code repository for fortran and python climatology assignment.

big-data climatology data-analysis data-visualization fortran90 python

Last synced: 28 Apr 2026

https://github.com/priyanshubiswas-tech/e-commerce_data_analysis

Analyzes 9,994 e-commerce transactions to uncover insights on sales trends, customer behavior, profitability, and logistics using EDA and visualization. Identifies top products, customer segments, and shipping efficiencies to optimize marketing, inventory, and operations, making it valuable for retail, finance, and logistics.

data data-analysis data-visualization pandas pandas-dataframe plotly-analytics-projects plotly-express python

Last synced: 28 Apr 2026

https://github.com/szapp/candyanalysis

Case study: Analyze the candy power ranking to identify and recommend popular candy characteristics

data-analysis data-visualization feature-selection interaction-terms

Last synced: 28 Apr 2026

https://github.com/wei-rongrong2/openfoodfactclustering

A project that explores clustering food products based on nutritional attributes using K-Means, Fuzzy C-Means, and DBSCAN algorithms, with a Streamlit dashboard for visualizing results.

clustering dashboard data-analysis dbscan food-products fuzzy-cmeans k-means machine-learning nutrition nutrition-clustering open-food-facts streamlit

Last synced: 28 Apr 2026

https://github.com/gaurav-van/optimizing-rate-of-penetration-in-geothermal-drilling-a-digital-twin-approach

Let’s explore something interesting together. In this project, we developed a machine learning digital twin using Intel-optimized XGBoost and daal4py to simulate and optimize the Rate of Penetration (ROP) in geothermal drilling. We leveraged SHAP for Explainable AI (XAI) to interpret model predictions.

data-analysis data-science digital-twin explainable-ai geothermal geothermal-energy jupyter-notebook machine-learning python shap xai xgboost

Last synced: 28 Apr 2026

https://github.com/josedanielchg/efficient-data-storage-for-predictive-modeling

DataCamp project from the Associate Data Scientist track, focusing on optimizing dataset storage by transforming data types and filtering. Prepares data for efficient machine learning workflows

cleaning-dataset data-analysis jupyter-notebook python

Last synced: 28 Apr 2026

https://github.com/kisaa-fatima/data-visualization-with-tableauleu

Conducted Exploratory Data Analysis (EDA) on the Berkeley Earth Dataset (large scale dataset), which features high-resolution land and ocean time series data. Created interactive dashboards using Tableau to effectively visualize and highlight trends and patterns within the data.

data-analysis data-science exploratory-data-analysis insights python tableau visualizations

Last synced: 29 Apr 2026

https://github.com/i-am-uchenna/sql-data-warehouse-project

The Data Warehouse and Analytics Project is a comprehensive initiative designed to demonstrate the end-to-end process of building a modern data warehouse and deriving actionable insights through SQL-based analytics.

architecture business-intelligence crm data data-analysis database database-management datawarehouse erp etl etl-pipeline model sql sqlserver

Last synced: 15 May 2026

https://github.com/devexpress-examples/web-forms-pivot-grid-change-summary-display-mode

This example shows how to use different summary display modes in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 29 Apr 2026

https://github.com/chrispsang/healthcare-dataanalysis

Analyze synthetic patient data to identify trends, improve healthcare delivery, and predict patient outcomes using machine learning models. Includes data exploration, preprocessing, model building, and visualizations.

data-analysis data-science data-visualization healthcare jupyter-notebook machine-learning python

Last synced: 29 Apr 2026

https://github.com/kawshik-khan/fake-news-analysis

A fake news detection ML model. It utilizes the Bag of Words model for text vectorization and a Multinomial Naive Bayes classifier to predict whether news articles are real or fake. The project covers data preprocessing, model training, and performance evaluation with accuracy metrics and a confusion matrix.

data-analysis data-science machine-learning ml python3

Last synced: 08 Jun 2026

https://github.com/devexpress-examples/winforms-visualize-pivot-grid-data-in-chart

The following example shows how to integrate the Pivot Grid with the Chart control.

charting data-analysis dotnet pivot-grid-for-winforms winforms

Last synced: 29 Apr 2026

https://github.com/nivasharmaa/spiderverse

A comprehensive Java program for analyzing and managing events and data points within a fictional spiderverse. Features event handling, anomaly detection, cluster management, and robust file I/O operations.

advanced-algorithms anomaly-detection clustering data-analysis file-io object-oriented-programming

Last synced: 29 Apr 2026

https://github.com/mumtaz4118/scraping-medium-and-data-analytics

The file DataExtraction.py extracts information from the json files scrapped by the scrapper medium_scrapper_post.py. To extract information from json files scrapped by medium_scrapper_tag_archive.py (scrapping from tags archive) then use Data_Extraction_Archive_Tags.py

data data-analysis data-analytics data-extraction data-preprocessing data-science data-scraping deep-learning machine-learning python

Last synced: 29 Apr 2026

https://github.com/anilyigitsel/istanbul-rental-apartments-analysis

This project analyzes the Istanbul Rental Apartments Dataset (2025), which includes rental apartment listings from Istanbul, Turkey.

data-analysis data-visualization jupyter-notebook matplotlib pandas python rental-housing

Last synced: 29 Apr 2026

https://github.com/eco786786/restaurant_orders

This analysis seeks to uncover patterns in customer behaviour by examining restaurant order data.

data-analysis git postgresql tableau

Last synced: 29 Apr 2026

https://github.com/i7t5/sentimentnlp

Sentiment analysis for COMP 435 Introduction to Machine Learning, Spring 2025

data-analysis jupyter-notebook machine-learning nlp python sentiment-analysis

Last synced: 29 Apr 2026

https://github.com/findmyway/dataframe-in-julia

A quick introduction of DataFrame in Julia for users from Python

data-analysis dataframe julia jupyter-notebook

Last synced: 29 Apr 2026

https://github.com/lankesathwik7/sql-query-assistant

Natural language to SQL query converter using Groq LLM. Ask questions in plain English and get SQL queries, visualized results, and natural language explanations. Built with Streamlit and PostgreSQL.

data-analysis database groq llm natural-language-processing python sql

Last synced: 29 Apr 2026

https://github.com/khushi-sabarad/web_scraping

This project is a Python-based web scraper that extracts the menu from a cafe and saves it to an Excel file. It was created to automate the process of retrieving and updating menu prices, a task that was observed to be done manually at the hostel.

beautifulsoup data-analysis data-visualization market-analysis pandas python requests web-scraping wordcloud

Last synced: 29 Apr 2026

https://github.com/fatihilhan42/starbucks_analysis_turkey_and_world_with_python

In this project, firstly the brands for coffee in the world and then these brands in Turkey were examined. The data from the dataset, which you can find in the repo, was first organized using data cleaning algorithms. These cleaned data were then graphically extracted using data visualization algorithms.

data-analysis data-cleaning data-science data-visualization jupyter-notebook python

Last synced: 29 Apr 2026

https://github.com/mfakhriazhar/python-data-analyst-tutorial

A collection of My Python learning files for Data Analyst purposes. Covers fundamental to advanced topics such as data exploration, visualization, statistical analysis, and the use of popular libraries like Pandas, NumPy, Matplotlib, and Seaborn. Suitable for personal documentation or shared learning references.

data-analysis data-science data-visualization exploratory-data-analysis portfolio python

Last synced: 29 Apr 2026

https://github.com/teja-1403/forage-standard-bank-data-science

This repository contains solutions to the 4 different tasks that must be performed during the Data Science virtual internship provided by Standard Bank via Forage.

automl communication-skills data-analysis data-science machine-learning python sql

Last synced: 29 Apr 2026

https://github.com/hardikk-7/election-analysis-project

A data analytics project exploring the 2024 Indian General Election results using Python. Includes party-wise, state-wise, and vote share analysis with visualizations.

data-analysis data-science election-analysis jupyter-notebook python

Last synced: 29 Apr 2026

https://github.com/chandantech2023/sales-trend-analysis

This repository features the Superstore Sales Analysis project, demonstrating data cleaning and analysis using Python and SQL, along with interactive visualization in Power BI. .

data-analysis data-science dax kaggle powerbi-desktop python3 sql

Last synced: 29 Apr 2026

https://github.com/valikmorinko/ecommerce-sales-analysis

Анализ продаж e-commerce: данные, визуализации, аналитические выводы.

data-analysis e-commerce jupyter matplotlib pandas python seaborn

Last synced: 29 Apr 2026

https://github.com/sdley/cas_pratique-del_annuel

Del-Annuel est logiciel de deliberation annuelle des ecoles superieures ou universités

data-analysis pandas python tkinter-gui

Last synced: 29 Apr 2026

https://github.com/alunera-data/sql-use-cases

Practical SQL use cases for Business Intelligence and IT Service Management (BI & ITSM)

business-intelligence dashboards data-analysis data-quality eda itsm kpis postgresql process-monitoring query reporting sql sqlserver

Last synced: 29 Apr 2026

https://github.com/meinhere/ta-pendat

Proyek Akhir Mata Kuliah Penambangan Data - Klasifikasi Trauma Pasien Menggunakan Metode Naive Bayes

data-analysis data-mining naive-bayes-classifier python trauma

Last synced: 29 Apr 2026

https://github.com/varshan1123/sql-tableau-project

We analyze key indicators for our pizza sales data to gain insights into our business performance - A Data Analysis Project performed on Tableau & SQL.

analysis data-analysis data-science data-visualization excel mysql powerbi sql sql-server tableau tableau-dashboards

Last synced: 29 Apr 2026

https://github.com/shimaa83/eda-repo

Exploratory data analysis for Police and retail dataset in kaggle

data-analysis python

Last synced: 29 Apr 2026

https://github.com/dindagustiayu/data-processing

The digital text book to interpreting characterisation results.

characterisation data-analysis gitbook latex-package myst qualitative-analysis quantitative-analysis

Last synced: 08 Jun 2026

https://github.com/al-ghaly/e-commerce-a-b-testing

A Statistical Analysis project in which I Performed an A/B test to analyze the effect of changing the user interface for an E-Commerce company's Website.

data-analysis matplotlib numpy pandas python python-data-analysis seaborn statistical-analysis statistics

Last synced: 29 Apr 2026

https://github.com/supertetelman/frc-data-analysis

A Collection of R, Matlab, and Bash scripts that were developed in real-time from the stands of a FRC competition. Gathered data from various online sources, parsed it, and ran some basic analysis on it to calculate ratings and make basic match predictions. Results were mad public and hosted live via AWS. Developed as a student teaching tool under poor Internet Connectivity with minimal access to real-time match data.

bash data-analysis matlab r teaching

Last synced: 29 Apr 2026

https://github.com/prithviraj-2003/cognifyz-data-science-internship

🎓 Data Science Internship at Cognifyz Technologies 📅 Duration: 2 Months 🧠 Worked on real-world restaurant data 🗂️ Completed structured tasks across 3 levels 📌 Tasks focused on EDA, data preprocessing, visualization, and analysis 📎 Task descriptions provided in an attached PDF

data-analysis data-science data-visualization matplotlib numpy pandas python3

Last synced: 29 Apr 2026

https://github.com/mominurr/amazon-best-sellers-data-analysis

Exploring trends and product insights in Amazon Best Sellers data.

data-analysis data-visualization python scraping selenium tableau

Last synced: 29 Apr 2026

https://github.com/ahshah322/world-happiness-report-2025

Data analysis and visualization of the World Happiness Report 2025 using Python (pandas, seaborn, matplotlib). Explores how GDP, health, freedom, generosity, and corruption perception influence global happiness.

data-analysis data-science matplotlib numpy pandas python seaborn worldhappiness

Last synced: 29 Apr 2026

https://github.com/farhad-here/student_performance_analyzer

Student Performance Analyzer with python, it is on of my data analysis course project. I teach you about filter(),lambda,map() in python

data-analysis data-visualization filter kaggle kaggle-dataset lambda map pandas python python-tutorial streamlit

Last synced: 29 Apr 2026

https://github.com/alam025/algo-trading-bot

Backtested 20+ strategies achieving 18% annualised returns on historical S&P 500 data

api ccxt data-analysis finance fintech pandas postgresql python

Last synced: 08 Jun 2026