An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/rohitblaze10/-excel-_seller_store_analysis

A collection of data analysis projects showcasing data cleaning, exploration, visualization, and machine learning. Using "Excel" and more to uncover insights and drive data-driven decision-making. Feel free to explore, contribute, or collaborate!

data-analysis data-visualization excel excel-export

Last synced: 12 Feb 2026

https://github.com/projects-developer/ransomware-prediction-using-machine-learning-project

The project aims to develop a machine learning-based system to predict and detect ransomware attacks on computer systems. Ransomware is a type of malware that encrypts a victim's files and demands a ransom in exchange for the decryption key. Project Includes Source Code, PPT, Synopsis, Report, Documents, Base Research Paper & Video tutorials

artificial-intelligence btechproject computerscienceproject cybersecurity-malware data-analysis data-mining deep-learning machinelearning mtechproject neural-networks ransomware-machine-learning

Last synced: 12 Feb 2026

https://github.com/rahulsm20/storedata

A data analysis project aimed at analyzing the sales data of the super store and providing useful insight into customer preferences.

data-analysis matplotlib numpy pandas python streamlit

Last synced: 16 Apr 2026

https://github.com/mananabbasi/dashboard-power-bi

This repository showcases **Power BI projects** focused on data visualization and business intelligence. Each project transforms raw data into interactive dashboards and reports, providing actionable insights for decision-making. The repository includes Power BI files, datasets, and documentation for each project.

data-analysis data-science data-visualization powerbi

Last synced: 13 Feb 2026

https://github.com/hfzdzakii/dicoding-solvinghrproblem

This repo is a master submission for my Dicoding Final Project. Employee Attrition & Performance Dataset was being used to fulfill the submission. Feel free to explore and I hope my work give you some insight!

data-analysis data-visualization

Last synced: 16 May 2025

https://github.com/mo-elshamy/machine-learning-practice

This repository serves as a collection of my work and learning in machine learning while my internship in Cellual-Technologies, including algorithm explanations, data preprocessing workflows, and two projects.

data-analysis data-science dbscan decision-trees eda gradient-boosting gxboost hierarchical-clustering kmeans-clustering knn-classification linear-regression logistic-regression machine-learning model pca polynomial-regression preprocessing random-forest support-vector-machines training

Last synced: 14 Feb 2026

https://github.com/risdorn/restaurant-delivery-platforms-analysis-bdm-project

This project analyzes restaurant delivery platforms to understand customer preferences, industry competition, and expansion opportunities. Conducted as part of the BDM project from IITM, it includes descriptive stats, distribution, correlation, regression, and geospatial analysis using multiple datasets.

data-analysis data-visualization jupyter-notebook kaggle

Last synced: 15 Feb 2026

https://github.com/chandrashekhar-01/globalterrorism-analysis

A data mining and analytics project on the Global Terrorism Database (Kaggle) that explores worldwide terrorism trends through Python-based data visualization and statistical analysis.

data-analysis data-mining data-visualization exploratory-data-analysis

Last synced: 28 Feb 2026

https://github.com/siddhant2105s/bring-your-own-device-boyd-system

This repository contains the design and implementation of the Bring Your Own Device (BYOD) System for managing personal devices at Life Insurance Company. It includes an ERD diagram, MySQL scripts for database creation, data insertion, and queries, as well as detailed data definitions and system requirements documentation.

data-analysis database-design database-normalization entity-relationship-diagram entity-relationship-models my-sql relational-databases relational-model sql-queries

Last synced: 15 Feb 2026

https://github.com/swethajoseph/sales-eda-project

Performed an advanced Excel-based exploratory data analysis (EDA) of an E-Commerce sales dataset to create an interactive dashboard for uncovering key business insights.

advancedexcel data-analysis data-visualization datacleaning dataformatting exploratory-data-analysis msexcel pivot-tables

Last synced: 19 Mar 2026

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 15 Jun 2026

https://github.com/k-bloch/car-theft-analysis

A dashboard created to inform the public about car theft, providing insights extracted from real-world police stats.

data-analysis maven-analytics tableau

Last synced: 19 Mar 2026

https://github.com/devexpress-examples/aspxpivotgrid-group-date-time-values

This example shows how to group date-time values in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 01 Mar 2026

https://github.com/yash22222/pwc-power-bi-virtual-case-experience

The Power BI PwC Virtual Case Experience is an exciting and educational program designed to provide participants with hands-on exposure to Power BI, a prominent business intelligence and data visualization tool, within the context of consulting at PwC.

business-analyst business-analytics business-intelligence dashboard data-analysis data-analyst data-analytics dax microsoft-power-bi powerbi powerbi-dashboards powerbi-visuals pwc

Last synced: 02 Mar 2026

https://github.com/aya-jafar/python

Practice files & exercises during the journey of Python leaning 🐍

data-analysis dl exercises ml

Last synced: 16 May 2025

https://github.com/dcs-training/data-wrangling-and-vis-pandas

Introduction to analyzing structured data with the Python libraries pandas, for CSV and TSV data, and ElementTree, for XML data. Go to the readme file

data-analysis data-visualisation data-wrangling python

Last synced: 16 Jun 2026

https://github.com/anderson-andre-p/exploratory-data-analysis.roller-coaster

This repository contains an exploratory data analysis (EDA) project focused on roller coasters. The project involved organizing, cleaning, and visualizing the data to gain insights into roller coasters' characteristics and performance.

data-analysis eda exploratory-data-analysis exploratory-data-visualizations notebook

Last synced: 15 Mar 2025

https://github.com/chaitanyaprasad60/sql-queries

This is a list of complex SQL Queries I have practiced.

data-analysis sql window-functions

Last synced: 03 Mar 2026

https://github.com/jjfiv/csc212spellchecking

Data Structure Analysis for Spell Checking

data-analysis smith-csc212

Last synced: 03 Mar 2026

https://github.com/geoninja/reddit_data_analysis

Data analysis application presented at the 2016 NTC (Non-profit Technology Conference) in San Jose, CA.

data-analysis python reddit-data-analysis text-analysis

Last synced: 03 May 2026

https://github.com/chitranjan806/predicting-on-time-premium-deposits

A Predictive analysis project to predict the success rate of On-Time deposits of Premiums by Policy Holders.

analytics-vidhya analytics-vidhya-competition catboostregressor data-analysis data-science linear-regression logistic-regression python3

Last synced: 16 May 2026

https://github.com/chardyb/prob-and-stats-bmi6106

A repository for Spring 2025 BMI 6106: Statistics and Probability. This repository contains coursework, code examples, and projects exploring statistical methods and probabilistic models in biomedical informatics.

biomedical-informatics data-analysis data-science probability r statistical-modeling

Last synced: 02 Sep 2025

https://github.com/ilovenooodles/probstat-water-potability

Tugas Besar Probabilitas dan Statistika 1

csv data-analysis jupyter-notebooks python

Last synced: 03 May 2026

https://github.com/banner-19/extraction-and-analysis-of-text

The objective is to analyze text content from a list of URLs. This involves extracting article titles and text, then performing natural language processing to generate metrics like sentiment, readability, and word usage. Finally, the results are stored for further analysis or visualization.

data-analysis data-analytics data-science nlp nltk python3 text-analysis text-extraction

Last synced: 03 May 2026

https://github.com/adrianlardies/feelms_predict_by_emotion

Feelms is a mood-based movie recommendation app that uses collaborative filtering and machine learning to suggest films based on your emotions. Built with Streamlit and powered by AWS, Feelms personalizes each user's experience through simulated interactions and tailored predictions.

aws-ec2 aws-rds data-analysis data-science machine-learning python streamlit

Last synced: 16 Apr 2026

https://github.com/johannaschmidle/netflix-subscription-analysis

Examined Netflix subscription data to understand market behaviour, predict future trends, and identify consumer preferences. [SQL, Tableau]

data-analysis data-cleaning data-trend data-visualization netflix

Last synced: 05 Mar 2026

https://github.com/vishal-verma-96/pre-owned-car-price-prediction-using-streamlit-app

Capstone Project by skill Academy- Exploratory Analysis, Visualization and Prediction of Used Car Prices. Deploying the highest-scoring model with Streamlit web app

data-analysis data-science jupyter-notebook machine-learning machine-learning-algorithms matplotlib numpy pandas python3 regression-algorithms scikit-learn seaborn streamlit

Last synced: 11 Apr 2026

https://github.com/totonga/ods-exd-api-box

Helper package to build ASAM ODS EXD API grpc plugins.

asam data-analysis grpc grpc-server ods plugin python

Last synced: 03 Feb 2026

https://github.com/eliasdehondt/learn-r

Welcome to the Learn-R repository! This is your go-to resource for learning the R programming language, whether you're a beginner or looking to enhance your skills.

data-analysis data-visualization education machine-learning programming r statistics tutorials

Last synced: 03 Apr 2026

https://github.com/vitornegromonte/eda_stroke

Exploratory data analysis in the stroke prediction dataset

data-analysis data-science exploratory-data-analysis kaggle-dataset visualization

Last synced: 17 Apr 2026

https://github.com/q-viper/blog-notebooks

This is the repo to store most of my blogs in dataqoil.com and q-viper.github.io.

data-analysis data-science machine-learning-algorithms timeseries

Last synced: 04 Apr 2026

https://github.com/pawlo77/airline-performance-data-analysis

Preprocessing of structured data - part of IAD study program, Faculty of Mathematics and Information Science, Warsaw University of Technology

data-analysis data-science visualization

Last synced: 10 May 2026

https://github.com/rajeev2806/retail-order-data-analysis

Dataset downloaded from kaggle api and then data cleaning and analysis is performed

data-analysis data-cleaning postgresql

Last synced: 18 Apr 2026

https://github.com/vvhacker007/technocolabs

This repo contains the projects that were assigned to me during the internship.

data-analysis data-science flask heroku-deployment internship machine-learning project streamlit website

Last synced: 18 Apr 2026

https://github.com/masum184e/exploratory_data_analysis_projects

This space to showcase my journey in exploring various datasets, uncovering patterns, and extracting meaningful insights. Each project highlights different aspects of EDA, demonstrating techniques and tools that are essential for making sense of data.

data-analysis data-analysis-projects data-science data-science-projects eda eda-projects exploratory-data-analysis exploratory-data-analysis-projects

Last synced: 31 Mar 2025

https://github.com/llnl/cap

HPC workflow that automates the tedious actions of compiling, analyzing, and parsing with bincfg

data-analysis hpc python workflows

Last synced: 17 Jun 2026

https://github.com/vl1507/data_science_pro_course

Курс "Аналитик данных PRO (PRO DA-6)"

da data-analysis data-science ds jupyter-notebook machine-learning ml pro-da python

Last synced: 18 Apr 2026

https://github.com/stimulsoft/samples-dashboards.web-for-blazor-webassembly

Blazor WebAssembly (Wasm) samples for Reports.BLAZOR embedded components, Visual Studio C# projects, .NET 6, .NET 7, .NET 8 dashboards tool

blazor client-side converter dashboard data data-analysis data-sources database datagrid designer diagram dimension json net presentation print runtime viewer wasm webassembly

Last synced: 18 Apr 2026

https://github.com/mksingh431/free-data-science-courses

Data science is a rapidly growing tech field that’s transforming business decision-making. To break into this field, you need the right skills. Fortunately, top institutions like Harvard and IBM offer free online courses. These courses cover everything from basic programming to advanced machine learning.

course data data-analysis data-science data-visualization free freecou python

Last synced: 19 Apr 2026

https://github.com/rodriguesl1/analise-ibovespa-fiap

Modelo de previsão do índice IBOVESPA utilizando técnicas de séries temporais. O projeto inclui análise exploratória, decomposição sazonal, testes de estacionariedade e modelagem com Prophet, AutoARIMA e outros modelos estatísticos para apoiar decisões de investimento.

autoarima b3 brasil data-analysis economia finance forecasting ibovespa pandas prophet python statsmodels time-series

Last synced: 19 Apr 2026

https://github.com/vyjayanthipolapragada/data_analytics_medical_appointments

Analyzing the data set which consists of medical appointments to draw insights about patient's no-show scenarios

data-analysis data-analytics data-cleaning data-visualization data-wrangling jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 19 Apr 2026

https://github.com/mlucifer27/bilateral-visualization

Streamlit app visualizes bilateral relationship scores between 100 countries from 1945 to 2024. It supports interactive heatmaps, network graphs, pairwise comparisons, and more.

d3blocks data-analysis data-visualization plotly-python python streamlit

Last synced: 04 Jun 2026

https://github.com/samwhaaa/superfoodsmax

A customer demographic & spending trend analysis on the fictional SuperFoodsMax grocery chain

data-analysis data-analytics data-visualization jupyter jupyter-notebook python

Last synced: 20 Apr 2026

https://github.com/namratha2301/carprice_analysisandprediction

This project analyzes factors influencing vehicle prices using a dataset of various attributes, including Engine capacity, Power, Mileage, and Seating capacity.

data-analysis data-visualization exploratory-data-analysis machine-learning pandas predictive-modeling random-forest-classifier regression scikit-learn seaborn

Last synced: 20 Apr 2026

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/xre22zax/roller-coaster

Explore award-winning wood and steel coasters from 2013-2018 Golden Ticket Awards & Captain Coaster, all powered by Python and interactive visualizations.

analytics data-analysis data-visualization pandas python python-lambda python3 visualization

Last synced: 20 Apr 2026

https://github.com/abinashsahoo007/project-bankruptcy-prevention

The project is to create a classification model that predicts the chances of a business facing bankruptcy based on the key feature like Industrial Risk, Management Risk, Financial Flexibility, Credibility, Competitiveness, Operating Risk.

data-analysis data-mining data-visualization deployments eda machine-learning pickle python statistics streamlit

Last synced: 20 Apr 2026

https://github.com/robinmillford/hr-analytics-employee-performance-analysis

HR Analytics: Unveiling Employee Performance - A comprehensive exploration of employee data using SQL and Power BI, uncovering key insights for strategic HR decision-making.

data-analysis data-visualization jupyter-notebook powerbi python3 sql

Last synced: 20 Apr 2026

https://github.com/duoan/ds-nbs

Data analysis and machine learning notebook.

data-analysis data-scientists deep-learning kaggle-competition machine-learning

Last synced: 18 Jun 2026

https://github.com/docuvesta/la-mer-skincare-chicago-duty-free-analysis

Comparing La Mer product selection, availability and pricing from 3 different purchase locations ✈️

analytics cremedelamer data-analysis data-analytics data-science data-visualization lamer luxury plotly python seaborn skincare

Last synced: 21 Apr 2026

https://github.com/nxion/sql-data-warehouse-project

Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.

data data-analysis data-analytics data-engineering data-lakehouse data-warehouse datalake datascience etl etl-job medallion-architecture ms mssql sql sql-query sql-server

Last synced: 05 Jun 2026

https://github.com/meerantajalli/networksecuritydefense

This Network Security defense systems acts as an indicator against SMP Floods, UDP Floods, ICMP Floods. This model is trained using packets from wireshark and can easily differentiate between normal network traffic and traffic that has been targetted on the machine by an attacker using the rate of packets transfer and using the source IP.

anomaly-detection classification cyber-security data-analysis ddos-detection icmp-flood intrusion-detection machine-learning network-security packet-analysis python random-forest security smp-flood udp-flood wireshark

Last synced: 21 Apr 2026

https://github.com/mhuwaimel/data-analysis-of-students-results-in-qiyas

Analysis of student performance data from Qiyas (قياس), the Saudi Arabian National Center for Assessment

data-analysis jupyter-notebook python

Last synced: 22 Apr 2026

https://github.com/prgermux/yield-reporter

This Python application provides a graphical user interface (GUI) for analyzing and visualizing production data from various machines. It uses the PyQt5 framework for the GUI and Matplotlib for plotting data.

automation data-analysis python reporting

Last synced: 22 Apr 2026

https://github.com/rorrell/lifeexpectancy

A Jupyter Notebook where I create a chart with two line plots on it to check out the life expectancy of men vs. women from 1900-2018

data-analysis data-visualization jupyter-notebook python3

Last synced: 22 Apr 2026

https://github.com/leabrodyheine/california-schools-data-visualization

This front-end project provides interactive visualizations of learning models adopted by California schools during the pandemic. Using D3.js and Mapbox, it dynamically presents data through bar charts, bubble charts, heatmaps, and geographic maps, allowing users to explore trends across school types, sizes, and districts.

d3-visualization d3js data-analysis data-visualization mapbox openai plotly

Last synced: 22 Apr 2026

https://github.com/thc1006/nycu_timtable_crawler

🎓 NYCU Course Data Crawler & Timetable System | 國立陽明交通大學課程爬蟲與選課系統 - Python web scraper for course schedules, syllabi & educational data analysis. Crawls 18K+ courses with 98% success rate. Features: interactive timetable, JSON API, Google Colab support, batch processing, resume capability.

academic course course-selection crawler data-analysis education educational-data google-colab json-api nycu open-data python schedule student-tools syllabus taiwan timetable university web-automation web-scraping

Last synced: 24 Apr 2026

https://github.com/henriquetourinho/s.i.g.m.a

Plataforma de busca e análise de arquivos para Linux, com GUI avançada em PySide6 e foco em metadados ricos para investigações profundas.

data-analysis developer-tools file-search metadata open-source pyqt pyside6 python python-brasil qt6 sysadmin-tools

Last synced: 24 Apr 2026

https://github.com/yuvrajsaraogi/-iris-flower-classification

Iris flower has three species; setosa, versicolor, and virginica, which differs according to their measurements. Now assume that you have the measurements of the iris flowers according to their species, and the task is to train a machine learning model that can learn from the measurements of the iris species and classify them.

classification data data-analysis data-science data-visualization flower flower-classification iris iris-classification iris-flower iris-flower-classification knn knn-classification machine-learning machine-learning-algorithms ml natural-language-processing nlp python

Last synced: 24 Apr 2026

https://github.com/edwinrlambert/emomap-sentiment-analysis

To analyze public sentiment related to specific locations in a city (e.g., parks, transit stations, restaurants, neighborhoods) using geo-tagged social media posts, reviews, and comments. The goal is to visualize how people feel across different areas and times.

data-analysis jupyter-notebook python sentiment-analysis

Last synced: 24 Apr 2026

https://github.com/pedrohdosanjos/economic-data-analysis

This project aims to analyze the export data from various states in the United States to Brazil over time. The data is sourced from the FRED (Federal Reserve Economic Data) API and processed to identify the top 5 exporting states for each year, as well as the states with the highest total export value across all years.

api data-analysis data-visualization jupyter-notebook python

Last synced: 24 Apr 2026

https://github.com/ismielabir/pycsvsummarizer

A lightweight tool to summarize CSV files using various features.

csv data-analysis data-summary python

Last synced: 25 Apr 2026

https://github.com/fbarffmann/belly-button-challenge

Built an interactive JavaScript dashboard to visualize bacterial biodiversity from belly button samples. Analyzed data from 153 participants and identified OTU 1167 as the most common bacteria.

biodiversity dashboard data-analysis data-visualization interactive-charts javascript json plotly

Last synced: 25 Apr 2026

https://github.com/xjwllmsx/hacker-news-engagement

Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.

data data-analysis jupyter python

Last synced: 25 Apr 2026

https://github.com/m-biriulova/python-job-market-analysis

Web scraping, data analysis, and visualization of Python developer vacancies in Czech Republic.

automation beautifulsoup data-analysis data-visualization portfolio-project python selenium web-scraping

Last synced: 25 Apr 2026

https://github.com/angelmtenor/idafc

Udacity's Intro to Data Analysis

data-analysis

Last synced: 20 Jun 2026

https://github.com/pararang/nams-thesis-fuzzy

A specialized data processing tool designed to help with Fuzzy Delphi Method calculations for thesis research data analysis. Then extended with some new features for data processing with different method.

data-analysis dematel hacktoberfest hacktoberfest-accepted house-of-quality python sustainability vibecoding

Last synced: 27 Apr 2026

https://github.com/mumtaz4118/amazon-iphone-12-data-scrapped

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

data-analysis data-extraction data-science data-scraping html mark-up python

Last synced: 27 Apr 2026

https://github.com/malexandersalazar/covid-19-peru-estimacion-oxigeno-requerido

Análisis técnico de casos confirmados por COVID-19 en Perú para la estimación de oxígeno medicinal requerido.

covid-19 data-analysis data-science peru python

Last synced: 27 Apr 2026