An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mindlessmuse666/eda-pandas

Проект по разведочному анализу данных (EDA) о пассажирах Титаника с использованием библиотеки Pandas. Включает в себя загрузку данных, предобработку, статистический анализ, визуализацию и создание сводных таблиц. Цель проекта - демонстрация основных методов и инструментов EDA для анализа и понимания данных.

data-analysis data-processing data-science data-visualization eda exploratory-data-analysis matplotlib pandas python titanic

Last synced: 18 Apr 2026

https://github.com/lavkalsi/tableau-project-stock-market-analysis

The Tableau Project: Stock Market Analysis features a dashboard that combines Descriptive, Diagnostic, Predictive, and Prescriptive analytics to provide insights into stock market trends. Using Python for data processing and an LSTM model for forecasting, this project visualizes historical and predicted stock prices, helping make informed decision.

dashboard data-analysis deep-learning lstm-model python tableau

Last synced: 18 May 2026

https://github.com/caprogs/paris-events-analyzer

A project to analyze events in Paris using open source data provided by the city.

data data-analysis data-platform dbt docker ingestion python streamlit transformation vizualisation

Last synced: 04 May 2026

https://github.com/rathod-shubham/google-data-analytics

Learning a wide range of skills that are useful in everyday life as well as being a data analyst.

data-analysis data-analysis-in-r data-analyst data-analyst-nanodegree data-analytics data-visualization google

Last synced: 03 Feb 2026

https://github.com/dsrodrigovieira/rossmannsales

Este repositório contém um projeto desenvolvido para praticar análise de dados e aplicação de modelos de regressão (aprendizagem supervisionada)

data-analysis data-science machine-learning python telegram-bot xgboost-regression

Last synced: 19 May 2026

https://github.com/kevin-rsj/the-substance-sentiment-analysis

Se analiza los comentarios de usuarios de Reddit sobre la película The Substance (2024) usando técnicas de NLP. Se obtuvo un sentiment score promedio de 0.19, y palabras clave como "horror" y "like" destacan entre las opiniones.

data-analysis notebook python sentiment-analysis tableau visualization

Last synced: 19 May 2026

https://github.com/kianaasd93/faostat

build a multilayer perceptron model that can be used for forecasting the export value of crop products for a geographical region three years into the future

agriculture data-analysis data-science faostat machine-learning ml multiplayer python rnn

Last synced: 19 May 2026

https://github.com/kakri787/alcoholism-and-grade-analysis

A mini project for university data science module where we analyzed on the relationship between alcohol consumption in students and their academic performance, making use of exploratory data analysis and machine learning techniques to see if we can predict student's grades.

data-analysis data-science data-vizualisation lasso-regression machine-learning neural-network

Last synced: 12 Apr 2025

https://github.com/marcogdepinto/olympichistoryanalysis

Python visual analysis of the Olympic Games history. Kaggle gold medal with 15000+ views, 200+ upvotes and 100+ comments.

data-analysis data-science jupyter-notebook olympic-games python seaborn

Last synced: 29 Apr 2026

https://github.com/oubiche-ishak19/stock_evaluation_python

A Python script to classify companies based on financial metrics like Piotroski F-Score and Stock Valuation, using CSV financial data for analysis and output.

backtesting-frameworks classification csv-processing data-analysis expert-system finance financial-analysis-tools python rule-based-classifier stock stock-market streamlit tkinter-gui yahoo-finance

Last synced: 15 May 2026

https://github.com/ujjwalll/econometrics_analysis_of_india_gdp_misestimation

A Econometric Analysis of the India's GDP to determine whether their is any flaw in India's GDP, as quoted by Dr. Arvind Subhramanium.

coefficient-estimates data-analysis econometrics economics gdp india r statistics

Last synced: 04 Jul 2026

https://github.com/shrunga92/5g_qos_data_transformation_python

Resource Allocation in 5G Network Service

5g-nr data-analysis python

Last synced: 19 May 2026

https://github.com/ofir-frd/predict-success-of-a-restaurant

Apply machine learning on a restaurante database. Study and analyse the data for prediction of a successful restaurant.

data-analysis data-science machine-learning visualization

Last synced: 11 Jun 2026

https://github.com/jesusgomez-data/retail-sales-data-analysis

End-to-end retail sales data analysis project using SQL, SQLite and Python (Pandas). Includes data generation, KPIs and business insights.

data-analysis junior-data-analyst pandas portfolio-project python retail-analysis sql sqlite sqlite3

Last synced: 11 Apr 2026

https://github.com/saidabderrahmane/bus_line_supervision

Performance evaluation of the Saint-Sébastien bus line using real data to predict the number of passengers.

beautifulsoup4 data-analysis data-science deep-learning machine-learning python scraper sklearn

Last synced: 11 Apr 2026

https://github.com/jatin-s16/netflix_analysis

This project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL. The goal is to extract valuable insights and answer various business questions based on the dataset. The following README provides a detailed account of the project's objectives, business problems, solutions, findings, and conclusions.

data-analysis excel postgresql sql

Last synced: 19 May 2026

https://github.com/r12habh/canada-imigration-data-analysis

Dataset: Immigration to Canada from 1980 to 2013 - International migration flows to and from selected countries - The 2015 revision from United Nation's website. (Cognitive Class Data Analysis with Python)

canada data-analysis data-science data-visualization datascience python python3

Last synced: 23 May 2026

https://github.com/pooja-manjunatha/nyc_parking_violations_dbt

This project uses dbt to transform NYC parking violations data through a layered architecture: Bronze: Raw ingested data Silver: Cleaned and enriched data Gold: Aggregated tables for analytics Using DuckDB as the warehouse backend, it ensures data quality with tests and documentation. The project enables reliable analysis of parking violations

data data-analysis data-engineering dbt duckdb python sql

Last synced: 14 May 2026

https://github.com/jidesamuell/data-analytics-projects

This is a repository i have created to showcase my skills, share projects and track my progress in Data Analytics areas.

data-analysis excel matplotlib powrebi python sql

Last synced: 04 May 2026

https://github.com/first-coding/aidanalyst

AIDAnalyst is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow

data-analysis llm openai prompt-engineering python

Last synced: 19 May 2026

https://github.com/vubacktracking/freecodecamp-data-analysis-with-python

5 Projects in Data Analysis With Python Course on Freecodecamp

data-analysis freecodecamp freecodecamp-project python

Last synced: 19 May 2026

https://github.com/hamzacham/data_set-projet-8

Analyzing a real world data-set with SQL and Python

data-analysis database dataset jupyter-notebook paython sql

Last synced: 19 May 2026

https://github.com/bjornmelin/data-analytics-playground

🧐 Collection of academic data analytics projects showcasing exploratory data analysis, geographic visualization, and interactive dashboards.

data-analysis data-analytics data-visualization geographic-analysis ggplot interactive-maps leaflet r r-programming shiny tidyverse

Last synced: 06 Apr 2025

https://github.com/abdoomohamedd/data-science-projects

A collection of data science projects ranging from exploratory data analysis to predictive modeling and clustering. Each project is designed to solve specific problems or explore particular datasets using various data science techniques and tools.

data-analysis data-analysis-python data-cleaning data-science data-visualization machine-learning machine-learning-algorithms

Last synced: 14 May 2025

https://github.com/kaoutarmi/analyse-des-ventes-pour-optimiser-la-performance

Analyse des données de ventes pour identifier des opportunités d'amélioration des performances commerciales. Utilisation de Pandas pour le traitement des données, et Matplotlib/Seaborn pour la visualisation des tendances et des résultats.

business-intelligence data-analysis data-visualization jupyter-notebook matplotlib pandas sales-optimization seaborn

Last synced: 01 Jul 2026

https://github.com/venkat-023/thyroid-cancer-prediction

This project aims to develop a machine learning pipeline to predict thyroid cancer based on patient data. The dataset was sourced from multiple public repositories, cleaned, and merged to create a comprehensive dataset for modeling. Various classification algorithms were implemented, including Random Forest, Logistic Regression, K-Nearest Neighbors

data-analysis data-cleaning deep-learning ensembling-methods hyperparameter-tuning machine-learning-algorithms nueral-networks

Last synced: 17 May 2026

https://github.com/saob007/tablero_subsidios_servicio_agua

Se construye un dashboard para el análisis de la distribución y asignación de subsidios para agua potable y alcantarillado otorgados por la Secretaría de Planeación de la Alcaldía de Sincelejo en 2020, con el objetivo de identificar patrones en cobertura, consumo, facturación y subsidios, facilitando la toma de decisiones en políticas públicas

dashboard data-analysis data-visualization looker-studio

Last synced: 31 Jan 2026

https://github.com/jyrki69pro/pdf-insight-agent

📄 Extract insights from PDFs effortlessly with this AI-powered summarizer, transforming documents into structured, actionable points.

agent-based-model agentic-ai agentic-workflow agents ai-agent data-analysis finance-management financial-analysis generative-ai langchain langgraph llama3 llm multiagent-systems pdf phidata python toolcalling

Last synced: 11 Apr 2026

https://github.com/idaraabasiudoh/drug_prescribtion_decision_tree_model

This repository contains a machine learning project focused on classifying drugs based on patient characteristics using a Decision Tree classifier. The project uses Python and popular data science libraries such as scikit-learn, pandas, and matplotlib.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 10 Apr 2026

https://github.com/jofaval/iris-flowers

Multilabel Classification of the famous Iris Flowers Dataset from Ronald Aylmer Fisher in 1936

classification data-analysis data-science data-visualization google-colab iris-flowers kaggle machine-learning python scikit-learn xgboost

Last synced: 05 Apr 2026

https://github.com/touppercase78/salary-prediction-collection

Salary predictions with ML models and analyses on datasets from several other GitHub repos

data-analysis data-visualization datasets machine-learning python3 regression-models

Last synced: 02 May 2026

https://github.com/ramonanf/tc1002s_semanatec

Herramientas computacionales: El arte de la analítica

data-analysis data-visualization jupiter-notebook pandas-python

Last synced: 15 Jun 2025

https://github.com/ddjain/jsonl-visualizer

A beautiful web tool for visualizing JSONL files with syntax highlighting and multiple view modes

data-analysis json jsonl viusal

Last synced: 01 Jul 2026

https://github.com/chanupadeshan/atliq-bank-insights

A complete data analytics and A/B testing project for Atliq Bank using synthetic customer and transaction data. Includes data cleaning, EDA, and statistical evaluation of a targeted marketing campaign. Website: Leave blank or link to a blog/portfolio if applicable

ab-testing data-analysis data-visualization eda python3 statistics

Last synced: 04 Jul 2026

https://github.com/eco786786/salaries

This analysis explores the factors influencing salaries for data professionals from 2020 to 2024, including job titles, experience levels, remote work ratios, employment types, company locations and sizes. Using data from Kaggle, the project uncovers trends and insights to guide both companies and professionals in the tech industry.

data-analysis git postgresql powerbi

Last synced: 19 May 2026

https://github.com/mimi-netizen/python-and-machine-learning-in-financial-analysis

This comprehensive repository covers financial data analysis using Python and machine learning techniques, including time series modeling, portfolio optimization, risk assessment, credit risk prediction, and deep learning applications in finance.

data-analysis data-science data-visualization finance financial-analysis financial-data financial-modeling

Last synced: 19 May 2026

https://github.com/silianpan/python-data-analysis-course

python data analysis course of drotion-lega

data-analysis jupyter-notebook panda

Last synced: 11 Apr 2025

https://github.com/jabulente/tanzania-geographical-zones

This project provides a geospatial visualization of Tanzania's geographical zones and regions. It uses geospatial data to map each zone, display regions, and annotate them for easy identification. The visualizations include simulated data to demonstrate thematic mapping techniques.

ai data-analysis data-science data-visualization geopandas geospatial location matplotlib ml python tanzania tanzania-geographic tanzania-locations

Last synced: 19 May 2026

https://github.com/mysftz/statistics-analysis

A python statistical analysis of a dataset and probability.

data-analysis matplotlib python python3 statistical-analysis

Last synced: 29 Jun 2025

https://github.com/galahad20/b244006e_analisis_data

Data Analysis project at Dicoding course "Belajar Analisis Data dengan Python". I learn to do analyst on data and visualizing it to get meaningful insight.

data-analysis data-analytics python streamlit

Last synced: 06 Apr 2026

https://github.com/jprmaulion/cholera-gedeo-ethiopia-spatial-analysis

Exploratory spatial analysis and visualization of cholera case clusters in Gedeo Zone, Ethiopia that integrates demographic and geographic data to identify environmental risk patterns and inform public health interventions. Includes geospatial mapping of cholera incidence relative to waterways and administrative boundaries.

cholera data-analysis data-analysis-python epidemiology ethiopia openstreetmap python spatial-analysis

Last synced: 12 Apr 2026

https://github.com/brunomontezano/sleep-cognition-and-functioning

💤 Data analysis of a brief communication published in Psychiatry Research Communications journal by Montezano et al (2023).

bipolar-disorder cognition data-analysis data-visualization data-viz depression ggplot2 pelotasrs psychiatry psychology published-article r sleep ucpel

Last synced: 13 Jun 2026

https://github.com/iamsainikhil/data-visualization

Visualization of Web data using Python

data-analysis data-visualization python webscraping

Last synced: 13 Jun 2026

https://github.com/srvcl/lung-cancer-survival-analysis

Data Cleaning of a dataset and Survival Analysis in R Language

data-analysis data-science data-visualization r survival-analysis

Last synced: 11 May 2026

https://github.com/coditheck/data_analysis

Data analysis is the process of inspecting, cleaning, transforming, and modeling data in order to discover useful information, draw conclusions, and support decision making.

data-analysis python

Last synced: 17 Jun 2025

https://github.com/maheera421/pandas

Implementation of essential Pandas functions.

data-analysis data-manipulation pandas-dataframes pandas-datareader pandas-python

Last synced: 17 Jul 2025

https://github.com/luminati-io/target-dataset-samples

A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.

api data-analysis data-mining datasets target web-scraper web-scraping

Last synced: 04 Jan 2026

https://github.com/madi-s/tennispredictor

Program to predict outcomes of major tennis matches.

data-analysis prediction-algorithm python scraper tennis webdriver

Last synced: 06 Jul 2025

https://github.com/lucaspadoni/9-11-hijackers-social-network-analysis

Social Network Analysis focused on the events of 9/11/2001. By examining publicly available data through SNA techniques, we gain insights into the organizational structure of the terrorist network, offering valuable perspectives on key relationships and connections.

9-11 data-analysis data-analytics graph-theory hijacking network-analysis sna social-network-analysis terrorism terrorist-attacks

Last synced: 26 Jun 2026

https://github.com/jabulente/kruskall-wallis-test

This repository contain project that provides a reusable Python function to perform the Kruskal-Wallis H-test across multiple continuous variables, grouped by a categorical feature

data-analysis data-science eda hypothesis-tests kruskal-wallis kruskals-algorithm scipy-stats statistics

Last synced: 22 Jul 2025

https://github.com/aaisha-nexus/sql_company_insights

A beginner-friendly SQL project for managing employee records, departments, and sales transactions. Includes table creation, optimized queries, stored procedures, and window functions to extract business insights.

business-analytics data data-analysis dataanalysis-projects dataanalytics database-schema mssql-database query relational-databases sql sql-query ssms

Last synced: 12 Aug 2025

https://github.com/amr-yasser226/interactive-sales-analytics-dashboard

An interactive web-based dashboard for visualizing multinational electronics sales data. This project for the DSAI 203 course integrates a Python/Flask backend with an amCharts frontend to provide dynamic insights into product revenues, sales distribution, and employee statistics across different countries.

am5charts amcharts business-intelligence css dashboard data-analysis data-analytics data-visualization flask html javascript python sqlalchemy sqlite web-application

Last synced: 13 Apr 2026

https://github.com/carvalhoandre/coletor-tweets

Criado para coletar e armazenar tweets utilizando a API do Twitter. Inicialmente inspirado no caso de uso do livro Um Voluntário na Campanha de Obama, este projeto tem como objetivo demonstrar a importância do monitoramento no X. O coletor permite buscar tweets sobre qualquer termo desejado

data-analysis mongodb python twiter-analysis twitter

Last synced: 19 May 2026

https://github.com/prasad-chavan1/bank_data_analysis_r

Bank data analysis in R language

data data-analysis data-science r

Last synced: 24 Feb 2025

https://github.com/chaganti-reddy/ai-prototype-customer-segmentation

Artificial Intelligence Prototype product based model for Customer Segmentation in E-Commerce Industry.

artificial-intelligence cluster-analysis customer-segmentation data-analysis machine-learning product-based prototype

Last synced: 13 Mar 2025

https://github.com/aakk23/netflix_sql_project

This SQL project provides an analytical overview of Netflix's movies and TV shows dataset, uncovering key insights related to content types, ratings, release trends, and geographic distribution. It helps explore patterns in content availability, audience targeting, and regional preferences to support data-driven decisions.

data-analysis netflix-data-analysis postgresql sql

Last synced: 10 Apr 2025

https://github.com/marcomadera/test-for-random-numbers

Test for random number between 0 and 1

data-analysis statistics

Last synced: 09 Jul 2025

https://github.com/sweta-kaundilya/911-calls-capstone-project

For this capstone project we will be analyzing some 911 call data from Kaggle.

data data-analysis data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 28 Apr 2026

https://github.com/sweta-kaundilya/sql_projects_data_analytics

This repository contains SQL porfolio projects

data-analysis mysql-database mysql-workbench

Last synced: 10 Sep 2025

https://github.com/al-ogr/sf_pr2_job_analysis_hh_sql

SkillFactory DataScience PROJECT-2. Анализ вакансий из HeadHunter

data-analysis data-science ipynb plotly python sql

Last synced: 19 May 2026

https://github.com/arun-data-analyst/finance-reporting-sql

End-to-end SQL project for project/portfolio finance: schema, seed data, validation, data-quality checks, business queries, and KPI views (Power BI–ready).

data-analysis data-modeling data-quality database finance kpi portfolio-management powerbi sql sql-server ssms

Last synced: 18 May 2026

https://github.com/lmuffato/jiboia

Jiboia is a Python package for automatically normalizing and optimizing DataFrames efficiently.

data-analysis data-science dataframe normalization pandas python

Last synced: 19 May 2026

https://github.com/gonzalofuentes28/dpeek

Interactive terminal data viewer for CSV, TSV, JSON, and JSONL files

bubbletea cli csv csv-viewer data-analysis data-viewer golang json json-viewer sqlite terminal tui

Last synced: 06 Apr 2026

https://github.com/shubhammittal-data/hr_dashboard_tableau

An interactive HR Analytics Dashboard built using Tableau. Provides insights into workforce demographics, hiring trends, salary analysis, and employee records for data-driven decision-making.

chatgpt4 data data-analysis data-visualization drawio-tools faker-generator hr-analytics hr-analytics-dashboard human-resources numpy python tableau tableau-public

Last synced: 17 May 2026

https://github.com/valyaevgeorgiy/r_basic

Работа с основами среды R и тем самым изучения нового языка программирования, связанного непосредственно с анализом данных и построением графиков и диаграмм.

coding data data-analysis r rstudio

Last synced: 12 Dec 2025

https://github.com/jatin-mehra119/sales-analysis

Sales Analysis of super market

data-analysis salesanalysis visualization

Last synced: 30 Jun 2026

https://github.com/joe-stifler/llm-sig-playground

This repository is a collaborative space for MSc Earth Science students at Imperial College London to experiment with and apply Large Language Models (LLMs) to real-world Earth Science problems. Follows below the persona playground link.

data-analysis earth-science llms machine-learning research-automation

Last synced: 29 Mar 2025

https://github.com/mansiikumarii/mysql

A curated collection of MySQL scripts covering DDL, DML, and DRL operations. Ideal for beginners to practice and understand core SQL concepts.

backend data-analysis data-modeling database database-integration database-management database-performance database-schema mysql mysql-admin mysql-database orm php-mysql query-optimization rdbms sql sql-query sql-script stored-procedure

Last synced: 19 May 2026

https://github.com/faith99/water_pollution_dashboard

A data visualization project exploring water access, contamination and health outcomes

data-analysis data-visualization powerbi public-health publichealth

Last synced: 02 Feb 2026

https://github.com/the-pinbo/dimensionalityredux-pca-vs-autoencoders

Comparative study of PCA and Autoencoders for effective dimensionality reduction, assessed through PSNR and SSIM metrics.

autoencoder-mnist autoencoders data-analysis dimensionality-reduction image-compression mnist neural-networks pca psnr ssim

Last synced: 13 May 2025

https://github.com/julie-fliorko/rockbuster-insights-sql-project

Data analysis using PostgreSQL to help Rockbuster Stealth LLC identify revenue trends, customer insights, and rental behavior patterns.

data-analysis postgresql sql

Last synced: 22 Jul 2025

https://github.com/natgluons/fmcg-data-modeling

SQL, ARIMA, and K-Means Clustering for data analysis dan customer segmentation regarding sales data

arima-forecasting arima-model customer-segmentation data-analysis data-science-projects kmeans-clustering sales-forecasting

Last synced: 13 Aug 2025