An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/pyramidheadshark/ai-mirea-sem1p

Completed set of all MIREA AI an DA practices (1 sem.)

beginner-friendly data-analysis data-science jupyter mirea

Last synced: 05 Apr 2025

https://github.com/jwt218/sinc

MATLAB Standardization and Isotope Normalization for CSIA (with integrated correction and uncertainty quantification)

data-analysis geochemistry isotopes matlab

Last synced: 23 Jun 2025

https://github.com/debjyotisaha/data-analytics-projects-phase-2

Developed and showcased various data analytics projects, including data preprocessing, exploratory data analysis, and visualization. Utilized tools such as Python, Pandas, NumPy, and Matplotlib to derive actionable insights and demonstrate problem-solving capabilities.

data-analysis data-preprocessing eda matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/jofaval/boston-housing

Regression Analysis into the Boston Housing in-demand pricing in 1978

boston-housing data-analysis data-science data-visualization machine-learning python regression

Last synced: 16 May 2026

https://github.com/shubhamprajapati7748/end-to-end-house-price-prediction

A machine learning model that accurately predicts housing prices using the Boston Housing dataset by analyzing various house features, and it utilizes a CatBoost model to assist potential buyers or sellers in estimating housing prices.

boston-housing-price-prediction data-analysis data-science-projects machine-learning regression regression-models

Last synced: 30 Oct 2025

https://github.com/ddihora1604/social_media_analysis

A powerful, interactive dashboard for analyzing social media conversations, trends, and network dynamics. This tool allows researchers and analysts to explore patterns in social media data, identify key trends, and detect coordinated behavior.

aiml css data-analysis data-visualization html javascript python

Last synced: 30 Oct 2025

https://github.com/nikbarb810/motif_detection_in_r

Motif Detection for TFBS in Glycolysis and Glyconeogenesis pathways

bioinformatics data-analysis null-hypothesis pwm r

Last synced: 23 Jun 2025

https://github.com/rociobenitez/airbnb-data-mining

Análisis detallado y modelado predictivo de alojamientos en Madrid utilizando técnicas de Big Data y estadística en R, enfocado en optimización de datos y predicción de características de propiedades.

airbnb data-analysis data-mining estadistica prediction-model predictive-analytics predictive-modeling qmd r rstudio

Last synced: 23 Jun 2025

https://github.com/adriangalvanzamora/ecommerce-analytics-olist

Data analysis project based on the Olist Brazilian E-Commerce dataset. Includes data cleaning, exploratory analysis, delivery performance metrics, customer satisfaction modeling, and geospatial insights. Built entirely in Python (Jupyter Notebook) using real-world data from Kaggle.

brazil customer-satisfaction data-analysis data-visualization ecommerce folium geospatial-analysis machine-learning matplotlib notebook pandas plotly python seaborn

Last synced: 06 May 2026

https://github.com/drisskhattabi6/meteo-data-mining

This repo contains using Data Mining Techniques to analyze meteorological (meteo) data. The objective is to extract meaningful insights and patterns from the data that can aid in understanding weather phenomena and predicting future weather conditions.

cart data-analysis data-mining data-visualization decision-making decision-tree extract-data extract-insights insights-analytics insights-data k-means knn machine-learning svm

Last synced: 21 Mar 2025

https://github.com/jayita11/eda-student-exam-performance

This project performs Exploratory Data Analysis (EDA) and hypothesis testing on student performance data. It explores trends based on attributes like gender, race/ethnicity, parental education, lunch type, and test preparation course completion.

data-analysis eda hypothesis-testing matplotlib pandas python seaborn statsmodels student-performance-analysis

Last synced: 11 Jul 2025

https://github.com/balajimohan18/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataeset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization eda kmeans-clustering machine-learning sql unsupervised-learning

Last synced: 27 Jul 2025

https://github.com/thc1006/taiwan-ai-usage-index

台灣 AI 使用指數 (TAUI) - 開源資料分析框架,測量分析台灣各地區 AI 技術採用率 | Taiwan AI Usage Index - Open-source framework for measuring regional AI adoption

ai-adoption anthropic-index bilingual data-analysis human-ai-collaboration onet-classification open-source policy-analysis privacy-protection python research taiwan tdd usage-index visualization

Last synced: 03 Oct 2025

https://github.com/vedantshi/tableau-bike-data-dashboard

London Bike Rides Analysis explores bike usage patterns using data visualization and machine learning. It identifies trends through a dynamic moving average, analyzes weather impact with heatmaps, and provides actionable insights via an interactive Tableau dashboard. Tools: Python, Tableau.

data-analysis data-visualization python tableau weather-data

Last synced: 16 May 2026

https://github.com/maxbiostat/diehl_ebola_cell_2016

supplementary code and data to Diehl et al, 2016 (Cell)

data-analysis data-visualization disease-spread ebola mutation

Last synced: 11 Jul 2025

https://github.com/liebsen/overlemon

Overlemon institutional application

data-analysis design devops sysadmin webdev

Last synced: 21 Jul 2025

https://github.com/capjamesg/personal-notebooks

Notebooks for personal experiments with machine learning and computer vision.

data-analysis machine-learning notebooks

Last synced: 03 Apr 2025

https://github.com/bamresearch/utah-saxs-tools

The Utah SAXS Tools (USToo), adapted for Python 3, originally by David P. Goldenberg, 2009-2012

data-analysis saxs small-angle-scattering small-angle-xray-scattering

Last synced: 17 Jan 2026

https://github.com/lit26/data_jobs_analyzing

Data analysis for data jobs

data-analysis topic-modeling

Last synced: 26 Mar 2025

https://github.com/sakan811/gachascope

Evaluate the cost-effectiveness of various in-app purchase bundles available in gacha games.

data data-analysis data-visualization game honkai honkai-star-rail honkai-starrail hoyoverse javascript nextjs tableau tableau-public typescript wutheringwaves

Last synced: 04 May 2026

https://github.com/lavkalsi/tableau-project-stock-market-analysis

The Tableau Project: Stock Market Analysis features a dashboard that combines Descriptive, Diagnostic, Predictive, and Prescriptive analytics to provide insights into stock market trends. Using Python for data processing and an LSTM model for forecasting, this project visualizes historical and predicted stock prices, helping make informed decision.

dashboard data-analysis deep-learning lstm-model python tableau

Last synced: 18 May 2026

https://github.com/caprogs/paris-events-analyzer

A project to analyze events in Paris using open source data provided by the city.

data data-analysis data-platform dbt docker ingestion python streamlit transformation vizualisation

Last synced: 04 May 2026

https://github.com/rathod-shubham/google-data-analytics

Learning a wide range of skills that are useful in everyday life as well as being a data analyst.

data-analysis data-analysis-in-r data-analyst data-analyst-nanodegree data-analytics data-visualization google

Last synced: 03 Feb 2026

https://github.com/dsrodrigovieira/rossmannsales

Este repositório contém um projeto desenvolvido para praticar análise de dados e aplicação de modelos de regressão (aprendizagem supervisionada)

data-analysis data-science machine-learning python telegram-bot xgboost-regression

Last synced: 19 May 2026

https://github.com/adnanrahin/nlp-with-disaster-tweets

Kaggle Competition: Predict which Tweets are about real disasters and which ones are not. Natural Language Processing.

data-analysis data-science data-visualization kaggle-competition machine-learning natural-language-processing regular-expression tweets

Last synced: 21 Jun 2025

https://github.com/kevin-rsj/the-substance-sentiment-analysis

Se analiza los comentarios de usuarios de Reddit sobre la película The Substance (2024) usando técnicas de NLP. Se obtuvo un sentiment score promedio de 0.19, y palabras clave como "horror" y "like" destacan entre las opiniones.

data-analysis notebook python sentiment-analysis tableau visualization

Last synced: 19 May 2026

https://github.com/kianaasd93/faostat

build a multilayer perceptron model that can be used for forecasting the export value of crop products for a geographical region three years into the future

agriculture data-analysis data-science faostat machine-learning ml multiplayer python rnn

Last synced: 19 May 2026

https://github.com/marcogdepinto/olympichistoryanalysis

Python visual analysis of the Olympic Games history. Kaggle gold medal with 15000+ views, 200+ upvotes and 100+ comments.

data-analysis data-science jupyter-notebook olympic-games python seaborn

Last synced: 29 Apr 2026

https://github.com/pkjjoshi/restaurants-analysis

Performed beginner-level EDA on a restaurant dataset using Python. Analyzed top cuisines, city-wise ratings, price ranges, and online delivery impact using Pandas and Matplotlib. Includes 4 well-structured notebooks with visual insights.

beginner-project data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas python restaurant-data seaborn

Last synced: 21 Jun 2025

https://github.com/shrunga92/5g_qos_data_transformation_python

Resource Allocation in 5G Network Service

5g-nr data-analysis python

Last synced: 19 May 2026

https://github.com/teditae/data-analysis-with-pandas

Mini data science projects focused on Pandas-powered analysis.

data-analysis data-manipulation pandas python

Last synced: 30 Apr 2026

https://github.com/jesusgomez-data/retail-sales-data-analysis

End-to-end retail sales data analysis project using SQL, SQLite and Python (Pandas). Includes data generation, KPIs and business insights.

data-analysis junior-data-analyst pandas portfolio-project python retail-analysis sql sqlite sqlite3

Last synced: 11 Apr 2026

https://github.com/saidabderrahmane/bus_line_supervision

Performance evaluation of the Saint-Sébastien bus line using real data to predict the number of passengers.

beautifulsoup4 data-analysis data-science deep-learning machine-learning python scraper sklearn

Last synced: 11 Apr 2026

https://github.com/jatin-s16/netflix_analysis

This project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL. The goal is to extract valuable insights and answer various business questions based on the dataset. The following README provides a detailed account of the project's objectives, business problems, solutions, findings, and conclusions.

data-analysis excel postgresql sql

Last synced: 19 May 2026

https://github.com/devexpress-examples/web-forms-pivot-grid-calculate-running-totals

This example demonstrates how to calculate running totals in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 08 Aug 2025

https://github.com/jidesamuell/data-analytics-projects

This is a repository i have created to showcase my skills, share projects and track my progress in Data Analytics areas.

data-analysis excel matplotlib powrebi python sql

Last synced: 04 May 2026

https://github.com/first-coding/aidanalyst

AIDAnalyst is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow

data-analysis llm openai prompt-engineering python

Last synced: 19 May 2026

https://github.com/vubacktracking/freecodecamp-data-analysis-with-python

5 Projects in Data Analysis With Python Course on Freecodecamp

data-analysis freecodecamp freecodecamp-project python

Last synced: 19 May 2026

https://github.com/atharvkadammm/suicide-prediction-system

A machine learning project predicting suicide risk based on multiple socio-economic and environmental factors using data mining techniques.

csv data-analysis data-science data-visualization datamining exploratory-data-analysis feature-engineering machine-learnin matplotlib mental-health numpy pandas riskassesment seaborn sklearn suicide-prediction supervised-

Last synced: 01 Jul 2025

https://github.com/hamzacham/data_set-projet-8

Analyzing a real world data-set with SQL and Python

data-analysis database dataset jupyter-notebook paython sql

Last synced: 19 May 2026

https://github.com/jedrzej-wydra/competition-cooperation

Competition, cooperation, and parental effects in larval aggregations formed on carrion by communally breeding beetles Necrodes littoralis (Staphylinidae: Silphinae)

data-analysis non-linear-regression r

Last synced: 20 Aug 2025

https://github.com/bjornmelin/data-analytics-playground

🧐 Collection of academic data analytics projects showcasing exploratory data analysis, geographic visualization, and interactive dashboards.

data-analysis data-analytics data-visualization geographic-analysis ggplot interactive-maps leaflet r r-programming shiny tidyverse

Last synced: 06 Apr 2025

https://github.com/abdoomohamedd/data-science-projects

A collection of data science projects ranging from exploratory data analysis to predictive modeling and clustering. Each project is designed to solve specific problems or explore particular datasets using various data science techniques and tools.

data-analysis data-analysis-python data-cleaning data-science data-visualization machine-learning machine-learning-algorithms

Last synced: 14 May 2025

https://github.com/atharvkadammm/calmlytic

An end-to-end machine learning project that predicts anxiety severity using classification models (Naive Bayes, Decision Tree, SVM, Logistic Regression, XGBoost), based on lifestyle, health, and behavioral features.

anxiety-prediction classification csv data-analysis data-preprocessing-and-cleaning data-science data-visualization ensemble-learning logistic-regression machine-learning-algorithms matplotlib mental-health numpy pandas python sci-kit-learn seaborn supervised-learning svm xgboost

Last synced: 21 Jun 2025

https://github.com/rezowanrahat/netflix_analysis

Data analysis of Netflix content using Python, Pandas, and Seaborn

data-analysis data-visualization netflix pandas python

Last synced: 07 May 2026

https://github.com/kushagrakumar04/visual-age-distribution

A Bar chart or histogram to visually depict the distribution of a categorical or continuous variable, such as the age distribution or gender composition within a population. This graphical representation provides a clear and insightful overview of the data's patterns and trends.

data-analysis data-science google-colab

Last synced: 21 Jun 2025

https://github.com/jyrki69pro/pdf-insight-agent

📄 Extract insights from PDFs effortlessly with this AI-powered summarizer, transforming documents into structured, actionable points.

agent-based-model agentic-ai agentic-workflow agents ai-agent data-analysis finance-management financial-analysis generative-ai langchain langgraph llama3 llm multiagent-systems pdf phidata python toolcalling

Last synced: 11 Apr 2026

https://github.com/idaraabasiudoh/drug_prescribtion_decision_tree_model

This repository contains a machine learning project focused on classifying drugs based on patient characteristics using a Decision Tree classifier. The project uses Python and popular data science libraries such as scikit-learn, pandas, and matplotlib.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 10 Apr 2026

https://github.com/muneeb706/human_activity_recognition

This project performs data cleaning and data exploration steps for Human Activity Recognition Using Smartphones Data Set in R programming language.

data-analysis data-cleaning data-exploration r-programming

Last synced: 08 Aug 2025

https://github.com/touppercase78/salary-prediction-collection

Salary predictions with ML models and analyses on datasets from several other GitHub repos

data-analysis data-visualization datasets machine-learning python3 regression-models

Last synced: 02 May 2026

https://github.com/ramonanf/tc1002s_semanatec

Herramientas computacionales: El arte de la analítica

data-analysis data-visualization jupiter-notebook pandas-python

Last synced: 15 Jun 2025

https://github.com/eco786786/salaries

This analysis explores the factors influencing salaries for data professionals from 2020 to 2024, including job titles, experience levels, remote work ratios, employment types, company locations and sizes. Using data from Kaggle, the project uncovers trends and insights to guide both companies and professionals in the tech industry.

data-analysis git postgresql powerbi

Last synced: 19 May 2026

https://github.com/alpkanoz/ibm_data_science_professional_certificate

The repository contains projects and training materials carried out throughout the IBM data science professional course.

classification clustering data-analysis data-science data-visualization dataframe ibm ibm-watson machine-learning mathplotlib pandas predictive-modeling python scikit-learn

Last synced: 07 Mar 2026

https://github.com/mimi-netizen/python-and-machine-learning-in-financial-analysis

This comprehensive repository covers financial data analysis using Python and machine learning techniques, including time series modeling, portfolio optimization, risk assessment, credit risk prediction, and deep learning applications in finance.

data-analysis data-science data-visualization finance financial-analysis financial-data financial-modeling

Last synced: 19 May 2026

https://github.com/jgohel9902/toronto-airbnb-snowflake

This project analyzes Airbnb listings in Toronto using **Snowflake’s cloud data platform**. It follows a **Bronze → Silver → Gold** medallion architecture and leverages **Snowflake Cortex** to generate **AI-driven executive insights**.

data-analysis python snowflake sql

Last synced: 07 Mar 2026

https://github.com/jabulente/tanzania-geographical-zones

This project provides a geospatial visualization of Tanzania's geographical zones and regions. It uses geospatial data to map each zone, display regions, and annotate them for easy identification. The visualizations include simulated data to demonstrate thematic mapping techniques.

ai data-analysis data-science data-visualization geopandas geospatial location matplotlib ml python tanzania tanzania-geographic tanzania-locations

Last synced: 19 May 2026

https://github.com/mysftz/statistics-analysis

A python statistical analysis of a dataset and probability.

data-analysis matplotlib python python3 statistical-analysis

Last synced: 29 Jun 2025

https://github.com/galahad20/b244006e_analisis_data

Data Analysis project at Dicoding course "Belajar Analisis Data dengan Python". I learn to do analyst on data and visualizing it to get meaningful insight.

data-analysis data-analytics python streamlit

Last synced: 06 Apr 2026

https://github.com/marlysson/craw

A system to show the data collected from various sources using chartjs - ⚡️

chartsjs data-analysis data-science web-scraping

Last synced: 21 Jun 2025

https://github.com/bho0920/crime-data-analysis-eu

Crime Data Analysis for Self-Defense Tool Market Entry in the EU.

data data-analysis sql sqlite tableau

Last synced: 21 Jun 2025

https://github.com/iamsainikhil/data-visualization

Visualization of Web data using Python

data-analysis data-visualization python webscraping

Last synced: 13 Jun 2026

https://github.com/srvcl/lung-cancer-survival-analysis

Data Cleaning of a dataset and Survival Analysis in R Language

data-analysis data-science data-visualization r survival-analysis

Last synced: 11 May 2026

https://github.com/akunna1/energy-data-analysis-unc-campus

Link to Report: https://adminliveunc-my.sharepoint.com/:w:/r/personal/tadennis_ad_unc_edu/Documents/Capstone%20Group/Final%20Report%20Draft.docx?d=wba9e7182a9b948898133e4f89def1d90&csf=1&web=1&e=fQGAfy

arcgis-pro data-analysis dplyr excel geospatial-data-analysis ggplot ggplot2 lubricants tidyr tidyverse

Last synced: 08 Aug 2025

https://github.com/dmytrori/himalayan_expeditions

Himalayan expedition stats, 1905–2020

alpinism data-analysis data-visualization pandas-python

Last synced: 21 Jun 2025

https://github.com/jabercrombia/invoice-tracker

Created an invoice tracker with sample data using Nextjs and data visualizations.

data-analysis nextjs postgres shadcn vercel

Last synced: 07 Apr 2026

https://github.com/madi-s/tennispredictor

Program to predict outcomes of major tennis matches.

data-analysis prediction-algorithm python scraper tennis webdriver

Last synced: 06 Jul 2025

https://github.com/jabulente/kruskall-wallis-test

This repository contain project that provides a reusable Python function to perform the Kruskal-Wallis H-test across multiple continuous variables, grouped by a categorical feature

data-analysis data-science eda hypothesis-tests kruskal-wallis kruskals-algorithm scipy-stats statistics

Last synced: 22 Jul 2025

https://github.com/madrury/hot-sauce

Simuation of a Hot Sauce Spicyness Dataset

data-analysis data-science data-visualization dataset machine-learning

Last synced: 16 May 2026

https://github.com/whisplnspace/insightgenie

InsightGenie is an AI-powered data analyst that lets you upload files, ask questions, and get insights with visualizations

data-analysis data-science data-visualization deployment gemini-api huggingface nlp

Last synced: 19 Jun 2025