An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/namratagulati/tweets_analysis

This repository focuses on sentiment analysis of Twitter data using Python, Natural Language Processing (NLP), and the Natural Language Toolkit (NLTK). The goal is to extract valuable insights from social media discussions, such as word frequency, hashtag trends, and sentiment patterns.

analysis data-analysis natural-language-processing nlp-machine-learning nltk-corpus nltk-python sentiment-analysis twitter-sentiment-analysis

Last synced: 07 Aug 2025

https://github.com/v41bh4vr4jput/data-analysis-with-python

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

api data data-analysis data-visualization matplotlib numpy pandas python sakila-db seaborn

Last synced: 09 Apr 2026

https://github.com/gmasson/datadash

DataDash é uma biblioteca JavaScript e CSS para criar dashboards interativos, para visualização de dados dinâmicos em páginas web.

dashboard dashboard-application dashboards data-analysis data-science data-visualization javascript

Last synced: 08 Aug 2025

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/jagoda11/elastic-vision

This repository contains a full-stack application designed to explore data from ElasticSearch🧐indices and visualize it using charts and graphs. The backend is built using Node.js and the frontend is powered🚀 by React.

backend chartjs dashboard-development data-analysis data-visualization docker elasticsearch frontend fullstack javascript material-ui monorepo mui-x node pie-chart react restful-api tables

Last synced: 09 Apr 2026

https://github.com/debjyotisaha/data-analytics-projects-phase-2

Developed and showcased various data analytics projects, including data preprocessing, exploratory data analysis, and visualization. Utilized tools such as Python, Pandas, NumPy, and Matplotlib to derive actionable insights and demonstrate problem-solving capabilities.

data-analysis data-preprocessing eda matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/thc1006/taiwan-ai-usage-index

台灣 AI 使用指數 (TAUI) - 開源資料分析框架,測量分析台灣各地區 AI 技術採用率 | Taiwan AI Usage Index - Open-source framework for measuring regional AI adoption

ai-adoption anthropic-index bilingual data-analysis human-ai-collaboration onet-classification open-source policy-analysis privacy-protection python research taiwan tdd usage-index visualization

Last synced: 03 Oct 2025

https://github.com/devexpress-examples/web-forms-pivot-grid-calculate-running-totals

This example demonstrates how to calculate running totals in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 08 Aug 2025

https://github.com/muneeb706/human_activity_recognition

This project performs data cleaning and data exploration steps for Human Activity Recognition Using Smartphones Data Set in R programming language.

data-analysis data-cleaning data-exploration r-programming

Last synced: 08 Aug 2025

https://github.com/akunna1/energy-data-analysis-unc-campus

Link to Report: https://adminliveunc-my.sharepoint.com/:w:/r/personal/tadennis_ad_unc_edu/Documents/Capstone%20Group/Final%20Report%20Draft.docx?d=wba9e7182a9b948898133e4f89def1d90&csf=1&web=1&e=fQGAfy

arcgis-pro data-analysis dplyr excel geospatial-data-analysis ggplot ggplot2 lubricants tidyr tidyverse

Last synced: 08 Aug 2025

https://github.com/jakobzmrzlikar/trg-dela

Data analysis of student job offers.

data-analysis ipython-notebook web-scraping

Last synced: 09 Aug 2025

https://github.com/busradeveci/odev2-branching

This project is prepared for Artificial Intelligence and Technology Academy Git GitHub Assignment 2. Using the “Wine Reviews” dataset from Kaggle, it converts wine ratings into star ratings and analyzes them.

data-analysis kaggle-dataset python wine-reviews-dataset

Last synced: 03 Oct 2025

https://github.com/lashawnfofung/super-heroes-analysis-project

This portfolio project involves a detailed analysis of 732 superhero records from the heroes_information.csv dataset, comprising 11 columns of unique characteristics for each hero. The primary goal is to showcase key insights derived from this rich dataset, demonstrating proficiency in data analysis using SQL.

data-analysis datasets mysql-database mysql-server mysql-workbench sql

Last synced: 07 Jul 2025

https://github.com/yash22222/data-analysis-on-real-time-social-media-comments

EngageInsight analyzes user interactions in comment data. It provides insights through visualizations created using Python libraries like Pandas and Matplotlib. The project aims to uncover patterns and trends in user engagement. The visualizations provide an overview of comment lengths, the frequency of different types of replies.

data-analysis data-cleaning-and-preprocessing data-visualization matplotlib pandas pattern-recognition real-time-social-media-data seaborn trend-analysis

Last synced: 14 May 2026

https://github.com/svetlanam/pycon-workshop

Pycon CZ workshop: Better data analyses and product recommendations with Instagram data

data-analysis data-science martinus matplotlib pandas pycon2016 pyconcz python scikit-learn workshop

Last synced: 09 Apr 2026

https://github.com/abhigyan126/prompt2query

A Python desktop application for streamlined data analysis, enabling users to generate and execute Pandas and SQL queries with ease. Focus on reducing analysis time through an intuitive interface and efficient workflows

data-analysis data-science data-visualization database gemini generative-ai ide llm pandas pandas-interface python sql-interface

Last synced: 13 Feb 2026

https://github.com/brunomontezano/digital-interventions-for-depression

📱 "Digital interventions for depressive symptoms: a randomized clinical trial" code

academia clinical-trials cognitive-behavioral-therapy data-analysis digital-health open-science smartphone-app

Last synced: 03 Oct 2025

https://github.com/blackcub3s/msc-finalthesis

The most important programming files, code functions and data processing pipelines for the Machine learning final thesis of my Master's degree. Also, the LaTeX code of the thesis.

data-analysis latex machine-learning numpy python sklearn

Last synced: 09 Apr 2026

https://github.com/dcostachar/telco-customer-churn-dashboard

An interactive Tableau dashboard using the Telco Customer Churn dataset to analyze key drivers of customer churn and develop data-driven retention strategies for the telecommunications industry.

business-intelligence customer-churn-analysis data-analysis data-visualization marketing-analytics tableau

Last synced: 09 Mar 2026

https://github.com/alan-oliveir/state-of-data-2022

Neste projeto faço a análise da distribuição das faixas salariais para os profissionais de nível júnior para o cargo de analista, cientista e engenheiro de dados.

data-analysis jupyter-notebook pandas-python seaborn-python

Last synced: 03 Oct 2025

https://github.com/m-coder-umer/sales-dashboard-power-bi-project

An interactive Sales Dashboard built with Power BI using MySQL data, showcasing monthly trends, top-performing products, and key sales KPIs (Key Performance Indicators).

business-intelligence data-analysis data-cleaning data-modeling data-visualization dax interactive-dashboard mysql power-query powerbi sales-dashboard sql time-series-analysis

Last synced: 07 Jul 2025

https://github.com/susshiii/sql-layoffs-data-cleaning-and-eda

Full SQL project using MySQL to clean and analyze a real-world tech layoff dataset from 2020–2023.

data-analysis data-analytics-project data-cleaning eda layoffs mysql sql

Last synced: 07 Jul 2025

https://github.com/mkoeppe/jiawei-computations

Computations supporting Chapters 2 and 3 of Jiawei Wang's dissertation "Subadditivity of Piecewise Linear Functions", UC Davis, Ph.D. program in Mathematics, 2020

benchmark-framework branch-and-bound cluster cutting-planes data-analysis hpc integer-programming reproducible-research sagemath

Last synced: 10 Aug 2025

https://github.com/hemangsharma/hotel-revenue-booking-analysis

This project provides a comprehensive revenue and reservation analysis for Highfield Hotel using historical data exported from booking systems and internal revenue reports. The goal is to derive actionable insights to improve room profitability, understand booking patterns, and support data-driven decision-making.

analysis data-analysis data-visualization hotel

Last synced: 10 Aug 2025

https://github.com/nafisrayan/decentai

A comprehensive platform built using ReactJS and Flask, combining blockchain technology with AI to create a secure and intelligent space for community engagement and policy discussions. Leverages NLP and LLM for meaningful interactions and sentiment analysis while ensuring data security and user privacy.

chatbot data-analysis data-visualization flask gemini gemini-ai gemini-ai-chatbot gemini-api government government-tech llm mongodb nlp polls python react tailwind voting-systems winknlp

Last synced: 12 Apr 2026

https://github.com/ifigeneiatsiflidou/applied-statistics-project

Project for an Applied Statistics course, involving exploratory data analysis and predictive modeling of movie revenue using engineered features and multiple linear regression.

correlation-analysis data-analysis linear-regression python scikit-learn visualization

Last synced: 29 Apr 2026

https://github.com/r8vnhill/hdp

Hentai data processing

data-analysis e-hentai hentai kotlin

Last synced: 02 Apr 2025

https://github.com/antononcube/java-tilestats

Java package for statistics over 2D tillings. (Tile binning, aggregation functions application, etc.)

data-analysis hexagonal-grids

Last synced: 02 Apr 2025

https://github.com/nuraj250/datainsighthub

A Node.js backend application that processes and analyzes personal user data to generate personalized insights and recommendations. It features secure user authentication, data upload and storage, custom algorithms for data analysis, and optional real-time notifications and third-party API integrations. Perfect for showcasing backend development

api-development backend-development bcrypt data-analysis data-analytics data-insights dotenv express jwt-authentication mongodb nodejs passport secure-api user-authentication

Last synced: 09 Apr 2026

https://github.com/srikarveluvali/dataanalysis

The "Dataset - Extraction, Analysis, and Visualization" project is a Python-based data analysis venture that focuses on exploring and interpreting the "Video Game Sales Analysis" dataset.

css data-analysis html javascript matplotlib numpy pandas python seaborn tableau

Last synced: 09 Apr 2026

https://github.com/gui-sitton/prepaid

In this project I work as an analyst for the telecommunications company Megaline. The company offers its customers prepaid plans, Surf and Ultimate. The sales department wants to know which plans bring in the most revenue in order to adjust the advertising budget

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 22 May 2026

https://github.com/antononcube/wl-tilestats-paclet

Wolfram Language (aka Mathematica) paclet for statistics over 2D tillings. (Tile binning, aggregation functions application, etc.)

2d-data data-analysis geospatial-data mathematica wolfram-language

Last synced: 20 Mar 2026

https://github.com/andrii04/andreamonforte-bi-assignment

Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.

automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql

Last synced: 09 Nov 2025

https://github.com/gutow/langmuir_trough

Code to run homebuilt Langmuir Trough using Jupyter and Python. Link below for API docs:

data-acquisition data-analysis jupyter langmuir-trough plotting

Last synced: 11 Aug 2025

https://github.com/rajkumargara/bike_rental_data_analysis

Chicago bike rental data analysis for business insights using R programming

data-analysis data-visualization data-wrangling large-dataset machine-learning-algorithms

Last synced: 11 Aug 2025

https://github.com/jovicdev97/Financial-Loan-DataScience-Notebook

using numpy and pandas to analyze a synthetic loan dataset with python

data-analysis matlabplot numpy pandas plotting python seaborn

Last synced: 12 Mar 2025

https://github.com/erayagdogan/simplecharts

Simple Charts is a chart maker compose app with material 3 design. Charts are created using the lets-plot-compose library.

android android-app charts data-analysis data-visualization jetpack-compose lets-plot-kotlin material-3 viewmodel

Last synced: 11 Aug 2025

https://github.com/ct83/become-a-data-analyst-udacity

This repository contains all of the code, projects and reports that I wrote as I pursued my Udacity - Data Analyst NanoDegree.

data-analysis data-analysis-python data-analyst data-visualisation data-visualization-project datascience python udacity udacity-data-analyst-nanodegree

Last synced: 12 Aug 2025

https://github.com/mindlessmuse666/eda-pandas

Проект по разведочному анализу данных (EDA) о пассажирах Титаника с использованием библиотеки Pandas. Включает в себя загрузку данных, предобработку, статистический анализ, визуализацию и создание сводных таблиц. Цель проекта - демонстрация основных методов и инструментов EDA для анализа и понимания данных.

data-analysis data-processing data-science data-visualization eda exploratory-data-analysis matplotlib pandas python titanic

Last synced: 18 Apr 2026

https://github.com/r12habh/canada-imigration-data-analysis

Dataset: Immigration to Canada from 1980 to 2013 - International migration flows to and from selected countries - The 2015 revision from United Nation's website. (Cognitive Class Data Analysis with Python)

canada data-analysis data-science data-visualization datascience python python3

Last synced: 23 May 2026

https://github.com/nabilalibou/uber_fare_prediction_explained

This repository documents a complete ML workflow to model Uber fares in Paris, from granular EDA and feature engineering to building and fine-tuning a stacking regressor on 10k real-world rides.

data-analysis data-science eda feature-engineering machine-learning predictive-analytics pricing-model python regression-model stacking-ensemble uber

Last synced: 12 Aug 2025

https://github.com/omari-kd/data-analytics

Welcome to my Data Analytics Portfolio, which includes structured projects in both Data Science and Data Analysis, implemented in R and Python.

data-analysis data-analytics data-science machine-learning

Last synced: 12 Aug 2025

https://github.com/jprmaulion/cholera-gedeo-ethiopia-spatial-analysis

Exploratory spatial analysis and visualization of cholera case clusters in Gedeo Zone, Ethiopia that integrates demographic and geographic data to identify environmental risk patterns and inform public health interventions. Includes geospatial mapping of cholera incidence relative to waterways and administrative boundaries.

cholera data-analysis data-analysis-python epidemiology ethiopia openstreetmap python spatial-analysis

Last synced: 12 Apr 2026

https://github.com/aaisha-nexus/sql_company_insights

A beginner-friendly SQL project for managing employee records, departments, and sales transactions. Includes table creation, optimized queries, stored procedures, and window functions to extract business insights.

business-analytics data data-analysis dataanalysis-projects dataanalytics database-schema mssql-database query relational-databases sql sql-query ssms

Last synced: 12 Aug 2025

https://github.com/darkdk123/house-valuation-model

A Challenge Project in a Boot-Camp to create a ML Model to predict the prices of houses in Boston Massachusetts from multiple parameters Using Multivariable Regression.

data-analysis data-science data-visualization matplotlib-pyplot multivariate-regression predictive-modeling statistics

Last synced: 07 Jul 2025

https://github.com/arun-data-analyst/finance-reporting-sql

End-to-end SQL project for project/portfolio finance: schema, seed data, validation, data-quality checks, business queries, and KPI views (Power BI–ready).

data-analysis data-modeling data-quality database finance kpi portfolio-management powerbi sql sql-server ssms

Last synced: 18 May 2026

https://github.com/farhad-here/adventureworks_interactive_sales_dashboard_powerbi

An interactive Power BI dashboard for Adventure Works sales team to analyze performance, customers, products, and employees. Includes data cleaning, data modeling, DAX measures and advanced visualization features.

business-intelligence chart csv data-analysis data-cleaning data-cleaning-and-preprocessing data-visualization dax powerbi

Last synced: 13 Aug 2025

https://github.com/imgabreuw/minicurso-python-para-financas

Mini curso de Python para finanças, disponibilizado por Varos.

data-analysis financial-analysis python

Last synced: 13 Aug 2025

https://github.com/natgluons/fmcg-data-modeling

SQL, ARIMA, and K-Means Clustering for data analysis dan customer segmentation regarding sales data

arima-forecasting arima-model customer-segmentation data-analysis data-science-projects kmeans-clustering sales-forecasting

Last synced: 13 Aug 2025

https://github.com/itsachrafmansari/moroccan-real-estate-analysis

Scrape, process, analyze, and visualize data from Avito.ma to uncover current trends in Morocco's real estate market.

api-scraping data data-analysis data-mining data-science data-scraping data-visualization eda exploratory-data-analysis morocco real-estate web-scraping

Last synced: 13 Aug 2025

https://github.com/baguilar6174/python-jupyter-notebooks

Explore data analysis projects with Python, Jupyter and more tools. Discover stunning visualizations and reveal meaningful information in datasets to make informed decisions.

data-analysis jupyter-notebook kaggle pandas python

Last synced: 09 Apr 2026

https://github.com/lulloooo/bizdata-nexus

Collection of my Business & Data Analysis projects, from professional/academic endeavors to passion-driven explorations 📊

business-analysis data-analysis economics etl excel finance mysql python r risk-analysis

Last synced: 05 Apr 2026

https://github.com/emmarhoffmann/analysis-of-sleep-patterns-and-psychological-well-being-among-college-students

Explores the relationship between sleep patterns, psychological well-being, and lifestyle choices among college students using statistical analysis on 253 observations.

college-students data-analysis r statistical-models

Last synced: 04 Oct 2025

https://github.com/applicativesystem/numpy-builder

code getter and numpty operator for numpy operations

data-analysis numpy numpy-python shell-script

Last synced: 15 Aug 2025

https://github.com/Solrikk/PicTrace-Web

PicTraceV2 is a highly efficient image matching platform that leverages computer vision using OpenCV, deep learning with TensorFlow and the ResNet50 model, asynchronous processing with aiohttp, and Selenium for browser automation. PicTraceV2 allows users to upload images directly or provide URLs, quickly scanning a vast database to find image

automation computer-vision data-analysis data-extraction deep-learning image-processing image-search machine-learning natural-language-processing opencv openpyxl pandas python selenium tensorflow web-scraping yandex yandex-api

Last synced: 15 Aug 2025

https://github.com/shubhamgoyal575/tableau-visualization-dashboard

This repository features interactive Tableau dashboards for sales performance and healthcare analysis. It includes insights on revenue trends, regional sales, patient demographics, and hospital occupancy for data-driven decision-making. 🚀

dashborad data-analysis data-cleaning-and-preprocessing healthcare-analysis healthcare-dashboard sales-dashboard sales-data-analysis-project tableau tableau-dashboards tableau-public visualization visualization-tools

Last synced: 20 Feb 2026

https://github.com/ggarciajavier/udacity-dalf-project1-investigate-dataset

Work performed for the 1st project of Udacity Data Analyst Nanodegree: exploratory data analysis of a football dataset.

data-analysis football-analytics python python36 udacity-data-analyst-nanodegree

Last synced: 15 May 2026

https://github.com/clchinkc/zombie

Personal project, Python, NumPy, Matplotlib, Pygame, Scikit-learn, TensorFlow, Docker

algorithms data-analysis docker machine-learning matplotlib numpy pygame python sklearn tensorflow zombie-simulation

Last synced: 05 Apr 2026

https://github.com/lit26/data_jobs_analyzing

Data analysis for data jobs

data-analysis topic-modeling

Last synced: 26 Mar 2025

https://github.com/zen204/accenture-tech-news-summarization-engine

A tool developed to analyze knowledge graphs from technology news articles, uncovering insights and trends about technology products, platforms, services, and their industry impact. Built during an internship at Accenture to inform decision-making in the tech landscape.

data-analysis decision-making graph-visualization industry-insights jupyter-notebook knowledge-graph machine-learning python tech-news tech-trends

Last synced: 29 Apr 2026

https://github.com/dcs-training/scottishaccounts

This repo contains various examples of analysis that can be performed on the Statistical Accounts of Scotland dataset. Go to the readme file

data-analysis data-visualisation data-wrangling geographical-data r rmarkdown text-analysis

Last synced: 16 Aug 2025

https://github.com/cyberoctane29/noaa-lightning-analysis

This project explores lightning strike data from the National Oceanic and Atmospheric Administration (NOAA) to identify seasonal trends and analyze strike frequency across months. It demonstrates data manipulation, aggregation, and visualization using Python, providing insights into lightning activity patterns.

data-analysis data-science data-visualization eda python

Last synced: 20 Apr 2026

https://github.com/sebastiansauer/hans-hackathon2025

Materials for a course on the evaluation of the AI student learn tool "HaNS"

ai data-analysis evaluaton r

Last synced: 04 Oct 2025

https://github.com/rachkat/random-foresst-analysis-r-studio-plotting-classification-tree

Classification analysis in R using the birthwt dataset. Built and compared Decision Tree and Random Forest models to predict low birth weight. Both achieved 71.05% accuracy, with Random Forest reducing overfitting and confirming maternal weight and age as key predictors.

classification data-analysis decision-trees machine-learning predictive-modeling r random-forest

Last synced: 04 Oct 2025

https://github.com/chandkund/loan-eligibility-prediction

This project is designed to predict the eligibility of loan applicants based on various factors such as income, credit history, and marital status. By analyzing historical loan application data, the model helps to determine whether a loan application should be approved or not.

data-analysis data-science data-visualization machine-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/i-e-b/dynamictimewarp

A quick C# implementation of https://jeremykun.com/2012/07/25/dynamic-time-warping/

data-analysis pattern-matching working

Last synced: 17 Aug 2025

https://github.com/edoardotosin/january-2025-southern-california-wildfires-burn-severity-sentinel2

Scripts and data for analyzing burn severity of the January 2025 Southern California wildfires using Sentinel-2 satellite imagery. This project explores the use of the Differenced Normalized Burn Ratio (dNBR) and Relativized Burn Ratio (RBR) to classify burn severity, leveraging publicly available satellite data.

burn-severity copernicus data-analysis earth-observation satellite-imagery sentinel-2 wildfire wildfire-detection wildfires

Last synced: 09 Feb 2026

https://github.com/harshindcoder/online_retail_data_clustering_project

This marketing analytics project uses RFM (Recency, Frequency, Monetary) features for customer classification, inspired by the online retail mining paper. The RFM model helps segment customers, identify high-value ones, and optimize marketing strategies.

customer-segmentation data-analysis data-visualization market-analytics

Last synced: 17 Aug 2025

https://github.com/jofaval/iris-flowers

Multilabel Classification of the famous Iris Flowers Dataset from Ronald Aylmer Fisher in 1936

classification data-analysis data-science data-visualization google-colab iris-flowers kaggle machine-learning python scikit-learn xgboost

Last synced: 05 Apr 2026

https://github.com/davidzajac1/four-percent-rule-pandas-analysis

Analysis of the 4% Personal Finance Rule of Thumb

data-analysis data-visualization pandas python

Last synced: 20 Apr 2026

https://github.com/sukitsubaki/screen-time-tracker

A minimalist Python tracker that records the usage time of various applications and provides insights into your computer usage habits.

application-usage data-analysis monitoring productivity python python-cli screen-time time-tracking

Last synced: 12 Apr 2025