An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/djo/data-analysis

Data Analysis course notebooks in R

data-analysis r

Last synced: 29 Mar 2025

https://github.com/kiranmayi5/r-projects

This repository showcases R projects designed to tackle real-world problems through data-driven solutions.

data-analysis exploratory-data-analysis predictive-modeling r statistical-analysis

Last synced: 25 Jun 2025

https://github.com/vara-co/space-missions

Space Missions Over Time (1957-2022): Successes vs Failures, and Rocket Usage

data-analysis data-analysis-python history matplotlib pandas pandas-python space space-race spaceships team-project

Last synced: 18 May 2026

https://github.com/gurpreetkaurjethra/ai-data-visualization-agent

This Streamlit application creates an interactive Data Visualization Assistant that can understand Natural Language Queries and generate appropriate Visualizations using LLMs.

aiagents aichatbot aidevelopment artificial-intelligence data-analysis data-visualization generative-ai llms

Last synced: 25 Jun 2025

https://github.com/ajwad-shaikh/sristi-sanshodh-collect

SRISTI Sanshodh Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments. Contribute and make the world a better place! ✨📋✨ https://docs.opendatakit.org/collect-…

collect data-analysis data-collection javarosa odk opendatakit

Last synced: 04 Apr 2025

https://github.com/p2-718na/alice-simulation

Code for my Lab-2 course.

cern-root data-analysis

Last synced: 13 Mar 2025

https://github.com/sufiyanahmed4566/sql-musicmaven

"This Music Store Database Project showcases SQL skills through comprehensive database design, query optimization, and data analysis. Includes ER diagram, database file, query questions (Easy, Medium, Hard), answered queries, and CSV table data. Ideal for recruiters seeking skilled SQL developers for music store management and data analysis.

data-analysis database insights mysql-database oracle-database relational-databases sql

Last synced: 18 May 2026

https://github.com/makosai/covid19datachart

A basic chart for checking corona data. Written in a single HTML file for convenience. Grab the single file and run it anywhere. Or visit the webpage.

chart chartjs corona coronavirus coronavirus-analysis covid-19 covid-2019 covid19 covid19-data data data-analysis datasets

Last synced: 23 Feb 2026

https://github.com/onome-joseph/ml-fraud-dectection

This project is designed to identify fraudulent transactions with high accuracy.

classfication-model data-analysis data-science machine-learning problem-solving

Last synced: 06 Apr 2025

https://github.com/steciuk/ium-recommendation-system

Evaluation and comparison of 3 different recommendations models for web shopping service simulation.

data-analysis model-evaluation recomendation-system

Last synced: 29 Oct 2025

https://github.com/ebowwa/chatgpt-export-processor

🤖 Extract, analyze & search your ChatGPT conversations locally | Privacy-first tool for OpenAI ChatGPT data export processing | Python CLI with embeddings support

ai-tools chatgpt chatgpt-export chatgpt-tools cli conversation-analysis data-analysis data-extraction embeddings local-first nlp openai openai-api privacy python

Last synced: 19 May 2026

https://github.com/salman-khan-mohammed/predicting-the-intent-of-online-shoppers

This project aims to predict online shoppers' purchase intentions using browsing history and user data from e-commerce sites. By analyzing clickstream and session information, the goal is to create a machine learning model that accurately forecasts customers' likelihood of making a purchase.

cluster-analysis data-analysis data-pre eda outliers prediction

Last synced: 28 Jun 2026

https://github.com/adolbyb/data-science-python

An Introduction to Data Science and Data Visualization with the FAU Data Science and Machine Learning Club

data-analysis data-science data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/fatihilhan42/web_scraping_football_statistics_per_game_data-main

In this notebook I will describe the process of scraping data from web portal understat.com that has a lot of statistical information about all games in top 5 European football leagues.

data-analysis data-manipulation data-science data-scraping data-visualization jupyter-notebook python

Last synced: 19 May 2026

https://github.com/subhojit45/python3-iphones-x-flipkart-sales-analysis

A simple six questions and their insights derived from iphone sales on Flipkart dataset.

data-analysis jupyter-notebook python3 visual-studio-code visualization

Last synced: 19 May 2026

https://github.com/geobatpo07/office-hours-bootcamp

Practical case studies and labs from the Akademi 2025 Data Science & AI Bootcamp office hours.

artificial-intelligence data-analysis data-science data-visualization database deep-learning learning learning-by-doing machine-learning statistics

Last synced: 07 Mar 2026

https://github.com/airdac/mva-bank_india

Multivariate Analysis of an Indian bank's dataset about loan paybacks in R. Team project from UPC's Master's Degree in Data Science

data-analysis data-science multivariate-analysis r upc

Last synced: 26 May 2026

https://github.com/2003harsh/house-price-prediction-using-machine-learning

This project features a web app that predicts house prices using a linear regression model. Users can input details like location, square footage, bathrooms, and bedrooms through an HTML form. I've added a CI/CD pipeline with GitHub Actions, unit testing with pytest, and automated Docker containerization to improve deployment and robustness.

ci-cd data-analysis docker-image flask linear-regression machine-learning matplotlib mlops-workflow requests scikit-learn

Last synced: 04 Jan 2026

https://github.com/tiwarishubham635/uber-data-analysis-using-r

Analyzes the Uber Cab data using plots, heatmaps and dataframes

data-analysis data-visualization r

Last synced: 14 Apr 2025

https://github.com/aldomann/tropical-cyclones

Scripts to replicate the analyses and figures from "Scaling of tropical-cyclone dissipation" by Corral et al.

bachelor-thesis data-analysis hurricanes mathematical-statistics ocean studies

Last synced: 19 May 2026

https://github.com/reusjimenez/python-data-analysis

Casos completos y ejercicios prácticos de análisis de datos. 📊

data-analysis data-visualization jupyter-notebook machine-learning matplotib numpy pandas python sklearn

Last synced: 12 Apr 2026

https://github.com/sumitgirwal/procoder-public

"ProCoder", which is a web-based application providing massive open online courses for both professionals and students. It aims to offer a platform for learning coding skills online, accessible to anyone who is interested in learning programming or enhancing their coding knowledge. ProCoder provides courses on various programming languages, tools.

blog-platform bootstrap-4 chat-application css3 data-analysis django-crud django-project html5 javascript numpy-library pandas-library python3

Last synced: 05 Mar 2026

https://github.com/lmuffato/project-restaurant-orders-trybe

Projeto restaurant orders - Projeto avaliativo da Trybe do Bloco 36: Estrutura de Dados I: Arrays, Hashmaps e Sets

array array-set csv data data-analysis hashmap python set trybe trybe-projects

Last synced: 13 Sep 2025

https://github.com/pseudomanifold/us-inauguration-speeches

Data & feature extraction for U.S. inauguration speeches

data-analysis data-science inauguration politics speech speeches

Last synced: 15 May 2025

https://github.com/bretsw/subreddits-over-time

Study of the r/Teachers and r/education subreddits over time

data-analysis dataset reddit

Last synced: 12 Jan 2026

https://github.com/vhtua/group4_data_analysis

Hierarchical Cluster Analysis: Movie Genres Preferences

data-analysis hierarchical-clustering r unsupervised-learning

Last synced: 29 Mar 2025

https://github.com/nafisalawalidris/investigating-netflix-movies-and-guest-stars-in-the-office

Dive into the world of Netflix and explore the average duration of movies. Netflix, being the largest entertainment company, offers a wide range of movies for its viewers. In this project, we analyse movie durations using pandas and create a DataFrame from a dictionary. By examining average durations from 2011 to 2020.

average-duration csv-files data-analysis data-visualization dataframe filtering movie-durations movie-length-distribution netflix pandas python trends

Last synced: 20 May 2026

https://github.com/jabhij/tableau_dashboards

Consists brief info about all of my tableau dashboards, insights that I got out of them, & the outcomes that I got after analyzing those visualizations.

data-analysis data-analytics data-science data-visualization tableau visualisation

Last synced: 07 Mar 2026

https://github.com/lijesh010/netflix_dataset_exploratory_data_analysis_python_project

This repository contains an Exploratory Data Analysis (EDA) Python project on the Netflix dataset. The purpose of this project is to gain insights and better understand the characteristics of the content available on Netflix, including movies and TV shows.

data-analysis data-exploration data-visualization exploratory-data-analysis jupyter-notebook python

Last synced: 20 May 2026

https://github.com/gursv/stocksage

Predict next day's close price for a stock like NSEI, NYA, HSI, IXIC, TWII, etc...!

data-analysis data-preprocessing data-science gridsearchcv machine-learning python3 random-forest-regressor stock-data stock-price-prediction streamlit

Last synced: 18 Apr 2026

https://github.com/riju18/advanced-data-analysis-and-visualization

Advanced level of data preparation, level of detail calculation, animation, table calculation etc for data analysis & visualization.

data-analysis data-science data-visualization tableau

Last synced: 18 Feb 2026

https://github.com/gappeah/british-airways-analysis

This project focuses on analyzing and visualising travel data from British Airways using Tableau. The goal is to extract insights and present them in an interactive and visually appealing manner.

data data-analysis data-visualization tableau

Last synced: 11 Jun 2025

https://github.com/samruddhi3012/shopping-habits-customer-behavior-analysis

Hello there! This repo contains python project based on E-Commerce Customer Behavior analysis.

customer-segmentation customerbehavior data-analysis ecommerce python

Last synced: 29 Mar 2025

https://github.com/swarchal/morar

Processing phenotypic screening data

biology data data-analysis drug-discovery hts phenotypic

Last synced: 19 Jun 2025

https://github.com/mijisu0103/ukhsa-dashboard-project

Simple dashboard that downloads and displays the data about infectious diseases (Influenza, Rhinovirus and COVID-19) from the UK Health Security Agency (UKHSA) dashboard.

data-analysis data-visualisation ipywidgets python voila-dashboard

Last synced: 17 Jun 2025

https://github.com/mgobeaalcoba/analisis_con_r

Trabajos de análisis realizados con lenguaje R

data-analysis data-science dataset r r-package r-programming r-studio

Last synced: 21 May 2026

https://github.com/csoren66/diabetics_prediction

Predicting that whether the patient has diabetes or not on the basis of the features we will provide to our machine learning model.

data-analysis machine-learning python svm

Last synced: 03 Mar 2025

https://github.com/victorherdz10/rainsense-iot

Sistema IoT de detección temprana de lluvias con Arduino. Monitorea condiciones meteorológicas usando sensores DHT22/BMP280 y algoritmos de predicción multivariable para alertas en tiempo real. Procesa datos y los envía via HTTP/JSON.

arduino bmp280 data-analysis dht22 embedded-systems iot platformio rain-detection real-time sensor-network weather-prediction weather-station

Last synced: 17 Apr 2026

https://github.com/yard1/linearordering

An R package. Provides various methods of linear ordering of data. Supports weights and positive/negative impacts.

data-analysis data-analysis-in-r data-analysis-r data-science r

Last synced: 21 May 2026

https://github.com/hafeez-urrehman/mental-health-analyzer

Mental-Health-Analyzer is an AI-Based project for predicting mental health disorders such as stress, anxiety, depression, and loneliness. By applying machine learning techniques, this project analyzes user inputs and behavioral data to provide accurate predictions, aiming to support mental well-being and early intervention.

data-analysis data-science early-diagnonosis machine-learning mental-health mental-wellbeing predictive-modeling python

Last synced: 17 May 2026

https://github.com/diacod-i/bournetokill

Analysis on inhibition assay data for Monoamine Oxidase protein family

data-analysis data-science data-visualization python3

Last synced: 21 May 2026

https://github.com/markoshb/machine-learning-subject

Implementation of multiclass classification problems in R

classification-model data-analysis r

Last synced: 14 Mar 2025

https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform

This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.

bigquery data data-analysis data-modeling live-streaming sql

Last synced: 23 Jun 2025

https://github.com/hfxbse/dhbw-data-analysis

Exploratory data analysis R notebook for the module T3INF4333 "Grundlagen Data Science" held in 2024 by Lothar B. Blum at the DHBW Stuttgart.

data-analysis data-science dhbw dhbw-stuttgart ggplot2 r r-notebook

Last synced: 04 May 2026

https://github.com/virajbhutada/article-recommendation-system

This project aims to redefine content discovery by delivering personalized article recommendations tailored to individual user preferences. We use advanced machine learning techniques like PCA and K-means clustering to analyze user behavior and article characteristics to provide highly accurate recommendations.

anaconda article-recommendation clustering-algorithm data-analysis data-science keras-tensorflow machine-learning machine-learning-algorithms ml-models numpy pandas plotly python scikit-learn scipy

Last synced: 06 Jan 2026

https://github.com/tnickster/ai-analyst-agent

Ask questions about your business data in plain English, Get automatic SQL queries and visualizations, Receive AI-powered insights and recommendations, No SQL knowledge required

ai-assistant business-analytics business-intelligence data-analysis data-analyst data-visualization database-query gpt-4 langchain llm mysql natural-language-processing openai plotly python sql-generation streamlit

Last synced: 08 Apr 2026

https://github.com/roberto-butti/fit_explorer

FIT File Explorer, in GO Lang

data-analysis fitness geospatial golang

Last synced: 12 Apr 2025

https://github.com/jen-uis/loan-status-prediction

This repository contains project materials for the Winter STAT 206 class, University of California, Riverside, A. Gary Anderson School of Management.

data data-analysis data-analytics data-cleaning data-visualization descriptive-analytics julia julia-language jupyter-notebook predictive-analytics predictive-modeling team-collaboration

Last synced: 02 Jan 2026

https://github.com/messi10tom/ai-based-grade-prediction

GDSC task-1: Build a model to predict a student’s final grade based on features such as attendance, participation, assignment scores, and exam marks.

ai data-analysis data-science regression streamlit

Last synced: 02 May 2026

https://github.com/jcbritobr/iris

Iris dataset and data analysis with julia language.

data-analysis data-science data-visualization iris-dataset julia-language

Last synced: 06 Apr 2025

https://github.com/anurag-kumar-molankala/anurag-kumar-molankala

👋 About Me I'm a Power BI Developer with a passion for data visualization and UI/UX design. I create interactive dashboards that turn data into clear, actionable insights for smarter decision-making.

business-intelligence dashboards data-analysis data-visualization dax-query mlanguage powerbi sqlserver uiuxdesigner

Last synced: 25 Jan 2026

https://github.com/sumidcyber/dataviz-master

This Python application provides a user-friendly interface to load and visualize the contents of a CSV file. Users can choose from various types of graphs and perform analyses on the dataset.

data-analysis data-analysis-project data-analysis-python database databases python python3

Last synced: 02 Jan 2026

https://github.com/pymarcus/tcc_sistemasdeinformacao2025

This application is part of a research project aimed to use Gemini AI agent to identify "atoms of confusion" -- minimal code elements that cause misunderstandings -- in the context of Software Engineering.

atoms-of-code ci-cd clean-architecture concurrent-programming data-analysis design-patterns gemini-api golang ifmg inteligencia-artificial postgresql software-engineering solid tcc tdd workerpool

Last synced: 14 May 2026

https://github.com/theanujsinha01/rainfall-prediction-using-machine-learning

This project predicts whether it will rain or not based on weather features like pressure, humidity, dew point, cloud cover, sunshine, wind direction, and wind speed. We use a Random Forest Classifier, a popular ML algorithm, trained on historical weather data. The model learns patterns and helps us forecast rain chances.

classification data-analysis eda machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn supervised-learning

Last synced: 11 Apr 2026

https://github.com/arv-anshul/easy-analysis

A python package to perform Data Analysis easily. (Not Recommended)

arv-dumped data-analysis data-science easy-analysis eda pypi pypi-package python3

Last synced: 14 May 2025

https://github.com/jen-uis/la-crime-data-analysis

This repository contains project materials for the Fall 2023 MGT 256 class. This project is completed with assists from Professor Adem Orsdemir.

business-analytics crime-data crime-data-analysis data-analysis knn la-crimes-from-2020 la-safe r r-markdown r-studio report-generation rmd united-states visualization

Last synced: 14 Mar 2025

https://github.com/karatechop/noaa-storm-database-data-analysis

Analysis of population health and economic consequences of events documented in the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

data-analysis knitr r rmarkdown

Last synced: 14 Mar 2025

https://github.com/x1ao4/doc-merger

通过 python 脚本将两个相对不完整的文档合并为一个完整的文档 / merge two relatively incomplete documents into one complete document via python script

data-analysis data-merging document-analysis document-comparison document-processing documents filtering filtering-data merge merge-documents

Last synced: 28 Jun 2025

https://github.com/garciparedes/castile-and-leon-crops

Data Analysis of Castile and Leon Crops Area over the last years

castile-and-leon crops data-analysis data-science jupyter jupyter-notebook notebook spain

Last synced: 06 Jun 2026

https://github.com/palwisha-18/weather-api

Weather API built in Flask

data-analysis flask html pandas python

Last synced: 10 May 2026

https://github.com/sarincr/basics-of-julia-programming-language

Julia is a high-level, high-performance, dynamic programming language. While it is a general purpose language and can be used to write any application, many of its features are well-suited for high-performance numerical analysis and computational science.

data data-analysis data-mining data-science data-visualization dataanalysis dataanalytics datascience julia julia-language julia-library julia-package julialang machine-learning

Last synced: 19 May 2026

https://github.com/ronaldkanyepi/python-sreamlit-duplicate-records-finder-remover

This is a duplicate remover on csv,excel or txt files based on single or multi columns

css data-analysis data-visualization datascience python streamlit

Last synced: 07 May 2026

https://github.com/gher-uliege/stareso-data-processing

A set of tools to read, plot and process data from STARESO

coastal corsica data-analysis data-processing ocean-sciences oceanography

Last synced: 30 Mar 2025

https://github.com/manfredhair/wine-analysis-knn

wine data analysis using KNN with python and panda and sklearn

data-analysis data-science knn wine

Last synced: 16 Sep 2025

https://github.com/udaykumar-dhokia/d8a

d8a is a modern data analytics platform that provides powerful visualization and analysis tools for your data.

data-analysis data-visualization fullstack-development

Last synced: 16 Sep 2025

https://github.com/jamesnw/wtb-data

Explore beer addition and style info from WhatToBrew.com

data-analysis homebrewing jupyter-notebook python3

Last synced: 18 Apr 2026

https://github.com/atharvbyadav/dark-store-feasibility-analysis

A hackathon project analyzing the feasibility of setting up dark stores using data-driven insights. Focuses on demand clustering, location intelligence and logistics optimization.

business-intelligence dark-store data-analysis geospatial-analysis hackathon hackathon-project location-intelligence logistics pandas python retail-analytics urban-planning visualization

Last synced: 20 Jan 2026

https://github.com/edseldim/FirstRoundElectionsFr

A data visualization spreadsheet on Excel

data-analysis data-visualization excel pandas python

Last synced: 02 Aug 2025

https://github.com/viseshrp/community_health_indicator

Android app to fetch,organize and represent NYC health data

android data-analysis data-visualization health

Last synced: 03 Mar 2025

https://github.com/sourabh-kumar04/numpy-basic

Numpy-Basic is a structured learning repo covering NumPy from basics to advanced. It includes arrays, indexing, reshaping, filtering, vector ops, angle functions, stats, and .npy file handling. Each concept is explained with code, examples, and Matplotlib visualizations in both light and dark modes. Ideal for students and data learners.

data-analysis data-science data-visualization learning learning-resources machine-learning matplotlib numerical-computing numpy python python-library python-programming

Last synced: 10 May 2026

https://github.com/mksingh431/python-project

Learn Pandas with exercises and sample projects

data-analysis data-science data-visualization project projects python

Last synced: 03 May 2026

https://github.com/luminati-io/amazon-dataset-samples

A sample dataset of over 1,000 Amazon product listings, extracted using the Bright Data API, perfect for competitive analysis, market trends, and eCommerce insights.

amazon api data-analysis data-science dataset ecommerce products web-scraping

Last synced: 03 Jan 2026