An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/eikevons/pandas-paddles

Access the parent Pandas data frame in loc[], iloc[], assign(), and others Pandas helpers

data-analysis data-exploration data-science pandas pandas-dataframe pandas-library pandas-loc

Last synced: 16 Jun 2025

https://github.com/rahul-jha98/restauranttrends.stats

Visualise the trends in food and restaurant choices of customers in a city by scraping data from Zomato.

data-analysis data-science visualization vuejs zomato zomato-api zomato-scraper

Last synced: 08 Jul 2025

https://github.com/sodascience/map-explorer

Map Explorer is a Vue.js web application for rendering GeoJSON maps with dynamic region coloring based on external data.

choropleth data-analysis data-visualization geojson

Last synced: 10 Feb 2026

https://github.com/virajbhutada/tableau-data-vizzes

Engage with a growing collection of Tableau dashboards covering financial trends, HR analytics, streaming service insights, real estate dynamics, and more. Meticulously crafted for valuable insights, this repository continues to expand with new and compelling visualizations.

business-analytics data-analysis data-visualization hr-analytics industry-trends netflix performance-metrics stock-market-analysis strategic-analytics tableau visual-insights

Last synced: 02 Mar 2026

https://github.com/tuliosg/cdp

Repositório do curso "Ciência de Dados para Pesquisa".

data-analysis data-manipulation data-science data-visualization google-colab jupyter-notebook python

Last synced: 03 Mar 2026

https://github.com/farahibrar/kpmg-job-simulation

This repository showcases my work from the KPMG Technology Job Simulation by Forage, focusing on Data Analytics and Cloud Engineering. Explore how I tackled real-world business challenges through sales data analysis, regional growth strategies, and AWS architecture design, highlighting my analytical and technical expertise.

aws-architecture business-intelligence cloud-engineering cloud-strategy-and-design data-analysis data-visualization fintech-solutions forage kpmg kpmg-careers python-for-data-analysis sales-data-insights sustainable-retail-analysis

Last synced: 24 Jan 2026

https://github.com/patilni3/project_sql

Data Analysis using SQL

census-data data-analysis sql

Last synced: 16 Feb 2026

https://github.com/cyyeh/duckdb-data-agent

An AI-powered data analysis agent with a built-in SQL playground. Upload data files (CSV, JSON, Parquet, Excel) and ask questions in plain English — the agent delegates to a specialized subagent for SQL queries and renders charts inline — or switch to the SQL editor for direct queries.

agent claude-code csv data-analysis duckdb excel json langfuse llm parquet python react sql typescript

Last synced: 04 Jun 2026

https://github.com/happybono/avocadosmoothie

VB.NET project for running-median filtering. Users set kernel radius, border count, and pick MiddleMedian or AllMedian. Processing runs in parallel with a progress bar and smooth UI.

algorithms calibration correction data-analysis median outliers quicksort running-median runningmedian smoothing smoothing-methods statistics visual-basic

Last synced: 10 Feb 2026

https://github.com/rudra496/science

🔬 Interactive science experiments and research simulations — physics, chemistry, biology with 3D visualizations and real-time data analysis

data-analysis education experiments hacktoberfest javascript python research science simulation threejs

Last synced: 09 Jun 2026

https://github.com/quantumudit/analyzing-whiskyexchange-whisky

This project focuses on scraping data related to Japanese Whiskey from the Whiskey Exchange website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 02 May 2026

https://github.com/niamoto/niamoto

Niamoto is a command-line application and library focused on processing and publishing botanical data

botany cli-application data-analysis data-processing data-publication python-library

Last synced: 23 Apr 2026

https://github.com/quantumudit/consumer-goods-sales-analysis

This project focuses on analyzing and visualizing the consumer goods sales in the United States between 2015-2016 using Python & Power BI.

data-analysis data-visualization database jupyter-notebook python sqlite

Last synced: 29 Apr 2026

https://github.com/chaitanyac22/house-price-prediction-project-for-a-us-based-housing-company

The goal of this project is to garner data insights using data analytics to purchase houses at a price below their actual value and flip them on at a higher price. This project aims at building an effective regression model using regularization (i.e. advanced linear regression: Ridge and Lasso regression) in order to predict the actual values of prospective housing properties and decide whether to invest in them or not.

advanced-linear-regression business-analytics data-analysis data-cleaning data-manipulation data-visualization exploratory-data-analysis feature-engineering lasso-regression linear-regression machine-learning model-building model-evaluation prediction-model python3 regularization rfe ridge-regression statistics

Last synced: 30 Apr 2026

https://github.com/dcs-training/digital-method-of-the-month

In this repository you are going to find the documents we produced to support the discussion in our Digital Methods of the Month. These documents will help you orienting yourself if you want to pickup the method in your research. Go to the readme file

3d-data data-analysis data-visualisation data-wrangling geographical-data gis good-practices-digital-research machine-learning network-analysis open-research preregistration statistics text-analysis

Last synced: 25 Feb 2026

https://github.com/banisterious/obsidian-oneirometrics

OneiroMetrics (Turning Dreams Into Data). A plugin for Obsidian to track and analyze dream journal metrics.

data-analysis dream-analysis dream-diary dream-journal dreams journaling metrics obsidian obsidian-plugin self-improvement tracking

Last synced: 22 Apr 2026

https://github.com/zmyzheng/signature-authentication-pen

Signature Authentication Pen, a cloud based IoT project which realizes identity authentication by exploiting the signature biometric features of the users. Details:

android aws data-analysis identity-authentication iot neural-network signature-authentication-pen

Last synced: 03 May 2026

https://github.com/mscbuild/mscbuild

🏆 Сreating digital experiences that not only meet user expectations, but also drive engagement, loyalty and, ultimately, business success. Passionate developer from Latvia .

analysis best-practices coding config data-analysis data-science design developer freelance fullstack github-config latvia mscbuild profile readme seo site software-engineering web webapp

Last synced: 31 Jan 2026

https://github.com/quantumudit/regional-sales-analysis

This project focuses on analyzing and visualizing the United States regional sales for a fictitious company in between 2018-2020 using Python & Power BI.

data-analysis data-visualization databases jupyter-notebook power-bi python sqlite

Last synced: 02 May 2026

https://github.com/jggautier/dataverse-curation-assistant

A small software application that provides a UI for automating things in repositories that use the Dataverse software

data-analysis dataverse hacktoberfest python

Last synced: 01 Mar 2026

https://github.com/mertcandav/julenum

A high-performance library for numerical methods and scientific computing in Jule

data-analysis jule julelang math matrix scientific-computing statistics

Last synced: 09 Feb 2026

https://github.com/depressioncenter/mden

Mobile technologies code from the University of Michigan's Mobile Data Experts Network (MDEN), featuring data cleaning automations, REDCap project templates, and links to useful external modules. [DOI: 10.6084/m9.figshare.25438714]

automation data-analysis data-cleaning fitness-tracker heart-rate-data mobile-data mobile-development mquery powerautomate powerbi powerquery python r sleep-data smartwatch-data tableau

Last synced: 25 Feb 2026

https://github.com/edaaydinea/dataquest-projects

This repository is included data analyst, and data science-guided projects through Dataquest.

data-analysis data-science

Last synced: 07 Feb 2026

https://github.com/maksimekin/umd_data_challange_2020

Ocean Clean up data analysis project for the UMD Data Challenge 2020. Data Exploration for a Sustainable Planet.

cleanup competition data-analysis data-science folium geolocation machine-learning ocean planet pollution sklearn sustainability time-series trash umd

Last synced: 05 Jul 2025

https://github.com/ethan-wickstrom/rrrs

Welcome to RRRS, a rapid, hyper-optimized CSV random sampling tool designed with performance and efficiency at its core. Crafted meticulously in Rust, RRRS offers an unparalleled solution for extracting random data samples from CSV files swiftly and effortlessly.

analytics cli command-line command-line-tool data data-analysis data-science dataset rust rust-lang sample samples

Last synced: 16 May 2025

https://github.com/dcs-training/datavisualisationwithr

Data Visualisation with R Workshop (delivered by the Centre in December 2020). This workshop is focusing on visualising your data. Go to the readme file

data-analysis data-visualisation data-wrangling r

Last synced: 25 Apr 2025

https://github.com/globeandmail/startr-cli

A command-line scaffolder for the startr R project template

data-analysis data-journalism data-visualization journalism r

Last synced: 23 Apr 2025

https://github.com/theengineeringworld/numpy-data-science

NumPy Data Science Essential Traing COurse. Part of Youtube Course Offered by TheEngineeringWorld.

data-analysis data-science numpy numpy-exercises numpy-library numpy-tutorial python python-3-6 python3 scipy2018

Last synced: 09 Oct 2025

https://github.com/sonigarima/donation-management-system

A donation management system for NGOs and Donors. The project is designed for Cognizance IITR 2021 - Salesforce Codathon.

data-analysis donation-management reactjs

Last synced: 07 Sep 2025

https://github.com/paezha/edashop

An open educational resource to teach a workshop on Exploratory Data Analysis in R

data-analysis exploratory-data-analysis open-educational-resources package r rstats workshop-materials

Last synced: 18 Mar 2025

https://github.com/cparmet/pandas-checks

🐼🩺 Pandas Checks: Non-invasive health checks for Pandas method chains

data-analysis data-engineering data-science method-chaining pandas

Last synced: 27 May 2026

https://github.com/sondosaabed/introduction-to-data-analysis-with-pandas-and-numpy

Learning the data analysis process of questioning, wrangling, exploring, analyzing, and communicating data. Working with data in Python using libraries like NumPy and pandas.

data-analysis data-analyst-nanodegree data-wrangling numpy pandas python

Last synced: 09 Apr 2025

https://github.com/nafisalawalidris/predicting-credit-card-approvals

Explore credit card approval prediction through data analysis and machine learning. Preprocess data, train logistic regression models, and optimize hyperparameters. Learn data preprocessing, feature engineering, model training, and evaluation. Dive into the world of machine learning with Python and popular libraries.

approval-prediction credit-card data-analysis data-preprocessing feature-engineering hyperparameter-optimization libraries logistic-regression machine-learning model-evaluation model-training python python3

Last synced: 19 Apr 2025

https://github.com/abhash-rai/traffic-image-classifier

A web-based solution utilizing a robust tensorflow model for precise traffic condition classification made in ReactJs and FastAPI for backend.

cnn cnn-classification cnn-keras cnn-model data-analysis data-science data-visualization fastapi keras keras-tensorflow python python-3 python3 react reactjs tensorflow traffic traffic-classification transfer-learning

Last synced: 23 Feb 2026

https://github.com/yuukidach/twitchanal

Using AI for eGaming analytics to discover community interactions and behaviors of Twitch.

data-analysis data-analytics twitch

Last synced: 05 Jan 2026

https://github.com/lucasbotang/real_estate_management_data_analysis

Data analysis for real estate management

data-analysis excel mysql tableau

Last synced: 06 Oct 2025

https://github.com/robinmillford/cortex-ai-multi-model-insights-hub

Cortex AI: Multi-Model Insights Hub is an advanced platform that leverages cutting-edge AI to empower your research, analysis, and data exploration. By integrating multiple Large Language Models (LLMs) with a sophisticated Retrieve-and-Generate (RAG) system

article-extractor chatbot data-analysis data-visualization deepseek-chat deepseek-r1 llama3 llm pdf-document-processor rag streamlit-webapp summarizer vector-database

Last synced: 28 Oct 2025

https://github.com/efharkin/ez-ephys

Easy IO, inspection, and manipulation of electrophysiological data.

data-analysis electrophysiology neurophysiology neuroscience patch-clamp python

Last synced: 14 Jan 2026

https://github.com/sondosaabed/data-visualization-with-matplotlib-and-seaborn

Learning to apply sound design and data visualization principles to the data analysis process. Also learning how to use analysis and visualizations to tell a story with data.

data-analysis data-analyst-nanodegree data-visualization matplotlib python seaborn seaborn-plots

Last synced: 09 Apr 2025

https://github.com/afsalashyana/whatsapp-chat-analyzer

Analyze WhatsApp chats with beautiful graphs. Written in JavaFX

data-analysis data-visualization javafx javafx-14 javafx-application whatsapp

Last synced: 04 Sep 2025

https://github.com/darsan-in/rumour-monger-spotter

Rumour Monger Spotter is a prototype developed during a national-level cyber hackathon to identify false information on Twitter. Using the Google Fact Check API and a Multinomial Naive Bayes classifier, the tool analyzes tweet content to assess the likelihood of misinformation. Despite a development window of less than 24 hours, the project won a t

ai data-analysis fact-checking hackathon india naive-bayes national-competition natural-language-processing prototype real-time-analysis social-media text-classification tweet-content twitter

Last synced: 12 Oct 2025

https://github.com/itzmeanjan/indian-railway

Exploring Indian Railways time table dataset, with :heart:

data-analysis data-visualization indian-railways matplotlib python python3 railway

Last synced: 17 Oct 2025

https://github.com/fatihilhan42/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-engineering data-mining data-science data-structures data-visualization database datascience fatihilhan fortytwo fortytwofficial jupyter-notebook python

Last synced: 11 Oct 2025

https://github.com/martinthoma/bad-stats

Examples of how not to do statistics / visualizations

data-analysis statistics visualizations

Last synced: 07 Jan 2026

https://github.com/neutrinoceros/gpgi

A lightweight Python library for efficient in-RAM particle deposition on rectilinear, unrefined grids.

data-analysis grid particles performance

Last synced: 22 Apr 2025

https://github.com/sushantdhumak/traffic-forecasting-using-iot-sensor-data

Demonstrates how to utilize XGBoost for traffic forecasting using data gathered from IoT sensors, highlighting its efficiency in processing complex datasets and delivering accurate predictions.

data-analysis data-visualization exploratory-data-analysis feature-engineering feature-importance feature-selection gridsearchcv hyperparameter-optimization hyperparameter-tuning iot random-search xgboost-regression

Last synced: 26 Mar 2025

https://github.com/cheminfo/compass

Strategy for improved characterisation of human metabolic phenotypes using a COmbined Multiblock Principal components Analysis with Statistical Spectroscopy (COMPASS)

data-analysis metabolomics metabonomics multiblock nmr-spectroscopy pca population-analysis population-model

Last synced: 23 Mar 2025

https://github.com/jackfiszr/pl2xl

Nodejs-polars wrapper with `readExcel` and `writeExcel` methods.

data-analysis data-science deno excel excel-reader excel-writer nodejs polars

Last synced: 21 Jan 2026

https://github.com/1ayanabil1/healthcare-machine-learning

Explore our open-source repository focused on healthcare machine learning. We've developed predictive models for cardiovascular disease, diabetes, breast cancer, and more. Our projects employ diverse machine learning algorithms and data science techniques, enhancing early detection, diagnosis, and patient outcomes.

data-analysis data-science deep-learning disease disease-detection disease-modeling disease-prediction eda healthcare-application heathcare jupyter-notebook machine-learning machine-learning-algorithms machinelearning-python python

Last synced: 28 Apr 2025

https://github.com/waveform80/structa

A small utility for analyzing data structures (e.g. JSON files)

csv data-analysis data-visualization datajournalism datawrangling json yaml

Last synced: 06 Sep 2025

https://github.com/bjpop/gurita

A convenient and expressive tool for data analytics and plotting on the command line

command-line data-analysis data-science pandas plotting python

Last synced: 04 Feb 2026

https://github.com/jonzeolla/lab-securitydataanalysis

An introductory lab to Security Data Analysis (using Apache Metron (incubating)).

apache-metron data-analysis lab metron security

Last synced: 03 Jul 2025

https://github.com/lit26/trump_tweet_analysis

Analysis of Trump's original tweets.

data-analysis lda-model topic-modeling

Last synced: 12 Apr 2025

https://github.com/dsnchz/solid-g6

A SolidJS component library for graph visualization, powered by @antv/g6

analysis data-analysis data-visualization graph graph-visualization node-ui solidjs visualization

Last synced: 13 Oct 2025

https://github.com/trainingbypackt/splunk-7-essentials-elearning

Build an elaborate Splunk enterprise environment that will extract powerful insights from your machine-generated big data

data-analysis eventgen indexing machine-learning splunk sub-search visualization

Last synced: 01 Mar 2026

https://github.com/shervinnd/bazar_app_store_eda

Bazar App Data analysis code to find the most downloaded category and most popular installed apps

data data-analysis data-science dataanalysis eda python

Last synced: 15 Apr 2025

https://github.com/pepe-god/dataprophet

Extracts the identity information citizens from MySQL, creates a family network based on TC ID No. and exports it to CSV

101m 109m adres data-analysis data-extraction database-connector family-tree genealogy gsm hsys identity mysql-database python-script pyton

Last synced: 13 Jul 2025

https://github.com/ficaan/data-analysis-with-python-2023_2024-mooc.fi

These are all the solutions for exercises from Data Analysis with Python 2023/2024, a course offered by the University of Helsinki, Finland.

data-analysis machine-learning mooc-fi programming python

Last synced: 08 Jun 2026

https://github.com/csparpa/last.fm-stats

Exercise on Last.fm data aggregation

data-analysis exercise lastfm lastfm-api python

Last synced: 21 May 2026

https://github.com/kellyjadams/spotify-data-analyze

A serverless data pipeline that logs my Spotify listening history to BigQuery using Cloud Run, then visualizes trends with Looker Studio. Built with Python, Flask, Docker, and GCP..

data-analysis data-engineering

Last synced: 07 May 2025

https://github.com/louis-heraut/card

🎴 Card of Analyse and Diagnostic in R for a user-friendly experience of data aggregation with parametrisation file.

aggregation climate-change climate-data climate-science data-analysis data-science diagnostic environment environment-variables hydrology hydrology-statistical inrae r statistics tools user-friendly

Last synced: 09 Mar 2026

https://github.com/c0deta1ker/matbasex

MatBaseX is an all-in-one database and analytical tool for photoelectron spectroscopy (PES) analysis, focused on materials and their X-ray interactions. It offers features like a Materials Properties Database, IMFP & XPS Sensitivity Factor Calculator, and PES N-Layer Simulations & Curve Fitting utilities. Explore its powerful capabilities today!

cross-sections crystal-structure crystallography data-analysis data-fitting database electron imfp imfp-calculator-matlab material material-database matlab matlab-application matlab-gui matlab-toolbox pes-modelling photoelectron-spectroscopy photoionization simulation xps

Last synced: 01 Jul 2025

https://github.com/rhenkin/visxhclust

A Shiny app and functions for visual exploration of hierarchical clustering.

clustering data-analysis data-science r r-package r-shiny rstats shiny-apps

Last synced: 02 Apr 2025

https://github.com/super-lou/exstat

🌾 R package to provide an efficient and simple solution to aggregate and analyze the stationarity of time series

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae low-water mann-kendall mann-kendall-tests r stationarity-test statistics time-series

Last synced: 13 Apr 2025

https://github.com/louis-heraut/exstat

🌾 R package to provide an efficient and simple solution to aggregate and analyze the stationarity of time series

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae low-water mann-kendall mann-kendall-tests r stationarity-test statistics time-series

Last synced: 16 Jun 2025

https://github.com/richiejp/jdp

Automatically collect and normalise data, then run algorithms on it.

automation-framework data-analysis suse-qa

Last synced: 02 Jan 2026

https://github.com/johnsell620/sentiment-analysis-goodreads-reviews

Document-level sentiment analysis of book reviews scraped from the Goodreads website. Technologies used include TensorFlow, Spark, HDFS, Sqoop, Scrapy, and D3.js.

data-analysis data-visualization recurrent-neural-networks web-scraping

Last synced: 30 Apr 2025

https://github.com/avinashkranjan/basic-data-analysis-and-visualization-in-python

📊 Some of the most important python tools in data science for Data Analysis and Data Visualization.

data-analysis data-science matplotlib matplotlib-pyplot numpy pandas plotly seabourne

Last synced: 30 Oct 2025

https://github.com/quantumudit/movie-ratings-analysis

This project focuses on analyzing and finding correlations between the audience and critic ratings for some of the popular movies released between 2009-2011 using Python & Power BI

data-analysis data-visualization jupyter-notebook power-bi python

Last synced: 19 Apr 2026

https://github.com/rikard-helgegren/leverage_analysis_tool

Analyst tool for portfolio construction. How can levereged certificates be used to increase returns in a portfolio while keeping the risk as low as possible. Use the tool and find out.

cpp data-analysis investment kivy-framework python3

Last synced: 12 Apr 2025

https://github.com/stimulsoft/stimulsoft.dashboards.php

Dashboards.PHP is a complete software package for designing and viewing dashboards. Includes the JS data analysis engine, dashboard designer and viewer. Support PHP 5, PHP 7, and PHP 8 versions.

charts dashboard-builder dashboards data-analysis data-grid data-visualization datatable dynamic-dashboard interactive-dashboards live-data mysql-data php php-bi-tools php-dashboard php-kpi php7 php8 pivot-tables sql-datasources statistics

Last synced: 14 Oct 2025