An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/misaghmomenib/airport-flight-analysis

Flight Data Analysis Project Aimed at Exploring and Visualizing Airport Operations, Flight Patterns, and Delay Trends Using Python. This Project Involves Data Cleaning, Preprocessing, and Statistical Analysis With Tools Like Pandas, Matplotlib, and Scikit-learn to Uncover Insights and Improve Operational Efficiency.

analysis data-analysis data-visualization git open-source python python3

Last synced: 30 Apr 2025

https://github.com/danieldacosta/airbnb-analysis

Data analysis of AirBnb website history in the city of Rio de Janeiro

airbnb-analysis airbnb-website-history data-analysis

Last synced: 30 Apr 2025

https://github.com/vedadiyan/genql

GenQL is a generic querying language fully written in Go

data-analysis data-mapping data-processing data-science data-translation json json-data sql

Last synced: 22 Jun 2025

https://github.com/femtotrader/dukascopyticksreader.jl

A Julia library to download tick data from Dukascopy https://www.dukascopy.com/swiss/english/marketwatch/historical/

data data-analysis dataset dukascopy html julia stock-data

Last synced: 10 Apr 2025

https://github.com/eikevons/pandas-paddles

Access the parent Pandas data frame in loc[], iloc[], assign(), and others Pandas helpers

data-analysis data-exploration data-science pandas pandas-dataframe pandas-library pandas-loc

Last synced: 16 Jun 2025

https://github.com/theakashshukla/r-project

🎓 A Collection of Programming Assignment for R Language

algorithms data-analysis data-science data-science-projects ml r

Last synced: 24 Jul 2025

https://github.com/virajbhutada/tableau-data-vizzes

Engage with a growing collection of Tableau dashboards covering financial trends, HR analytics, streaming service insights, real estate dynamics, and more. Meticulously crafted for valuable insights, this repository continues to expand with new and compelling visualizations.

business-analytics data-analysis data-visualization hr-analytics industry-trends netflix performance-metrics stock-market-analysis strategic-analytics tableau visual-insights

Last synced: 02 Mar 2026

https://github.com/fburic/pandance

Advanced relational operations for pandas DataFrames

data-analysis data-science data-wrangling pandas

Last synced: 27 Jun 2026

https://github.com/tuliosg/cdp

Repositório do curso "Ciência de Dados para Pesquisa".

data-analysis data-manipulation data-science data-visualization google-colab jupyter-notebook python

Last synced: 03 Mar 2026

https://github.com/patilni3/project_sql

Data Analysis using SQL

census-data data-analysis sql

Last synced: 16 Feb 2026

https://github.com/sodascience/map-explorer

Map Explorer is a Vue.js web application for rendering GeoJSON maps with dynamic region coloring based on external data.

choropleth data-analysis data-visualization geojson

Last synced: 10 Feb 2026

https://github.com/chaitanyac22/house-price-prediction-project-for-a-us-based-housing-company

The goal of this project is to garner data insights using data analytics to purchase houses at a price below their actual value and flip them on at a higher price. This project aims at building an effective regression model using regularization (i.e. advanced linear regression: Ridge and Lasso regression) in order to predict the actual values of prospective housing properties and decide whether to invest in them or not.

advanced-linear-regression business-analytics data-analysis data-cleaning data-manipulation data-visualization exploratory-data-analysis feature-engineering lasso-regression linear-regression machine-learning model-building model-evaluation prediction-model python3 regularization rfe ridge-regression statistics

Last synced: 30 Apr 2026

https://github.com/quantumudit/analyzing-whiskyexchange-whisky

This project focuses on scraping data related to Japanese Whiskey from the Whiskey Exchange website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 02 May 2026

https://github.com/quantumudit/regional-sales-analysis

This project focuses on analyzing and visualizing the United States regional sales for a fictitious company in between 2018-2020 using Python & Power BI.

data-analysis data-visualization databases jupyter-notebook power-bi python sqlite

Last synced: 02 May 2026

https://github.com/edaaydinea/dataquest-projects

This repository is included data analyst, and data science-guided projects through Dataquest.

data-analysis data-science

Last synced: 07 Feb 2026

https://github.com/jggautier/dataverse-curation-assistant

A small software application that provides a UI for automating things in repositories that use the Dataverse software

data-analysis dataverse hacktoberfest python

Last synced: 01 Mar 2026

https://github.com/campos20/wca-statistics

This repository is meant to provide statistics for the WCA Statistics group on Facebook. It's also my repo to study data science with Python.

data-analysis data-science statistics world-cube-association

Last synced: 29 Jun 2026

https://github.com/banisterious/obsidian-oneirometrics

OneiroMetrics (Turning Dreams Into Data). A plugin for Obsidian to track and analyze dream journal metrics.

data-analysis dream-analysis dream-diary dream-journal dreams journaling metrics obsidian obsidian-plugin self-improvement tracking

Last synced: 22 Apr 2026

https://github.com/cyyeh/duckdb-data-agent

An AI-powered data analysis agent with a built-in SQL playground. Upload data files (CSV, JSON, Parquet, Excel) and ask questions in plain English — the agent delegates to a specialized subagent for SQL queries and renders charts inline — or switch to the SQL editor for direct queries.

agent claude-code csv data-analysis duckdb excel json langfuse llm parquet python react sql typescript

Last synced: 04 Jun 2026

https://github.com/zmyzheng/signature-authentication-pen

Signature Authentication Pen, a cloud based IoT project which realizes identity authentication by exploiting the signature biometric features of the users. Details:

android aws data-analysis identity-authentication iot neural-network signature-authentication-pen

Last synced: 03 May 2026

https://github.com/mscbuild/mscbuild

🏆 Сreating digital experiences that not only meet user expectations, but also drive engagement, loyalty and, ultimately, business success. Passionate developer from Latvia .

analysis best-practices coding config data-analysis data-science design developer freelance fullstack github-config latvia mscbuild profile readme seo site software-engineering web webapp

Last synced: 31 Jan 2026

https://github.com/niamoto/niamoto

Niamoto is a command-line application and library focused on processing and publishing botanical data

botany cli-application data-analysis data-processing data-publication python-library

Last synced: 23 Apr 2026

https://github.com/happybono/avocadosmoothie

VB.NET project for running-median filtering. Users set kernel radius, border count, and pick MiddleMedian or AllMedian. Processing runs in parallel with a progress bar and smooth UI.

algorithms calibration correction data-analysis median outliers quicksort running-median runningmedian smoothing smoothing-methods statistics visual-basic

Last synced: 10 Feb 2026

https://github.com/depressioncenter/mden

Mobile technologies code from the University of Michigan's Mobile Data Experts Network (MDEN), featuring data cleaning automations, REDCap project templates, and links to useful external modules. [DOI: 10.6084/m9.figshare.25438714]

automation data-analysis data-cleaning fitness-tracker heart-rate-data mobile-data mobile-development mquery powerautomate powerbi powerquery python r sleep-data smartwatch-data tableau

Last synced: 25 Feb 2026

https://github.com/rudra496/science

🔬 Interactive science experiments and research simulations — physics, chemistry, biology with 3D visualizations and real-time data analysis

data-analysis education experiments hacktoberfest javascript python research science simulation threejs

Last synced: 09 Jun 2026

https://github.com/dcs-training/digital-method-of-the-month

In this repository you are going to find the documents we produced to support the discussion in our Digital Methods of the Month. These documents will help you orienting yourself if you want to pickup the method in your research. Go to the readme file

3d-data data-analysis data-visualisation data-wrangling geographical-data gis good-practices-digital-research machine-learning network-analysis open-research preregistration statistics text-analysis

Last synced: 25 Feb 2026

https://github.com/quantumudit/consumer-goods-sales-analysis

This project focuses on analyzing and visualizing the consumer goods sales in the United States between 2015-2016 using Python & Power BI.

data-analysis data-visualization database jupyter-notebook python sqlite

Last synced: 29 Apr 2026

https://github.com/mertcandav/julenum

A high-performance library for numerical methods and scientific computing in Jule

data-analysis jule julelang math matrix scientific-computing statistics

Last synced: 09 Feb 2026

https://github.com/cparmet/pandas-checks

🐼🩺 Pandas Checks: Non-invasive health checks for Pandas method chains

data-analysis data-engineering data-science method-chaining pandas

Last synced: 27 May 2026

https://github.com/martinthoma/bad-stats

Examples of how not to do statistics / visualizations

data-analysis statistics visualizations

Last synced: 07 Jan 2026

https://github.com/shervinnd/bazar_app_store_eda

Bazar App Data analysis code to find the most downloaded category and most popular installed apps

data data-analysis data-science dataanalysis eda python

Last synced: 15 Apr 2025

https://github.com/cheminfo/compass

Strategy for improved characterisation of human metabolic phenotypes using a COmbined Multiblock Principal components Analysis with Statistical Spectroscopy (COMPASS)

data-analysis metabolomics metabonomics multiblock nmr-spectroscopy pca population-analysis population-model

Last synced: 23 Mar 2025

https://github.com/robinmillford/cortex-ai-multi-model-insights-hub

Cortex AI: Multi-Model Insights Hub is an advanced platform that leverages cutting-edge AI to empower your research, analysis, and data exploration. By integrating multiple Large Language Models (LLMs) with a sophisticated Retrieve-and-Generate (RAG) system

article-extractor chatbot data-analysis data-visualization deepseek-chat deepseek-r1 llama3 llm pdf-document-processor rag streamlit-webapp summarizer vector-database

Last synced: 28 Oct 2025

https://github.com/yuukidach/twitchanal

Using AI for eGaming analytics to discover community interactions and behaviors of Twitch.

data-analysis data-analytics twitch

Last synced: 05 Jan 2026

https://github.com/ethan-wickstrom/rrrs

Welcome to RRRS, a rapid, hyper-optimized CSV random sampling tool designed with performance and efficiency at its core. Crafted meticulously in Rust, RRRS offers an unparalleled solution for extracting random data samples from CSV files swiftly and effortlessly.

analytics cli command-line command-line-tool data data-analysis data-science dataset rust rust-lang sample samples

Last synced: 16 May 2025

https://github.com/lit26/trump_tweet_analysis

Analysis of Trump's original tweets.

data-analysis lda-model topic-modeling

Last synced: 12 Apr 2025

https://github.com/nafisalawalidris/predicting-credit-card-approvals

Explore credit card approval prediction through data analysis and machine learning. Preprocess data, train logistic regression models, and optimize hyperparameters. Learn data preprocessing, feature engineering, model training, and evaluation. Dive into the world of machine learning with Python and popular libraries.

approval-prediction credit-card data-analysis data-preprocessing feature-engineering hyperparameter-optimization libraries logistic-regression machine-learning model-evaluation model-training python python3

Last synced: 19 Apr 2025

https://github.com/jonzeolla/lab-securitydataanalysis

An introductory lab to Security Data Analysis (using Apache Metron (incubating)).

apache-metron data-analysis lab metron security

Last synced: 03 Jul 2025

https://github.com/neutrinoceros/gpgi

A lightweight Python library for efficient in-RAM particle deposition on rectilinear, unrefined grids.

data-analysis grid particles performance

Last synced: 22 Apr 2025

https://github.com/afsalashyana/whatsapp-chat-analyzer

Analyze WhatsApp chats with beautiful graphs. Written in JavaFX

data-analysis data-visualization javafx javafx-14 javafx-application whatsapp

Last synced: 04 Sep 2025

https://github.com/sondosaabed/data-visualization-with-matplotlib-and-seaborn

Learning to apply sound design and data visualization principles to the data analysis process. Also learning how to use analysis and visualizations to tell a story with data.

data-analysis data-analyst-nanodegree data-visualization matplotlib python seaborn seaborn-plots

Last synced: 09 Apr 2025

https://github.com/sondosaabed/introduction-to-data-analysis-with-pandas-and-numpy

Learning the data analysis process of questioning, wrangling, exploring, analyzing, and communicating data. Working with data in Python using libraries like NumPy and pandas.

data-analysis data-analyst-nanodegree data-wrangling numpy pandas python

Last synced: 09 Apr 2025

https://github.com/bjpop/gurita

A convenient and expressive tool for data analytics and plotting on the command line

command-line data-analysis data-science pandas plotting python

Last synced: 04 Feb 2026

https://github.com/globeandmail/startr-cli

A command-line scaffolder for the startr R project template

data-analysis data-journalism data-visualization journalism r

Last synced: 23 Apr 2025

https://github.com/dcs-training/datavisualisationwithr

Data Visualisation with R Workshop (delivered by the Centre in December 2020). This workshop is focusing on visualising your data. Go to the readme file

data-analysis data-visualisation data-wrangling r

Last synced: 25 Apr 2025

https://github.com/maksimekin/umd_data_challange_2020

Ocean Clean up data analysis project for the UMD Data Challenge 2020. Data Exploration for a Sustainable Planet.

cleanup competition data-analysis data-science folium geolocation machine-learning ocean planet pollution sklearn sustainability time-series trash umd

Last synced: 05 Jul 2025

https://github.com/trainingbypackt/splunk-7-essentials-elearning

Build an elaborate Splunk enterprise environment that will extract powerful insights from your machine-generated big data

data-analysis eventgen indexing machine-learning splunk sub-search visualization

Last synced: 01 Mar 2026

https://github.com/waveform80/structa

A small utility for analyzing data structures (e.g. JSON files)

csv data-analysis data-visualization datajournalism datawrangling json yaml

Last synced: 06 Sep 2025

https://github.com/1ayanabil1/healthcare-machine-learning

Explore our open-source repository focused on healthcare machine learning. We've developed predictive models for cardiovascular disease, diabetes, breast cancer, and more. Our projects employ diverse machine learning algorithms and data science techniques, enhancing early detection, diagnosis, and patient outcomes.

data-analysis data-science deep-learning disease disease-detection disease-modeling disease-prediction eda healthcare-application heathcare jupyter-notebook machine-learning machine-learning-algorithms machinelearning-python python

Last synced: 28 Apr 2025

https://github.com/sushantdhumak/traffic-forecasting-using-iot-sensor-data

Demonstrates how to utilize XGBoost for traffic forecasting using data gathered from IoT sensors, highlighting its efficiency in processing complex datasets and delivering accurate predictions.

data-analysis data-visualization exploratory-data-analysis feature-engineering feature-importance feature-selection gridsearchcv hyperparameter-optimization hyperparameter-tuning iot random-search xgboost-regression

Last synced: 26 Mar 2025

https://github.com/efharkin/ez-ephys

Easy IO, inspection, and manipulation of electrophysiological data.

data-analysis electrophysiology neurophysiology neuroscience patch-clamp python

Last synced: 14 Jan 2026

https://github.com/abhash-rai/traffic-image-classifier

A web-based solution utilizing a robust tensorflow model for precise traffic condition classification made in ReactJs and FastAPI for backend.

cnn cnn-classification cnn-keras cnn-model data-analysis data-science data-visualization fastapi keras keras-tensorflow python python-3 python3 react reactjs tensorflow traffic traffic-classification transfer-learning

Last synced: 23 Feb 2026

https://github.com/sonigarima/donation-management-system

A donation management system for NGOs and Donors. The project is designed for Cognizance IITR 2021 - Salesforce Codathon.

data-analysis donation-management reactjs

Last synced: 07 Sep 2025

https://github.com/paezha/edashop

An open educational resource to teach a workshop on Exploratory Data Analysis in R

data-analysis exploratory-data-analysis open-educational-resources package r rstats workshop-materials

Last synced: 18 Mar 2025

https://github.com/lucasbotang/real_estate_management_data_analysis

Data analysis for real estate management

data-analysis excel mysql tableau

Last synced: 06 Oct 2025

https://github.com/theengineeringworld/numpy-data-science

NumPy Data Science Essential Traing COurse. Part of Youtube Course Offered by TheEngineeringWorld.

data-analysis data-science numpy numpy-exercises numpy-library numpy-tutorial python python-3-6 python3 scipy2018

Last synced: 09 Oct 2025

https://github.com/jackfiszr/pl2xl

Nodejs-polars wrapper with `readExcel` and `writeExcel` methods.

data-analysis data-science deno excel excel-reader excel-writer nodejs polars

Last synced: 21 Jan 2026

https://github.com/fatihilhan42/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-engineering data-mining data-science data-structures data-visualization database datascience fatihilhan fortytwo fortytwofficial jupyter-notebook python

Last synced: 11 Oct 2025

https://github.com/darsan-in/rumour-monger-spotter

Rumour Monger Spotter is a prototype developed during a national-level cyber hackathon to identify false information on Twitter. Using the Google Fact Check API and a Multinomial Naive Bayes classifier, the tool analyzes tweet content to assess the likelihood of misinformation. Despite a development window of less than 24 hours, the project won a t

ai data-analysis fact-checking hackathon india naive-bayes national-competition natural-language-processing prototype real-time-analysis social-media text-classification tweet-content twitter

Last synced: 12 Oct 2025

https://github.com/dsnchz/solid-g6

A SolidJS component library for graph visualization, powered by @antv/g6

analysis data-analysis data-visualization graph graph-visualization node-ui solidjs visualization

Last synced: 13 Oct 2025

https://github.com/itzmeanjan/indian-railway

Exploring Indian Railways time table dataset, with :heart:

data-analysis data-visualization indian-railways matplotlib python python3 railway

Last synced: 17 Oct 2025

https://github.com/franpog859/top-of-the-world

🌍🔝 Proof that your country is the top of the world using GeoTIFF images and a little bit of geometry. Data mining project

data data-analysis data-mining elevation geometry geotiff image-processing matplotlib nvector rasterio

Last synced: 16 Feb 2026

https://github.com/louis-heraut/card

🎴 Card of Analyse and Diagnostic in R for a user-friendly experience of data aggregation with parametrisation file.

aggregation climate-change climate-data climate-science data-analysis data-science diagnostic environment environment-variables hydrology hydrology-statistical inrae r statistics tools user-friendly

Last synced: 09 Mar 2026

https://github.com/saksham-joshi/sentiment_analyzer

Analyze the sentiment of a text stored in a string or file and understand the reason why your blogs and posts are not ranking up.

data-analysis data-analytics python sentiment-analyser sentiment-analysis sentiment-analysis-without-nltk

Last synced: 22 Aug 2025

https://github.com/sowinskibraeden/schedulegeneratorapp

The Desktop Application for my schedule-generator algorithm, allowing users to easily interact with the algorithm and its variables to generate schedules as documents for students individually as well as the master timetable

algorithm csv data-analysis dataclasses python-docx python-typing python311 xlsxwriter

Last synced: 09 Jul 2025

https://github.com/thecoderpinar/earthquake_prediction_analysis_project

🌍 Welcome to the Earthquake Prediction Analysis Project! 🚀 This project aims to predict earthquake magnitudes using LSTM neural networks and analyze seismic data. Explore, analyze, and forecast earthquakes with ease! 📈🔮

analysis data-analysis data-science earthquake-prediction geocoding geology lstm lstm-neural-networks machine-learning matlab matlab-deep-learning open-source time-series visualization

Last synced: 16 Aug 2025

https://github.com/gxjansen/user-analysis-with-r-google-analytics

Analyzing user behavior of an E-commerce website with R and (mainly) Google Analytics Data

analytics analytics-api conversion-rate-optimization data-analysis ecommerce google google-analytics r

Last synced: 27 Mar 2025

https://github.com/ynikitenko/lena

Lena is an architectural framework for data analysis

analysis-framework analysis-pipeline data-analysis data-science

Last synced: 30 Apr 2025

https://github.com/lisa-ho/three-investigators

Respository for scraping and analysing fan data on a German audio drama called 'Die Drei Fragezeichen' (the three investigators).

data-analysis data-viz datawrapper python webscraping

Last synced: 25 Oct 2025

https://github.com/nicucalcea/raise

An R library that uses ChatGPT / GPT to generate data

chatgpt chatgpt-api chatgpt-app data-analysis gpt gpt-35-turbo openai openai-chatgpt parsing r

Last synced: 05 Mar 2025

https://github.com/ficaan/data-analysis-with-python-2023_2024-mooc.fi

These are all the solutions for exercises from Data Analysis with Python 2023/2024, a course offered by the University of Helsinki, Finland.

data-analysis machine-learning mooc-fi programming python

Last synced: 08 Jun 2026