An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/sondosaabed/paltaqdeer

🇵🇸 PalTaqdeer is an AI-Driven Student Success Forecaster. Was developed for Hackathon Google Launchpad, data analysis techniques, Linear regression model, and Flask for the web 🇵🇸

data-analysis hackathon hackathon-project linear-regression matplotlib outliers-detection pandas python student-grades

Last synced: 09 Apr 2025

https://github.com/sciruby/daru-io

daru-io is a plugin gem to the existing daru gem, which aims to add support to Importing DataFrames from / Exporting DataFrames to multiple formats.

daru data-analysis exporter importer parser ruby ruby-gem

Last synced: 15 Jun 2025

https://github.com/integerman/gitstractor

A library for visualizing the commits, authors, and files of any git repository

code-analysis data-analysis data-visualization dotnet git powerbi repository-management static-code-analysis utilities visualization

Last synced: 14 Jan 2026

https://github.com/unipept/unipept

🌐 Unipept frontend for metaproteomics data analysis

data-analysis data-visualization metaproteomics unipept uniprot

Last synced: 14 Sep 2025

https://github.com/hoangsonww/north-carolina-household-analysis

🏠 This repository contains data analysis scripts for the 2022 American Community Survey (ACS) focusing on individuals aged 25 and over in North Carolina, based on 75,340 observations. This repository offers valuable insights into demographic and economic patterns across North Carolina's urban areas.

confidence-interval confidence-score data data-analysis data-analytics data-science data-visualization ggplot2 hypothesis-testing hypothesis-tests north-carolina r r-language r-programming stata

Last synced: 11 Apr 2025

https://github.com/PiotrZakrzewski/merge-chance

Source code of https://merge-chance.info

analysis data data-analysis open-source

Last synced: 26 Mar 2025

https://github.com/pravj/ospi

Open Source Presence Infographic of Indian Startups

data-analysis data-visualization india open-source startup

Last synced: 13 Apr 2025

https://github.com/ahmedosamamath/statistics-basics

A comprehensive guide to applying statistical techniques in machine learning, including data preprocessing, model development, evaluation metrics, and real-world applications. This repository provides beginner-to-advanced insights into the statistical foundations of machine learning.

artificial-intelligence data-analysis data-science machine-learning statistics

Last synced: 12 Apr 2025

https://github.com/goplus/pandas

Flexible and powerful data analysis / manipulation library for Go+, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

data-analysis data-science data-tech go golang gop goplus pandas scientific-computing

Last synced: 30 Apr 2025

https://github.com/jasdumas/ttbbeer

An R Dataset Package for US Beer Statistics From TTB :beer:

beer-statistics data-analysis r

Last synced: 05 Mar 2025

https://github.com/piquette/edgr

A set of tools for dealing with SEC EDGAR corporate filings

api cli-app data-analysis data-mining edgar-database edgar-scraper finance financial-analysis sec-edgar sec-filings

Last synced: 15 May 2025

https://github.com/amkrajewski/nimcso

nim Composition Space Optimization is a high-performance tool leveraging metaprogramming to implement several methods for selecting components (data dimensions) in compositional datasets, as to optimize the data availability and density for applications such as machine learning.

data-analysis data-optimization data-science materials-informatics metaprogramming nim nim-lang

Last synced: 09 Apr 2025

https://github.com/hoangsonww/global-covid19-analysis

🌍 This repository hosts an in-depth analysis of COVID-19's impact across five key countries from Jan 2020 to Dec 2021. Through advanced data analysis and visualization, we aim to provide insights into how the pandemic evolved differently across these nations, shedding light on the effectiveness of various health measures and vaccination campaigns.

covid covid-19 covid19-tracker data data-analysis data-analytics data-science data-visualization ggplot2 julia julia-language python r r-language r-markdown r-programming sas sas-programming stata vaccination

Last synced: 10 Apr 2025

https://github.com/simfg/etcd-analysis

🔦 Etcd Data Analysis Tool

data-analysis etcd go raft

Last synced: 20 Aug 2025

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 14 Jul 2025

https://github.com/rfordatascience/r4dswebsite

Public repository for the R4DS community website.

blogdown data-analysis data-analytics data-science data-visualization r r4ds tidyverse

Last synced: 11 Apr 2025

https://github.com/cmudig/texture

Visualize your text data with structured attributes

data-analysis llm text visualization

Last synced: 07 May 2025

https://github.com/skyzh/meteor

🚆 Fine-grained analysis and visualization of Hangzhou Metro for efficient traveling in metro system. Project report, slide and presentation video included.

cmake data-analysis hangzhou metro qt sqlite visualize

Last synced: 23 Mar 2025

https://github.com/itzmeanjan/chanalyze

A simple WhatsApp Chat Analyzer ( for both Private & Group chats ), made with :heart:

chat-analysis data-analysis datascience dataviz matplotlib python3 visualization whatsapp whatsapp-chat whatsapp-chat-analyzer

Last synced: 06 Oct 2025

https://github.com/skyzh/Meteor

🚆 Fine-grained analysis and visualization of Hangzhou Metro for efficient traveling in metro system. Project report, slide and presentation video included.

cmake data-analysis hangzhou metro qt sqlite visualize

Last synced: 12 Apr 2025

https://github.com/saranshbansal/data-science-with-python

Data science with Python: This repository mostly contains DataCamp data-science courses/exercises that I have completed.

data-analysis data-science datacamp-exercises numpy python

Last synced: 07 Oct 2025

https://github.com/mattools/matstats

Statistical Data Analysis Toolbox for Matlab. Provides a Table class similar to R's dataframe, as well a exloratory data analysis tools.

data-analysis data-table matlab matlab-toolbox statistics

Last synced: 21 Jun 2025

https://github.com/mr-easy/badminton-stroke-classification

Classifying badminton strokes based on accelorometer and gyroscope sensor data attached to player's wrist. An end-to-end Machine Learning project, from data collection and preprocessing to final model evaluation.

badminton-stroke-classification data-analysis data-analytics data-science deep-learning machine-learning model-evaluation notebook project time-series-analysis tutorial

Last synced: 31 Aug 2025

https://github.com/danvk/march-madness-data

NCAA brackets in JSON form

data-analysis ncaa-basketball sports

Last synced: 03 Mar 2025

https://github.com/PySloth/pysloth

A Python Package for Probabilistic Prediction

data-analysis data-science machine-learning python statistics

Last synced: 11 May 2025

https://github.com/rubydamodar/the-ultimate-pandas-bootcamp

Welcome to the Pandas for Data Science repository! This course is designed to take you from beginner to proficient in using Pandas, the powerful data manipulation library in Python. Whether you're just starting your data science journey or looking to sharpen your skills, this repository contains all the resources

beginner-friendly csv-data data-analysis data-cleaning data-manipulation data-science data-visualization dataframe exploratory-data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python python-pandas series statistical-analysis time-series titanic-dataset

Last synced: 19 Apr 2025

https://github.com/cengel/R-data-wrangling

Materials for my my R data workshop. https://cengel.github.io/R-data-wrangling/

data-analysis data-workshop datascience material r rstats social-sciences teaching tidyverse workshop

Last synced: 06 May 2025

https://github.com/shdev/phpflashtext

Extract Keywords from sentence or Replace keywords in sentences. @ https://github.com/vi3k6i5/flashtext

data-analysis data-extraction flashtext keyword-extraction nlp php search-in-text string-manipulation string-matching word2vec

Last synced: 12 Jan 2026

https://github.com/alejandrodumas/kodiak

Enhance your feature engineering workflow with Kodiak

data-analysis pandas

Last synced: 19 Jul 2025

https://github.com/rob-med/everything-shapelets

This repo contains useful links to research papers and implementations of shapelets discovery/learning techniques from different sources.

data-analysis data-mining shapelets time-series-analysis timeseries

Last synced: 03 Mar 2025

https://github.com/alessandrocorradini/harvard-data-analysis-for-life-science-xseries

Lectures, Code and Quizzes for the Data Science for Life Science XSeries

data-analysis datascience edx harvardx machine-learning

Last synced: 02 Jan 2026

https://github.com/lgrcia/nuance

Efficient detection of planets transiting quiet or active stars

correlated-noise data-analysis exoplanets search systematics transits variability

Last synced: 29 Dec 2025

https://github.com/csinva/data-viz-utils

Functions for easily making publication-quality figures with matplotlib.

big-data data-analysis data-science data-visualization eda legend matplotlib python python3 scatterplot time-series

Last synced: 05 May 2025

https://github.com/UtrechtUniversity/iBridges

A wrapper around the python-irodsclient to allow for easy interaction with iRODS servers.

data-analysis data-engineering data-science datascience irods-client

Last synced: 29 Jun 2025

https://github.com/osl-pocs/skdata

Python tools for data analysis

data data-analysis data-science open-data python

Last synced: 12 Dec 2025

https://github.com/squey/squey

Squey is a visualization software designed to interactively explore and understand large amounts of tabular data (this is the read-only mirror of https://gitlab.com/squey/squey)

cybersecurity data-analysis data-science data-visualization exploratory-data-visualizations parallel-coordinates parquet parquet-files parquet-viewer pcap timeseries timeseries-analysis visualization

Last synced: 08 Mar 2025

https://github.com/iBridges-for-iRODS/iBridges

A wrapper around the python-irodsclient to allow for easy interaction with iRODS servers.

data-analysis data-engineering data-science datascience irods-client

Last synced: 14 Jul 2025

https://github.com/pgomba/mdpi_explorer

A simple package to explore MDPI´s articles by journal. A series of functions help to obtain lists of papers, obtain data from them (turnaround times, special issues and articles types) and create summary graphs.

analysis data-analysis data-visualization mdpi metrics scientific-journals visualization web-scraping

Last synced: 13 May 2025

https://github.com/jm199504/python-exercises

Python 练习册(基础操作 / 数据库 / 数据统计处理 / 图文生成 / 数据转换 / 算法题 / 小应用 / 程序开发)

data-analysis database python

Last synced: 07 May 2025

https://github.com/luizbizzio/grafana-wallpaper

🖥️ A detailed guide on how to set up Grafana and display its dashboards as your desktop wallpaper. This project allows you to transform your data visualizations into an interactive real-time monitoring background, making data always visible.

app automation data-analysis data-visualization exporter grafana grafana-dashboard graph graphs guide homeautomation iot lively-wallpaper metrics monitoring prometheus real-time tutorial wallpaper windows

Last synced: 26 Oct 2025

https://github.com/ajayarunachalam/gui-pandas-ai

GUIPandasAI - Integrating Generative AI capabilities into Pandas as Web Interface along with key-words based data analysis services

ai chatgpt data data-analysis data-analytics data-science generative-ai gpt-3 gpt-4 llm pandas python streamlit web-app

Last synced: 06 Jul 2025

https://github.com/bcgov/shinyrems

An R package to launch shinyrems; an online application that allows a user to access, download, clean, plot and calculate simple statistics using data from the B.C. government Environmental Monitoring System database.

data-analysis environment environmental-data water-quality

Last synced: 30 Jul 2025

https://github.com/yashksaini-coder/zomato-data-analysis

Zomato Data Analysis to explore insights and build predictive models with a dynamic & Interactive dashboard using Streamlit Web application. Also deploying and scaling with Cyclops-UI

data-analysis docker kubernets streamlit streamlit-webapp

Last synced: 26 Jul 2025

https://github.com/ptyadana/sql-for-data-analysis-parch-and-posey

SQL for Data Analysis using PostgresSQL - analyzing Parch&Posey fictional company

challenges data-analysis pgadmin pgadmin4 posey-company postgres postgresql postgresql-database schema sql udacity

Last synced: 12 Apr 2025

https://github.com/mcwaage1/qs

Quantified Self: A Personal Data Aggregator and Dashboard for Self-Trackers and Quantified Self Enthusiasts

activity-tracking data-analysis data-visualization fitbit goodreads google-sheets lastfm mood personal-data quantified-self quantifiedself self-tracking writing

Last synced: 16 Apr 2025

https://github.com/adamdempsey90/ndtamr

N-dimensional adaptive mesh refinment tree structure in Python

amr data-analysis python visual

Last synced: 28 Oct 2025

https://github.com/samuroi/samuroi

SamuROI - Structured analysis of multiple user-defined ROIs, an open source Python-based analysis environment for imaging data.

calcium-imaging conda data-analysis data-visualization opencv python scikit-image

Last synced: 15 Apr 2025

https://github.com/aifred-health/vulcanai

A high level deep learning framework for quickly prototyping networks with added tools in data visualisation, model interpretability and performance metrics

data-analysis data-cleaning data-science data-visualization deep-learning deep-neural-networks feature-engineering mental-health python3 pytorch scikit-learn

Last synced: 01 Aug 2025

https://github.com/chuongmep/aps-bot

Explore Data By CLI With Autodesk Platform Services

aps autodesk-forge autodesk-platform-services cli data-analysis data-science forge

Last synced: 12 Apr 2025

https://github.com/chyikwei/bnp

Bayesian nonparametric models for python

bayesian data-analysis probabilistic-graphical-models python topic-modeling

Last synced: 03 May 2025

https://github.com/llnl/topoms

Topological Analysis for Molecular Systems

data-analysis data-viz

Last synced: 29 Apr 2025

https://github.com/andrewreynen/lazylyst

Lazylyst is a GUI created for time series review, using a flexible framework for new workflows

data-analysis earthquakes gui python qt seismology

Last synced: 22 Apr 2025

https://github.com/thesimonho/rok-data

Data for Rise of Kingdoms

d3js data-analysis dataset dataviz gaming statistics

Last synced: 12 Apr 2025

https://github.com/nicovandenhooff/top-repo-analysis

This repository contains my work that supports my article on Towards Data Science: "Exploring the Most Popular Machine Learning and Deep Learning GitHub Repositories."

altair automation data-analysis data-science data-visualization pygithub python

Last synced: 21 Aug 2025

https://github.com/msyriac/orphics

A library containing analysis and theory tools for cosmological data.

analysis cmb cosmology data-analysis theory

Last synced: 16 Mar 2025

https://github.com/tushar2704/everyday_python

Welcome to Everyday Python Sheets – your go-to resource for everyday Python cheat sheets, pro tips, interview questions, Python one-liners, and Python data structures. Whether you're a beginner looking to learn Python or an experienced developer seeking quick reference materials, this Streamlit application has got you covered.

artificial-intelligence cheatsheet data data-analysis data-science data-structures data-visualization database protips python streamlit streamlit-tushar2704 tushar2704

Last synced: 04 Nov 2025

https://github.com/alipsa/ride

A nice R development and analytics environment, for the Renjin JVM implementation of R

analytics data-analysis data-science data-visualization integrated-development-environment r sql

Last synced: 14 Oct 2025

https://github.com/kennethleungty/fifa-football-world-rankings

Analyzing FIFA World Football Rankings with Python and R

data-analysis data-analytics data-science football python r soccer sports

Last synced: 12 Jul 2025

https://github.com/sigbla/sigbla-app

Sigbla is a framework for working with data in tables, using the Kotlin programming language. It supports various data types, reactive programming and events, user input, charts, and more.

dashboard dashboard-application data-analysis data-science data-visualization kotlin kotlin-dsl kotlin-library sigbla spreadsheet table

Last synced: 17 Jan 2026

https://github.com/kongruksiamza/python-datascience

เอกสารประกอบการสอนเนื้อหา Python - Data Science และงานด้าน Machine Learning

data-analysis data-science numpy pandas python

Last synced: 05 May 2025

https://github.com/zekeriyyaa/pyspark-structured-streaming-ros-kafka-apachespark-cassandra

A structured streaming was applied to the robot data from ROS-Gazebo simulation environment using Apache Spark. Data is collected in Kafka, analyzed by Apache Spark and stored in Cassandra.

apache-cassandra apache-kafka apache-spark cqlsh data-analysis kafka-consumer kafka-producer pyspark python python3 ros ros-noetic spark-cassandra spark-cassandra-connector spark-kafka-connector spark-kafka-integration spark-sql spark-streaming structured-streaming

Last synced: 30 Jun 2025

https://github.com/nceas/metajam

Bringing data and metadata togetheR

data data-analysis metadata r repositories

Last synced: 25 Oct 2025

https://github.com/junioralive/indian-medicine-dataset

A curated dataset of Indian medicines, organized by brand. Essential for healthcare research and pharmaceutical analysis.

brand-medicines data-analysis drug-information healthcare-analytics healthcare-data indian-medicine medical-research medicine-database pharmaceuticals pharmacy-data

Last synced: 18 Aug 2025

https://github.com/yashuv/python-for-data-science-ai-and-development

Python for Data Science, AI & Development - offered by IBM on Coursera

coursera-course data-analysis data-science ibm numpy pandas python

Last synced: 12 Apr 2025

https://github.com/mohammadreza-mohammadi94/data-analysis-and-machine-learning-projects

A comprehensive collection of data analysis and machine learning projects, showcasing techniques and models for various data challenges. Dive in to explore code examples, analyses, and machine learning workflows.

data-analysis data-science dataframes deep-learning exploratory-data-analysis hyperparameter-tuning machine-learning machine-learning-algorithms pandas python scikit-learn visualization

Last synced: 06 Oct 2025