An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mainakrepositor/data-analysis

Different types of data analytics projects : EDA, PDA, DDA, TSA and much more.....

data-analysis data-science deeplearning machine-learning-algorithms neural-networks time-series-analysis tsa

Last synced: 06 Mar 2026

https://github.com/cdeweyx/medium-stats-analysis

Exploring data and analyzing metrics for user-specific Medium Stats

data-analysis data-mining data-visualization python

Last synced: 25 Apr 2025

https://github.com/rshkarin/quanfima

Quanfima (Quantitative Analysis of Fibrous Materials)

data-analysis material-science morphological-analysis volumetric-data

Last synced: 07 May 2025

https://github.com/alessandrocorradini/harvard-data-science-professional

Repository for the Data Science Professional Program from Harvard University on edX

data-analysis data-science datascience edx harvardx machine-learning machinelearning mooc moocs r r-language

Last synced: 13 Jul 2025

https://github.com/ncbi/tree-tool

Incremental building of phylogenetic distance trees

bioinformatics bioinformatics-tool data-analysis distance-measures evolution phylogenetic-trees

Last synced: 31 Jan 2026

https://github.com/dawievlill/datascience-871

Data science module for economists written mostly in Julia and R

data-analysis data-science machine-learning

Last synced: 27 Feb 2025

https://github.com/open-cogsci/datamatrix

An intuitive, Pythonic way to work with tabular data

analysis data-analysis data-structures python scientific-computing

Last synced: 03 Jun 2026

https://github.com/hugohadfield/bayesfilter

Pure Python/Numpy Bayesian Filtering and Smoothing

data-analysis ekf filtering smoothing ukf

Last synced: 25 Oct 2025

https://github.com/dotbithq/das-account-indexer

Mapping relationship between multi-chain's addresses and accounts

data-analysis docker golang nervos server

Last synced: 09 Oct 2025

https://github.com/tsffarias/data-analysis-queries

Este repositório foi cuidadosamente criado para fornecer uma extensa coleção de consultas SQL que visam facilitar o trabalho dos analistas de dados em diversas áreas de uma empresa, incluindo marketing, logística, comercial, financeiro, recursos humanos, operação, jurídico, suporte e muito mais.

business-intelligence comercial data-analysis data-insights esg finance-management fraud-prevention human-resources juridico kpis logistics marketing marketing-analytics operacao pricing sql suporte

Last synced: 05 Apr 2025

https://github.com/serkor1/slmetrics

A high-performance R :package: for supervised and unsupervised machine learning evaluation metrics witten in 'C++'.

armadillo armadillo-library artificial-intelligence cpp cran cran-r data-analysis data-science eigen3 machine-learning performance-metrics r r-package r-stats rcpp rcpparmadillo rcppeigen statistics supervised-learning

Last synced: 18 Feb 2026

https://github.com/isisneutronmuon/mdanse

MDANSE: Molecular Dynamics Analysis for Neutron Scattering Experiments

data-analysis molecular-dynamics neutron-scattering python qt-gui science

Last synced: 22 Aug 2025

https://github.com/ActivityWatch/aw-research

Tools to analyse and experiment with ActivityWatch data

activitywatch data-analysis python quantified-self

Last synced: 01 May 2025

https://github.com/activitywatch/aw-research

Tools to analyse and experiment with ActivityWatch data

activitywatch data-analysis python quantified-self

Last synced: 14 Apr 2025

https://github.com/mkcor/advanced-pandas

Pandas is a powerful tool for data exploration and analysis (including timeseries).

data-analysis data-science labeled-data notebooks python3 teaching-materials

Last synced: 12 Oct 2025

https://github.com/mark-hoffmann/icd

Tools for working with icd codes and comorbidities

data-analysis icd python

Last synced: 02 Apr 2026

https://github.com/anselmoo/spectrafit

📊📈🔬 SpectraFit is a command-line and Jupyter-notebook tool for quick data-fitting based on the regular expression of distribution functions.

console-application curve-fitting data-analysis data-analysis-python data-science data-visualization fitting juypter-notebook python science science-research scientific-plotting spectral-analysis spectroscopy

Last synced: 25 Nov 2025

https://github.com/fatbobman/objects2xlsx

A powerful, type-safe Swift library for converting Swift objects to Excel (.xlsx) files. Objects2XLSX provides a modern, declarative API for creating professional Excel spreadsheets with full styling support, multiple worksheets, and real-time progress tracking.

business data-analysis dataset excel export-excel reporting spredsheet swift xlsx xlsxwriter

Last synced: 18 Jul 2025

https://github.com/mrankitgupta/python-roadmap

I am sharing Python lessons from scratch to intermediate with practice sets which I have studied into my Journey of 66DaysofData into Data Analytics.

66daysofdata analytics ankitgupta data-analysis data-analysis-python data-analytics data-mining data-science data-structures data-visualization jupyter matplotlib mrankitgupta numpy pandas programming python python-library python3

Last synced: 14 Jul 2025

https://github.com/computationalcore/introduction-to-python

A very useful collection of Jupyter Notebooks, which aims to introduce the Python programming language.

data-analysis data-science fundamental google-colab jupyter-notebook jupyter-notebooks numpy pandas python python-language python-programming python3

Last synced: 24 Apr 2025

https://github.com/theengineeringworld/python-data-science

Python Data Science has all the data sets and jupyter notebook files for the Youtube course at http://youtube.com/theengineeringworld under the name of " Python Data Science Course ".

data data-analysis data-mining data-science data-visualization jupyter-notebook jupyter-notebooks machine-learning python python27

Last synced: 17 Nov 2025

https://github.com/codingforentrepreneurs/try-pandas

In this series, we're going to learn the fundamentals of the popular Python data science tool called Pandas.

data-analysis data-science deepnote jupyter nba-api nba-stats notebook pandas python python-pandas

Last synced: 18 Jan 2026

https://github.com/jpenuchot/ctbench

Compiler-assisted variable size benchmarking for the study of C++ metaprogram compile times.

benchmark clang compilation data-analysis data-visualization gcc metaprogramming

Last synced: 26 Oct 2025

https://github.com/sciruby/daru-io

daru-io is a plugin gem to the existing daru gem, which aims to add support to Importing DataFrames from / Exporting DataFrames to multiple formats.

daru data-analysis exporter importer parser ruby ruby-gem

Last synced: 12 Mar 2026

https://github.com/probcomp/cgpm

Library of composable generative population models which serve as the modeling and inference backend of BayesDB.

bayesian-inference data-analysis machine-learning probabilistic-programming tabular-data

Last synced: 19 Oct 2025

https://github.com/staircase-dev/piso

Pandas Interval Set Operations: providing methods for set operations, analytics, lookups and joins on pandas' Interval, IntervalArray and IntervalIndex

data-analysis data-science data-structures interval interval-arithmetic interval-set pandas set set-operations set-theory

Last synced: 20 Aug 2025

https://github.com/sondosaabed/paltaqdeer

🇵🇸 PalTaqdeer is an AI-Driven Student Success Forecaster. Was developed for Hackathon Google Launchpad, data analysis techniques, Linear regression model, and Flask for the web 🇵🇸

data-analysis hackathon hackathon-project linear-regression matplotlib outliers-detection pandas python student-grades

Last synced: 02 Mar 2026

https://github.com/ptyadana/data-analysis-for-digital-music-store

helping Digitial Music Store to optimize their business practices using PostgreSQL

chinook chinook-database data-analysis datavisualization pgadmin4 postgresql sql tableau

Last synced: 12 Apr 2025

https://github.com/cataseven/statistics-graph-chart-card

A highly customizable, smooth, and advanced graph card. Shows historical sensor data with dynamic trend colors, statistics (min, max, avg), and more. A great alternative to the default history graph and sensor cards.

analysis analytics bar-chart chart data data-analysis data-science data-visualization graph graphics histogram historical-data history home-assistant statistical-analysis statistics

Last synced: 12 Apr 2026

https://github.com/csbiology/fsharpgephistreamer

F# functions for streaming any kind of graph/network data to the network visualization tool gephi

data-analysis exploratory-data-analysis fsharp gephi graph-visualization streaming-graph-data visualization

Last synced: 30 Jul 2025

https://github.com/luizbizzio/grafana-wallpaper

🖥️ A detailed guide on how to set up Grafana and display its dashboards as your desktop wallpaper. This project allows you to transform your data visualizations into an interactive real-time monitoring background, making data always visible.

app automation data-analysis data-visualization exporter grafana grafana-dashboard graph graphs guide homeautomation iot lively-wallpaper metrics monitoring prometheus real-time tutorial wallpaper windows

Last synced: 23 Feb 2026

https://github.com/jatinagrawal0/youtube-comment-sentimental-analysis

YouTube Sentiment Analysis is a web application that analyzes the sentiment of YouTube comments, providing insights into comment sentiment using VADER sentiment analysis and interactive visualizations.

data-analysis data-visualization natural-language-processing plotly python sentiment-analysis streamli streamlit-cloud vader-lexicon youtube-api-v3 youtube-comment-scraper youtube-comments-downloader

Last synced: 14 Apr 2025

https://github.com/hoangsonww/north-carolina-household-analysis

🏠 This repository contains data analysis scripts for the 2022 American Community Survey (ACS) focusing on individuals aged 25 and over in North Carolina, based on 75,340 observations. This repository offers valuable insights into demographic and economic patterns across North Carolina's urban areas.

confidence-interval confidence-score data data-analysis data-analytics data-science data-visualization ggplot2 hypothesis-testing hypothesis-tests north-carolina r r-language r-programming stata

Last synced: 11 Apr 2025

https://github.com/asadiahmad/edit-distance-spark

Calculating Edit Distance with PySpark

data-analysis edit-distance nlp pyspark spark

Last synced: 28 Apr 2026

https://github.com/djangoaddicts/django-pygwalker

Easily add PyGWalker visualizations to your Django applications

data-analysis django pygwalker tableau tableau-alternative visualization

Last synced: 08 Apr 2026

https://github.com/pravj/ospi

Open Source Presence Infographic of Indian Startups

data-analysis data-visualization india open-source startup

Last synced: 13 Apr 2025

https://github.com/ahmedosamamath/statistics-basics

A comprehensive guide to applying statistical techniques in machine learning, including data preprocessing, model development, evaluation metrics, and real-world applications. This repository provides beginner-to-advanced insights into the statistical foundations of machine learning.

artificial-intelligence data-analysis data-science machine-learning statistics

Last synced: 12 Apr 2025

https://github.com/phisanti/mcpr

MCPR enables AI agents to participate in interactive R sessions for professional analysis workflows.

data-analysis mcp mcp-server r

Last synced: 17 May 2026

https://github.com/nnthanh101/sentiment-analysis

Voice of the Customer (VoC) to enhance customer experience with serverless architecture and sentiment analysis, using Amazon Kinesis, Amazon Athena, Amazon QuickSight, Amazon Comprehend, and ChatGPT-LLMs for sentiment analysis.

aws-athena aws-comprehend aws-kinesis aws-quicksight cdk data-analysis data-visualization sentiment-analysis voice-of-the-customer

Last synced: 28 Feb 2026

https://github.com/integerman/gitstractor

A library for visualizing the commits, authors, and files of any git repository

code-analysis data-analysis data-visualization dotnet git powerbi repository-management static-code-analysis utilities visualization

Last synced: 14 Jan 2026

https://github.com/PiotrZakrzewski/merge-chance

Source code of https://merge-chance.info

analysis data data-analysis open-source

Last synced: 26 Mar 2025

https://github.com/unipept/unipept

🌐 Unipept frontend for metaproteomics data analysis

data-analysis data-visualization metaproteomics unipept uniprot

Last synced: 21 Jan 2026

https://github.com/amkrajewski/nimcso

nim Composition Space Optimization is a high-performance tool leveraging metaprogramming to implement several methods for selecting components (data dimensions) in compositional datasets, as to optimize the data availability and density for applications such as machine learning.

data-analysis data-optimization data-science materials-informatics metaprogramming nim nim-lang

Last synced: 09 Apr 2025

https://github.com/goplus/pandas

Flexible and powerful data analysis / manipulation library for Go+, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

data-analysis data-science data-tech go golang gop goplus pandas scientific-computing

Last synced: 30 Apr 2025

https://github.com/piquette/edgr

A set of tools for dealing with SEC EDGAR corporate filings

api cli-app data-analysis data-mining edgar-database edgar-scraper finance financial-analysis sec-edgar sec-filings

Last synced: 15 May 2025

https://github.com/jasdumas/ttbbeer

An R Dataset Package for US Beer Statistics From TTB :beer:

beer-statistics data-analysis r

Last synced: 05 Mar 2025

https://github.com/skyzh/meteor

🚆 Fine-grained analysis and visualization of Hangzhou Metro for efficient traveling in metro system. Project report, slide and presentation video included.

cmake data-analysis hangzhou metro qt sqlite visualize

Last synced: 23 Mar 2025

https://github.com/simfg/etcd-analysis

🔦 Etcd Data Analysis Tool

data-analysis etcd go raft

Last synced: 20 Aug 2025

https://github.com/itzmeanjan/chanalyze

A simple WhatsApp Chat Analyzer ( for both Private & Group chats ), made with :heart:

chat-analysis data-analysis datascience dataviz matplotlib python3 visualization whatsapp whatsapp-chat whatsapp-chat-analyzer

Last synced: 06 Oct 2025

https://github.com/hoangsonww/global-covid19-analysis

🌍 This repository hosts an in-depth analysis of COVID-19's impact across five key countries from Jan 2020 to Dec 2021. Through advanced data analysis and visualization, we aim to provide insights into how the pandemic evolved differently across these nations, shedding light on the effectiveness of various health measures and vaccination campaigns.

covid covid-19 covid19-tracker data data-analysis data-analytics data-science data-visualization ggplot2 julia julia-language python r r-language r-markdown r-programming sas sas-programming stata vaccination

Last synced: 10 Apr 2025

https://github.com/cmudig/texture

Visualize your text data with structured attributes

data-analysis llm text visualization

Last synced: 07 May 2025

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 14 Jul 2025

https://github.com/skyzh/Meteor

🚆 Fine-grained analysis and visualization of Hangzhou Metro for efficient traveling in metro system. Project report, slide and presentation video included.

cmake data-analysis hangzhou metro qt sqlite visualize

Last synced: 12 Apr 2025

https://github.com/saranshbansal/data-science-with-python

Data science with Python: This repository mostly contains DataCamp data-science courses/exercises that I have completed.

data-analysis data-science datacamp-exercises numpy python

Last synced: 07 Oct 2025

https://github.com/rfordatascience/r4dswebsite

Public repository for the R4DS community website.

blogdown data-analysis data-analytics data-science data-visualization r r4ds tidyverse

Last synced: 11 Apr 2025

https://github.com/danvk/march-madness-data

NCAA brackets in JSON form

data-analysis ncaa-basketball sports

Last synced: 03 Mar 2025

https://github.com/mattools/matstats

Statistical Data Analysis Toolbox for Matlab. Provides a Table class similar to R's dataframe, as well a exloratory data analysis tools.

data-analysis data-table matlab matlab-toolbox statistics

Last synced: 21 Jun 2025

https://github.com/mr-easy/badminton-stroke-classification

Classifying badminton strokes based on accelorometer and gyroscope sensor data attached to player's wrist. An end-to-end Machine Learning project, from data collection and preprocessing to final model evaluation.

badminton-stroke-classification data-analysis data-analytics data-science deep-learning machine-learning model-evaluation notebook project time-series-analysis tutorial

Last synced: 31 Aug 2025

https://github.com/yefee/xcesm

python package for cesm output diagnosis

cesm data-analysis regrid

Last synced: 06 Apr 2026

https://github.com/PySloth/pysloth

A Python Package for Probabilistic Prediction

data-analysis data-science machine-learning python statistics

Last synced: 11 May 2025

https://github.com/nasa/ziggy

Ziggy, a portable, scalable infrastructure for science data processing pipelines, is the child of the Transiting Exoplanet Survey Satellite (TESS) pipeline and the grandchild of the Kepler Pipeline.

algorithm analysis arc data data-analysis data-reduction java k2 kepler linux macos nasa open-source pipeline science tess ziggy

Last synced: 09 May 2026

https://github.com/rubydamodar/the-ultimate-pandas-bootcamp

Welcome to the Pandas for Data Science repository! This course is designed to take you from beginner to proficient in using Pandas, the powerful data manipulation library in Python. Whether you're just starting your data science journey or looking to sharpen your skills, this repository contains all the resources

beginner-friendly csv-data data-analysis data-cleaning data-manipulation data-science data-visualization dataframe exploratory-data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python python-pandas series statistical-analysis time-series titanic-dataset

Last synced: 19 Apr 2025

https://github.com/shdev/phpflashtext

Extract Keywords from sentence or Replace keywords in sentences. @ https://github.com/vi3k6i5/flashtext

data-analysis data-extraction flashtext keyword-extraction nlp php search-in-text string-manipulation string-matching word2vec

Last synced: 12 Jan 2026

https://github.com/cengel/R-data-wrangling

Materials for my my R data workshop. https://cengel.github.io/R-data-wrangling/

data-analysis data-workshop datascience material r rstats social-sciences teaching tidyverse workshop

Last synced: 06 May 2025