An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/moosetechnology/Moose

MOOSE - Platform for software and data analysis.

data-analysis moose pharo smalltalk software-analysis

Last synced: 11 May 2025

https://github.com/moosetechnology/moose

MOOSE - Platform for software and data analysis.

data-analysis moose pharo smalltalk software-analysis

Last synced: 04 Apr 2025

https://github.com/NCAS-CMS/cf-python

A CF-compliant Earth Science data analysis library

cf cfdm cfunits data-analysis earth-science metadata netcdf pp python um

Last synced: 20 Jul 2025

https://github.com/hemansnation/Data-Analyst-Roadmap

Data-Analyst-Roadmap for Professionals. This roadmap contains 8 Chapters that can be completed in 8 weeks, whether you are a fresher in the field or an experienced professional who wants to transition into Data Analysis.

analytics data-analysis data-analysis-python data-analytics data-science numpy predictive-analytics project-based-learning python statistics tableau

Last synced: 07 Sep 2025

https://github.com/hemansnation/data-analyst-roadmap

Data-Analyst-Roadmap for Professionals. This roadmap contains 8 Chapters that can be completed in 8 weeks, whether you are a fresher in the field or an experienced professional who wants to transition into Data Analysis.

analytics data-analysis data-analysis-python data-analytics data-science numpy predictive-analytics project-based-learning python statistics tableau

Last synced: 15 Apr 2025

https://github.com/ing-bank/probatus

SHAP-based validation for linear and tree-based models. Applied to binary, multiclass and regression problems.

binary-classifiers data-analysis data-science feature-elimination machine-learning multi-class-classification recursive-feature-elimination regressors shap statistics tree-model

Last synced: 07 Apr 2025

https://github.com/nickpoison/astsa

R package to accompany Time Series Analysis and Its Applications: With R Examples -and- Time Series: A Data Analysis Approach Using R

astsa data-analysis data-science dna-sequences em-algorithm kalman-filter missing-data package r state-space-models time-series-analysis

Last synced: 21 Oct 2025

https://github.com/dmnfarrell/tablexplore

Table analysis and plotting application written in PySide2/PyQt5

data-analysis data-science dataframe pandas plotting pyqt5 pyside2 python qt

Last synced: 01 Aug 2025

https://github.com/anicetngrt/jiro-nn

A Deep Learning and preprocessing framework in Rust with support for CPU and GPU.

adam classification cuda data-analysis deep-learning dropout gpu gpu-computing machine-learning ml nalgebra neural-networks nn opencl pipelines regression rust sgd

Last synced: 09 Apr 2025

https://github.com/AnicetNgrt/jiro-nn

A Deep Learning and preprocessing framework in Rust with support for CPU and GPU.

adam classification cuda data-analysis deep-learning dropout gpu gpu-computing machine-learning ml nalgebra neural-networks nn opencl pipelines regression rust sgd

Last synced: 25 Sep 2025

https://github.com/dkedar7/fast_dash

Turn your Python functions into interactive apps! Fast Dash is an innovative way to deploy your Python code as interactive web apps with minimal changes.

dash data-analysis data-science data-visualization deep-learning fast-dash flask machine-learning plotly-dash python ui webdevelopment

Last synced: 20 Apr 2026

https://github.com/hbuschme/TextGridTools

Read, write, and manipulate Praat TextGrid files with Python

annotation data-analysis elan linguistics praat python textgrid

Last synced: 19 Jul 2025

https://github.com/iam-mhaseeb/skytrax-data-warehouse

A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.

airflow data-analysis data-analytics data-cleaning data-engineering data-orchestration data-processing data-visualization data-warehouse data-warehousing database docker metabase python python3 redshift s3 s3-bucket sql

Last synced: 12 Aug 2025

https://github.com/miniql/miniql

A tiny JSON-based query language inspired by GraphQL

data data-analysis data-science graphql javascript json queries query query-language typescript

Last synced: 15 Apr 2025

https://github.com/subhashhhhhh/Fastlytics

A modern web app for Formula 1 data analytics, providing race results, standings, and telemetry charts.

data-analysis formula1 javascript

Last synced: 13 Apr 2025

https://github.com/subhashhhhhh/fastlytics

A modern web app for Formula 1 data analytics, providing race results, standings, and telemetry charts.

data-analysis formula1 javascript

Last synced: 20 Jan 2026

https://github.com/Canner/wren-engine

🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥

business-intelligence data data-analysis data-analytics data-lake data-warehouse hacktoberfest llm semantic semantic-layer sql

Last synced: 01 Apr 2025

https://github.com/ArturSepp/QuantInvestStrats

Quantitative Investment Strategies (QIS) package implements Python analytics for visualisation of financial data, performance reporting, analysis of quantitative strategies.

asset-management data-analysis data-visualization investment-analysis performance-attribution portfolio-optimization portfolio-risk-management python quantitative-finance

Last synced: 21 May 2026

https://github.com/jadianes/spark-r-notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

big-data bigdata data-analysis data-science exploratory-data-analysis jupyter jupyter-notebook notebook r sparkr

Last synced: 21 Apr 2025

https://github.com/hay/dataknead

Effortless conversion between data formats like JSON, XML and CSV

csv data-analysis data-conversion json python python3

Last synced: 28 Aug 2025

https://github.com/juliadata/indexedtables.jl

Flexible tables with ordered indices

data-analysis data-manipulation indexedtables julia juliadb

Last synced: 06 Apr 2025

https://github.com/hdfgroup/hsds

Cloud-native, service based access to HDF data

asyncio aws data-analysis docker hdf5 multi-dimensional python scientific-data

Last synced: 05 Apr 2025

https://github.com/chasmani/piecewise-regression

piecewise-regression (aka segmented regression) in python. For fitting straight line models to data with one or more breakpoints where the gradient changes.

data-analysis linear-regression model-fitting piecewise-regression python3 regression segmented-regression statistics

Last synced: 21 Oct 2025

https://github.com/apachecn/ds100-textbook-zh

:book: [译] UCB DS100 数据科学的原理与技巧

data-analysis ds100 machine-learning python textbook ucb

Last synced: 02 May 2025

https://github.com/acerbilab/pyvbmc

PyVBMC: Variational Bayesian Monte Carlo algorithm for posterior and model inference in Python

bayesian-inference data-analysis gaussian-processes machine-learning python variational-inference

Last synced: 04 Apr 2025

https://github.com/winvector/data_algebra

Codd method-chained SQL generator and Pandas data processing in Python.

data-analysis data-science pandas python

Last synced: 09 Apr 2025

https://github.com/ajayarunachalam/msda

Library for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector

anamoly-detection-using-graphs anomaly-detection correlation data-analysis deep-learning deep-neural-networks explainable-artificial-intelligence feature-engineering feature-selection multidimensional-data multisensor python pytorch sensor sensor-data signal-processing tabular-data time-series variation visualization

Last synced: 05 Oct 2025

https://github.com/phanxuanquang/askdb

Interact with your relational databases using natural language, and even more

csharp dapper data-analysis database dotnet gemini genai generative-ai llm llms sql windows winui winui3

Last synced: 05 Apr 2025

https://github.com/ujjwalkarn/xda

R package for exploratory data analysis

data-analysis data-science exploratory-data-analysis r

Last synced: 26 Apr 2025

https://github.com/bccp/nbodykit

Analysis kit for large-scale structure datasets, the massively parallel way

astrophysics clustering cosmology data-analysis large-scale-structure mpi mpi4py parallel-computing python

Last synced: 07 Aug 2025

https://github.com/deanmarchiori/analysis-flow

Data Analysis Workflows & Reproducibility Learning Resources

data-analysis reproducibility reproducible-data-science reproducible-science tooling workflow

Last synced: 29 Jul 2025

https://github.com/firmai/business-analytics-and-mathematics-python-book

Advanced Business Analytics and Mathematics with Python (by @firmai)

analytics business data-analysis data-science mathematics python

Last synced: 06 May 2025

https://github.com/abhiamishra/ggshakeR

An analysis and visualization R package that works with publicly available soccer data

analysis data-analysis data-visualization football-analytics library machine-learning plotting r soccer soccer-analytics visualization

Last synced: 06 May 2025

https://github.com/aphalo/ggpmisc

R package ggpmisc is an extension to ggplot2 and the Grammar of Graphics

data-analysis dataviz ggplot2-annotations ggplot2-stats statistics

Last synced: 28 Jan 2026

https://github.com/Nesvilab/philosopher

PeptideProphet, PTMProphet, ProteinProphet, iProphet, Abacus, and FDR filtering

bioinformatics data-analysis go mass-spectrometry ms-data proteomics

Last synced: 19 Apr 2025

https://github.com/innat/ML-Resource

A concise resource repository for machine learning

data-analysis data-science deep-learning kaggle machine-learning python spark

Last synced: 29 Apr 2025

https://github.com/imsanjoykb/data-science-regular-bootcamp

Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.

artificial-intelligence data-analysis data-science data-science-notebook data-science-projects data-visualization database-connection deep-learning etl-pipeline etl-process feature-engineering machine-learning mysql-database neural-network numpy pandas postgresql python python-automation sqlite

Last synced: 30 Oct 2025

https://github.com/DataWithBaraa/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-queries sql-query sql-server window-functions window-functions-in-sql

Last synced: 14 Oct 2025

https://github.com/basedosdados/analises

📊 Repositório de códigos simples e replicáveis das análises publicadas.

data-analysis data-visualization open-source

Last synced: 05 Apr 2025

https://github.com/spectrochempy/spectrochempy

SpectroChemPy is a framework for processing, analyzing and modeling spectroscopic data for chemistry with Python

chemistry data-analysis datasets ftir ftir-data-analysis infrared nmr nmr-data nmr-spectroscopy processing python raman raman-spectra raman-spectroscopy spectroscopy uv-vis

Last synced: 16 May 2026

https://github.com/aershov24/machine-learning-ds-interview-questions

🔴 1704 Machine Learning, Data Science & Python Interview Questions (ANSWERED) To Kill Your Next ML & DS Interview. Get All Answers + PDFs on MLStack.Cafe. Post your ML Jobs 👉

algorithms-and-data-structures data-analysis data-science interview-practice interview-preparation interview-questions machine-learning machine-learning-algorithms machinelearning

Last synced: 17 Aug 2025

https://github.com/pmelchior/pygmmis

Gaussian mixture model for incomplete (missing or truncated) and noisy data

data-analysis gmm

Last synced: 17 Jan 2026

https://github.com/dcwuser/metanumerics

Meta.Numerics is library for advanced numerical computing on the .NET platform. It offers an object-oriented API for statistical analysis, advanced functions, Fourier transforms, numerical integration and optimization, and matrix algebra.

csharp-library data-analysis dotnet math math-library matrix matrix-algebra matrix-factorization matrix-library matrix-multiplication numerical-analysis numerical-integration numerical-optimization numerics optimization scientific-computing special-functions statistical-analysis statistical-tests statistics

Last synced: 19 Jul 2025

https://github.com/sanmeet007/logger

Logger is a Flutter-based Android app that enables you to view and export call logs in CSV or JSON format and perform lightweight on-device analysis.

andriod call-data-record-analysis call-logs csv-export csv-import data-analysis flutter json-export open-source

Last synced: 06 Apr 2025

https://github.com/tidypyverse/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 13 Mar 2026

https://github.com/jepegit/cellpy

extract and tweak data from electrochemical tests of cells

battery chemistry data-analysis electrochemistry opensource physics

Last synced: 21 Feb 2026

https://github.com/scverse/mudata

Multimodal Data (.h5mu) implementation for Python

anndata data-analysis genomics mudata multi-omics multimodal-omics-analysis muon scverse

Last synced: 07 Jul 2025

https://github.com/talegari/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 03 Mar 2025

https://github.com/sciruby/daru-view

daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.

charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra

Last synced: 06 Oct 2025

https://github.com/SciRuby/daru-view

daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.

charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra

Last synced: 27 Mar 2025

https://github.com/Coorsaa/shinyMlr

shiny-mlr: Integration of the mlr package into shiny

data-analysis data-visualization machine-learning mlr r r-package shiny shiny-apps

Last synced: 30 Jul 2025

https://github.com/coorsaa/shinymlr

shiny-mlr: Integration of the mlr package into shiny

data-analysis data-visualization machine-learning mlr r r-package shiny shiny-apps

Last synced: 16 Mar 2025

https://github.com/synthesized-io/fairlens

Identify bias and measure fairness of your data

bias data data-analysis data-science fairness ml pandas python statistics

Last synced: 24 Jun 2025

https://github.com/stanfordnlp/edu-convokit

Edu-ConvoKit: An Open-Source Framework for Education Conversation Data

data data-analysis data-science education language natural-language-processing

Last synced: 15 Apr 2025

https://github.com/jadianes/data-journalism

Data journalism and easy to replicate notebooks using Python, R, and Web visualisations

data-analysis data-journalism data-visualisation data-visualization exploratory-data-analysis notebook

Last synced: 03 Oct 2025

https://github.com/mne-tools/mne-rsa

Representational Similarity Analysis on MEG and EEG data

data-analysis eeg meg mne-python neuroscience

Last synced: 07 Apr 2026

https://github.com/leerob/facebook-data-analyzer

📊Python script to analyze the contents of your Facebook data export

beautifulsoup data-analysis facebook python

Last synced: 30 Apr 2025

https://github.com/easonlai/azure_openai_langchain_sample

This repository contains various examples of how to use LangChain, a way to use natural language to interact with LLM, a large language model from Azure OpenAI Service.

azure-openai azure-openai-api azure-openai-service csv data-analysis langchain langchain-python openai python python3

Last synced: 26 Apr 2025

https://github.com/woz-u/DS-Student-Resources

Data Science Student Companion Notebooks and Data Lake

data-analysis data-science data-visualization machine-learning nosql python r sql statistics

Last synced: 20 Jul 2025

https://github.com/inutano/chip-atlas

ChIP-Atlas: Browse and analyze all public ChIP/DNase-seq data on your browser

bioinformatics data-analysis database

Last synced: 24 Dec 2025

https://github.com/luanborelli/ipeadatapy

ipeadatapy is a data and metadata extraction package made in Python using Ipeadata database official API. In it's essence it is an API wrapper.

api api-wrapper brazil dados-abertos dados-historicos data data-analysis datasets econometrics economic-data economics geographic-data geography ipea ipeadata wrapper

Last synced: 07 Apr 2026

https://github.com/thoughtspile/hippotable

👩🏻‍🔬📊 Lightweight data analysis in your browser

csv dashboard data-analysis data-science javascript table visualization

Last synced: 06 Oct 2025