An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/gher-uliege/divand.jl

DIVAnd performs an n-dimensional variational analysis of arbitrarily located observations

data-analysis earth-observation eosc-hub interpolation julia ocean-sciences oceanography smoothing-splines spatial-data-analysis toolbox

Last synced: 05 Apr 2025

https://github.com/ndleah/8-week-sql-challenge

#8WeekSQLChallenge by Danny Ma.

data-analysis data-science sql

Last synced: 25 Oct 2025

https://github.com/visivo-io/visivo

✨ Build dashboards with end-to-end version control. 🔋 CLI w/ batteries included, no infra required. Develop on your laptop for instant results, deploy changes safely (with automated checks), and keep every report trustworthy for stakeholders, analysts and agents 🤖

analytics bi bi-analytics bi-as-code business-intelligence data data-analysis data-visualization duckdb plotlyjs pydantic python reactjs sql

Last synced: 16 Oct 2025

https://github.com/pixelspark/warp

Convert and analyze large data sets at light speed, on Mac and iOS.

big-data data-analysis mysql postgresql rethinkdb sqlite

Last synced: 15 May 2025

https://github.com/kianweelee/Edator

A python package that performs exploratory data analysis for users. Additionally, it generates 3 types of output files (cleaned CSV, plots and a text report).

data-analysis data-science exploratory-data-analysis

Last synced: 08 May 2025

https://github.com/capitalone/dataCompareR

dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.

compare-data data data-analysis data-science r

Last synced: 30 Jul 2025

https://github.com/impetus/jumbune

Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,

aiops apm cluster-monitoring data-analysis data-quality developer-tools devops-tools hadoop hadoop-cluster hadoop-monitor hadoop-monitoring monitoring-tool optimization-framework yarn yarn-hadoop-cluster

Last synced: 17 Dec 2025

https://github.com/yusufcinarci/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-science data-science-projects jupyter jupyter-notebook python

Last synced: 14 Mar 2026

https://github.com/paezha/spatial-analysis-r

Open Educational Resource for teaching spatial data analysis and statistics with R

data-analysis open-educational-resource r r-package r-spatial rstats spatial-data-analysis spatial-statistics statistics

Last synced: 09 Apr 2025

https://github.com/arpanghosh8453/dji-logbook

A high-performance universal dashboard application for organizing and analyzing DJI drone flight logs privately in one place. Built with Tauri v2, DuckDB, and React.

dashboard data-analysis data-visualization database desktop dji docker drone duckdb flight linux logs macos react self-hosted statistics tauri uav windows

Last synced: 13 Feb 2026

https://github.com/njanakiev/folderstats

Python module that collects detailed statistics from a folder structure

data-analysis filesystem pandas python statistics

Last synced: 20 Aug 2025

https://github.com/analyseether/ether_sql

A python library to push ethereum blockchain data into an sql database.

analytics blockchain csv data-analysis ethereum etl export postgresql python sql

Last synced: 14 Jan 2026

https://github.com/VUKOZ-OEL/3d-forest

Visualization, processing and analysis of Lidar point clouds, mainly focused on forest environment. New version of 3D Forest. Process files with terabytes of data. Edit new point attributes. Simple addition of new features by plugins.

3d classification cpp cross-platform data-analysis desktop-application editor forest gui interactive-visualization las laser-scanning lidar opengl plugins point-cloud qt scientific-computing segmentation tree

Last synced: 07 May 2025

https://github.com/alanderex/pydata-pandas-workshop

Material for my PyData Jupyter & Pandas Workshops, I'm also available for personal in-house trainings on request

data-analysis jupyter-notebook pandas visualisation workshop

Last synced: 14 Apr 2025

https://github.com/AllenInstitute/openscope_databook

OpenScope databook: a collaborative, versioned, data-centric collection of foundational analyses for reproducible systems neuroscience 🐁🧠🔬🖥️📈

dandi-archive data-analysis data-visualization nwb python reproducible-research visualization

Last synced: 01 May 2025

https://github.com/randyzwitch/streamlit-embedcode

Streamlit component for embedding code snippets such as GitHub gists, CodePen snippets, Gitlab snippets, etc.

data-analysis data-science data-visualization python streamlit streamlit-component

Last synced: 12 May 2025

https://github.com/apachecn/pandas-cookbook-code-notes

:book: Pandas Cookbook 带注释源码

code data-analysis notes pandas python

Last synced: 02 May 2025

https://github.com/rickiepark/hg-da

<혼자 공부하는 데이터 분석 with 파이썬>의 코드 저장소

data-analysis data-science data-visualization machine-learning matplotlib numpy pandas scikit-learn scipy

Last synced: 06 Apr 2025

https://github.com/renumics/sliceguard

A library for detecting problematic data segments in structured and unstructured data with few lines of code.

data-analysis data-cleaning data-curation data-exploration data-science data-visualization deep-learning eda exploratory-data-analysis machine-learning python visualization

Last synced: 16 Mar 2025

https://github.com/tatevkaren/tatevkaren-data-science-portfolio

Data Science Portfolio of Tatev Karen Aslanyan including Case Studies and Research Projects that I have completed that solve business problems or introduce new products. Case Study papers, codes, and additional resources are all included.

blog case-study computer-science data-analysis data-science deep-learning econometrics machine-learning papers portfolio portfolio-website statistics

Last synced: 10 Apr 2025

https://github.com/cvjena/libmaxdiv

Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.

anomalydetection anomalydiscovery data-analysis data-mining datamining machine-learning machine-learning-library machinelearning time-series timeseries

Last synced: 28 Jan 2026

https://github.com/b0o/apple-autofill-domains

Apple's allowed autofill domains

apple data-analysis github-actions web-scraping

Last synced: 25 Mar 2025

https://github.com/dask-contrib/dask-awkward

Native Dask collection for awkward arrays, and the library to use it.

columnar-format dask data-analysis data-science data-structure jagged-array python ragged-array

Last synced: 12 Apr 2025

https://github.com/datamole-ai/edvart

An open-source Python library for Data Scientists & Data Analysts designed to simplify the exploratory data analysis process. Using Edvart, you can explore data sets and generate reports with minimal coding.

analysis data-analysis data-science data-visualization data-viz eda exploration exploratory-data-analysis exploratory-data-analysis-eda plots python

Last synced: 11 Feb 2026

https://github.com/airoldilab/sgd

An R package for large scale estimation with stochastic gradient descent

big-data data-analysis gradient-descent r statistics

Last synced: 12 Jul 2025

https://github.com/404notf0und/FXY

Security-Scenes-Feature-Engineering-Toolkit, Continuous Integration.一款安全数据特征化工具

data-analysis data-mining feature-engineering machine-learning security security-scenes

Last synced: 11 Jul 2025

https://github.com/staircase-dev/staircase

A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

analysis data-analysis data-structures library numpy pandas python step-function stepfunction

Last synced: 04 Apr 2025

https://github.com/glotaran/pyglotaran

A Python library for Global and Target Analysis of time-resolved spectroscopy data

data-analysis glotaran modelling pyglotaran python-library target-analysis

Last synced: 26 Feb 2026

https://github.com/404notf0und/fxy

Security-Scenes-Feature-Engineering-Toolkit, Continuous Integration.一款安全数据特征化工具

data-analysis data-mining feature-engineering machine-learning security security-scenes

Last synced: 16 Oct 2025

https://github.com/lkuffo/data-viz

Más de 50 ejemplos de visualizaciones y análisis de datos en Matplotlib, Pandas, Seaborn, Plotly, Bokeh y Networkx

data-analysis data-science dataviz geoviz jupyter jupyter-notebook matplotlib networkx pandas plotly python seaborn

Last synced: 30 Jul 2025

https://github.com/jmwoloso/pychattr

Python Channel Attribution (pychattr) - A Python implementation of the excellent R ChannelAttribution library

channel-attribution data-analysis data-science machine-learning python python-channel-attribution rpy2 wrapper

Last synced: 06 May 2025

https://github.com/DistrictDataLabs/cultivar

Multidimensional data explorer and visualization tool.

data-analysis data-exploration data-management visualization

Last synced: 15 Apr 2025

https://github.com/districtdatalabs/cultivar

Multidimensional data explorer and visualization tool.

data-analysis data-exploration data-management visualization

Last synced: 01 Feb 2026

https://github.com/antononcube/mathematicavsr

Example projects, code, and documents for comparing Mathematica with R.

comparison data-analysis data-science machine-learning mathematica r time-series

Last synced: 17 Oct 2025

https://github.com/angelfp/visualpic

Data Visualization for Particle-in-Cell Codes.

data-analysis data-visualization openpmd particle-in-cell python vtk

Last synced: 16 Apr 2025

https://github.com/chiphuyen/metrotwitter

What Twitter reveals about the differences between cities and the monoculture of the Bay Area

data-analysis data-visualization emojis nlp nlp-datasets python twitter twitter-dataset

Last synced: 14 Apr 2025

https://github.com/wrinth/data_analyst_projects

Projects created from Udacity's Data Analyst Nanodegree

data-analysis python udacity-nanodegree

Last synced: 05 Apr 2025

https://github.com/spratiher9/sparkora

Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟

apache apache-spark data data-analysis data-analysis-python data-analytics easy-to-use eda exploratory-data-analysis open-source opensource pyspark python python3 toolkit

Last synced: 10 Jul 2025

https://github.com/okfn-brasil/serenata-notebooks

Notebooks from Operação Serenata de Amor | ** Este repositório não recebe atualizações frequentes **

data-analysis ipynb jupyter-notebook python

Last synced: 28 Oct 2025

https://github.com/alleninstitute/openscope_databook

OpenScope databook: a collaborative, versioned, data-centric collection of foundational analyses for reproducible systems neuroscience 🐁🧠🔬🖥️📈

dandi-archive data-analysis data-visualization nwb python reproducible-research visualization

Last synced: 11 Apr 2025

https://github.com/ulikoehler/uliengineering

A python library for calculations perfomed in electronics engineering

data-analysis data-science electronics engineering python

Last synced: 05 Apr 2025

https://github.com/toolsforexperiments/plottr

A flexible plotting and data analysis tool.

data-analysis live-plotting physics plotting pyqt qcodes science

Last synced: 20 Feb 2026

https://github.com/dcoles/prometheus-pandas

Pandas integration for Prometheus.

data-analysis jupyter-notebook pandas prometheus python

Last synced: 08 Oct 2025

https://github.com/contextlab/computational-neuroscience

Short undergraduate course taught at University of Pennsylvania on computational and theoretical neuroscience. Provides an introduction to programming in MATLAB, single-neuron models, ion channel models, basic neural networks, and neural decoding.

computational-neuroscience course-materials data-analysis matlab modeling neuron problem-set simulation

Last synced: 07 Oct 2025

https://github.com/dbis-ilm/stark

A framework for Spatio-Temporal Data Analytics on Spark

apache-spark data-analysis rdd scala spatial spatial-data-analysis spatio-temporal-data

Last synced: 13 Oct 2025

https://github.com/kavvkon/enlopy

enlopy is an open source python library with methods to generate, process, analyze, and plot energy related timeseries.

data-analysis energy python timeseries visualization

Last synced: 21 Feb 2026

https://github.com/zblz/naima

Derivation of non-thermal particle distributions through MCMC spectral fitting

astronomy astrophysics data-analysis gamma-ray-astronomy python

Last synced: 13 Apr 2025

https://github.com/cosmoduende/r-youtube-personal-history-analysis

Explore your activity on YouTube with R: How to analyze and visualize your personal pata history. Find out how you consume YouTube using a copy of your personal data from Google Takeout.

data-analysis data-analytics data-visualization data-viz google-takeout r-data r-language r-programming youtube youtube-accounts youtube-analytics youtube-api youtube-data youtube-data-analysis youtube-data-api youtube-data-api-v3 youtube-data-scraping youtube-dataset youtube-scrape youtube-scraper

Last synced: 26 Jul 2025

https://github.com/palewire/first-python-notebook

A step-by-step guide to analyzing data with Python and the Jupyter notebook.

altair data-analysis data-journalism education journalism jupyter jupyter-notebook jupyterlab news pandas python sphinx tutorial

Last synced: 11 Apr 2025

https://github.com/theengineeringworld/statistics-using-python

These files are part of Youtube Course "Statistics Using Python" Offered By The Engineering WOrld. Offered By: http://youtube.com/theengineeringworld

cleaning data-analysis data-mining data-science data-visualization database jupyter-notebooks python python3 statistics

Last synced: 06 Sep 2025

https://github.com/asnelt/mixedvines

Python package for canonical vine copula trees with mixed continuous and discrete marginals

c-vines copula copula-models copulae copulas data-analysis dependency-analysis dependency-modeling modeling python regular-vines statistics

Last synced: 17 Mar 2026

https://github.com/david26694/cluster-experiments

Power analysis and AB test analysis library

abtesting data-analysis mde pandas power-analysis python statistics

Last synced: 18 Feb 2026

https://github.com/ContextLab/computational-neuroscience

Short undergraduate course taught at University of Pennsylvania on computational and theoretical neuroscience. Provides an introduction to programming in MATLAB, single-neuron models, ion channel models, basic neural networks, and neural decoding.

computational-neuroscience course-materials data-analysis matlab modeling neuron problem-set simulation

Last synced: 19 Jul 2025

https://github.com/dfinke/psduckdb

PSDuckDB is a PowerShell module that provides seamless integration with DuckDB, enabling efficient execution of analytical SQL queries directly from the PowerShell environment.

data-analysis data-science duckdb powershell sql

Last synced: 16 Mar 2025

https://github.com/cdnjs/cf-stats

📈 Monthly usage statistics from Cloudflare for the cdnjs.cloudflare.com domain - The #1 free and open source CDN built to make life easier for developers.

cdnjs cloudflare data data-analysis statistics stats usage usage-data usage-reports

Last synced: 06 Jul 2025