An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/datawithbaraa/tableau-ultimate-course

All Tableau workbooks, datasets, and resources for students from Data with Baraa’s YouTube Tableau series. The most comprehensive Tableau & SQL guide from a real-world expert! Learn everything from basics to advanced analytics, dashboards, optimizations, and real-world business use cases.

bi-tools business-intelligence data-analysis data-analytics data-science data-visualization tableau tableau-course tableau-dashboard tableau-dashboards tableau-desktop tableau-project tableau-projects tableau-public tableau-repository tableau-server tableau-story tableau-training tableau-tutorial tableau-workbooks

Last synced: 02 Sep 2025

https://github.com/alexandroskyriakakis/strongappanalytics

Analytics and Charts for your fitness log exported from the Strong App (AppStore/PlayStore).

chartjs data-analysis data-representation fitness-log nodejs pyplot python-analytics react rsuitejs strong strong-app

Last synced: 28 Oct 2025

https://github.com/david26694/cluster-experiments

Power analysis and AB test analysis library

abtesting data-analysis mde pandas power-analysis python statistics

Last synced: 13 Dec 2025

https://github.com/khanhnamle1994/world-cup-2018

An exploratory data analysis and data visualization project for World Cup 2018

data-analysis data-visualization

Last synced: 25 Apr 2025

https://github.com/rafzamb/sknifedatar

sknifedatar is a package that serves primarily as an extension to the modeltime 📦 ecosystem. In addition to some functionalities of spatial data and visualization.

data data-analysis data-science data-visualization forecasting r statistics time-series

Last synced: 22 Oct 2025

https://github.com/alibaba/table-computing

Table-Computing (Simplified as TC) is a high performance and low latency computing framework, 10x faster than Flink for complicated use cases, distributed and light weighted, relational operation, simple to use, write less and do more.

big-data data-analysis java stream-processing table-computing tc

Last synced: 14 Oct 2025

https://github.com/sharmaroshan/Insurance-Claim-Prediction

In this Data set we are Predicting the Insurance Claim by each user, Machine Learning algorithms for Regression analysis are used and Data Visualization are also performed to support Analysis.

beginner classification data-analysis data-visualization eda evaluation-metrics finance machine-learning radar-chart

Last synced: 20 Jul 2025

https://github.com/sircryptic/autoexif

want to remove sensitive data from photos or even view it? use autoexif to easily help you do that no more remembering syntaxs with this user-friendly tool.

data data-analysis exif-data exif-data-extraction exif-interface exif-metadata exif-reader exif-remover exiftool image meta metadata osint osint-tool viewer

Last synced: 14 Apr 2025

https://github.com/tstreamdoth/instacart-market-basket-analysis

Use Instacart public dataset to report which products are often shopped together. 🍋🍉🥑🥦

data-analysis data-science instacart market-basket-analysis

Last synced: 21 Mar 2025

https://github.com/root-11/tablite

multiprocessing enabled out-of-memory data analysis library for tabular data.

data-analysis data-science datatype disk etl excel filereader pandas pivot-tables python table tabular-data

Last synced: 28 Oct 2025

https://github.com/stellar/stellar-etl

Stellar ETL will enable real-time analytics on the Stellar network

bitcoin blockchain data-analysis ethereum etl-framework etl-pipeline stellar stellar-lumens stellar-network

Last synced: 07 Apr 2025

https://github.com/SirCryptic/autoexif

want to remove sensitive data from photos or even view it? use autoexif to easily help you do that no more remembering syntaxs with this user-friendly tool.

data data-analysis exif-data exif-data-extraction exif-interface exif-metadata exif-reader exif-remover exiftool image meta metadata osint osint-tool viewer

Last synced: 04 Mar 2025

https://github.com/jcm-ai/tata-data-visualisation-empowering-business-with-effective-insights

This repository holds all of the assignments I was needed to complete for the TATA Data Visualization Empowering Business with Effective Insights Virtual Experience Program. 📊 📈 📉

analysis-and-reporting analytics analytics-and-decision-science charts communications dashboards data-analysis data-cleanup data-interpretation data-storytelling data-visualizations graph insights power-bi tableau visual-basic visualizations

Last synced: 26 Feb 2025

https://github.com/czyt1988/data-workbench

Data processing tool software developed by QT(CPP)

data-analysis graphicsview qt qt-workflow qt5 workflow

Last synced: 20 Aug 2025

https://github.com/braph-software/BRAPH-2

BRAPH 2.0 is a comprehensive software package for the analysis and visualization of brain connectivity data, offering flexible customization, rich visualization capabilities, and a platform for collaboration in neuroscience research.

biomedical-engineering brain-connectivity-analysis brain-research computational-neuroscience connectomics data-analysis data-science data-visualization deep-learning graph-theory machine-learning matlab network-analysis neuroimaging neuroscience open-source reproducible-research research-tools scientific-software toolbox

Last synced: 01 May 2025

https://github.com/jmaasch/sanzo

R Color Palettes Based on the Works of Sanzo Wada – A CRAN Package

color-palettes data-analysis data-science data-visualization r sanzo-wada visualizations

Last synced: 17 Aug 2025

https://github.com/leeper/make-example

An example of using make for a data analysis project

data-analysis make manuscript reproducible-research

Last synced: 23 Mar 2025

https://github.com/davidfokkema/tailor

Application for data analysis and curve fitting

curve-fitting data-analysis python scatterplot science spreadsheet-editor

Last synced: 17 Jan 2026

https://github.com/atapas/covid-19

COVID-19 World is yet another Project to build a Dashboard like app to showcase the data related to the COVID-19(Corona Virus).

analytics countries covid covid-19 covid-19-india covid19 dashboard data-analysis data-visualization jamstack react reactjs recharts saas showcase virus visualization

Last synced: 12 Apr 2025

https://github.com/ihrke/pypillometry

Pupillometry and eyetracking with python

data data-analysis eye-tracking eyetracking pupillometry

Last synced: 10 Oct 2025

https://github.com/inphyt/covid19-italy-integrated-surveillance-data

COVID-19 integrated surveillance data provided by the Italian Institute of Health and processed via UnrollingAverages.jl to deconvolve the weekly moving averages.

covid-19 covid19-data data data-analysis data-structures data-visualization data-wrangling database dataset epidemiological-data epidemiology italy italy-data italy-dataset open-data surveillance surveillance-data time-series time-series-analysis

Last synced: 26 Jul 2025

https://github.com/raycad/stream-processing

Stream processing guidelines and examples using Apache Flink and Apache Spark

apache-flink apache-spark batch-processing data-analysis streaming

Last synced: 13 Jul 2025

https://github.com/prakhar-ff13/customer-analytics

Machine Learning Case study on customer segmentation and prediction of groups.

analytics case-study data-analysis data-science data-visualization dimensionality-reduction machine-learning python python3

Last synced: 24 Jul 2025

https://github.com/abeltavares/finstockdash

📈 A streamlit web app for retrieving and analyzing financial data for a stock ticker.

dashboard data-analysis finance financial-analysis financial-reporting financial-statements investments python stocks streamlit web-app

Last synced: 19 Apr 2025

https://github.com/kwokhing/yandexcatboost-python-demo

Demo on the capability of Yandex CatBoost gradient boosting classifier on a fictitious IBM HR dataset obtained from Kaggle. Data exploration, cleaning, preprocessing and model tuning are performed on the dataset

catboost data-analysis data-preprocessing data-science feature-selection gradient-boosting gradient-boosting-classifier one-hot-encode pandas pearson-correlation python python27 seaborn variance-analysis visualization yandex-catboost

Last synced: 09 Apr 2025

https://github.com/davidchall/ipaddress

Data analysis for IP addresses and networks

cyber data-analysis ip-address ipv4 ipv6 r vctrs

Last synced: 21 Oct 2025

https://github.com/vinyl-davyl/versus-dashboard-v2

Versus admin dashboard with upgraded features, functional components, featuring widgets, activity charts, data tables and paginations, user page, product page, blog, login, register and more.

dashboard dashboards data-analysis prettier react-dashboard react-dashboard-demo reactjs styled-components

Last synced: 23 Apr 2025

https://github.com/petersontylerd/mlmachine

mlmachine accelerates machine learning experimentation

data-analysis data-science data-visualization machine-learning python

Last synced: 21 Mar 2025

https://github.com/chstan/arpes

Mirror of PyARPES (gitlab/lanzara-group/python-arpes) the open source ARPES analysis framework

angle-resolved-photoemission arpes condensed-matter-physics data-analysis electrons pes photoemission physics python spectroscopy xps

Last synced: 18 Jan 2026

https://github.com/cdhunt/pselect

PowerShell DSL for aggregating data

data-analysis dsl powershell powershell-module

Last synced: 21 Mar 2025

https://github.com/vvzen/houdini-geospatial-tools

tools for geospatial exploration in Houdini (ipython notebooks + GeoJSON python library)

data-analysis data-visualization geojson geospatial geotiff houdini python27

Last synced: 09 Apr 2025

https://github.com/davidgasquez/filecoin-data-portal

🧮 Open, serverless, and local friendly Data Platform for the Filecoin Ecosystem

data-analysis data-platform filecoin

Last synced: 12 Apr 2025

https://github.com/datawithbaraa/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-queries sql-query sql-server window-functions window-functions-in-sql

Last synced: 15 Apr 2025

https://github.com/hackersandslackers/pandas-sqlalchemy-tutorial

:panda_face: :computer: Load or insert data into a SQL database using Pandas DataFrames.

data-analysis data-science dataframes pandas pandas-sqlalchemy-tutorial python sql-database sqlalchemy tutorial

Last synced: 16 Apr 2025

https://github.com/alexbykoff/datafield

Sort, select, filter, evaluate and perform maths on your arrays of data

arrays collections data-analysis data-structures filtering sorting

Last synced: 21 Apr 2025

https://github.com/cdeweyx/medium-stats-analysis

Exploring data and analyzing metrics for user-specific Medium Stats

data-analysis data-mining data-visualization python

Last synced: 25 Apr 2025

https://github.com/alessandrocorradini/harvard-data-science-professional

Repository for the Data Science Professional Program from Harvard University on edX

data-analysis data-science datascience edx harvardx machine-learning machinelearning mooc moocs r r-language

Last synced: 13 Jul 2025

https://github.com/rshkarin/quanfima

Quanfima (Quantitative Analysis of Fibrous Materials)

data-analysis material-science morphological-analysis volumetric-data

Last synced: 07 May 2025

https://github.com/mainakrepositor/data-analysis

Different types of data analytics projects : EDA, PDA, DDA, TSA and much more.....

data-analysis data-science deeplearning machine-learning-algorithms neural-networks time-series-analysis tsa

Last synced: 02 May 2025

https://github.com/montara-io/dbt-command-center

Never sift through endless dbt™ logs again. dbt Command Center is a free, open-source, local web application that provides a user-friendly interface to monitor and manage dbt runs.

analytics-engineering bigquery data-analysis data-catalog data-engineering data-lineage data-observability data-pipeline data-pipelines data-validation data-warehouse dataops dbt dbt-packages elt etl orchestration python redshift

Last synced: 05 May 2025

https://github.com/dotbithq/das-account-indexer

Mapping relationship between multi-chain's addresses and accounts

data-analysis docker golang nervos server

Last synced: 09 Oct 2025

https://github.com/open-cogsci/datamatrix

An intuitive, Pythonic way to work with tabular data

analysis data-analysis data-structures python scientific-computing

Last synced: 21 Oct 2025

https://github.com/hugohadfield/bayesfilter

Pure Python/Numpy Bayesian Filtering and Smoothing

data-analysis ekf filtering smoothing ukf

Last synced: 25 Oct 2025

https://github.com/dawievlill/datascience-871

Data science module for economists written mostly in Julia and R

data-analysis data-science machine-learning

Last synced: 27 Feb 2025

https://github.com/tsffarias/data-analysis-queries

Este repositório foi cuidadosamente criado para fornecer uma extensa coleção de consultas SQL que visam facilitar o trabalho dos analistas de dados em diversas áreas de uma empresa, incluindo marketing, logística, comercial, financeiro, recursos humanos, operação, jurídico, suporte e muito mais.

business-intelligence comercial data-analysis data-insights esg finance-management fraud-prevention human-resources juridico kpis logistics marketing marketing-analytics operacao pricing sql suporte

Last synced: 05 Apr 2025

https://github.com/theengineeringworld/python-data-science

Python Data Science has all the data sets and jupyter notebook files for the Youtube course at http://youtube.com/theengineeringworld under the name of " Python Data Science Course ".

data data-analysis data-mining data-science data-visualization jupyter-notebook jupyter-notebooks machine-learning python python27

Last synced: 17 Nov 2025

https://github.com/computationalcore/introduction-to-python

A very useful collection of Jupyter Notebooks, which aims to introduce the Python programming language.

data-analysis data-science fundamental google-colab jupyter-notebook jupyter-notebooks numpy pandas python python-language python-programming python3

Last synced: 24 Apr 2025

https://github.com/activitywatch/aw-research

Tools to analyse and experiment with ActivityWatch data

activitywatch data-analysis python quantified-self

Last synced: 14 Apr 2025

https://github.com/fatbobman/objects2xlsx

A powerful, type-safe Swift library for converting Swift objects to Excel (.xlsx) files. Objects2XLSX provides a modern, declarative API for creating professional Excel spreadsheets with full styling support, multiple worksheets, and real-time progress tracking.

business data-analysis dataset excel export-excel reporting spredsheet swift xlsx xlsxwriter

Last synced: 18 Jul 2025

https://github.com/anselmoo/spectrafit

📊📈🔬 SpectraFit is a command-line and Jupyter-notebook tool for quick data-fitting based on the regular expression of distribution functions.

console-application curve-fitting data-analysis data-analysis-python data-science data-visualization fitting juypter-notebook python science science-research scientific-plotting spectral-analysis spectroscopy

Last synced: 25 Nov 2025

https://github.com/mrankitgupta/python-roadmap

I am sharing Python lessons from scratch to intermediate with practice sets which I have studied into my Journey of 66DaysofData into Data Analytics.

66daysofdata analytics ankitgupta data-analysis data-analysis-python data-analytics data-mining data-science data-structures data-visualization jupyter matplotlib mrankitgupta numpy pandas programming python python-library python3

Last synced: 14 Jul 2025

https://github.com/mkcor/advanced-pandas

Pandas is a powerful tool for data exploration and analysis (including timeseries).

data-analysis data-science labeled-data notebooks python3 teaching-materials

Last synced: 12 Oct 2025

https://github.com/ActivityWatch/aw-research

Tools to analyse and experiment with ActivityWatch data

activitywatch data-analysis python quantified-self

Last synced: 01 May 2025

https://github.com/isisneutronmuon/mdanse

MDANSE: Molecular Dynamics Analysis for Neutron Scattering Experiments

data-analysis molecular-dynamics neutron-scattering python qt-gui science

Last synced: 22 Aug 2025

https://github.com/staircase-dev/piso

Pandas Interval Set Operations: providing methods for set operations, analytics, lookups and joins on pandas' Interval, IntervalArray and IntervalIndex

data-analysis data-science data-structures interval interval-arithmetic interval-set pandas set set-operations set-theory

Last synced: 20 Aug 2025

https://github.com/jpenuchot/ctbench

Compiler-assisted variable size benchmarking for the study of C++ metaprogram compile times.

benchmark clang compilation data-analysis data-visualization gcc metaprogramming

Last synced: 26 Oct 2025

https://github.com/jatinagrawal0/youtube-comment-sentimental-analysis

YouTube Sentiment Analysis is a web application that analyzes the sentiment of YouTube comments, providing insights into comment sentiment using VADER sentiment analysis and interactive visualizations.

data-analysis data-visualization natural-language-processing plotly python sentiment-analysis streamli streamlit-cloud vader-lexicon youtube-api-v3 youtube-comment-scraper youtube-comments-downloader

Last synced: 14 Apr 2025

https://github.com/csbiology/fsharpgephistreamer

F# functions for streaming any kind of graph/network data to the network visualization tool gephi

data-analysis exploratory-data-analysis fsharp gephi graph-visualization streaming-graph-data visualization

Last synced: 30 Jul 2025

https://github.com/codingforentrepreneurs/try-pandas

In this series, we're going to learn the fundamentals of the popular Python data science tool called Pandas.

data-analysis data-science deepnote jupyter nba-api nba-stats notebook pandas python python-pandas

Last synced: 18 Jan 2026

https://github.com/sondosaabed/paltaqdeer

🇵🇸 PalTaqdeer is an AI-Driven Student Success Forecaster. Was developed for Hackathon Google Launchpad, data analysis techniques, Linear regression model, and Flask for the web 🇵🇸

data-analysis hackathon hackathon-project linear-regression matplotlib outliers-detection pandas python student-grades

Last synced: 09 Apr 2025

https://github.com/probcomp/cgpm

Library of composable generative population models which serve as the modeling and inference backend of BayesDB.

bayesian-inference data-analysis machine-learning probabilistic-programming tabular-data

Last synced: 19 Oct 2025

https://github.com/ptyadana/data-analysis-for-digital-music-store

helping Digitial Music Store to optimize their business practices using PostgreSQL

chinook chinook-database data-analysis datavisualization pgadmin4 postgresql sql tableau

Last synced: 12 Apr 2025

https://github.com/ncbi/tree-tool

Incremental building of phylogenetic distance trees

bioinformatics bioinformatics-tool data-analysis distance-measures evolution phylogenetic-trees

Last synced: 04 Jul 2025