An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/xuri/excelize-py

Excelize is a Python port of Go Excelize library that allow you to write to and read from XLAM / XLSM / XLSX / XLTM / XLTX files.

calculation chart data-analysis data-science data-visualization ecma-376 excel excelize golang microsoft office ooxml pipy python spreadsheet visualization xlsm xlsx xlsxreader xlsxwriter

Last synced: 07 May 2025

https://github.com/rubenknex/qtplot

Data visualization application for data taken with qtlab or QCoDeS

data-analysis data-visualization physics plotting python science

Last synced: 07 May 2025

https://github.com/chaganti-reddy/evmarket-india

Electric Vehicle Market Segmentation Analysis in India

data-analysis data-science machine-learning market-segmentation pandas python

Last synced: 12 Apr 2025

https://github.com/hoangsonww/standard-deviation-calculator

📊 This repository contains a Standard Deviation Calculator implemented in C++. It provides an efficient algorithm for calculating the statistical standard deviation of a dataset, making it a valuable tool for students, researchers, and analysts seeking a reliable method for data analysis.

algorithms cplusplus cpp data data-analysis data-analytics data-science standard-deviation standard-deviation-calculator standard-deviations

Last synced: 22 Sep 2025

https://github.com/mathewroy/ynabr

Analyze and visualize your You Need A Budget (YNAB) data. YNAB meets R programming language.

api data-analysis data-science data-visualization r ynab ynab-api

Last synced: 30 Jul 2025

https://github.com/toutiaoio/howtos

How-To Recipes, 碎片化实用教程, 开发技巧

cookbook data-analysis howto-tutorial howtos pandas recipes tips-and-tricks tutorials

Last synced: 14 Apr 2025

https://github.com/30mb1/pandas-boost

This repo contains code with comparison of Pandas speedup libs, such as modin, dask, swifter, pandarallel and numba

data-analysis machine-learning multiprocessing pandas pandas-tutorial python

Last synced: 30 Apr 2025

https://github.com/openbridge/chatlytics

Chatlytics is a data query and visualization platform for chat!

charts chatbot data-analysis data-visualization slack slack-bot

Last synced: 10 Apr 2025

https://github.com/maastrichtlawtech/law3025-legal-analytics

📚 Materials for Legal Analytics (LAW3025) @ Maastricht University

course-materials data-analysis data-visualization legaltech python

Last synced: 23 Jan 2026

https://github.com/chandraprakash-bathula/apparel-recommendations

This project implements a personalized apparel recommendation engine using content-based search with the Amazon API, NLTK, and Keras libraries.

boxplot cnn-keras data-analysis data-science deep-learning linear-regression machine-learning numpy pandas scatter-plot scikit-learn svm tensorflow xgboost

Last synced: 23 Mar 2025

https://github.com/engali94/twitter-account-analyzer

Using various Python libraries such as Pandas, tweetPy, JSON ans matplotLib to take a sneak peek on your Twitter account using Google Colab.

data-analysis data-visualization python3 twitter-api twitter-sentiment-analysis twitter-streaming-api

Last synced: 29 Apr 2025

https://github.com/iam-mhaseeb/Satellite-Imagery-Analysis-of-Vegetation-in-Southern-Pakistan

This repository contains a study how we can examine the vegetation cover of a region with the help of satellite data. The notebook in this repository aims to familiarise with the concept of satellite imagery data and how it can be analyzed to investigate real-world environmental and humanitarian challenges.

data-analysis data-visualization jupyter-notebook notebook pakistan pakistan-analysis python3 satellite-data satellite-imagery satellite-images vegetation vegetation-analysis

Last synced: 08 May 2025

https://github.com/ptyadana/tableau_2020_a-z_hands-on

Tableau Projects for data analysis, data analytics and data visualaization on different data sets

data-analysis data-science data-visualization tableau tableau-dashboards tableau-desktop tableau-public tableau-workbooks

Last synced: 03 Aug 2025

https://github.com/fabriziomusacchio/python_neuro_practical

This is the course material for the advanced course into Python for Data Scientists.

data-analysis data-science jupyter jupyter-notebook jupyter-notebooks open-source python teaching teaching-materials

Last synced: 22 Jul 2025

https://github.com/jethronap/jstat

A Java Library for Statistics, Data Analysis and Visualization.

classification data-analysis gradient-descent java linear-regression statistics

Last synced: 21 Jun 2025

https://github.com/kennethleungty/english-premier-league-var-analysis

Analyzing Video Assistant Referee (VAR) decisions in the English Premier League (2019 - 2021)

data-analysis data-analytics data-science english-premier-league football soccer var

Last synced: 27 Aug 2025

https://github.com/codeperfectplus/dataanalysiswithjupyter

A Perfect Repository For Data Anaysis with Jupyter Notebook.:chart_with_upwards_trend::chart_with_downwards_trend:

codeperfe data-analysis database hacktoberfest jupyter jupyter-notebook matplotlib matplotlib-pyplot numpy pandas pandas-dataframe python-3 seaborn-plots

Last synced: 13 May 2025

https://github.com/luminousmen/python_for_ds

Python for Data Analysis workshop

data-analysis data-science python tutorial

Last synced: 01 May 2025

https://github.com/qurator-spk/mods4pandas

Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis

alto alto-xml data-analysis digital-humanities library mets mods pandas qurator

Last synced: 16 Jan 2026

https://github.com/venkat-0706/social-media-analysis

Clean & analyze social media data with Python. Explore trends, sentiments, & user behavior. Includes data cleaning, visualization, & insights.

data-analysis data-visualization hashtag-trends natural-language-processing sentiment-analysis social-media-analysis text-mining user-engagement-metrics

Last synced: 13 Apr 2025

https://github.com/cosmoduende/r-google-search-history-analysis

Explore your activity on Google with R: How to Analyze and Visualize Your Personal Data Search History. Find out how and how much you have used the most popular search engine in the world, using a copy of your personal data.

data-analysis data-visualisation data-visualization data-viz google google-history google-takeout personal-data-analysis r-language r-programming r-script rvest search-data search-history takeout takeout-data

Last synced: 01 Mar 2026

https://github.com/olow304/sqdata

📊 Simple SQL Client for lightweight data analysis using Reactjs framework. Demo

chartsjs data-analysis highcharts javascript nodejs react reactjs sql sqljs visualization

Last synced: 07 May 2025

https://github.com/brad-cannell/freqtables

Quickly make tables of descriptive statistics (i.e., counts, percentages, confidence intervals) for categorical variables. This package is designed to work in a tidyverse pipeline, and consideration has been given to get results from R to Microsoft Word ® with minimal pain.

categorical-data data-analysis descriptive-statistics epidemiology r

Last synced: 04 Jul 2025

https://github.com/yangfa-zhang/lunax

Lunax is a machine learning framework specifically designed for the processing and analysis of tabular data.

data-analysis data-science lunax machine-learning tabular-data

Last synced: 14 Dec 2025

https://github.com/krlennon/mastercurves

Python package for automatically superimposing data sets to create a master curve, using Gaussian process regression and maximum a posteriori estimation.

automation data-analysis gaussian-processes interpreatable-ai machine-learning maximum-a-posteriori python statistical-analysis uncertainty-quantification

Last synced: 01 Apr 2026

https://github.com/zekeriyyaa/traffic-data-analysis-with-apache-spark-based-on-mobile-robot-data

Mobile robot data were analyzed with Apache-Spark to extract five different statistical result such as travel time, waiting time, average speed, occupancy and density were produced.

agv apache-spark big-data data-analysis data-visualization industrial-robot mobile-robot mongodb mssql pyqt5 pyspark python spark

Last synced: 21 Apr 2025

https://github.com/thecoderpinar/spotify_trends_2023_analysis

Exploring Spotify's latest trends, top songs, genres, and artists using Python, Pandas, NumPy, Matplotlib, CNNs for image-based analysis, and advanced algorithms for music recommendation. Dive into the world of music data and discover what's trending on Spotify! 🎵📊

cnn cnn-keras data-analysis data-science data-visualization machine-learning matplotlib music-trend numpy pandas python spotify

Last synced: 30 Apr 2025

https://github.com/maugavilla/well_hello_stats

Tutorials to learn R from scratch

data-analysis r statistics tutorial

Last synced: 24 Jul 2025

https://github.com/ct83/surway

SurWay is a survey/polling website for cab drivers where they can report their typical work hours and which company they work for, this data is then stored anonymously and used to generate charts and insights.

charts data-analysis data-visualization data-visualization-project docker docker-compose express node nodeexpress nodejs react react-chartjs-2 react-charts react-project react-projects reactjs reactjs-demo

Last synced: 16 Sep 2025

https://github.com/sintel-dev/mtv

A Full-stack Platform for Multiple Time-series Visualization (MTV) and Anomaly Analysis.

anomaly-detection data-analysis visualization

Last synced: 10 Apr 2025

https://github.com/viper373/jd-comments

爬取京东商品评论数据

crawler-python data-analysis python spider

Last synced: 15 Apr 2025

https://github.com/jongan69/soltrendio

A solana wallet address analyzer with ai and apis for building useful tools on top of

ai blockchain data-analysis solana trading-systems

Last synced: 07 Sep 2025

https://github.com/randlab/hytool

Hytool is a matlab toolbox for the interpretation of hydraulic tests in wells. The toolbox contains analytical solutions used to describe groundwater flow around wells, and functions for importing, displaying, and fitting a model to the data.

data-analysis hydrogeology matlab pumping-test well-testing

Last synced: 04 Apr 2026

https://github.com/sahahn/bpt

The Brain Predictability toolbox (BPt), is a python based Machine Learning library designed primarily for tabular and neuroimaging specific neuroimaging data but can easily be generalized further.

bp bpt brain-predictability-toolbox data-analysis data-science machine-learning ml neuroimaging-data neuroscience neuroscience-methods pandas python sklearn

Last synced: 13 Apr 2025

https://github.com/OpenEnergyPlatform/data-preprocessing

Repository for data formatting, import of data, data and metadata review, and data curation.

data-analysis database oep open-energy-family

Last synced: 27 Jan 2026

https://github.com/lironmiz/data.intro

Introductory course in the field of data science of the cyber education center at campus il which touches both the theoretical and the practical aspect of big data analysis in the Python language

big-data course data-analysis data-science data-visualization education jupyter-notebook learning-by-doing matplotlib numpy pandas-library python3 statistics

Last synced: 05 Jul 2025

https://github.com/supercowpowers/scp-labs

SCP Labs (Open Source Team for SuperCowPowers)

data-analysis data-science pandas python scikit-learn security

Last synced: 06 May 2025

https://github.com/darenr/report_creator

Tool to assemble HTML reports using python components with charts and diagrams.

data-analysis data-presentation exploratory-data-analysis html-report html-report-generation pandas python

Last synced: 24 Apr 2026

https://github.com/joshuaulrich/stl-rug

Content presented at the Saint Louis R User Group

data-analysis data-science r

Last synced: 26 Aug 2025

https://github.com/emreyalvac/sulfur

Shaping, Processing, and Transforming Data with the Power of Sulfur with Rust

data data-analysis data-flow database

Last synced: 19 Aug 2025

https://github.com/pranabdas/arpespythontools

Explore, analyze, visualize Angle Resolved Photoemission Spectroscopy (ARPES) data.

arpes condensed-matter-physics data-analysis materials-science matplotlib photoelectron-spectroscopy python surface-science

Last synced: 17 Oct 2025

https://github.com/jmgirard/circumplex

R Package for the Analysis and Visualization of Circumplex Data

circular circumplex data-analysis ggplot2 interpersonal psychology r r-package rcpparmadillo rstats tidyverse

Last synced: 15 Aug 2025

https://github.com/ammarlodhi255/student_performance_indicator_end-to-end_implementation

An end-to-end machine learning project, student performance indicator. The goal of this project is to understand the influence of the parents background, test preparation, and various other variables on the students performance.

aws cd-pipeline data-analysis data-science data-science-projects eda end-to-end-machine-learning machine-learning machine-learning-projects regression regression-analysis

Last synced: 27 Sep 2025

https://github.com/bfolkens/pandas-datareader-gdax

GDAX data for Pandas in the style of DataReader

bitcoin cryptocurrency data data-analysis dataset finance gdax pandas quant

Last synced: 28 Jan 2026

https://github.com/aifred-health/vulcan-old

Deprecated: A high level deep learning framework for quickly prototyping networks with added tools in data visualisation, model interpretability and performance metrics

data-analysis deep-learning deep-neural-networks lasagne machine-learning mental-health theano

Last synced: 01 Aug 2025

https://github.com/aiguofer/sql_connectors

A simple wrapper for SQL connections using SQLAlchemy and Pandas read_sql to standardize SQL workflow with multiple data sources.

data-analysis data-analytics data-exploration data-science pandas relational-databases sql sqlalchemy standardized-api

Last synced: 13 Oct 2025

https://github.com/amirhosseinhonardoust/workout-efficiency-benchmark

Streamlit + Python pipeline that benchmarks gym workout efficiency (kcal/min) using present sessions only. Generates sortable workout-type benchmarks, distribution plots, fairness-aware gap analysis with uncertainty/low-sample flags, and a data-quality report to prevent misleading comparisons.

analytics benchmarking bias-audit dashboard data-analysis data-quality data-science eda fairness fitness health-data pandas plotly python reporting reproducible-research statistics streamlit visualization workout

Last synced: 10 Jun 2026

https://github.com/tusharnankani/data-analysis-with-python

A complete introduction to data analysis covering the basics of Python, Numpy, Pandas, Data Visualization, and Exploratory Data Analysis.

data data-analysis data-visualization hacktoberfest jovian jovian-ml jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 07 May 2025

https://github.com/codex-crusader/le_market_intelligence_platform

An explainable market analysis system that combines technical indicators and news sentiment to generate clear buy/sell signals with reasoning through an interactive dashboard

cryptocurrency data-analysis data-pipeline financial-analysis market-analysis python sentiment-analysis stock-market streamlit trading-signals

Last synced: 06 Apr 2026

https://github.com/abeltavares/marketpipe

🛠 Containerized and configurable Airflow ETL pipeline for collecting and storing stock and cryptocurrency market data.

airflow aws ci-cd cryptocurrency data-analysis data-collection data-storage docker iac oop pgadmin pipeline postgresql python sql stocks unit-testing

Last synced: 22 Apr 2025

https://github.com/cschreib/vif

Easy, robust, and fast numerics in C++.

astronomy astrophysics c-plus-plus data-analysis library

Last synced: 06 May 2025

https://github.com/the-pulse-engine/pulse-engine_market_intelligence_platform

An explainable market analysis system that combines technical indicators and news sentiment to generate clear buy/sell signals with reasoning through an interactive dashboard

cryptocurrency data-analysis data-pipeline financial-analysis market-analysis python sentiment-analysis stock-market streamlit trading-signals

Last synced: 26 Apr 2026

https://github.com/iamgmujtaba/scholar_search

This project provides a tool for extracting and analyzing the quantity and distribution of scholarly articles related to a particular topic or field over a desired time span, using Google Scholar search results and built-in data visualization functionality.

academia academic academic-papers data-analysis data-visualization google google-scholar scholarly-articles

Last synced: 30 Apr 2025

https://github.com/xilinjia/xj-strategist

A powerful machine learning and AI system for constructing sustainable strategies for financial trading.

data-analysis data-science data-visualization julia machine-learning quantitative-analysis quantitative-finance quantitative-trading rest-api trading-algorithms trading-strategies

Last synced: 12 May 2025

https://github.com/PFund-Software-Ltd/pfeed

Data pipeline for algo-trading, helping traders in getting real-time and historical data, and storing them in a local data lake for quantitative research.

algo-trading backtesting data-analysis data-pipeline data-storage historical-data pandas

Last synced: 27 Feb 2025

https://github.com/selva221724/edasql

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

correlation data-analysis data-science data-visualization dataprofiling eda missing-values outlier-detection pandas python sql

Last synced: 10 Jun 2025

https://github.com/c0deta1ker/arpesgui

A MATLAB GUI used for the analysis of soft x-ray angle-resolved photoemission spectroscopy (SX-ARPES) experiments that give direct access to the electronic band-structure of a material. Designed to be directly compatible with the data format of SX-ARPES experiments at the ADRESS beamline, at the Swiss Light Source (SLS) in the Paul Scherrer Institute (PSI), but can be generalised to other data formats if required.

analysis analysis-package arpes data-analysis lcn matlab matlab-gui psi sls ucl

Last synced: 30 Oct 2025

https://github.com/cosmoduende/r-spotify-history-analysis

Explore your activity on Spotify with R and "spotifyr": How to analyze and visualize your streaming history and music tastes. Find out how and how much you consume from Spotify, using a copy of your personal data and the "spotifyr" package

analisis-de-data analytics data-analysis data-analysis-r data-analytics data-visualization r-language r-programming sentiment-analysis spotify-analysis spotify-api spotify-connect spotify-data spotify-playlist spotify-streaming-history spotify-web-api spotifyr streaming-history visualizacion-de-datos visualizaciones

Last synced: 11 Apr 2025

https://github.com/cool-japan/pandrs

DataFrame library for data analysis implemented in Rust. It has features and design inspired by Python's pandas library, combining fast data processing with type safety.

data-analysis data-science datafrane pandas rust rust-lang

Last synced: 04 Apr 2026

https://github.com/clima/climaanalysis.jl

An analysis library for ClimaDiagnostics (and, more generally, NetCDF files)

climate data-analysis julia visualization

Last synced: 18 Jul 2025

https://github.com/dennis-van-gils/python-fluidprop

Easy access to thermodynamic fluid properties as a function of temperature and pressure. With a minimal command-line interface.

command-line-tool coolprop data-analysis fluids thermodynamic-properties

Last synced: 05 Sep 2025

https://github.com/jl33-ai/dotplotlib

A basic extension library for creating tree dot plots, strip plots or dot charts w/ matplotlib or seaborn in Python

data-analysis data-science data-visualization dot-chart dotplot dotplots matplotlib-pyplot matplotlib-python python seaborn seaborn-plots strip-plots

Last synced: 07 Sep 2025

https://github.com/cosmoduende/r-google-location-history

Explore your activity on Google with R: How to analyze and visualize your Location History. Find out how and how much you have allowed Google to track you, using a copy of your personal data.

data-analysis data-analytics data-visualisation data-visualization data-viz geolocation-data google-data-analytics google-location-api google-location-history google-location-service google-takeout location-history maps-data r- r-analytics r-language r-programming r-stats

Last synced: 11 Apr 2025

https://github.com/shukkkur/exploring-the-history-of-lego

Using variety of data manipulation techniques to explore different aspects of Lego's history.

bricks data-analysis data-manipulation data-visualization history jupyter-notebook lego python rebrickable-database

Last synced: 08 Sep 2025

https://github.com/vuthanhhai2302/apply-machine-learning-on-data-analytics

My project of applied machine learning on data analytics, using pandas, numpy and scikit-learn to analyze data

data-analysis numpy pandas scikit-learn

Last synced: 28 Apr 2025

https://github.com/ichait/oloviz

:fries: Scrape your Foodpanda and Deliveroo orders

data-analysis data-visualization food-ordering scraping-websites tableau

Last synced: 24 Jan 2026

https://github.com/vtsaplin/datatalk-cli

Query CSV, Excel & Parquet files with natural language. Fast, local, DuckDB-powered.

ai-tools cli csv data-analysis data-engineering developer-tools duckdb excel gpt llm local-first natural-language-query openai parquet privacy python text-to-sql

Last synced: 04 Mar 2026

https://github.com/mrdandelion6/learn-to-code

This repository is a collection of my notes and code snippets as I journey through learning different programming languages and coding concepts.

c data-analysis data-science javascript learn-to-code machine-learning matlab python r react shell-script

Last synced: 11 Apr 2025

https://github.com/alluxio/k8s-operator

An operator for managing Alluxio system on Kubernetes cluster

alluxio data-analysis data-orchestration kubernetes kubernetes-operator machine-learning

Last synced: 15 Aug 2025

https://github.com/niklaspfister/adaxt

adaXT: tree-based machine learning in Python

data-analysis decision-trees machine-learning statistics tree-ensembles

Last synced: 31 Oct 2025

https://github.com/pnnl-comp-mass-spec/proteomics-data-analysis-tutorial

A comprehensive tutorial for proteomics data analysis in R that utilizes packages developed by researchers at PNNL and from Bioconductor.

data-analysis proteomics

Last synced: 18 Jan 2026

https://github.com/chivke/serveliza

Serveliza is an application to extract data of the Chilean Electoral Service (SERVEL) from different open sources.

chile chilean-rut data-analysis electoral-rolls osint pandas political-science python3 servel

Last synced: 14 Jan 2026

https://github.com/zjunlp/datamind

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

agent artificial-intelligence data-analysis data-science language-model natural-language-processing

Last synced: 04 Oct 2025

https://github.com/tushar2704/my_homebrewed_notebooks_archived-account-kaggle.com-tusharaggarwal27

My_homebrewed_NOTEBOOKS is a GitHub repository that houses a collection of personal notebooks derived from various sources, including Kaggle and Jupyter Notebooks. This repository serves as a curated collection of notebooks created and customized by the repository owner, providing a valuable resource for learning and exploring different topics.

data-analysis data-science kaggle kaggle-competition kaggle-competition-notebooks kaggle-competiton kaggle-scripts machine-learning python

Last synced: 07 May 2025