An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/lironmiz/data.intro

Introductory course in the field of data science of the cyber education center at campus il which touches both the theoretical and the practical aspect of big data analysis in the Python language

big-data course data-analysis data-science data-visualization education jupyter-notebook learning-by-doing matplotlib numpy pandas-library python3 statistics

Last synced: 05 Jul 2025

https://github.com/cschreib/vif

Easy, robust, and fast numerics in C++.

astronomy astrophysics c-plus-plus data-analysis library

Last synced: 06 May 2025

https://github.com/abeltavares/marketpipe

🛠 Containerized and configurable Airflow ETL pipeline for collecting and storing stock and cryptocurrency market data.

airflow aws ci-cd cryptocurrency data-analysis data-collection data-storage docker iac oop pgadmin pipeline postgresql python sql stocks unit-testing

Last synced: 22 Apr 2025

https://github.com/iamgmujtaba/scholar_search

This project provides a tool for extracting and analyzing the quantity and distribution of scholarly articles related to a particular topic or field over a desired time span, using Google Scholar search results and built-in data visualization functionality.

academia academic academic-papers data-analysis data-visualization google google-scholar scholarly-articles

Last synced: 30 Apr 2025

https://github.com/sintel-dev/mtv

A Full-stack Platform for Multiple Time-series Visualization (MTV) and Anomaly Analysis.

anomaly-detection data-analysis visualization

Last synced: 10 Apr 2025

https://github.com/supercowpowers/scp-labs

SCP Labs (Open Source Team for SuperCowPowers)

data-analysis data-science pandas python scikit-learn security

Last synced: 06 May 2025

https://github.com/viper373/jd-comments

爬取京东商品评论数据

crawler-python data-analysis python spider

Last synced: 15 Apr 2025

https://github.com/aiguofer/sql_connectors

A simple wrapper for SQL connections using SQLAlchemy and Pandas read_sql to standardize SQL workflow with multiple data sources.

data-analysis data-analytics data-exploration data-science pandas relational-databases sql sqlalchemy standardized-api

Last synced: 13 Oct 2025

https://github.com/sahahn/bpt

The Brain Predictability toolbox (BPt), is a python based Machine Learning library designed primarily for tabular and neuroimaging specific neuroimaging data but can easily be generalized further.

bp bpt brain-predictability-toolbox data-analysis data-science machine-learning ml neuroimaging-data neuroscience neuroscience-methods pandas python sklearn

Last synced: 13 Apr 2025

https://github.com/jl33-ai/dotplotlib

A basic extension library for creating tree dot plots, strip plots or dot charts w/ matplotlib or seaborn in Python

data-analysis data-science data-visualization dot-chart dotplot dotplots matplotlib-pyplot matplotlib-python python seaborn seaborn-plots strip-plots

Last synced: 07 Sep 2025

https://github.com/ichait/oloviz

:fries: Scrape your Foodpanda and Deliveroo orders

data-analysis data-visualization food-ordering scraping-websites tableau

Last synced: 24 Jan 2026

https://github.com/vuthanhhai2302/apply-machine-learning-on-data-analytics

My project of applied machine learning on data analytics, using pandas, numpy and scikit-learn to analyze data

data-analysis numpy pandas scikit-learn

Last synced: 28 Apr 2025

https://github.com/tushar2704/my_homebrewed_notebooks_archived-account-kaggle.com-tusharaggarwal27

My_homebrewed_NOTEBOOKS is a GitHub repository that houses a collection of personal notebooks derived from various sources, including Kaggle and Jupyter Notebooks. This repository serves as a curated collection of notebooks created and customized by the repository owner, providing a valuable resource for learning and exploring different topics.

data-analysis data-science kaggle kaggle-competition kaggle-competition-notebooks kaggle-competiton kaggle-scripts machine-learning python

Last synced: 07 May 2025

https://github.com/niklaspfister/adaxt

adaXT: tree-based machine learning in Python

data-analysis decision-trees machine-learning statistics tree-ensembles

Last synced: 31 Oct 2025

https://github.com/pnnl-comp-mass-spec/proteomics-data-analysis-tutorial

A comprehensive tutorial for proteomics data analysis in R that utilizes packages developed by researchers at PNNL and from Bioconductor.

data-analysis proteomics

Last synced: 18 Jan 2026

https://github.com/chalmerlowe/data-analysis-courseware

College course focused on intro to Python and Data Analysis, including pandas

analysis data-analysis python

Last synced: 10 Apr 2025

https://github.com/alluxio/k8s-operator

An operator for managing Alluxio system on Kubernetes cluster

alluxio data-analysis data-orchestration kubernetes kubernetes-operator machine-learning

Last synced: 15 Aug 2025

https://github.com/mrdandelion6/learn-to-code

This repository is a collection of my notes and code snippets as I journey through learning different programming languages and coding concepts.

c data-analysis data-science javascript learn-to-code machine-learning matlab python r react shell-script

Last synced: 11 Apr 2025

https://github.com/c0deta1ker/arpescape

ARPEScape is a MATLAB-based app that contains a set of tools and functions for analysing the electronic structure of materials using photoelectron spectroscopy (PES) techniques, such as X-ray photoelectron spectroscopy (XPS) and angle-resolved photoelectron spectroscopy (ARPES).

analysis analysis-package angle-resolved-photoemission angle-resolved-spectroscopy arpes condensed-matter-physics data-analysis lcn matlab photoelectron-spectra photoelectron-spectroscopy photoemission psi sls ucl xps

Last synced: 07 May 2025

https://github.com/zjunlp/datamind

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

agent artificial-intelligence data-analysis data-science language-model natural-language-processing

Last synced: 04 Oct 2025

https://github.com/chivke/serveliza

Serveliza is an application to extract data of the Chilean Electoral Service (SERVEL) from different open sources.

chile chilean-rut data-analysis electoral-rolls osint pandas political-science python3 servel

Last synced: 14 Jan 2026

https://github.com/joshniemela/financetools.jl

Various tools to process financial time series

data-analysis finance julia julia-language quantitative-finance time-series

Last synced: 22 Apr 2025

https://github.com/29dch/ai_ml_dataanalysis_datavisualization_classic-examples

关于AI,ML,DA,DV等的几个经典案例,包括堵车模拟(NagelSchreckenberg)、蒙特卡洛排队问题(Monte Carlo Queuing Problem)、人脸识别(RecognitionFace)、遗传算法推断图像(IconGenetic)

ai data-analysis data-visualization ml

Last synced: 29 Apr 2025

https://github.com/pottekkat/go-corona

Live viz and updates of COVID 19. Let us fight this together!

corona covid-19 covid-19-india data-analysis data-visualization health infographics spread tableau

Last synced: 30 Jul 2025

https://github.com/lotfiferaga/eda-python

Exploratory Data Analysis with Python

data-analysis pandas python visualization

Last synced: 27 Apr 2025

https://github.com/bovem/stock-tracker

An interactive data visualization application developed in Python

data data-analysis data-visualization iex-api plotly-dash python stock-data stock-tracker visualization

Last synced: 19 Sep 2025

https://github.com/data-forge/data-forge-fs

This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js

csv data data-analysis data-cleaning data-cleansing data-forge data-management data-manipulation data-munging data-visualization data-wrangling javascript json linq nodejs pandas visualization

Last synced: 04 Sep 2025

https://github.com/jaumebonet/rosettasilenttoolbox

Python Toolbox For Rosetta Silent Files Processing

data-analysis data-visualization protein-design protein-sequence science

Last synced: 06 Sep 2025

https://github.com/contextlab/data-wrangler

Wrangle messy numerical, image, and text data into consistent well-organized formats

data data-analysis data-science data-wrangling hugging-face image-data machine-learning nlp numpy pandas python scikit-learn

Last synced: 10 Apr 2025

https://github.com/xilinjia/xj-strategist

A powerful machine learning and AI system for constructing sustainable strategies for financial trading.

data-analysis data-science data-visualization julia machine-learning quantitative-analysis quantitative-finance quantitative-trading rest-api trading-algorithms trading-strategies

Last synced: 12 May 2025

https://github.com/PFund-Software-Ltd/pfeed

Data pipeline for algo-trading, helping traders in getting real-time and historical data, and storing them in a local data lake for quantitative research.

algo-trading backtesting data-analysis data-pipeline data-storage historical-data pandas

Last synced: 27 Feb 2025

https://github.com/selva221724/edasql

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

correlation data-analysis data-science data-visualization dataprofiling eda missing-values outlier-detection pandas python sql

Last synced: 10 Jun 2025

https://github.com/shukkkur/exploring-the-history-of-lego

Using variety of data manipulation techniques to explore different aspects of Lego's history.

bricks data-analysis data-manipulation data-visualization history jupyter-notebook lego python rebrickable-database

Last synced: 08 Sep 2025

https://github.com/c0deta1ker/arpesgui

A MATLAB GUI used for the analysis of soft x-ray angle-resolved photoemission spectroscopy (SX-ARPES) experiments that give direct access to the electronic band-structure of a material. Designed to be directly compatible with the data format of SX-ARPES experiments at the ADRESS beamline, at the Swiss Light Source (SLS) in the Paul Scherrer Institute (PSI), but can be generalised to other data formats if required.

analysis analysis-package arpes data-analysis lcn matlab matlab-gui psi sls ucl

Last synced: 30 Oct 2025

https://github.com/cosmoduende/r-google-location-history

Explore your activity on Google with R: How to analyze and visualize your Location History. Find out how and how much you have allowed Google to track you, using a copy of your personal data.

data-analysis data-analytics data-visualisation data-visualization data-viz geolocation-data google-data-analytics google-location-api google-location-history google-location-service google-takeout location-history maps-data r- r-analytics r-language r-programming r-stats

Last synced: 11 Apr 2025

https://github.com/noduslabs/infranodus

A Node.Js / Neo4J tool that translates words and relations into network graphs and shows you how it all connects.

data-analysis datavisualization dataviz graph-visualization javascript neo4j network-analysis nodejs text-mining visualization

Last synced: 21 Jan 2026

https://github.com/clima/climaanalysis.jl

An analysis library for ClimaDiagnostics (and, more generally, NetCDF files)

climate data-analysis julia visualization

Last synced: 18 Jul 2025

https://github.com/cosmoduende/r-spotify-history-analysis

Explore your activity on Spotify with R and "spotifyr": How to analyze and visualize your streaming history and music tastes. Find out how and how much you consume from Spotify, using a copy of your personal data and the "spotifyr" package

analisis-de-data analytics data-analysis data-analysis-r data-analytics data-visualization r-language r-programming sentiment-analysis spotify-analysis spotify-api spotify-connect spotify-data spotify-playlist spotify-streaming-history spotify-web-api spotifyr streaming-history visualizacion-de-datos visualizaciones

Last synced: 11 Apr 2025

https://github.com/dennis-van-gils/python-fluidprop

Easy access to thermodynamic fluid properties as a function of temperature and pressure. With a minimal command-line interface.

command-line-tool coolprop data-analysis fluids thermodynamic-properties

Last synced: 05 Sep 2025

https://github.com/jonocarroll/runkeepr

Extract, plot, and analyse Runkeeper(TM) data.

data-analysis data-mining gis gpx rstats runkeeper

Last synced: 29 Oct 2025

https://github.com/amazingcoderpro/flight-delay-prediction

2018年全球程序员大赛参赛作品, 在给定的数据基础上,加上自己采集的飞机、天气等影响因子, 利用svm算法预测航班延误率.

airline data-analysis python svm svm-classifier svm-model

Last synced: 07 May 2025

https://github.com/joanacmbarros/analysis-results-data-model

Data analysis produces data in the form of results. Although integrating and putting findings into context is a cornerstone of scientific work, data of this type is often neglected. The analysis results data model is a proposed solution to this issue that combines analysis standards with a common data model.

analysis-results automation clinical-data clinical-data-management data-analysis data-management data-model data-standards data-stewardship

Last synced: 16 Aug 2025

https://github.com/reycn/data-analytics-in-julia

Notebooks for data analysis in social science using Julia, replicating frequent analytical steps in Python & R.

data data-analysis data-science data-visualization julia

Last synced: 07 May 2025

https://github.com/nicejade/play-with-python

学习 Python & 实战练习,以便能更好的玩 Python 相关技能、工具。

ai automation data-analysis excel python python3 sendmail spider visualization

Last synced: 08 Apr 2025

https://github.com/saisurajmatta/bike-sales-excel-dashboard-project

Bike Sales Excel Dashboard Project: Analyzed and visualized sales data, cleaned datasets, and created interactive dashboards in Excel.

data-analysis data-analytics data-cleaning data-visualization excel excel-dashboard excel-data-analytics pivot-tables

Last synced: 11 Aug 2025

https://github.com/pmgraham/datagrunt

Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.

csv csv-parser data-analysis data-engineering data-science data-wrangling dataframe duckdb open-source polars python python3

Last synced: 26 Aug 2025

https://github.com/tillbiskup/aspecd

Python framework for handling spectroscopic data focussing on reproducibility

data-analysis good-practices reproducible-research reproducible-science spectroscopy

Last synced: 06 Sep 2025

https://github.com/AurelienAubry/Spotlight

Spotlight is a Spotify dashboard that allows user to visualize his listening habits.

backend bootstrap chartjs data data-analysis data-science data-visualization flask frontend javascript js pandas python python3 react react-bootstrap spotify

Last synced: 15 Apr 2025

https://github.com/shanisoni/tata-data-visualization-empowering-business-with-effective-insights

Within the confines of this repository, you will find an extensive compilation of the assignments that were integral to my participation in the TATA Data Visualization Empowering Business with Effective Insights Virtual Experience Program. 📊 📈 📉

analysis-and-reporting analytics-and-decision-science chart communications dashboards data-analysis data-cleanup data-interpretation data-storytelling data-visualizations graph insights power-bi tableau visual-basic visualizations

Last synced: 23 Feb 2025

https://github.com/antoniosbarotsis/coronabot

My attempt at data mining and analysis on Covid-19

chartjs coronavirus covid-19 data-analysis discord-bot graph hacktoberfest

Last synced: 28 Oct 2025

https://github.com/mchenryspagg/sql-portfolio-project

A portfolio project involving a detailed analysis of 37,997 high school/college student records to showcase key insights through the aid of effective visualizations aimed at evaluating the factors affecting student's academic performance in high school and colleges in the US..

data-analysis data-visualization database dataset datasets joins spreadsheet sql sqlite visualization

Last synced: 05 Sep 2025

https://github.com/port-zero/lens

Data structure traversal in the command line

data-analysis lens traversal

Last synced: 18 Jan 2026

https://github.com/ilyasmoutawwakil/fcc-data-analysis-with-python

My 5 Data Analysis projects that I've built as part of my freeCodeCamp assignment.

certification course-project data-analysis fcc fcc-assignment fcc-certification fcc-data python

Last synced: 27 Jul 2025

https://github.com/otherwa/csvdash

Analyze and visualize CSV data with ease using this Streamlit-powered data analytics tool

analytics data-analysis huggingface llama-index pandas python statistics streamlit

Last synced: 15 Apr 2025

https://github.com/lonelyhentxi/minellius

A experimental github data analysis solution. Group project of comp-3002, 2018 fall, hitsz.

angular assignment data-analysis electron github nestjs ng-zorro-antd typescript

Last synced: 11 Apr 2025

https://github.com/cosmoduende/r-ggsoccer

StrangeR things: Visualizing Soccer Data with R… on a Soccer Pitch? How to analyze, visualize and report soccer data and strategies on a soccer pitch with the "ggsoccer" package

data-analysis data-analysis-in-r data-analytics ggsoccer package packages r-language r-package r-programming r-programming-projects r-studio soccer soccer-analytics soccer-data soccer-game soccer-matches soccer-simulation

Last synced: 11 Apr 2025

https://github.com/wahyudesu/predicting-hotel-booking-cancellations

This project will help hotel managers optimize their booking policies, reduce cancellations, and improve revenue.

data data-analysis data-science python

Last synced: 07 Jul 2025

https://github.com/meetpateltech/convelyze

Visualize your ChatGPT usage with interactive charts and insights.

ai chatgpt data-analysis openai

Last synced: 31 Mar 2025

https://github.com/ztjhz/sc1015-project

Predict the success of an anime using data science and machine learning (regression + classification)

anime data-analysis data-science machine-learning

Last synced: 22 Mar 2025

https://github.com/masasron/shraga

Private, powerful data analysis—right in your browser.

data-analysis frontend llm

Last synced: 14 Jul 2025

https://github.com/vidhi1290/machine-learning-pipeline

Explore a collection of Jupyter notebooks that guide you through various stages of the machine learning pipeline. From data analysis and feature engineering to model training and deployment, these notebooks provide practical insights for both beginners and experienced data enthusiasts. Let's dive into the world of data-driven decision-making! 📊🚀"

data-analysis feature-engineering feature-selection jupyter jupyter-notebook machine-learning machine-learning-algorithms machine-learning-pipeline model-training new-dataset opensource python

Last synced: 10 Apr 2025

https://github.com/tawounfouet/mlops-specialiazation-duke

MLOps | Machine Learning Operations Specialization from Duke University : acquiring critical MLOps skills, including the use of Python and Rust, utilizing GitHub Copilot to enhance productivity, and leveraging platforms like Amazon SageMaker, Azure ML, and MLflow.

aws azure big-data cloud-computing data-analysis data-management devops mlops python rust

Last synced: 23 Apr 2025

https://github.com/alrza2003/binancetrader

AI-powered trading bot using Python, pandas, scikit-learn, NumPy, and TensorFlow. Interacts with Binance API for cryptocurrency trading based on ZigZag indicator and AI predictions.

binance ccxt cryptocurrency data-analysis numpy pandas python statistics tensorflow trading trading-algorithms trading-strategies tradingbot

Last synced: 22 Aug 2025

https://github.com/jhrcook/spotify-data-analysis

The analysis of my Spotify streaming data.

data-analysis data-analysis-r r rlang rstudio spotify-data

Last synced: 30 Oct 2025

https://github.com/super-lou/makaho

🥤 MAKAHO (for MAnn-Kendall Analysis of Hydrological Observations) is an interactive cartographic visualization system that allows to calculate trends present in data from hydrometric stations with flows which are little influenced by human actions

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae leaflet r shiny shiny-apps stationarity-test statistics

Last synced: 13 Apr 2025

https://github.com/darenr/report_creator

Tool to assemble HTML reports using python components with charts and diagrams.

data-analysis data-presentation exploratory-data-analysis html-report html-report-generation pandas python

Last synced: 17 Jan 2026

https://github.com/tushar2704/pyverse-exploring-python-frameworks

This repository is the Ultimate guide to exploring and mastering Python Libraries & frameworks, collection of code and guide by me, Tushar!

artificial-intelligence data-analysis data-engineering data-science data-visualization machine-learning python streamlit-tushar2704 tushar2704 web-application

Last synced: 30 Oct 2025

https://github.com/akbaritabar/dask-duckdb-dbeaver

Parallelised and out of memory data analysis using Dask in Python and DuckDB and DBeaver in SQL. Using example of publicly accessible ORCID 2019 XML files

data-analysis data-science pandas parallel-computing python

Last synced: 08 Aug 2025

https://github.com/graphcore-research/kg-topology-toolbox

A Python toolbox to compute topological metrics and statistics for Knowledge Graphs

data-analysis graph-topology knowledge-graph metrics-gathering

Last synced: 17 Jan 2026

https://github.com/kaustubhgupta/blogathon-analysis

Analytics Vidhya Blogathon Data Analysis: Python Data Extraction with PowerBI dashboard

data-analysis data-mining data-scraping excel pandas powerbi powerbi-report project python tqdm

Last synced: 12 Apr 2025

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Apr 2025

https://github.com/rasrea/dataviz

这是一个基于Java和Spring Boot的可视化项目,旨在帮助(新手)开发者掌握前后端分离的开发模式。在项目中,用户可以上传数据文件,进行数据预处理,并通过前端的图表模块展示数据。项目涉及Spring Boot的API设计,数据的前后端交互,数据的处理与可视化展示,并且使用了Echarts图表进行数据展示。

data-analysis data-visualization spring-boot

Last synced: 05 Aug 2025

https://github.com/kaos599/vit-gpa-calculator

VIT-GPA-Calculator is a Python application that extracts course grade data from a PDF file to calculate your current CGPA and allows you to simulate grade improvements. It leverages Camelot for PDF table extraction and Pandas for data manipulation, making it a handy tool for students and educators alike.

calculator camelot cgpa cgpa-calculator cgpa-simulator data-analysis education pandas pdf python- vellore-institute-of-technology vit-bhopal vit-university

Last synced: 18 Nov 2025

https://github.com/thecoderpinar/samsung_stock_analysis_forecasting_and_volatility_analysis

A comprehensive analysis and forecasting project for Samsung stock data, utilizing historical data to build predictive models and analyze volatility.

data-analysis deep-learning financial-analysis forecasting machine-learning python stock-analysis volatility-forecasting

Last synced: 31 Jul 2025

https://github.com/thecoderpinar/house-price-prediction-project

🏠 This project focuses on predicting house prices using advanced regression techniques. It involves comprehensive data preprocessing, feature engineering, and model selection. The aim is to develop an accurate predictive model for real estate prices.

data-analysis data-preprocessing data-visualization deep-learning jupyter-notebook machine-learning neural-networks python regression regression-models

Last synced: 30 Apr 2025

https://github.com/fluhus/biostuff

Computational biology packages for Go.

bioinformatics biology computational-biology data-analysis go golang

Last synced: 12 Jan 2026