An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/joshniemela/financetools.jl

Various tools to process financial time series

data-analysis finance julia julia-language quantitative-finance time-series

Last synced: 22 Apr 2025

https://github.com/lcsrodriguez/hft-intradayvol-estimation

Intraday volatility estimation using High-Frequency Financial Data

data-analysis hft hft-data microstructure-noise python q realized-volatility

Last synced: 17 May 2026

https://github.com/selva221724/edasql

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

correlation data-analysis data-science data-visualization dataprofiling eda missing-values outlier-detection pandas python sql

Last synced: 10 Jun 2025

https://github.com/29dch/ai_ml_dataanalysis_datavisualization_classic-examples

关于AI,ML,DA,DV等的几个经典案例,包括堵车模拟(NagelSchreckenberg)、蒙特卡洛排队问题(Monte Carlo Queuing Problem)、人脸识别(RecognitionFace)、遗传算法推断图像(IconGenetic)

ai data-analysis data-visualization ml

Last synced: 29 Apr 2025

https://github.com/cosmoduende/r-spotify-history-analysis

Explore your activity on Spotify with R and "spotifyr": How to analyze and visualize your streaming history and music tastes. Find out how and how much you consume from Spotify, using a copy of your personal data and the "spotifyr" package

analisis-de-data analytics data-analysis data-analysis-r data-analytics data-visualization r-language r-programming sentiment-analysis spotify-analysis spotify-api spotify-connect spotify-data spotify-playlist spotify-streaming-history spotify-web-api spotifyr streaming-history visualizacion-de-datos visualizaciones

Last synced: 11 Apr 2025

https://github.com/c0deta1ker/arpescape

ARPEScape is a MATLAB-based app that contains a set of tools and functions for analysing the electronic structure of materials using photoelectron spectroscopy (PES) techniques, such as X-ray photoelectron spectroscopy (XPS) and angle-resolved photoelectron spectroscopy (ARPES).

analysis analysis-package angle-resolved-photoemission angle-resolved-spectroscopy arpes condensed-matter-physics data-analysis lcn matlab photoelectron-spectra photoelectron-spectroscopy photoemission psi sls ucl xps

Last synced: 07 May 2025

https://github.com/chivke/serveliza

Serveliza is an application to extract data of the Chilean Electoral Service (SERVEL) from different open sources.

chile chilean-rut data-analysis electoral-rolls osint pandas political-science python3 servel

Last synced: 14 Jan 2026

https://github.com/chalmerlowe/data-analysis-courseware

College course focused on intro to Python and Data Analysis, including pandas

analysis data-analysis python

Last synced: 10 Apr 2025

https://github.com/zjunlp/datamind

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

agent artificial-intelligence data-analysis data-science language-model natural-language-processing

Last synced: 04 Oct 2025

https://github.com/niklaspfister/adaxt

adaXT: tree-based machine learning in Python

data-analysis decision-trees machine-learning statistics tree-ensembles

Last synced: 31 Oct 2025

https://github.com/cosmoduende/r-google-location-history

Explore your activity on Google with R: How to analyze and visualize your Location History. Find out how and how much you have allowed Google to track you, using a copy of your personal data.

data-analysis data-analytics data-visualisation data-visualization data-viz geolocation-data google-data-analytics google-location-api google-location-history google-location-service google-takeout location-history maps-data r- r-analytics r-language r-programming r-stats

Last synced: 11 Apr 2025

https://github.com/contextlab/data-wrangler

Wrangle messy numerical, image, and text data into consistent well-organized formats

data data-analysis data-science data-wrangling hugging-face image-data machine-learning nlp numpy pandas python scikit-learn

Last synced: 10 Apr 2025

https://github.com/jaumebonet/rosettasilenttoolbox

Python Toolbox For Rosetta Silent Files Processing

data-analysis data-visualization protein-design protein-sequence science

Last synced: 06 Sep 2025

https://github.com/vandermeerlab/nept

Neuroelectrophysiology tools used for the analysis of neural recording data and related behaviors

data-analysis neuroscience python

Last synced: 11 Mar 2026

https://github.com/tawounfouet/mlops-specialiazation-duke

MLOps | Machine Learning Operations Specialization from Duke University : acquiring critical MLOps skills, including the use of Python and Rust, utilizing GitHub Copilot to enhance productivity, and leveraging platforms like Amazon SageMaker, Azure ML, and MLflow.

aws azure big-data cloud-computing data-analysis data-management devops mlops python rust

Last synced: 23 Apr 2025

https://github.com/shanisoni/tata-data-visualization-empowering-business-with-effective-insights

Within the confines of this repository, you will find an extensive compilation of the assignments that were integral to my participation in the TATA Data Visualization Empowering Business with Effective Insights Virtual Experience Program. 📊 📈 📉

analysis-and-reporting analytics-and-decision-science chart communications dashboards data-analysis data-cleanup data-interpretation data-storytelling data-visualizations graph insights power-bi tableau visual-basic visualizations

Last synced: 05 Mar 2026

https://github.com/masasron/shraga

Private, powerful data analysis—right in your browser.

data-analysis frontend llm

Last synced: 14 Jul 2025

https://github.com/jonocarroll/runkeepr

Extract, plot, and analyse Runkeeper(TM) data.

data-analysis data-mining gis gpx rstats runkeeper

Last synced: 29 Oct 2025

https://github.com/amazingcoderpro/flight-delay-prediction

2018年全球程序员大赛参赛作品, 在给定的数据基础上,加上自己采集的飞机、天气等影响因子, 利用svm算法预测航班延误率.

airline data-analysis python svm svm-classifier svm-model

Last synced: 07 May 2025

https://github.com/reycn/data-analytics-in-julia

Notebooks for data analysis in social science using Julia, replicating frequent analytical steps in Python & R.

data data-analysis data-science data-visualization julia

Last synced: 07 May 2025

https://github.com/antoniosbarotsis/coronabot

My attempt at data mining and analysis on Covid-19

chartjs coronavirus covid-19 data-analysis discord-bot graph hacktoberfest

Last synced: 28 Oct 2025

https://github.com/meetpateltech/convelyze

Visualize your ChatGPT usage with interactive charts and insights.

ai chatgpt data-analysis openai

Last synced: 31 Mar 2025

https://github.com/ilyasmoutawwakil/fcc-data-analysis-with-python

My 5 Data Analysis projects that I've built as part of my freeCodeCamp assignment.

certification course-project data-analysis fcc fcc-assignment fcc-certification fcc-data python

Last synced: 27 Jul 2025

https://github.com/lonelyhentxi/minellius

A experimental github data analysis solution. Group project of comp-3002, 2018 fall, hitsz.

angular assignment data-analysis electron github nestjs ng-zorro-antd typescript

Last synced: 11 Apr 2025

https://github.com/abhay557/fakedata

The fakedata package generates realistic synthetic user profiles for machine learning, deep learning, data analysis, and data science workflows.

abhay557 anime data data-analysis data-science deep-learning fake fake-data generator joke machine-learning mock mock-data

Last synced: 30 May 2026

https://github.com/vidhi1290/machine-learning-pipeline

Explore a collection of Jupyter notebooks that guide you through various stages of the machine learning pipeline. From data analysis and feature engineering to model training and deployment, these notebooks provide practical insights for both beginners and experienced data enthusiasts. Let's dive into the world of data-driven decision-making! 📊🚀"

data-analysis feature-engineering feature-selection jupyter jupyter-notebook machine-learning machine-learning-algorithms machine-learning-pipeline model-training new-dataset opensource python

Last synced: 10 Apr 2025

https://github.com/ztjhz/sc1015-project

Predict the success of an anime using data science and machine learning (regression + classification)

anime data-analysis data-science machine-learning

Last synced: 22 Mar 2025

https://github.com/pmgraham/datagrunt

Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.

csv csv-parser data-analysis data-engineering data-science data-wrangling dataframe duckdb open-source polars python python3

Last synced: 26 Aug 2025

https://github.com/otherwa/csvdash

Analyze and visualize CSV data with ease using this Streamlit-powered data analytics tool

analytics data-analysis huggingface llama-index pandas python statistics streamlit

Last synced: 15 Apr 2025

https://github.com/AurelienAubry/Spotlight

Spotlight is a Spotify dashboard that allows user to visualize his listening habits.

backend bootstrap chartjs data data-analysis data-science data-visualization flask frontend javascript js pandas python python3 react react-bootstrap spotify

Last synced: 15 Apr 2025

https://github.com/port-zero/lens

Data structure traversal in the command line

data-analysis lens traversal

Last synced: 18 Jan 2026

https://github.com/wahyudesu/predicting-hotel-booking-cancellations

This project will help hotel managers optimize their booking policies, reduce cancellations, and improve revenue.

data data-analysis data-science python

Last synced: 07 Jul 2025

https://github.com/cosmoduende/r-ggsoccer

StrangeR things: Visualizing Soccer Data with R… on a Soccer Pitch? How to analyze, visualize and report soccer data and strategies on a soccer pitch with the "ggsoccer" package

data-analysis data-analysis-in-r data-analytics ggsoccer package packages r-language r-package r-programming r-programming-projects r-studio soccer soccer-analytics soccer-data soccer-game soccer-matches soccer-simulation

Last synced: 11 Apr 2025

https://github.com/nicejade/play-with-python

学习 Python & 实战练习,以便能更好的玩 Python 相关技能、工具。

ai automation data-analysis excel python python3 sendmail spider visualization

Last synced: 08 Apr 2025

https://github.com/imartinezl/formula-one-viewer

Formula 1 standings and point scoring systems 🏎️🏁

data-analysis data-visualization formula1 interactive-visualizations js

Last synced: 28 Feb 2026

https://github.com/mchenryspagg/sql-portfolio-project

A portfolio project involving a detailed analysis of 37,997 high school/college student records to showcase key insights through the aid of effective visualizations aimed at evaluating the factors affecting student's academic performance in high school and colleges in the US..

data-analysis data-visualization database dataset datasets joins spreadsheet sql sqlite visualization

Last synced: 05 Sep 2025

https://github.com/joanacmbarros/analysis-results-data-model

Data analysis produces data in the form of results. Although integrating and putting findings into context is a cornerstone of scientific work, data of this type is often neglected. The analysis results data model is a proposed solution to this issue that combines analysis standards with a common data model.

analysis-results automation clinical-data clinical-data-management data-analysis data-management data-model data-standards data-stewardship

Last synced: 16 Aug 2025

https://github.com/tbep-tech/tbeptools

R package for Tampa Bay Estuary Program functions

data-analysis package tampa-bay tbep water-quality

Last synced: 19 Feb 2026

https://github.com/saisurajmatta/bike-sales-excel-dashboard-project

Bike Sales Excel Dashboard Project: Analyzed and visualized sales data, cleaned datasets, and created interactive dashboards in Excel.

data-analysis data-analytics data-cleaning data-visualization excel excel-dashboard excel-data-analytics pivot-tables

Last synced: 11 Feb 2026

https://github.com/tillbiskup/aspecd

Python framework for handling spectroscopic data focussing on reproducibility

data-analysis good-practices reproducible-research reproducible-science spectroscopy

Last synced: 06 Sep 2025

https://github.com/shubham18024/census_analysis

This repository contains code and resources for a summer research project focused on statistical analysis of census data. The project aims to analyze demographic trends, population distributions, and other relevant metrics derived from census datasets.

census-data csv-files data-analysis data-visualization jupyter-notebook mitosheet python report statistics

Last synced: 15 Apr 2025

https://github.com/ahmed-maher77/wind-turbine-power-prediction-app-using-machine-learning

"Wind Power Predictor" is a machine learning project that forecasts turbine output using real-time data from Turkish wind farms. Its web app interface offers convenient access to predictions, enabling informed decisions for maximizing energy production and advancing renewable energy usage.

ai catboost data-analysis data-science flask html-css-javascript javascript machine-learning matplotlib numpy pandas predictive-modeling pwa python sklearn web web-development wind wind-turbine wind-turbine-operational-optimization

Last synced: 10 Apr 2025

https://github.com/fluhus/biostuff

Computational biology packages for Go.

bioinformatics biology computational-biology data-analysis go golang

Last synced: 12 Jan 2026

https://github.com/kaos599/vit-gpa-calculator

VIT-GPA-Calculator is a Python application that extracts course grade data from a PDF file to calculate your current CGPA and allows you to simulate grade improvements. It leverages Camelot for PDF table extraction and Pandas for data manipulation, making it a handy tool for students and educators alike.

calculator camelot cgpa cgpa-calculator cgpa-simulator data-analysis education pandas pdf python- vellore-institute-of-technology vit-bhopal vit-university

Last synced: 18 Nov 2025

https://github.com/tushar2704/pyverse-exploring-python-frameworks

This repository is the Ultimate guide to exploring and mastering Python Libraries & frameworks, collection of code and guide by me, Tushar!

artificial-intelligence data-analysis data-engineering data-science data-visualization machine-learning python streamlit-tushar2704 tushar2704 web-application

Last synced: 30 Oct 2025

https://github.com/alrza2003/binancetrader

AI-powered trading bot using Python, pandas, scikit-learn, NumPy, and TensorFlow. Interacts with Binance API for cryptocurrency trading based on ZigZag indicator and AI predictions.

binance ccxt cryptocurrency data-analysis numpy pandas python statistics tensorflow trading trading-algorithms trading-strategies tradingbot

Last synced: 12 Apr 2026

https://github.com/akbaritabar/dask-duckdb-dbeaver

Parallelised and out of memory data analysis using Dask in Python and DuckDB and DBeaver in SQL. Using example of publicly accessible ORCID 2019 XML files

data-analysis data-science pandas parallel-computing python

Last synced: 08 Aug 2025

https://github.com/leosouliotis/odsc_python_da

My tutorial on "Introduction to Python for Data Analysis" on the ODSCA 2022

data-analysis data-visualization pandas python

Last synced: 07 Sep 2025

https://github.com/ritvik19/vizard

Intuitive, Interactive, Easy and Quick Visualizations for Data Science Projects

data-analysis data-science data-visualization

Last synced: 10 Apr 2025

https://github.com/ayushanand18/climate-change-vs-agri

Visualizing trends and patterns how climate change affects agriculture (Python, Jupyter Notebook)

climate-change data-analysis data-visualization matplotlib python visualization

Last synced: 28 Apr 2026

https://github.com/louis-heraut/makaho

🥤 MAKAHO (for MAnn-Kendall Analysis of Hydrological Observations) is an interactive cartographic visualization system that allows to calculate trends present in data from hydrometric stations with flows which are little influenced by human actions

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae leaflet r shiny shiny-apps stationarity-test statistics

Last synced: 08 Jul 2025

https://github.com/srohit0/ml-misc

Miscellaneous Machine Learning and Data Analysis Projects

colaboratory data-analysis data-science data-visualization google-colab machine-learning-algorithms

Last synced: 15 Apr 2025

https://github.com/stats-tests/statstests

Statstests: a Python package that provides a complement of process and statistical tests for statsmodels statistical models.

count-models data-analysis hyphotesis-tests python regression-models statistics statsmodels

Last synced: 17 Mar 2026

https://github.com/adrienc21/vulpes

Vulpes: Test many classification, regression models and clustering algorithms to see which one is most suitable for your dataset

automl data-analysis data-science machine-learning models package python scikit-learn statistics

Last synced: 25 Oct 2025

https://github.com/thecoderpinar/big-tech-financial-insights

🚀 A comprehensive project analyzing Big Tech stock prices using time series analysis, volatility modeling, and macroeconomic indicators. Featuring interactive dashboards and automated reporting! 📈💼

data-analysis data-science finance machine-learning macroeconomics stock-analysis time-series-analysis volatility-modeling

Last synced: 03 Apr 2025

https://github.com/super-lou/makaho

🥤 MAKAHO (for MAnn-Kendall Analysis of Hydrological Observations) is an interactive cartographic visualization system that allows to calculate trends present in data from hydrometric stations with flows which are little influenced by human actions

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae leaflet r shiny shiny-apps stationarity-test statistics

Last synced: 13 Apr 2025

https://github.com/kaustubhgupta/blogathon-analysis

Analytics Vidhya Blogathon Data Analysis: Python Data Extraction with PowerBI dashboard

data-analysis data-mining data-scraping excel pandas powerbi powerbi-report project python tqdm

Last synced: 12 Apr 2025

https://github.com/psyteachr/quant-fun-v2

Fundamentals of Quantitative Analysis

data-analysis data-visualization datawrangling r statistics

Last synced: 11 Oct 2025

https://github.com/thecoderpinar/house-price-prediction-project

🏠 This project focuses on predicting house prices using advanced regression techniques. It involves comprehensive data preprocessing, feature engineering, and model selection. The aim is to develop an accurate predictive model for real estate prices.

data-analysis data-preprocessing data-visualization deep-learning jupyter-notebook machine-learning neural-networks python regression regression-models

Last synced: 30 Apr 2025

https://github.com/thecoderpinar/samsung_stock_analysis_forecasting_and_volatility_analysis

A comprehensive analysis and forecasting project for Samsung stock data, utilizing historical data to build predictive models and analyze volatility.

data-analysis deep-learning financial-analysis forecasting machine-learning python stock-analysis volatility-forecasting

Last synced: 31 Jul 2025

https://github.com/mattcieslak/meap

Software for cardiovascular data analysis and visualization

data-analysis electrophysiology python statistics

Last synced: 18 Sep 2025

https://github.com/syamkakarla98/datascience_head_start

This repository focuses on the building path for the data science.

data-analysis data-science data-visualization machine-learning machinelearning-python python3

Last synced: 03 May 2025

https://github.com/jhrcook/spotify-data-analysis

The analysis of my Spotify streaming data.

data-analysis data-analysis-r r rlang rstudio spotify-data

Last synced: 30 Oct 2025

https://github.com/daisybio/prone

R Package for preprocessing, normalizing, and analyzing proteomics data

data-analysis evaluation normalization proteomics

Last synced: 25 May 2026

https://github.com/rasrea/dataviz

这是一个基于Java和Spring Boot的可视化项目,旨在帮助(新手)开发者掌握前后端分离的开发模式。在项目中,用户可以上传数据文件,进行数据预处理,并通过前端的图表模块展示数据。项目涉及Spring Boot的API设计,数据的前后端交互,数据的处理与可视化展示,并且使用了Echarts图表进行数据展示。

data-analysis data-visualization spring-boot

Last synced: 05 Aug 2025

https://github.com/antoineaugusti/google-search-gdpr

Transform your Google Search GDPR export in CSV

data-analysis gdpr google-search python self-data

Last synced: 28 Oct 2025

https://github.com/dr-montasir/mnjs

MATH NODE JS (MNJS): A tiny math library for node.js & JavaScript on browser

data-analysis data-science javascript js jsdelivr library math nextjs npm react svelte sveltekit ts typescript yarn

Last synced: 26 Apr 2025

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Apr 2025

https://github.com/graphcore-research/kg-topology-toolbox

A Python toolbox to compute topological metrics and statistics for Knowledge Graphs

data-analysis graph-topology knowledge-graph metrics-gathering

Last synced: 17 Jan 2026

https://github.com/misaghmomenib/social-engagement-analysis

In This Repository, Using a Csv File Related to Social Network Interactions I Made a General Analysis That You Can Have at Your Disposal

data-analysis data-visualization git open-source pandas-python python

Last synced: 21 Nov 2025

https://github.com/shreeparab1890/fifa-wc-2022-qatar-data-analysis-eda

This is a Jupyter Notebook( iPython Notebook) with Data Analysis (EDA) on FIFA WC Qatar 2022 match data.

data-analysis data-analysis-python data-science data-visualization eda fifa matplotlib-pyplot numpy pandas plotly-express python-3

Last synced: 08 Mar 2026

https://github.com/dataforgeopenaihub/steam-sales-analysis

This repository features an ETL pipeline for retrieving, processing, validating, and ingesting game metadata and sales data from SteamSpy and Steam APIs. Data is stored in a MySQL database on Aiven Cloud and visualized using Tableau dashboards for insightful analysis of gaming trends and sales performance.

cloud-computing data-analysis data-engineering data-pipepline data-warehousing games mysql-database python steam-api tableau typer-cli

Last synced: 06 Feb 2026