An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/josmarcristello/goprotimeocr

Python-based OCR tool using EasyOCR and OpenCV for automated text extraction from images. Customizable image preprocessing steps and options for GPU acceleration make this a versatile and efficient solution for various OCR tasks

computer-vision data-analysis easyocr gopro gopro-camera ocr opencv pytesseract python

Last synced: 09 May 2026

https://github.com/surajv311/data_analysis-food_recipes_ds

Data preprocessing, cleaning, <Analysis> & plotting 📊 of Food Recipies Dataset (from Kaggle). 🐍 Libraries used: Pandas, Matplotlib, Seaborn, Plotly.📈

data-analysis kaggle-dataset matplotlib numpy pandas plotly seaborn

Last synced: 03 May 2026

https://github.com/avrtt/sentiment-analysis

Sentiment analysis of reviews with a simple web interface in Flask

data-analysis data-mining data-science flask machine-learning nlp parsing text-mining

Last synced: 01 May 2026

https://github.com/lussierc/foodborneillnessdataanalysis

A data analysis of foodborne illnesses using R Scripting methods.

data-analysis database foodborne-disease-outbreaks foodborne-illnesses rstudio

Last synced: 03 May 2026

https://github.com/felipeheide/technicalcryptobot

Python script to analyze and trade Bitcoin (BTC) based on technical indicators like RSI, MACD, MMS, and support/resistance levels. Fetches prices from CryptoCompare and CoinGecko APIs. Allows investing a specified amount and displays potential profit/loss. Runs continuously, updating every minute.

api bitcoin coingecko cryptocompare cryptocurrency data-analysis python trading-bot

Last synced: 10 May 2026

https://github.com/reiniiriarios/squirrel-table

Desktop application to run MySQL queries over SSH and generate CSV and XLSX files. Useful for QA where queries need to be run repeatedly and files handed off.

csv csv-files data-analysis desktop-application electron excel mariadb mysql nodejs qa quality-assurance quality-control quality-control-assurance sql xlsx

Last synced: 15 Apr 2026

https://github.com/walidalsafadi/house-prices

Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.

cross-validation data-analysis data-science data-visualization decision-trees eda house-price-prediction house-prices jupyter linear-regression machine-learning machine-learning-algorithms mlp-regressor plot python random-forest-regression regression svr xgboost-regression

Last synced: 11 May 2026

https://github.com/chanmeng666/water-quality-testing-data-analysis

Statistical analysis and predictive modeling of water quality parameters using Python, pandas, scikit-learn, and statsmodels

data-analysis data-science data-visualization environmental-monitoring jupyter-notebook machine-learning pandas python scikit-learn seaborn statistics water-quality

Last synced: 18 Apr 2026

https://github.com/y1-studio/statbooru

Statbooru is an analytics tool built on Danbooru metadata. It helps artists and researchers discover stable, recognizable characters to draw, identify subjects that may attract attention, compare character popularity, analyze popular formats and tag patterns, and track how popularity changes over time.

danbooru data-analysis dataset python

Last synced: 17 May 2026

https://github.com/iondv/metrics

IONDV. Framework application: Metrics is to collect and show the metrics data.

collecting data data-analysis iondv iondv-app metrics

Last synced: 10 Feb 2026

https://github.com/gallillio/webscraping-datacleaning-imdb_videogame_webscrapper

IMDb Web Scrapper using Python. The scraped data is automatically data cleaned to address missing or wrong data, it dynamically searches Steam and Google to fill in the gaps or correct the data, This tool streamlines the process of aggregating video game data, facilitating analysis and insights for enthusiasts, researchers, and developers alike.

automation data-analysis data-cleaning data-wrangling jupyter-notebook python web-scraping webscraping

Last synced: 30 Apr 2026

https://github.com/lebrancconvas/personal-saved-datasets

Saved Datasets for doing stuff in the area of Statistics & Data Science.

data-analysis data-science database datasets excel kaggle kaggle-dataset microsoft-excel sample-dataset

Last synced: 12 May 2026

https://github.com/realbou17/wusf1

A Python-based GUI tool for visualizing F1 telemetry with multiple data representations.

data-analysis f1 fastf1 formula1 formula1-data-analysis python student-project

Last synced: 18 Apr 2026

https://github.com/dataket/dataket

Propuesta de proyecto para el Datatón Anticorrupción 2021. Equipo Dataket

corruption data-analysis dataviz open-government

Last synced: 02 Apr 2026

https://github.com/vandita2020/merra2_nasa_wind_speed_analysis

In this study, we aim to explore the vulnerability of power grids in the south-east region of the USA with the help of data analysis tools and machine learning algorithms

data-analysis data-science machine-learning-algorithms python

Last synced: 09 Jun 2026

https://github.com/grand-27-master/data-science-course

One-stop repo for learning data science along with roadmap!

data-analysis data-science machine-learning python statistics

Last synced: 12 May 2026

https://github.com/qathom/crawlx

crawlx allows to analyze product data on Amazon. It is a simple and lightweight tool to act on product issues.

data-analysis electron es6-javascript vue vuejs2 vuex2 webpack

Last synced: 14 Feb 2026

https://github.com/joamag/pandas

Loads of pandas data from China with awesome data

data data-analysis jupyter notebook pandas

Last synced: 25 Apr 2026

https://github.com/iremaydas/base-r-skill

Provides base R programming guidance covering data structures, data wrangling, statistical modeling, visualization, and I/O

claude-skills data-analysis r skills

Last synced: 10 Jun 2026

https://github.com/arfc/pride

(P)lan for (R)ap(I)d (DE)carbonization

data-analysis quantitative-recommendations reactor

Last synced: 27 Mar 2026

https://github.com/josechirif/reviews-and-satisfaction-analysis-of-airbnb-brazil-and-mexico-from-june-2010-to-february-2021

This project analyzes the reviews and satisfaction of customers who used AirBnB services. It also studies if there is a relationship between another variables.

data data-analysis data-visualization powerbi sql-server

Last synced: 25 Feb 2026

https://github.com/totonga/odsbox-jaquel-mcp

MCP server to work with odsbox to access ASAM ODS HTTP API.

ai-agents asam data-analysis data-science mcp ods

Last synced: 11 Apr 2026

https://github.com/virajbhutada/power-bi-resources

Comprehensive Power BI resources covering interview questions, group training materials, project portfolio ideas, and a cheat sheet for quick reference. Elevate your Power BI skills with curated content designed to enhance your proficiency and boost project success.

data-analysis data-modeling data-visualization dax-expression interview-preparation portfolio-projects powerbi quick-reference visualization-techniques

Last synced: 03 Mar 2026

https://github.com/rickyxume/data_statistic_analysis

2021数据统计与分析大赛全国一等奖方案

data-analysis data-mining data-science

Last synced: 15 Jun 2026

https://github.com/monteirooscar98/salario-minimo-brasil

Extração de dados através de API do Banco Central, WebScraping no site do Dieese e Análise dos dados.

api brasil data-analysis opendata python webscraping

Last synced: 15 Jun 2026

https://github.com/brews/riverpca

Companion repository to Malevich S.B., and C.A. Woodhouse (2017), Pacific SSTs, mid-latitude atmospheric circulation, and widespread interannual anomalies in Western US streamflow, Geophys. Res. Lett., 44, doi:10.1002/2017GL073536.

analysis data-analysis paper pca python river streamflow visualization

Last synced: 28 Apr 2026

https://github.com/manmolecular/http-response-clustering

:chart_with_downwards_trend: Clustering of HTTP responses using k-means++ and the elbow method

data-analysis elbow-method elbow-plot jupyter k-means-plus-plus python3

Last synced: 29 Apr 2026

https://github.com/carocardenas0699/pi02-data-analysis

Proyecto Individual 2 de la carrera Data Science. Se realizó un análisis de homicidios en siniestros viales en la ciudad de Buenos Aires. Incluye: ETL, EDA, Dashboard interactivo con resultados

data-analysis data-science data-visualization eda etl powerbi python

Last synced: 15 Jun 2026

https://github.com/sidhuk/metaboheatmap

A R/Shiny based app for visualizing metabolomics data through heatmaps

data-analysis data-visualization heatmap metabolomics shiny

Last synced: 26 Feb 2026

https://github.com/pitmonticone/covid-italy

References for COVID-19 situation in Italy.

coronavirus covid-19 covid-19-italy data data-analysis documentation testing

Last synced: 05 Apr 2026

https://github.com/sabujxi/python-scraper-and-data-analysts-admin-panel-in-django

A data scraper from texas govt site and a helping web app for managing, reviewing and editing the data

analyst data data-analysis data-entry data-scraper django django-application python python-scraper real-estate regex scraper texas

Last synced: 30 Apr 2026

https://github.com/cworld1/da-learning

Some notes and code about CWorld learning Data Analysis

data-analysis data-science jupyter-book jupyter-notebook python r

Last synced: 18 Apr 2026

https://github.com/cai-lab-at-university-of-michigan/ncorrect

A toolkit for the correction and normalization of SWC files from neuron morphology experiments.

data-analysis neuron-morphology swc

Last synced: 05 May 2026

https://github.com/aliakbar-omidi/digikala-data-analysis

Analysis of the behavior of Digikala customers in shopping at different times

collection data-analysis matplotlib numpy pandas persiantools python seaborn

Last synced: 15 Apr 2026

https://github.com/mirokeimioniemi/optimizing-insulin-injection-timing

Data processing and analysis for "Determining the optimal timing for insulin injection to minimize glucose level variability after a meal in ideal conditions" - a research project for the IB Standard Level Mathematics Analysis and Approaches course inspired by my type 1 diabetes.

cgm data-analysis data-science dexcom dexcom-g6 diabetes exploration ib insulin insulin-timing international-baccalaureate mathematics optimization python type-1-diabetes

Last synced: 09 May 2026

https://github.com/1luvc0d3/metabase-mcp

MCP server connecting Claude to Metabase for natural language data analysis, dashboard management, and SQL queries

anthropic claude data-analysis mcp metabase model-context-protocol natural-language sql

Last synced: 21 Apr 2026

https://github.com/sevdanurgenc/data-modeling-techniques-lecture-notes

In this repo, I have the course contents of Data Modelling Techniques training, which will be given to Innova Technology by the cooperation of Academy Peak Information Technologies Training and Consultancy between 25 - 26 January 2022.

data-analysis data-mining data-modeling data-science data-structure data-visualization

Last synced: 19 Mar 2026

https://github.com/mrjxtr/tokyo_airbnb_analysis_project

Full project case study and analysis to show potential opportunities to start an AirBnb business in Tokyo, Japan.

data-analysis data-cleaning data-science data-visualization pandas python3

Last synced: 24 Feb 2026

https://github.com/jku-vds-lab/marjorie

Marjorie is a web-based approach to visualize and explore patterns in type 1 diabetes data.

data-analysis diabetes pattern-recognition visualization

Last synced: 09 May 2026

https://github.com/mk2112/minicorpus

Reproducing, then improving MiniPile with PyTorch and HuggingFace

data-analysis huggingface pytorch subset-construction subset-selection

Last synced: 20 Apr 2026

https://github.com/alexandregazagnes/unilasalle-public-resources

UniLaSalle-Public-Ressources : This public repository contains the notebooks and the data used for both : 2nd Year - Practical Statistical Tests 4th Year - Data Analysis with Python

data data-analysis data-analytics data-cleaning data-storytelling education educational exploratory-data-analysis python python3 r r-programming rstudio statistics visualization

Last synced: 28 Apr 2026

https://github.com/iantomasinicola/portfoliodataanalyst

Progetto di Data analysis con Python, Microsoft Sql Server e Excel

data-analysis excel python sql

Last synced: 12 May 2026

https://github.com/vultair/vultair-platform

An automated tool for forensic investigations of social media accounts. Supports platforms like Facebook, Twitter, Instagram, Telegram, WhatsApp, etc.

android automation data-analysis data-parsing forensics-tools investigation social-media

Last synced: 03 Jun 2026

https://github.com/gattiharishkumar/employee-attendance-leaves-analytics-dashboard

This project showcases a Power BI dashboard created to analyze employee attendance and leaves over a three-month period. The data was sourced from Excel datasets available on the Codebasics website.

dashboards data-analysis data-cleaning data-transformation data-visualization power-query-editor powerbi

Last synced: 19 Mar 2026

https://github.com/supertetelman/kaggle-public

A collection of Python and Matlab projects aimed at utilizing various machine learning techniques to solve big data problems.

cnn data-analysis deep-learning machine-learning matlab python

Last synced: 29 Apr 2026

https://github.com/saranshbansal/spam-detection-analytics-tool

This is a nice tool to read chunks of sms data from a csv and understand how different algorithms (pre-implemented) perform in identifying spam messages.

analytics data-analysis data-science data-visualization mysql spring-boot

Last synced: 01 May 2026

https://github.com/seyedhosseinzadeh/ws_tm

Weather web scraping and Time series model to predict temperature, humidity and barometer

data-analysis deep-learning lstm-model machine-learning prediction prediction-model weather web-scraping

Last synced: 10 Jun 2026

https://github.com/freepicheep/nu-salesforce

A nushell module to interact with Salesforce data through the Salesforce REST API.

data-analysis nu nushell salesforce salesforce-api scripting shell

Last synced: 03 Mar 2026

https://github.com/praveendecode/product_sentiment_analysis

This project employs NLTK, Prowebscraper, and Python for sentiment analysis on online product reviews. Through web scraping, EDA, and NLP, it evaluates user satisfaction by comparing actual ratings and sentiment scores

data-analysis data-visualization natural-language-processing nltk-python product-analysis python sentiment-analysis

Last synced: 03 May 2026

https://github.com/roland045/road_quality_measurement_analysis

Novel road quality measurement system for cost effective pavement monitoring, ML-based

azure data-analysis data-engineering data-science machine-learning mlops model-deployment python sql unsupervised-learning

Last synced: 24 Jan 2026

https://github.com/shivamswarnkar/tesla-stock-prediction

Making prediction of close prices of Tesla Stocks using different regression methods.

data-analysis data-visualization plotly regression regularization sklearn stock-price-prediction

Last synced: 05 May 2026

https://github.com/idaraabasiudoh/vehicle-co2emission_model

Predicts CO2 emissions from vehicle fuel consumption using a multiple linear regression model trained on sklearn, based on a dataset of engine sizes and corresponding CO2 emissions in Canada.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 06 May 2026

https://github.com/antononcube/wl-outlieridentifiers-paclet

Wolfram Language (aka Mathematica) paclet that provides outlier identifier functions.

data-analysis hampel outlier-detection outliers

Last synced: 20 Mar 2026

https://github.com/mahmoudparsian/data-management-for-business-analytics

Data Management for Business Analytics: This course focuses on database management systems and procedures with an emphasis on the design and development of efficient business information systems. MySQL is used to teach the basics of relational database systems, structures, and database queries by using SQL.

analytics business-analytics business-intelligence data-analysis data-visualization database mysql python-data-analysis relational-databases relational-model sql

Last synced: 26 Feb 2026

https://github.com/invictusaman/socioeconomic-indicators-in-chicago-sql-python

This project displays how to create a database connection in notebook, update database using python and how to run Python program and SQL queries together. It uses SQLite and Chicago dataset for analysis.

data-analysis jupyter-notebook python sql sql-queries sqlite

Last synced: 12 Feb 2026

https://github.com/aisurjyasamantaray/sales-perfomance-analysis-dashboard

A comprehensive sales performance analysis dashboard built using Python, and visualization tools. This project includes data cleaning, descriptive statistics, correlation analysis, and insights into sales trends, profitability, and the impact of discounts. Key features include interactive visualizations using Seaborn, and Matplot

analytics annova data data-analysis data-visualization-project dataproject eda hypothesis-testing pandas-dataframe python sales-performance-analysis statistics

Last synced: 04 Apr 2026

https://github.com/camille-maslin/securecard-ai

🛡️ SecureCard-AI: A high-performance credit card fraud detection system implemented in a Jupyter Notebook, achieving 99.97% accuracy.

classification credit-card-fraud-detection data-analysis data-science fraud-detection jupyter-notebook machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 12 Feb 2026

https://github.com/mylethidiem/zero-to-hero

Project for learning, practicing code: Python, SQL, C/C++, Data science/Data Analysis, AI/Machine learning

ai cpp data-analysis data-science deep-learning machine-learning mlops python sql

Last synced: 02 Mar 2026

https://github.com/andr3w03/bike-sharing-dashboard

Bike Sharing Data Analysis Streamlit Dashboard

dashboard data-analysis data-visualization python streamlit

Last synced: 01 May 2026

https://github.com/shashankbansal6/signal-analysis-for-patient-monitoring

A reliable patient monitoring system which analyzes the correlated physiological signals collected from the patient's body, and generates alarms for abnormalities.

data-analysis patient-monitoring

Last synced: 18 Mar 2026

https://github.com/quantumudit/analyzing-gamerevolution-games

This project focuses on scraping data related to video games from the GameRevolution website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 27 Apr 2026

https://github.com/billy-enrizky/kimia-farma-sales-management-database-replica-project

SQL Database Management, Then Visualizing it on Tableau!

analytics data-analysis data-visualization sql

Last synced: 27 Feb 2026

https://github.com/revogati/ecommerce_consumer_behaviour

This is a Full Data Analytics project From data cleaning, preparation, exploration, Interpretation of insights up to Presentation of findings and recommendations..

data-analysis data-exploration ecommerce jupyter-notebook python sql tableau-public visualization

Last synced: 16 Apr 2026

https://github.com/kirrrto/amazon-market-research-dashboard

Amazon product research dashboard for spreadsheet imports, supplier product pages, specification matrices, gap analysis, requirement drafts and supplier follow-up exports.

amazon-product-research data-analysis ecommerce market-research pandas product-development python specification-analysis streamlit

Last synced: 17 Jun 2026

https://github.com/dcs-training/2024-11-18-cdcs-carpentry-social-sciences

This repo contains the material produced for a course run by the Centre in November 2024

data-analysis data-visualisation data-wrangling intro-to-programming r

Last synced: 14 Feb 2026

https://github.com/w-edward/youtube-keyword-popularity-analyzer

An effort to discover the top trending keywords on Youtube.

data-analysis node-js numpy python webscraping youtube-api

Last synced: 15 Apr 2026

https://github.com/zelosleone/finncorr

A .NET Core financial analysis tool/API for calculating correlations between time series data with interactive visualizations powered by ML.NET and Plotly.js.

aspnet-core correlation-analysis csv-parser data-analysis dotnet financial-analysis machine-learning ml-net plotly rest-api statistical-analysis swagger time-series visualization

Last synced: 09 Feb 2026

https://github.com/pratishtha-abrol/astronomy-dataanalysis

A key technique in Data Driven Astronomy

astronomy astropy crossmatch data-analysis

Last synced: 15 Jun 2026

https://github.com/juliasouz/dashboard-vendas

Dashboard interativo de vendas do Xbox Game Pass, criado no Excel para análise e visualização de dados de assinaturas.

business-intelligence dashboard data-analysis excel sales-data visualizacao-de-dados xbox-game-pass

Last synced: 31 Jan 2026

https://github.com/labrinyang/apple-health-analysis

Mayo Clinic-grade Apple Health data analysis — Claude Code skill with 20 peer-reviewed statistical methods and 35+ SVG visualizations

apple-health cgm claude-code claude-skill data-analysis health-analytics heart-rate statistical-methods

Last synced: 19 Apr 2026

https://github.com/karlyndiary/global-electronics-retailer-sales-and-customer-insights

Developed an analysis using Python, SQL, and Excel to examine sales and customer demographics for a Global Electronics Retailer. The findings aim to enhance business strategies and improve overall performance.

dashboard data-analysis data-cleaning-and-preprocessing data-pipeline data-visualization etl microsoft-excel microsoft-sql-server python sql

Last synced: 14 Feb 2026

https://github.com/verbasik/yandex.practicum.datascience

Портфолио проектов Data Science, выполненных в рамках профессиональной переподготовки в Яндекс.Практикум. Включает исследования в области финансов, недвижимости, кинопроката и других, с использованием статистики, машинного обучения и анализа данных.

data-analysis data-science machine-learning yandex-praktikum

Last synced: 29 Jan 2026