An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/jpquast/icp-ms-data-explorer

A shiny app for the exploration of ICP-MS data.

data-analysis icp-ms r shiny shiny-apps

Last synced: 17 Jan 2026

https://github.com/ivanildobarauna-dev/data-pipeline-sync-ingest

ETL Process for Currency Quotes Data" project is a complete solution dedicated to extracting, transforming and loading (ETL) currency quote data. This project uses several advanced techniques and architectures to ensure the efficiency and robustness of the ETL process.

business-intelligence data-analysis data-analytics data-engineering data-pipeline data-visualization etl-pipeline python

Last synced: 28 Oct 2025

https://github.com/lastancientone/amd-vs-nvda

Analyzing 2 technology stocks using Master Analyst Program (MAP).

data data-analysis data-structures data-visualization excel forecasting time-series-analysis

Last synced: 15 May 2025

https://github.com/naso7y/students-performance-analysis

A project analyzing students' academic performance to identify trends and factors affecting outcomes. Built with Python, using data visualization and statistical techniques to derive actionable insights.

data-analysis data-visualization machine-learning python

Last synced: 23 Feb 2026

https://github.com/henrylin03/video-games

Using Python and SQL to clean, analyse and visualise video games' data from Metacritic. Includes scraping using BeautifulSoup.

analysis beautifulsoup beautifulsoup4 data data-analysis data-science eda jupyter-notebook pandas python sql sqlite3 video-game video-games

Last synced: 14 Apr 2026

https://github.com/its-kanii/predictive-maintenance-for-healthcare-equipment

Predictive Maintenance for Healthcare Equipment utilizes machine learning to analyze operational metrics and predict equipment failures. This project leverages a dataset of usage hours, temperature, and maintenance history to enhance equipment reliability and reduce downtime.

data-analysis data-science failure-prediction feature-engineering healthcare-equipment jupyter-notebook machine-learning predictive-maintenance python time-series-analysis

Last synced: 09 May 2026

https://github.com/llnl/hdtopology

High-dimensional topological data analysis library for NDDAV

analysis cpp data-analysis data-viz high-dimensional-data topological-data-analysis visualization

Last synced: 29 Apr 2025

https://github.com/mrjxtr/tokyo_airbnb_analysis_project

Full project case study and analysis to show potential opportunities to start an AirBnb business in Tokyo, Japan.

data-analysis data-cleaning data-science data-visualization pandas python3

Last synced: 24 Feb 2026

https://github.com/luizassimoes/fitness-report

Create a personalized Fitness Wrapped report using your Apple Health data with this Streamlit application. Generate comprehensive, detailed summaries of your annual fitness activities, providing valuable insights into your year-long progress and achievements.

data-analysis data-visualization python streamlit

Last synced: 12 Feb 2026

https://github.com/arzan101/ola-data-analytics

Ola - Identified the reason and trends for ride cancellation. Process - Cleaned and Processed Data from multiple sources, applied Sql queries and visualized data using PoweBi . Motive - To reduce the cancellation rate

dashboard data-analysis data-mining data-visualization dataanalytics excel powerbi sql

Last synced: 06 Jan 2026

https://github.com/fernandezfran/exma

A Python library with C extensions to analyze and manipulate molecular dynamics trajectories and electrochemical data

computational-physics data-analysis molecular-dynamics oop python science

Last synced: 16 Jan 2026

https://github.com/roland045/road_quality_measurement_analysis

Novel road quality measurement system for cost effective pavement monitoring, ML-based

azure data-analysis data-engineering data-science machine-learning mlops model-deployment python sql unsupervised-learning

Last synced: 24 Jan 2026

https://github.com/trybnetic/tu7-acceleration-sleep-wake-classification

Supporting material for the paper ''Discrimination of sleep and wake periods from a hip-worn raw acceleration sensor using recurrent neural networks''

accelerometer accelerometry actigraphy data-analysis sensors sleep

Last synced: 01 Jun 2026

https://github.com/ashwinpn/visualization

Data Visualization using Matplotlib, Pandas Visualization, Seaborn, ggplot, and Plotly.

analysis data data-analysis data-science data-visualization graphs plots python python3 visualization

Last synced: 13 Apr 2026

https://github.com/jimbrig/eda

Exploratory Data Analysis R Package and Shiny App

data-analysis data-visualization eda r shiny

Last synced: 03 Jan 2026

https://github.com/louislefevre/sstubs-miner

Data mining and analysis for the ManySStuBs4J dataset.

data-analysis data-mining manysstubs4j-dataset msr

Last synced: 30 Mar 2025

https://github.com/devexpress-examples/web-forms-pivot-grid-bind-to-sql-data-source

This example demonstrates how to create an ASPxPivotGrid and bind it to data via code.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 06 Jul 2025

https://github.com/ernestaroozoo/memestocks.net

MemeStocks.net is a Python web app that tracks the historical popularity of specific stocks by monitoring Reddit mentions. Users can search for a stock symbol and view information such as the stock's name, price, and historical popularity data. The data is gathered using the Pushshift API and stored in a PostgreSQL database.

dashboard data-analysis financial-data meme-stock python reddit-scraper scraping sql stock streamlit

Last synced: 06 Mar 2026

https://github.com/virajbhutada/cliquebait-digital-marketing-analysis-using-sql

This GitHub repository contains the CliqueBait Digital Marketing Analysis project, utilizing SQL for comprehensive analysis of marketing campaigns, user engagement, product performance, and website interactions within the Clique Bait food app. The project offers actionable insights for optimizing marketing strategies in competitive landscape.

campaign-website data-analysis data-extraction data-science digital-marketing food-store microsoft-excel mysql product-performance sql sql-database sql-project user-engagement website-analytics

Last synced: 27 Feb 2025

https://github.com/walidbosso/r_data_mining

Extract knowledge from a data using different techniques, including Association Rules Hierarchical Agglomerative Clustering (HAC) K-means Clustering Decision Trees

association-rule-mining association-rules clustering data-analysis data-mining data-science data-visualization decision-tree-classifier decision-trees exportation extract-data hac hierarchical-clustering k-means k-means-clustering k-means-r r-programming r-studio

Last synced: 23 Mar 2025

https://github.com/martial2023/bank-performance-analysis

Analyse de données bancaires du Berka Dataset (1993-1998) pour calculer et visualiser des KPI clés

dashboard data-analysis data-visualization nextjs pandas plotly-express pymongo python recharts-js sqlalchemy

Last synced: 26 Aug 2025

https://github.com/is-leeroy-jenkins/badger

A data science & analysis toolkit for federal analysts with the environmental protection agency based on WPF, Net 8, and is written in C#.

ai budget budget-management data-analysis data-science data-visualization federal-government large-language-models machine-learning

Last synced: 14 Oct 2025

https://github.com/mainakverse/ml-algorithms-starter

List of machine learning algorithms that are needed to start with ML projects and lay a foundation into data science

data-analysis data-science jupyter-notebooks machine-learning-algorithms practice

Last synced: 19 Apr 2025

https://github.com/agungbudiwirawan/supermarket_sales_dashboard-sales-analysis-using-pivot-table-and-chart

The objective of this project is to analyze supermarket sales using pivot table and chart in Microsoft Excel. After doing the analysis, I created an interactive dashboard that aims to make it easier for the audience to explore data.

dashboard data-analysis data-science data-visualization excel sales-dashboard spreadsheet

Last synced: 08 Jan 2026

https://github.com/tillbiskup/cwepr

A Python package based on the ASpecD framework for handling cwEPR data.

continuous-wave data-analysis data-processing electron-paramagnetic-resonance reproducible-research reproducible-science

Last synced: 06 Sep 2025

https://github.com/gxelab/tutorials

Tutorials of frequently used software packages and libraries in the lab

bioinformatics data-analysis evolution genetics genomics julia python3 r-language statistics visualization

Last synced: 18 Jan 2026

https://github.com/tathithienthanh/dataanalysis_diagnosis-of-diabetes-based-on-data-set-of-blood-test-result

Implement all learned knowledge about data analysis and data mining to make a complete project about Diagnosis of diabetes based on data set of blood test result

blood-test classification clustering data-analysis data-processing decision-tree diabetes-prediction diagnosis exercise google-colab health hierarchical ipynb kmeans knn knns py python smote-sampling visualization

Last synced: 08 Jan 2026

https://github.com/ziaeemehr/itng_nest

Nest Simulator quick guides and examples, adding new model using NESTML

computational-neuroscience data-analysis nest-simulator neuroscience

Last synced: 30 Aug 2025

https://github.com/virajbhutada/music-recommendation-system

This project is designed to provide personalized music recommendations for relaxation and meditation. Leveraging ML and data analysis, the system suggests tracks based on user preferences such as tempo, energy, and genre. Join us in enhancing music discovery through advanced algorithms and community-driven contributions.

data-analysis data-science-projects data-visualization eda html machine-learning ml-algortihms model-deployment model-evaluation music-recommendation-system nlp pivot-table principal-component-analysis python python-library similarity-matrix spotify-data streamlit-web user-experience

Last synced: 24 Jan 2026

https://github.com/anil951/early-detection-of-mental-health

This project develops a predictive model to identify early signs of mental health issues in adolescents using social media activity, school performance, health records, and an AI chatbot. It analyzes emotional tone, academic changes, and health data, offering personalized recommendations and resources for mental wellness.

data-analysis deep-learning early-detection lstm mental-health sentiment-analysis social-media

Last synced: 28 Jan 2026

https://github.com/scailfin/flowserv-core

Reproducible and Reusable Data Analysis Workflow Server

benchmarks data-analysis reproducibility reusability workflows

Last synced: 14 Jan 2026

https://github.com/mustafah/dream-my-plots

Create visual plots in Python with the help of text prompting popular LLMs through langchain

ai artificial-intelligence automation data-analysis data-visualization langchain llms machine-learning plotting python

Last synced: 13 Apr 2026

https://github.com/olow304/goboard

Python Data Analysis Dashboard using Public Dataset, Django

dashboard dashboard-templates data-analysis data-science django jupyter-notebook machine-learning python sklearn

Last synced: 11 Apr 2026

https://github.com/camille-maslin/simulfcimage

🔍 SimulFCImage: A professional multispectral image processing application developed for ImViA Laboratory.

academic-project computer-vision data-analysis gui-application image-processing image-viewer multispectral-images pyqt5 python scientific-visualization spectral-analysis

Last synced: 05 Feb 2026

https://github.com/code-jl/nfl-point-kicker-data-scraper

A Python-based web scraping toolkit that extracts and processes NFL kicking statistics from Pro-Football-Reference. This project automates the collection of comprehensive game data, with a particular focus on field goal attempts and environmental conditions.

automation beautifulsoup csv data-analysis data-collection field-goals football-statistics kicking-stats nfl python selenium sports-analysis statistics weather-data web-scraping

Last synced: 06 Sep 2025

https://github.com/jshinm/web-scrapper

Web Scrapper used to extract NeuroData github repo stats

data-analysis web-scraping

Last synced: 04 Apr 2025

https://github.com/dcs-training/bayesian-statistics

Materials for the CDCS Introduction to Bayesian Statistics course. Go to the readme file

bayesian-statistics data-analysis r statistics

Last synced: 05 Feb 2026

https://github.com/luminati-io/Amazon-dataset-samples

A sample dataset of over 1,000 Amazon product listings, extracted using the Bright Data API, perfect for competitive analysis, market trends, and eCommerce insights.

amazon api data-analysis data-science dataset ecommerce products web-scraping

Last synced: 09 Apr 2025

https://github.com/aravind-selvam/covid_dashboard

With Covid death and vaccine data. I have created a dashboard.

covid-19 data-analysis data-science data-visualization tableau tableau-public visualization

Last synced: 08 Mar 2026

https://github.com/ronylpatil/whatsapplib

WhatsApp Group Chat Analysis Python Package.

data-analysis open-source pypi-package python-library python-package

Last synced: 02 Jan 2026

https://github.com/asifdotexe/timeseriesanalysis

This repository serves as a central hub for all of my projects related to time series analysis. Here, you'll find a collection of projects, code samples, and resources that explore various aspects of time series data and its analysis.

data-analysis feature-engineering jupyter-notebook pandas python time-series-analysis visualization

Last synced: 14 Apr 2026

https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation

we aim to predict trends in the Canadian market basket using sentiment analysis techniques. Sentiment analysis involves analyzing text data to determine the sentiment expressed, whether positive, negative, or neutral.

algorithms-and-data-structures data data-analysis data-science data-visualization feature-engineering machine-learning matplotlib-pyplot numerical-analysis numpy pandas pipelines python sklearn structured-data super unsupervised-learning

Last synced: 05 Feb 2026

https://github.com/fabienarcellier/qjoin

qjoin is a data manipulation library that provides simple and efficient joining and collection processing functionality

composable data-analysis developer-tools functools python

Last synced: 01 Mar 2026

https://github.com/yahia3200/become-an-independent-data-scientist

My final project for the Applied Plotting, Charting & Data Representation in Python Course

data-analysis data-science data-visualization matplotlib

Last synced: 16 Mar 2025

https://github.com/visionkernel/centerspoke

Centerspoke is a data management and analysis tool that allows easy access to cloud databases. Say goodbye to using excel for data management. This open-source CLI tool allows for the rapid processing and analysis of all your data, and makes it easy to upload your excel files into your cloud databases.

cli cloud-database data-aggregation data-analysis data-analysis-python data-management data-science python python3

Last synced: 24 May 2026

https://github.com/a-r-j/npview

CLI tools for quickly inspecting CSV/TSV & NumPy (.npy) array files

cli csv data-analysis inspector npy numpy python tsv

Last synced: 18 Jan 2026

https://github.com/c0deta1ker/MatBaseX

MatBase provides access to an extensive database of material parameters, inelastic mean free paths (IMFP), photoionization binding energies, cross sections, and asymmetry parameters. Additionally, MatBase includes a suite of functions for users to load, process, model and fit their own data, making it an indispensable tool in the field.

cross-sections crystal-structure crystallography data-analysis data-fitting database electron imfp imfp-calculator-matlab material material-database matlab matlab-application matlab-gui matlab-toolbox pes-modelling photoelectron-spectroscopy photoionization simulation xps

Last synced: 23 Jul 2025

https://github.com/lafayettegabe/g2m-insight-for-cab-investment-firm

📊 Exploratory Data Analysis (EDA) on multiple datasets related to the cab industry in the US, to provide actionable insights and recommendations to a private firm looking to invest in the market. The analysis includes data cleaning, transformation, visualization, and hypothesis testing.

big-data data data-analysis data-science data-visualization eda gotomarket

Last synced: 13 Jun 2025

https://github.com/DCS-training/intromachinelearning

This course is aimed at providing an introduction to machine learning for those with some beginner level python/Rstudio skills. Go to the readme file

data-analysis data-wrangling machine-learning python statistics

Last synced: 25 Apr 2025

https://github.com/quantumudit/analyzing-goodreads-famous-quotes

This project focuses on scraping famous quotes and their related data from the GoodReads website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 20 May 2026

https://github.com/freekatz/english-reading

考研英语(10-19)数据集及相关数据分析

data-analysis dataset

Last synced: 16 Jul 2025

https://github.com/parisaroozgarian/ibm-data-analyst-professional-certificate

The IBM Data Analyst Professional Certificate, consisting of 9 courses, equips with essential skills in Excel, SQL, Python, data visualization, and analysis techniques

big-data business-analysis business-communication communication data-analysis data-management data-structures data-visualization databases general-statistics human-resources planning python-programming spreedsheet sql

Last synced: 27 Jan 2026

https://github.com/kylekirkby/cardatasnatch

CarDataSnatch allows you to quickly find information about a car in the uk using a valid number plate. Grab an image of the car in question along with a multitude of other data. Compare two cars' data for fast and easy analysis.

beautifulsoup cars command-line-tool data data-analysis data-mining ethical-hacking python python3 requests scraper social-engineering

Last synced: 15 Apr 2025

https://github.com/rmnldwg/liver-smart

Data and analysis pipeline for a study on the potential advantages of daily adaptive liver SBRT performed at the University Hospital Zurich.

data-analysis fractionation jupyter-notebook liver-cancer metastasis radiation-oncology stereotactic

Last synced: 27 May 2026

https://github.com/sarincr/data-analytics-with-knime

Data Analytics with KNIME (Konstanz Information Miner), a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing (ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data-analysis data-mining data-science data-structures data-visualization database datascience deep-learning machine-intelligence machine-learning machine-learning-algorithms machinelearning mining mining-software

Last synced: 14 Mar 2025

https://github.com/quantumudit/uk-student-accommodation-analysis

This project focuses on scraping student properties related data from the UK Student Accommodation website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 27 Apr 2026

https://github.com/i4ds/ecallisto_ng

Ecallisto NG is a Python package tailored for interacting with Ecallisto data.

data-analysis data-visualization e-callisto ecallisto-international-network numpy pandas python spectrometer

Last synced: 13 Oct 2025

https://github.com/souvik09-tech/walmart_sales_dataanalysis

This end-to-end data analysis project leverages Python for processing and SQL for advanced querying to extract key business insights from Walmart sales data. It's designed for data analysts to enhance skills in data manipulation, querying, and pipeline creation.

data-analysis end-to-end etl-pipeline jupyter-notebook mysql mysql-database pandas python

Last synced: 17 Feb 2026

https://github.com/marios-mamalis/mca-visualisation

A script for automatic visualisation of Multiple Correspondence Analysis (MCA) results from FactoMineR in 3 dimensions using Plotly (exported as html)

3d-scatterplots correspondence-analysis data-analysis factominer html mca multiple-correspondence-analysis plotly visualisation

Last synced: 02 Mar 2025

https://github.com/leonism/customer-predictive-analysis

Explore this repository, a comprehensive resource offering an in-depth guide to conducting customer predictive analysis using cutting-edge machine learning techniques, all within the intuitive framework of Dataiku.

data-analysis data-model data-science data-visualization dataiku machine-learning predictive-modeling

Last synced: 28 Mar 2025

https://github.com/jakebrehm/demesstify

📱Demystifies your messages and allows for easy analysis and visualization of conversations.

data-analysis data-science imessage messages messaging nlp pandas python sentiment-analysis visualization wordcloud

Last synced: 13 Apr 2025

https://github.com/beckversync/probability-and-statis_computer-parts-cpus-and-gpus-ics_

Probability and statistical analysis techniques are employed to explore data related to computer components, such as CPUs, GPUs, and Integrated Circuits (ICs). The objective is to uncover trends, identify patterns, and extract meaningful insights from real-world hardware data.

data-analysis r

Last synced: 18 Feb 2026

https://github.com/johntocci/nullaxe

Nullaxe is a powerful and user-friendly Python library designed for cleaning and preprocessing data. It works seamlessly with both pandas and polars DataFrames, making it a versatile tool for data scientists and developers.

data data-analysis data-science datacleaning pandas polars python

Last synced: 06 Apr 2026

https://github.com/rgalyeon/machine_learning_and_data_analysis

Machine Learning and Data Analysis specialization by Yandex and MIPT

coursera data-analysis data-science machine-learning mipt python yandex

Last synced: 03 Mar 2025

https://github.com/zachlagden/spotify-listening-analyzer

A comprehensive Python tool for analyzing your Spotify listening history data.

analytics data-analysis pandas python spotify-web-api spotipy

Last synced: 31 Jul 2025

https://github.com/elysian01/ml-eda-and-modelling-using-streamlit

Beautiful Web interface made using Streamlit for quick Exploratory Data Analysis and building classification models which are implemented from scratch.

data-analysis data-visualization eda exploratory-data-analysis knn-classification logistic-regression matplotlib ml-model-on-web ml-models naive-bayes-classifier pandas seaborn streamlit streamlit-webapp

Last synced: 12 Apr 2025

https://github.com/dcs-training/from-spss-to-r-how-to-make-your-statistical-analysis-reproducible

Comfortable/aware of how to run your stats in SPSS? Curious to learn how to run them in R? You've come to the right place. Go to the readme file

data-analysis data-visualisation data-wrangling good-practices-digital-research r rmarkdown spss statistics

Last synced: 25 Jan 2026

https://github.com/simoneas02/data-science

🐍 A planning study to become a data scientist and to improve my current skills. 🤘🏼🌻

data data-analysis data-science data-visualization deep-learning machine-learning pandas python3 r sql

Last synced: 12 Apr 2026

https://github.com/thisisashukla/survival-analysis

Hands-On Survival Analysis in Python

data-analysis data-science survival-analysis

Last synced: 28 Jul 2025

https://github.com/emptymalei/mini-lab

Some code snippets used to explain stuff to myself in my personal data science wiki

data-analysis data-mining data-science data-visualization datascience

Last synced: 07 Apr 2025

https://github.com/quantumudit/demographic-data-analysis

This project focuses on analyzing and finding correlations between the three important metrics by 195 countries,i.e., birth rate, internet users, and income group.

data-analysis jupyter-notebook power-bi python

Last synced: 15 May 2026

https://github.com/poga/dat-ipynb-demo

use ipython notebook to analyze data in dat archive

dat data-analysis distributed jupyter-notebook

Last synced: 17 Aug 2025

https://github.com/kevinyang372/san-francisco-crime-data-analysis

An ARIMA prediction model for forecasting potential crimes based on users' time and location

data-analysis machine-learning

Last synced: 29 Oct 2025

https://github.com/shramkoweb/bookbot

A Python-based text analyzer that counts words and character frequencies in any .txt file, providing a detailed, sorted report. Perfect for quick text insights and learning text processing basics!

automation beginner-friendly character-frequency data-analysis file-processing open-source python text-analysis text-parser text-processing word-count

Last synced: 02 Feb 2026

https://github.com/gabysbrain/purescript-dataframe

A data structure for row-based data and queries

data-analysis purescript

Last synced: 19 Feb 2026

https://github.com/shibam120302/black-friday-sales-data-analysis

This repository contain Data Analysis on Black Friday Sales Data using various Regression ML algorithms

data-analysis eda machine-learning python random-forest regression

Last synced: 20 May 2026