An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/acs/ghtorrent

Experimental GItHub project analysis based on GHTorrent

data-analysis data-visualization ghtorrent github visualization

Last synced: 20 Mar 2025

https://github.com/abmenzel/messenger-wrapped

Explore your Facebook Messenger chat history, in a manner inspired by Spotify Wrapped

data-analysis data-visualization facebook meta nextjs reactjs spotify

Last synced: 18 Jul 2025

https://github.com/stappit/blog

I often post solutions to textbook exercises, including: Bayesian Data Analysis (BDA) by Gelman et al; Causal Inference in Statistics Primer (CISP) by Pearl et al; Purely Functional Data Structures (PFDS) by Okasaki.

bayesian-data-analysis blog data-analysis data-science gelman hakyll haskell pearl purely-functional-data-structures solutions stan static-site statistical-inference statistics

Last synced: 14 Mar 2025

https://github.com/coalio/Assistant

A data science library providing flexible dataframes for Lua 5.1+

data-analysis data-science data-structures dataframe lua

Last synced: 11 Apr 2025

https://github.com/dbeaudoinfortin/napsdataanalysis

Canadian National Air Pollution Surveillance Program (NAPS) data downloader, importer, extractor, analysis, and visualization toolbox.

air-pollution air-quality air-quality-data aqhi aqhi-canada canada cesi data-analysis eccc geoscience heatmap naps pollution

Last synced: 16 Mar 2025

https://github.com/tzerk/gammaspec

A collection of functions to analyse gamma spectra

data-analysis gamma-ray-spectrometry r

Last synced: 15 Apr 2025

https://github.com/drkostas/eestech-bigdata-challenge

EESTech Challenge is a brand new competition organized by EESTEC, that has the aim to create opportunities for European students to gain knowledge in the field of EECS and develop a professional network. The technological topic of 2017-2018ths competition was Big Data. This is the code I sumbitted with my team (BFS), which consisted of 3 members in total.

apache-spark data-analysis data-visualization eestec machine-learning matplotlib python

Last synced: 25 Apr 2025

https://github.com/sanvishal/Exoplanet-Explore

An Interactive data visualization of Exoplanets

animation d3js data-analysis data-science exoplanet python space visualization

Last synced: 14 Apr 2025

https://github.com/bobleesj/cif-bond-analyzer

An interactive Python script that computes the minimum atomic bonding distances from sites, generating histograms and pair counts.

crystal-structure data-analysis high-throughput materials-infomatics solid-state

Last synced: 16 Jan 2026

https://github.com/globeandmail/upstartr

An R package powering core startr functionality

data data-analysis data-journalism data-visualization journalism news r r-package

Last synced: 03 Oct 2025

https://github.com/iamyajat/whatsapp-chat-analyzer-api

An API to analyse WhatsApp chats and generate insights

data-analysis data-science fastapi python whatsapp

Last synced: 17 Oct 2025

https://github.com/bydevmar/master_masd_fpo

Ce dépôt GitHub regroupe tous les cours, TP, TD, projets, et exercices de ma formation en master en mathématiques appliquées pour la science des données. Parcourez-le pour une vue complète de mon parcours académique, offrant une perspective détaillée de mon apprentissage dans ce domaine.

acp afc algebra big-data-analytics dashboards data-analysis datascience economics english graph-theory latex linear-algebra non-linear-algebra probability prog python scientific-research software-package statistics

Last synced: 05 May 2025

https://github.com/erictleung/phyloseq-cheatsheet

:notebook: Minimal cheatsheet for functions in the phyloseq R package

bioconductor bioinformatics cheatsheet data-analysis microbiome microbiota notes phyloseq r

Last synced: 25 Mar 2025

https://github.com/arv-anshul/yt-watch-history-v2

Analyse your YouTube watch history using ML with Graphs.

data-analysis docker fastapi machine-learning python streamlit youtube

Last synced: 01 Jul 2025

https://github.com/ogoodness/vbreaker-js

CSC 483 Project - Ciphers: Caeser, Multiplicitive, Affine, Vigenere, Hill, Columnar Transposition

affine-cipher caesar-cipher columnar-transposition-cipher cryptography data-analysis decoder decryption encoder encryption hill-cipher parsing vigenere-cipher

Last synced: 11 Apr 2025

https://github.com/zeeshanahmad4/my-path-to-python

Templates and Referance code in Python for Web development,Data Science, Analysing,visualization,Cleaning and Scraping, Machine Learning, Artificial Intelegence

code data-analysis data-visualization machine-learning python refereance selenium template

Last synced: 12 Aug 2025

https://github.com/virajbhutada/spotify-track-analysis-and-recommendation

Experience a comprehensive exploration of Spotify's musical landscape seamlessly transitioned from Tableau visualizations to SQL analysis. Dive into track inventory, streaming metrics, and sonic trends via interactive dashboards, while leveraging SQL queries for deeper insights into KPIs and cross-platform rankings.

audio-analysis data-analysis data-analytics data-science data-visualization eda machine-learning-library ml-models mysql recommendation-system spotify spotify-data spotify-dataset sql-database sql-server streaming-metrics tableau tableau-public trends-analysis

Last synced: 28 Apr 2025

https://github.com/ankan24/machine-learning-data-analysis

This repository contains a collection of Jupyter Notebooks that demonstrate various machine learning and data analysis techniques. The project does not provide a detailed description or specific use cases, but the notebooks cover a range of topics related to machine learning and data analysis.

data-analysis jupiter-notebook machine-learning

Last synced: 19 Oct 2025

https://github.com/nicodupont/mooc

All my finished Moocs on the subject of the data science mainly

data-analysis data-science data-visualization datacamp jupyter-notebook machine-learning mooc pandas python sas sql

Last synced: 28 Apr 2025

https://github.com/bpkaur/word-frequency-in-moby-dick

To find out the most frequent words in the novel Moby Dick using Python.

beautifulsoup data-analysis data-science moby-dick nltk notebook-jupyter python3

Last synced: 09 Nov 2025

https://github.com/nico-curti/data-analysis

Data Analysis Utilities (from Statistics to Machine Learning)

data-analysis deep-learning-algorithms linear-algebra machine-learning-algorithms statistics

Last synced: 07 Oct 2025

https://github.com/datadesk/extreme-heat-excess-deaths-analysis

A statistical analysis of excess deaths attributable to extreme heat in California's most populous counties

data-analysis data-journalism journalism news python r statistics

Last synced: 11 Oct 2025

https://github.com/cusyio/datenanalyse-in-python

Kurs zur automatisierten Aufbereitung, Zusammenfassung und Erstellung von Diagrammen tabellarischer Daten mit Python.

data-analysis pandas-python pandas-tutorial python

Last synced: 24 Apr 2025

https://github.com/djoshea/trial-data

Interfaces and utilities for analysis of neurophysiology and behavioral data

data-analysis electrophysiology neuroscience

Last synced: 23 Jan 2026

https://github.com/neelshah18/scopus-analysis-for-indian-researcher

It contains data analysis of indian researcher in scopus journal from 2000 to 2016. It also have dataset.

artificial-intelligence data-analysis deep-learning india industry jupyter-notebook machine-learning scopus-analysis universities world

Last synced: 06 Jul 2025

https://github.com/invia-flights/blitzly

Lightning-fast way to get plots with Plotly ⚡️

data-analysis data-science plotly plotting-in-python python visualization

Last synced: 14 Jan 2026

https://github.com/dbeaudoinfortin/NAPSDataAnalysis

Canadian National Air Pollution Surveillance Program (NAPS) data downloader, importer, extractor, analysis, and visualization toolbox.

air-pollution air-quality air-quality-data aqhi aqhi-canada canada cesi data-analysis eccc geoscience heatmap naps pollution

Last synced: 01 Mar 2025

https://github.com/iondv/report

IONDV. Framework: Report module is to form the analytical reports.

analytics businessintelligence css data data-analysis data-visualization iondv iondv-module reporting

Last synced: 19 Oct 2025

https://github.com/patilni3/project_sql

Data Analysis using SQL

census-data data-analysis sql

Last synced: 08 Oct 2025

https://github.com/mahshaaban/intro_data_r

A gentle introduction to data analysis in R

data-analysis image-analysis qpcr-analysis r

Last synced: 28 Apr 2025

https://github.com/caleydo/coral

A web-based visual analysis tool for creating and characterizing cohorts.

cohort-analysis data-analysis data-visualization genomics web-application

Last synced: 19 Jan 2026

https://github.com/spacetelescope/stistools

Tools for HST/STIS.

astronomy data-analysis hst stis

Last synced: 10 Oct 2025

https://github.com/sayakpaul/analysis-of-college-database-of-2017-passouts

Contains my analysis of a database containing information about the students of an engineering college.

data-analysis data-visualization matplotlib python-3

Last synced: 12 Jun 2025

https://github.com/jthomperoo/holtwinters

Holt-Winters exponential smoothing implemented in Go.

data-analysis exponential-smoothing go golang holt-winters prediction prediction-algorithm

Last synced: 17 Jul 2025

https://github.com/sevdanurgenc/r-programming-for-data-science-lecture-notes

In this repo, I have the course contents of R Programming For Data Science training, which will be given to Sigorta Bilgi ve Gözetim Merkezi by the cooperation of Academy Peak Information Technologies Training and Consultancy between 21 - 23 March 2023.

data-analysis data-science data-visualization r r-programming r-programming-projects

Last synced: 11 Oct 2025

https://github.com/farahibrar/kpmg-job-simulation

This repository showcases my work from the KPMG Technology Job Simulation by Forage, focusing on Data Analytics and Cloud Engineering. Explore how I tackled real-world business challenges through sales data analysis, regional growth strategies, and AWS architecture design, highlighting my analytical and technical expertise.

aws-architecture business-intelligence cloud-engineering cloud-strategy-and-design data-analysis data-visualization fintech-solutions forage kpmg kpmg-careers python-for-data-analysis sales-data-insights sustainable-retail-analysis

Last synced: 24 Jan 2026

https://github.com/martinthoma/shell-history-analysis

Analyze how you use your shell

data-analysis python shell

Last synced: 24 Apr 2025

https://github.com/aglove2189/appias

Machine learning workflow toolkit ✨🦋✨

appias data-analysis data-science machine-learning pandas python sklearn workflow

Last synced: 19 Oct 2025

https://github.com/photosynq/photosynq-r

R package to conveniently access project data from the PhotosynQ website

data-analysis photosynq rstudio

Last synced: 22 Oct 2025

https://github.com/cafali/pathscan

PathScan exports information about the contents of directories and hard drives. With a single click, you can create a complete list of all files and paths within a specific folder or across an entire hard drive.

backup command-line data data-analysis data-migration data-mining data-recovery directory folder-management folders forensics hard-drive keyword-extraction logging pathfinding recovery string-search tools utility windows

Last synced: 10 Oct 2025

https://github.com/gcappon/py_agata

Official Python porting of AGATA (Automated Glucose dATa Analysis) a toolbox to analyse glucose data.

continuous-glucose-monitoring data-analysis hacktoberfest python toolbox

Last synced: 22 Jan 2026

https://github.com/mjunaidca/sql-auditor-pro-gpt

Get quick insights from your Data Sources in tables, charts and graphs with with AI-driven audits and optimize your SQL databases.

ai ai-agent auditing-data cloudflared compound-ai-systems custom-gpt data-analysis fastapi-template gpt sql sqlmodel

Last synced: 02 Apr 2025

https://github.com/virajbhutada/tableau-data-vizzes

Engage with a growing collection of Tableau dashboards covering financial trends, HR analytics, streaming service insights, real estate dynamics, and more. Meticulously crafted for valuable insights, this repository continues to expand with new and compelling visualizations.

business-analytics data-analysis data-visualization hr-analytics industry-trends netflix performance-metrics stock-market-analysis strategic-analytics tableau visual-insights

Last synced: 21 Nov 2025

https://github.com/m-barker/fibs-reporter

Automatically generate a pdf report containing feature importance, baseline modelling, spurious correlation detection, and more, from a single command line input for any given ML CSV file

audio-analysis audio-processing automation automl baseline-model data-analysis data-visualization feature-extraction feature-visualization machine-learning pdf-generation

Last synced: 14 Jan 2026

https://github.com/vishnu-t-r/data-analytics-portfolio-projects

This repository contain data analyst portfolio projects developed using various data analytics tools including SQL, Python, Tableau, Looker etc.

data data-analysis data-cleaning data-modeling data-visualization looker looker-studio python sql ssms tableau

Last synced: 23 Apr 2025

https://github.com/dermatologist/goscar-export

:fire: CSV to FHIR (For OSCAR EMR EForm Export)

data-analysis data-warehouse fhir fhir-r4 fhir-server hacktoberfest oscar-emr

Last synced: 10 Sep 2025

https://github.com/misaghmomenib/airport-flight-analysis

Flight Data Analysis Project Aimed at Exploring and Visualizing Airport Operations, Flight Patterns, and Delay Trends Using Python. This Project Involves Data Cleaning, Preprocessing, and Statistical Analysis With Tools Like Pandas, Matplotlib, and Scikit-learn to Uncover Insights and Improve Operational Efficiency.

analysis data-analysis data-visualization git open-source python python3

Last synced: 30 Apr 2025

https://github.com/mmfava/personal-r-script-2015-2017

Este repositório contém uma coleção de scripts R que desenvolvi ao longo de diversos cursos que participei ou ministrei. Eles cobrem uma ampla gama de tópicos em estatística, análise de dados e técnicas de machine learning, com foco especial em aplicações na ecologia e ciências ambientais.

data-analysis r

Last synced: 12 Apr 2025

https://github.com/yaricom/english-article-correction

The experiment with applying NLP to correction of definite/indefinite articles in English text corpus

data-analysis glove-vectors nlp nlp-machine-learning numpy pandas scikit-learn umbc-webbase-corpus

Last synced: 05 Apr 2025

https://github.com/danieldacosta/airbnb-analysis

Data analysis of AirBnb website history in the city of Rio de Janeiro

airbnb-analysis airbnb-website-history data-analysis

Last synced: 30 Apr 2025

https://github.com/hrolive/from-data-to-insights-with-google-cloud-platform

Four-course accelerated online specialization teaches course participants how to derive insights through data analysis and visualization using the Google Cloud Platform

data-analysis data-cleaning data-preparation data-visualization sql

Last synced: 12 May 2025

https://github.com/ankman007/cricket-statsguru

Streamlit-based Nepali cricket visualization dashboard that utilizes various python tools & libraries to display interactive and engaging charts & statistics.

data-analysis data-visualization machine-learning python streamlit

Last synced: 09 Jul 2025

https://github.com/rahul-jha98/restauranttrends.stats

Visualise the trends in food and restaurant choices of customers in a city by scraping data from Zomato.

data-analysis data-science visualization vuejs zomato zomato-api zomato-scraper

Last synced: 08 Jul 2025

https://github.com/palewire/baseball-notebooks

Python notebooks exploring Major League Baseball data

baseball baseball-statistics data-analysis jupyter-notebook pandas python

Last synced: 02 Mar 2025

https://github.com/haroldeustaquio/sql-coding-challenges

Repository dedicated to solving SQL problems from HackerRank, DataLemur and other challenges. Contains solutions to improve skills in database querying, optimization, and data manipulation.

challenge data-analysis database hackerrank-solutions mysql query sql sqlite t-sql-exercises

Last synced: 12 Jul 2025

https://github.com/pjagielski/worldcup

2018 World Cup match data analysis

clojure data-analysis repl world-cup-2018

Last synced: 30 Oct 2025

https://github.com/csiro/miriad

The CSIRO ATNF version of MIRIAD

data-analysis data-reduction radio-astronomy

Last synced: 16 Jan 2026

https://github.com/prajwalchapke055/accenture-data-analytics-and-visualization-forage

NAVIGATING NUMBERS - Apply your data analytics & visualization skills to advise a social media client on their content creation strategy as a Data Analyst at Accenture

accenture communication data-analysis data-modeling data-understanding data-visualization forage internship internship-task job-simulation presentation project-planning public-speaking storytelling strategy teamwork virtual-internship

Last synced: 28 Oct 2025

https://github.com/alandefreitas/scistats

High-Performance Descriptive Statistics and Hypothesis Tests in C++20

bayesian-statistics data-analysis descriptive-statistics hypothesis-testing performance-statistics statistics

Last synced: 11 Apr 2025

https://github.com/oghene-ella/airbnb_market

Meeting the high demand for temporary lodging for anywhere between a few nights to many months

data-analysis

Last synced: 30 Apr 2025

https://github.com/lpsm-dev/twitter-sentimental-analysis-covid

✔️ Twitter Sentimental Analysis Covid-19 (using Textblob - Naive bayes) + Python Backend Flask + Docker + Docker Compose + MongoDB

adminmongo alpine backend corona coronavirus-analysis covid covid-19 data-analysis docker docker-compose docker-compose-wait flask flask-restful mongodb mongoku python python-api sentiment-analysis twitter

Last synced: 30 Apr 2025

https://github.com/zvtvz/zvdata

an extendable library for recording and analyzing data

analyzing-data dash data-analysis pandas plotly-dash sqlalchemy

Last synced: 03 May 2025

https://github.com/vedadiyan/genql

GenQL is a generic querying language fully written in Go

data-analysis data-mapping data-processing data-science data-translation json json-data sql

Last synced: 22 Jun 2025

https://github.com/pythondeveloper6/matplotlib-for-beginners

how to visualize your data using matplotlib

data-analysis matplotlib numpy pandas python visualization

Last synced: 11 Apr 2025

https://github.com/anaagg/data-installations-windows

Are you thinking about using the Linux subsystem on Windows? This is your repo, along this repository you will find the necessary steps to install on your computer what you need to work with Python, SQL, miniconda on Windows using the Linux subsystem.

conda conda-environment data-analysis hyper jupyter jupyter-notebook jupyter-notebooks linux miniconda pycharm pycharm-ide python sql vscode windows workbench

Last synced: 23 Oct 2025

https://github.com/yusufcinarci/web-scraping-projects

In these project files, I will host the web scraping examples that I will make day by day.

data-analysis data-science jupyter-notebook python web-scraping

Last synced: 01 May 2025

https://github.com/eikevons/pandas-paddles

Access the parent Pandas data frame in loc[], iloc[], assign(), and others Pandas helpers

data-analysis data-exploration data-science pandas pandas-dataframe pandas-library pandas-loc

Last synced: 16 Jun 2025

https://github.com/amrrs/iq18_workshop

Data Science in Python - Workshop Data and Notebook

data-analysis ipl python

Last synced: 22 Jul 2025

https://github.com/hevalhazalkurt/exploring_the_data_of_lego_history

A data exploration project on LEGO history in Python with pandas, matplotlib etc. (WIP)

data data-analysis data-science data-visualization datascience datasets lego lego-history matplotlib pandas python python3

Last synced: 13 Apr 2025

https://github.com/duhaime/douglasduhaime.com

Data analysis and visualization

data-analysis data-visualization website

Last synced: 12 Apr 2025

https://github.com/nafisalawalidris/911-call-analysis

The 911 Call Analysis project explores and visualises emergency call data to uncover patterns and trends. It includes data preparation, exploratory analysis, visualizing call volume and reasons and generating heatmaps. Users can customize the code for their dataset. The project relies on libraries like Pandas, NumPy, Matplotlib, Seaborn, and SciPy

cluster-analysis data-analysis data-visualization decision-making emergency-calls emergency-services exploratory-data-analysis heatmaps matplotlib numpy pandas patterns-and-trends resource-allocation scipy seaborn

Last synced: 26 Jun 2025

https://github.com/femtotrader/dukascopyticksreader.jl

A Julia library to download tick data from Dukascopy https://www.dukascopy.com/swiss/english/marketwatch/historical/

data data-analysis dataset dukascopy html julia stock-data

Last synced: 10 Apr 2025

https://github.com/mribeirodantas/vidente

R package to parse and preprocess the Surveillance, Epidemiology, and End Results (SEER) Program data from NIH/NCI

cancer-patients cancer-research data-analysis data-science data-structures r-package seer

Last synced: 14 Apr 2025

https://github.com/aim-harvard/faceage

Decoding biological age from face photographs using deep learning.

age-estimation biological-age cnn data-analysis deep-learning survival-analysis

Last synced: 13 Apr 2025

https://github.com/negativenagesh/whatsapp_chat_analyzer

This is an end to end project of whatsapp chat analysis, here I have used my hostel whatsapp group's chat data.

data-analysis data-science frontend modeling python streamlit

Last synced: 12 Apr 2025

https://github.com/sukanyabag/statistical-analysis-of-my-medium-articles

This repository contains an exploratory data analysis of my writer data at Medium. I use it to carry out data analysis once in every 4 months to see audience and fan growth, and topics they love! You can check out the articles here👇

data-analysis data-storytelling matplotlib pandas seaborn statistical-analysis sweetviz web-scraping

Last synced: 18 Jul 2025

https://github.com/ascender1729/iris-flower-classification-2024

An exploratory data analysis and machine learning project using the Iris dataset to classify flower species with a K-Nearest Neighbors classifier. It includes data visualization, feature scaling, model training, and evaluation with 100% accuracy on the test set.

classification data-analysis iris-dataset k-nearest-neighbors machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 20 Jul 2025