Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-07-02 00:07:33 UTC
- JSON Representation
https://github.com/wittline/data-analytics-with-r
Repository for data analytics course using R
cassandra-database cql data-analysis genetic-algorithm pentaho-data-integration r
Last synced: 07 Jul 2025
https://github.com/fabienarcellier/qjoin
qjoin is a data manipulation library that provides simple and efficient joining and collection processing functionality
composable data-analysis developer-tools functools python
Last synced: 01 Mar 2026
https://github.com/yahia3200/become-an-independent-data-scientist
My final project for the Applied Plotting, Charting & Data Representation in Python Course
data-analysis data-science data-visualization matplotlib
Last synced: 16 Mar 2025
https://github.com/arzan101/ola-data-analytics
Ola - Identified the reason and trends for ride cancellation. Process - Cleaned and Processed Data from multiple sources, applied Sql queries and visualized data using PoweBi . Motive - To reduce the cancellation rate
dashboard data-analysis data-mining data-visualization dataanalytics excel powerbi sql
Last synced: 06 Jan 2026
https://github.com/llnl/hdtopology
High-dimensional topological data analysis library for NDDAV
analysis cpp data-analysis data-viz high-dimensional-data topological-data-analysis visualization
Last synced: 29 Apr 2025
https://github.com/its-kanii/predictive-maintenance-for-healthcare-equipment
Predictive Maintenance for Healthcare Equipment utilizes machine learning to analyze operational metrics and predict equipment failures. This project leverages a dataset of usage hours, temperature, and maintenance history to enhance equipment reliability and reduce downtime.
data-analysis data-science failure-prediction feature-engineering healthcare-equipment jupyter-notebook machine-learning predictive-maintenance python time-series-analysis
Last synced: 09 May 2026
https://github.com/henrylin03/video-games
Using Python and SQL to clean, analyse and visualise video games' data from Metacritic. Includes scraping using BeautifulSoup.
analysis beautifulsoup beautifulsoup4 data data-analysis data-science eda jupyter-notebook pandas python sql sqlite3 video-game video-games
Last synced: 14 Apr 2026
https://github.com/naso7y/students-performance-analysis
A project analyzing students' academic performance to identify trends and factors affecting outcomes. Built with Python, using data visualization and statistical techniques to derive actionable insights.
data-analysis data-visualization machine-learning python
Last synced: 23 Feb 2026
https://github.com/ekosaputro09/Data-Science-References
Some useful resources to learn about Data Science
cheatsheet data-analysis data-science data-visualization machine-learning statistical-learning
Last synced: 22 Nov 2025
https://github.com/ivanildobarauna-dev/data-pipeline-sync-ingest
ETL Process for Currency Quotes Data" project is a complete solution dedicated to extracting, transforming and loading (ETL) currency quote data. This project uses several advanced techniques and architectures to ensure the efficiency and robustness of the ETL process.
business-intelligence data-analysis data-analytics data-engineering data-pipeline data-visualization etl-pipeline python
Last synced: 28 Oct 2025
https://github.com/jpquast/icp-ms-data-explorer
A shiny app for the exploration of ICP-MS data.
data-analysis icp-ms r shiny shiny-apps
Last synced: 17 Jan 2026
https://github.com/ronylpatil/whatsapplib
WhatsApp Group Chat Analysis Python Package.
data-analysis open-source pypi-package python-library python-package
Last synced: 02 Jan 2026
https://github.com/emso-c/stream-analyser
A tool that analyses YouTube live streams.
cli data-analysis guessing highlights python youtube-video
Last synced: 18 Jan 2026
https://github.com/markmelnic/scalg
List scoring algorithm. Analyse data using a range based procentual proximity algorithm.
algorithm data-analysis pypi pypi-package score scorer scoring scoring-algorithm
Last synced: 08 Oct 2025
https://github.com/markmelnic/carsen-desktop
A python dashboard app for scraping and tracking cars for sale on websites such as mobile.de
automation dashboard data-analysis interface scraping scraping-websites tkinter-python
Last synced: 08 Oct 2025
https://github.com/fenghaojiang/ethereum-etl
ETL(Extract, Transform, Load) data from Ethereum like EVM Block chain
Last synced: 14 Jan 2026
https://github.com/fernandezfran/exma
A Python library with C extensions to analyze and manipulate molecular dynamics trajectories and electrochemical data
computational-physics data-analysis molecular-dynamics oop python science
Last synced: 16 Jan 2026
https://github.com/ahmednasef3/titanic-full-eda
Simple EDA for Titanic Dataset.
data-analysis data-visualization eda exploratory-data-analysis matplotlib pandas seaborn titanic titanic-data-analytics
Last synced: 27 Jan 2026
https://github.com/elhaban3ro/thewildtool
TheWildTool is a tool developed with the main objective of saving time when working with audio datasets. Either to prepare them, to get them or to train a model with them. 🤖
ai audio audio-processing data-analysis data-science dataset deeplearning python
Last synced: 03 Sep 2025
https://github.com/visionkernel/centerspoke
Centerspoke is a data management and analysis tool that allows easy access to cloud databases. Say goodbye to using excel for data management. This open-source CLI tool allows for the rapid processing and analysis of all your data, and makes it easy to upload your excel files into your cloud databases.
cli cloud-database data-aggregation data-analysis data-analysis-python data-management data-science python python3
Last synced: 24 May 2026
https://github.com/quantumudit/analyzing-goodreads-famous-quotes
This project focuses on scraping famous quotes and their related data from the GoodReads website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 20 May 2026
https://github.com/lacerbi/vbmc
Variational Bayesian Monte Carlo (VBMC) algorithm for posterior and model inference in MATLAB (old location)
bayesian-inference data-analysis gaussian-processes machine-learning matlab variational-inference
Last synced: 11 Oct 2025
https://github.com/jimbrig/eda
Exploratory Data Analysis R Package and Shiny App
data-analysis data-visualization eda r shiny
Last synced: 03 Jan 2026
https://github.com/louislefevre/sstubs-miner
Data mining and analysis for the ManySStuBs4J dataset.
data-analysis data-mining manysstubs4j-dataset msr
Last synced: 30 Mar 2025
https://github.com/kenvilar/data-analysis-using-python
Transforming a description of a location from an analyzed CSV file data using Pandas with Python 3
bs4 data-analysis jupyter pandas python python3 requests xlrd
Last synced: 04 Oct 2025
https://github.com/briatte/asr
Applied Stats with R and RStudio (first-year social-science tutorials)
course data-analysis data-science data-visualization r statistics
Last synced: 14 Apr 2026
https://github.com/jimut123/ultimate_date_finder
To find the best place for dating in your country
data-analysis data-science date-cluster dating geo location maps software
Last synced: 16 Jan 2026
https://github.com/erictleung/erictleung.github.io
:memo: Source code for my website, portfolio of projects, and more
bioinformatics blog data data-analysis data-science github-jekyll github-page jekyll lanyon open-science open-source software-engineering
Last synced: 21 Jan 2026
https://github.com/codebypinar/reta
🍃 Explore the world of renewable energy production, analyze historical data, and predict sustainable energy trends. Join us on the journey to a greener future!
arima clean-energy data-analysis data-science data-visualization energy-future forecasting-models innovation renewable-energy sustainability time-series
Last synced: 12 Oct 2025
https://github.com/1sumer/sql
This repository contains SQL scripts and data for various analytical and database management tasks. The project is designed to demonstrate SQL capabilities in handling complex queries, data analysis, and database design. It includes datasets related to e-commerce and streaming services, with a focus on real-world scenarios and use cases.
analytics data data-analysis data-storage sql vscode
Last synced: 19 Jan 2026
https://github.com/hvignolo87/ortex-programming-challenge
Coding challenges required for the Python Developer and Data Engineer job positions.
challenge data-analysis finance pandas python scripting sql sqlalchemy
Last synced: 17 May 2026
https://github.com/winter000boy/dsa-practice
This repository holds my solutions for LeetCode’s Pandas playlists. Each section includes code and notes on using Pandas to handle real-world data tasks efficiently. Perfect for anyone looking to deepen their understanding of data manipulation with Pandas.
data-analysis data-science leetcode leetcode-python pandas-python python3
Last synced: 06 Feb 2026
https://github.com/afondiel/ibm-data-science-professional-certificate-coursera
IBM Data Science Professional Certificate Coursera Notes
ai classification clustering coursera data-analysis data-engineering data-mining data-science data-science-challenges data-science-projects data-scientist data-visualization ibm ibm-certificate ibm-professional-certificate linear-algebra machine-learning python regression statistics
Last synced: 13 Oct 2025
https://github.com/walidalsafadi/titanic-disaster
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
data-analysis data-science decision-trees eda gradient-boosting knearest-neighbors machine-learning-algorithms naive-bayes random-forest titanic-kaggle titanic-survival-prediction
Last synced: 16 Mar 2025
https://github.com/adirthaborgohain/community-data-analysis
Data and Visual Analysis on several different communities generated using Louvain Algorithm in Neo4j on the dblp dataset.
Last synced: 10 May 2026
https://github.com/kyle-wannacott/DataCamp-Projects
DataCamp project solutions.
data-analysis data-mining data-science datacamp-projects machine-learning python r
Last synced: 13 Oct 2025
https://github.com/jupyterphysscilab/documentation
Documentation for the Jupyter Physical Science Lab Suite of Packages
analog-to-digital-converter data-acquisition data-analysis education jupyter-notebooks pandas physical-sciences plotting python raspberry-pi
Last synced: 22 Jan 2026
https://github.com/pizofreude/data-career-navigator
An interactive dashboard providing deep insights into career opportunities for data-related roles, utilizing a comprehensive dataset sourced from LinkedIn. Features include analysis of experience levels, salaries, key skills, job locations, and industry trends, aiding job seekers and professionals in exploring and identifying optimal career paths.
codeinplace data-analysis data-visualization standford-university
Last synced: 13 Mar 2026
https://github.com/hayesall/babybear
🐼 It's like pandas, but tiny.
data-analysis data-analysis-python data-science dataframe python teaching teaching-tool
Last synced: 31 May 2026
https://github.com/quantumudit/alteryx-weekly-challenges
This repository contains Alteryx solutions to the weekly challenges published in Alteryx Community
alteryx alteryx-workflow data-analysis data-science data-transformation data-visualization etl
Last synced: 27 Jan 2026
https://github.com/gher-uliege/liege-colloquium-on-ocean-dynamics
Python tools and latex files for the Colloquium
data-analysis data-assimilation numerical-simulations ocean-modelling oceanography remote-sensing submesoscale turbulence
Last synced: 14 Oct 2025
https://github.com/parisaroozgarian/ibm-data-analyst-professional-certificate
The IBM Data Analyst Professional Certificate, consisting of 9 courses, equips with essential skills in Excel, SQL, Python, data visualization, and analysis techniques
big-data business-analysis business-communication communication data-analysis data-management data-structures data-visualization databases general-statistics human-resources planning python-programming spreedsheet sql
Last synced: 27 Jan 2026
https://github.com/haloapping/malas-ngetik-clf
Saya malas ngetik, makanya saya buat aja template proyek kompetisi Kaggle 😜. Template ini khusus untuk kasus klasifikasi.
data-analysis exploratory-data-analysis feature-engineering kaggle kaggle-competition machine-learning python3 scikit-learn
Last synced: 12 Apr 2026
https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation
we aim to predict trends in the Canadian market basket using sentiment analysis techniques. Sentiment analysis involves analyzing text data to determine the sentiment expressed, whether positive, negative, or neutral.
algorithms-and-data-structures data data-analysis data-science data-visualization feature-engineering machine-learning matplotlib-pyplot numerical-analysis numpy pandas pipelines python sklearn structured-data super unsupervised-learning
Last synced: 05 Feb 2026
https://github.com/victor-lis/regression-ai-model-practice
ai data-analysis python regression-model
Last synced: 01 Apr 2025
https://github.com/martincastroalvarez/django-data-analytics
Data Analytics, PnL, LTV & retention analysis with Django
analytics beautifulsoup4 d3 d3js data-analysis django ltv rest-api visualization
Last synced: 06 May 2026
https://github.com/dcs-training/bayesian-statistics
Materials for the CDCS Introduction to Bayesian Statistics course. Go to the readme file
bayesian-statistics data-analysis r statistics
Last synced: 05 Feb 2026
https://github.com/gjbex/python-dashboards
Repository that contains material for training sessions on creating dashboards using Python.
dash dashboard data-analysis data-exploration data-science data-visualization panel python streamlit training training-materials visualization
Last synced: 13 Jul 2025
https://github.com/maciekmalachowski/crypto-charts-site
📊Application that returns financial data for selected cryptocurrency.
binance-api data-analysis jupyter-notebook matplotlib mplfinance numpy pandas python python-binance
Last synced: 12 Apr 2026
https://github.com/viper373/baidutieba
爬取百度贴吧(指定吧名、起始页数/重点页数、日志输出)
baidutieba-crawler bert data-analysis deep-learning python spider
Last synced: 30 Mar 2025
https://github.com/rvalla/chessevolution
Some code to analyze my chess games using the Lichess API.
chess data-analysis lichess lichess-api python
Last synced: 23 Oct 2025
https://github.com/trybnetic/tu7-acceleration-sleep-wake-classification
Supporting material for the paper ''Discrimination of sleep and wake periods from a hip-worn raw acceleration sensor using recurrent neural networks''
accelerometer accelerometry actigraphy data-analysis sensors sleep
Last synced: 01 Jun 2026
https://github.com/kevinschoon/qviz
QViz Interactive Plotting
data-analysis data-visualization go gonum qframe yaegi
Last synced: 01 Jun 2026
https://github.com/dcs-training/r-qgisintegratingspatialanalysis
This was an intermediate course of three sessions with a focus on developing skills in data visualisation, analysis and integration using both R studio and QGIS. Go to the readme file
data-analysis data-visualisation data-wrangling gis qgis r spatial-analysis
Last synced: 28 Jan 2026
https://github.com/elfgk/ogretmenanalizantalya
OgretmenAnalizAntalya
analysis data-analysis data-science data-visualization ogretmenanaliz
Last synced: 08 Feb 2026
https://github.com/thennen/py-ivtools
A package for reproducible measurement and analysis of current-voltage characteristics of electronic devices.
current-voltage data-analysis data-visualization electrical-engineering emerging-technology instrumentation measurements
Last synced: 23 Jan 2026
https://github.com/roland045/road_quality_measurement_analysis
Novel road quality measurement system for cost effective pavement monitoring, ML-based
azure data-analysis data-engineering data-science machine-learning mlops model-deployment python sql unsupervised-learning
Last synced: 24 Jan 2026
https://github.com/alieymsxxn/sql_project_data_job_analysis
This project explores top-paying jobs, in-demand skills, and where high demand meets high salary in data analytics.
data-analysis postgresql sql sqlite
Last synced: 16 Apr 2025
https://github.com/mrjxtr/tokyo_airbnb_analysis_project
Full project case study and analysis to show potential opportunities to start an AirBnb business in Tokyo, Japan.
data-analysis data-cleaning data-science data-visualization pandas python3
Last synced: 24 Feb 2026
https://github.com/iamfoysal/data-analysis
This repository contains various examples and exercises to help learn data science using Python.
data-analysis data-science database jupyter-notebook python3
Last synced: 10 Feb 2026
https://github.com/anil951/early-detection-of-mental-health
This project develops a predictive model to identify early signs of mental health issues in adolescents using social media activity, school performance, health records, and an AI chatbot. It analyzes emotional tone, academic changes, and health data, offering personalized recommendations and resources for mental wellness.
data-analysis deep-learning early-detection lstm mental-health sentiment-analysis social-media
Last synced: 28 Jan 2026
https://github.com/narius2030/hive-datawarehouse-analysis
Implement a Hive data warehouse to store meaningful data, apply Machine Learning like Clustering or Regression for dealing with business problems
apache-hadoop apache-hive data-analysis etl-pipeline hiveql machine-learning statistics
Last synced: 01 Apr 2025
https://github.com/jpcadena/onemetric-plus
OneMetric+ project for analytical tool on demand forecast and outlier detection
black-formatter data-analysis data-analytics data-science data-visualization demand-forecasting isort machine-learning matplotlib mypy numpy outlier-detection pandas pre-commit-hook pydantic python ruff scikit-learn seaborn solid-principles
Last synced: 10 Apr 2026
https://github.com/aditiiprasad/whatsstat
A fun and insightful WhatsApp chat analyzer that turns your conversations into beautiful stats, juicy graphs, and quirky insights.
chat-analyzer data-analysis data-visualization nlp streamlit text-processing whatsapp
Last synced: 02 Sep 2025
https://github.com/adriens/endoflife-date-snapshots
Daily consolidated and enriched snapshots of endoflife.date
apache-parquet csv csv-export data-analysis data-science database datavisualization dataviz duckdb duckdb-database end-of-life endoflife eol jupyter-notebook kaggle kaggle-notebook olap python release-policy release-schedule
Last synced: 11 Apr 2026
https://github.com/vishrut-b/end-to-end-data-analytics-with-python-and-sql
This project involves the data cleaning and SQL-based analytics of a retail orders dataset using Python and SQL. It focuses on preprocessing data, followed by detailed analytics to extract insights on sales trends and product performance.
data-analysis python retail sql sql-server sqlalchemy
Last synced: 07 Feb 2026
https://github.com/walidbosso/r_data_mining
Extract knowledge from a data using different techniques, including Association Rules Hierarchical Agglomerative Clustering (HAC) K-means Clustering Decision Trees
association-rule-mining association-rules clustering data-analysis data-mining data-science data-visualization decision-tree-classifier decision-trees exportation extract-data hac hierarchical-clustering k-means k-means-clustering k-means-r r-programming r-studio
Last synced: 23 Mar 2025
https://github.com/cosmoduende/r-arduino
Interoperability Data-IoT: How to send and receive data and take control of your Arduino, from R. How to establish interoperability between R and Arduino (Data and IoT) using a data flow between the two
arduino arduino-data arduino-dataflow arduino-serial arduino-serial-data arduino-serial-led arduino-uno data-analysis data-arduino data-cleaning data-iot data-visualization interoperability iot-rstudio r-analytics r-data-visualization r-iot rstudio-arduino serial-read serialport
Last synced: 24 Apr 2026
https://github.com/yusufcinarci/covid-19-data-analysis-visualization
The first project of our data visualization studies is the COVID-19 data analysis project. In this project, we analyzed the data of the COVID-19 pandemic, which started in the first month of 2020 and still continues to affect the world, on the basis of countries. You can find the brief details of the project we realized in 3 stages in the readme file. We have tried to explain the details of the project step by step below. We wish you healthy days.
covid-19-data-visualization data-analysis data-science data-visualization
Last synced: 22 Jul 2025
https://github.com/aad99bxp/whatsapp-chat-analyzer
A project intended for Business Owners / Managers to analyze Whatsapp chats between their customer care executives and their customers.
data-analysis heroku-deployment python3
Last synced: 15 Mar 2025
https://github.com/martial2023/bank-performance-analysis
Analyse de données bancaires du Berka Dataset (1993-1998) pour calculer et visualiser des KPI clés
dashboard data-analysis data-visualization nextjs pandas plotly-express pymongo python recharts-js sqlalchemy
Last synced: 26 Aug 2025
https://github.com/realkarthiknair/data-science-notes
Data science notes and programs
data-analysis data-science data-visualization
Last synced: 27 Aug 2025
https://github.com/shlokashah/student-depression-and-suicide-rate-prediction
https://shlokashah.github.io/Student-Depression-And-Suicide-Rate-Prediction/
data-analysis data-visualization machine-learning student suicide-rate-prediction
Last synced: 19 Nov 2025
https://github.com/akshat0427/spotify_history
code to find out some insights in spotify streaming data (work in progress)
data-analysis data-visualization
Last synced: 04 Feb 2026
https://github.com/draym/covid19tracker
Coronavirus COVID-19 dashboard to track global cases
covid-19 covid19-tracker dashboard data-analysis
Last synced: 07 Jan 2026
https://github.com/cbg-ethz/scdna-pipe
Python data analysis pipeline for single cell copy number event history reconstruction
bioinformatics bioinformatics-pipeline data-analysis genomics python snakemake snakemake-workflows workflow
Last synced: 05 Jan 2026
https://github.com/gxelab/tutorials
Tutorials of frequently used software packages and libraries in the lab
bioinformatics data-analysis evolution genetics genomics julia python3 r-language statistics visualization
Last synced: 18 Jan 2026
https://github.com/arjo129/image-sorter
Sort through folders of videos and images. Root out blurred and overexposed images.
computational-photography data-analysis photo-browser photo-gallery photography uwp uwp-apps
Last synced: 25 Jul 2025
https://github.com/depressioncenter/data-and-design-core
Code developed by the EFDC Data and Design Core team to support mental health research.
data-analysis data-science efdc inference r statistical-analysis umich
Last synced: 19 May 2026
https://github.com/chrdek/linqdatacalc
📈 🎲 Linq based data statistics set of extensions.
calculations calculator data-analysis data-analytics data-science data-statictics extension-methods extensions linq linq-extensions set-theory statistical-analysis statistics
Last synced: 27 Jun 2025
https://github.com/patex1987/temperature-calibration
Notebook for sensor calibration evaluation
calibration data-analysis jupyter-notebook sensor
Last synced: 20 Jun 2025
https://github.com/stimulsoft/samples-dashboards.js-for-html
JavaScript samples for Dashboards.JS data visualization tool for HTML and native JavaScript applications
analytics automation components dashboard-application dashboard-designer dashboard-viewer data-analysis embedded html5 indicators javascript js json-database native-javascript onepage panels pivot-tables simple-dashboard transformation website
Last synced: 20 Oct 2025
https://github.com/bala-ceg/digital-payment-index
This project aims to develop an index for the digital transactions of India
collaborate data-analysis fintech hacktoberfest machine-learning statistics
Last synced: 20 Jun 2025
https://github.com/ocramz/record-encode
Generic encoding of record types
categorical-data categorical-features data-analysis data-mining data-science generic-programming machine-learning one-hot-encode preprocessing
Last synced: 14 Apr 2025
https://github.com/elysian01/ml-eda-and-modelling-using-streamlit
Beautiful Web interface made using Streamlit for quick Exploratory Data Analysis and building classification models which are implemented from scratch.
data-analysis data-visualization eda exploratory-data-analysis knn-classification logistic-regression matplotlib ml-model-on-web ml-models naive-bayes-classifier pandas seaborn streamlit streamlit-webapp
Last synced: 12 Apr 2025
https://github.com/anonympins/data-primals-engine
Manage and automate your data at scale 🚀 With data-primals-engine you get workflows, dashboards, alerts, i18n, client integration & AI assistant — all open-source, all MongoDB powered.
api automation data data-analysis data-engineer data-visualization database expressjs low-code mongodb nodejs rest-api
Last synced: 07 Mar 2026
https://github.com/techytushar/india-odi-analysis
Analysis of ODI cricket matches of Indian Team
cricket data-analysis data-science pandas plotting python3
Last synced: 05 May 2026
https://github.com/rawsashimi1604/jobextract
Scrapes LinkedIn data. Conducts sentiment analysis on what traits and qualifications employers are looking for.
data data-analysis data-analytics data-cleaning linkedin mvc python webscraper
Last synced: 06 Nov 2025
https://github.com/mljar/mercury-notebook-apps
Amazing apps build from Python notebooks with Mercury
data-analysis data-science data-visualization jupyter jupyter-notebook jupyterlab mljar python
Last synced: 21 May 2026
https://github.com/gallillio/data_science-data_visualizer_tool
## About Supervised ML Helper is a Python application that streamlines exploratory data analysis (EDA) and preprocessing for supervised machine learning. Featuring a user-friendly Tkinter interface, it enables users to load CSV files, visualize data, and perform essential transformations, making data preparation accessible for all skill levels.
data-analysis data-science data-visualization matplotlib numpy pandas seaborn sklearn
Last synced: 17 Feb 2026
https://github.com/emptymalei/mini-lab
Some code snippets used to explain stuff to myself in my personal data science wiki
data-analysis data-mining data-science data-visualization datascience
Last synced: 07 Apr 2025
https://github.com/nafisalawalidris/analyzing-nobel-prize-dataset-demographics-and-trends
This project analyses a Nobel Prize dataset using Python and data analysis libraries. It explores the distribution of winners by category and country, examines the proportion of female winners over time, investigates the age of winners when they received the prize and identifies the oldest and youngest recipients.
age-at-award country-distribution data-analysis data-manipulation dataset demographics filtering gender-balance grouping nobel-prize notable-laureates python trends visualisation winners
Last synced: 19 May 2026
https://github.com/nabilalibou/uber_fare_prediction_explained
This repository documents a complete ML workflow to model Uber fares in Paris, from granular EDA and feature engineering to building and fine-tuning a stacking regressor on 10k real-world rides.
data-analysis data-science eda feature-engineering machine-learning predictive-analytics pricing-model python regression-model stacking-ensemble uber
Last synced: 30 Jun 2026
https://github.com/zachlagden/spotify-listening-analyzer
A comprehensive Python tool for analyzing your Spotify listening history data.
analytics data-analysis pandas python spotify-web-api spotipy
Last synced: 31 Jul 2025
https://github.com/zrkhadija/data-analysis-for-financial-time-series
In this notebook, we performed data analysis on financial time series data from Yahoo Finance for the US market. We examined seasonality, trends, stationarity, and other aspects such as outliers and correlations.
autocorrelation correlation-analysis data-analysis financial-analysis time-series-analysis timeseries-forecasting visualization
Last synced: 09 Feb 2026
https://github.com/simoneas02/data-science
🐍 A planning study to become a data scientist and to improve my current skills. 🤘🏼🌻
data data-analysis data-science data-visualization deep-learning machine-learning pandas python3 r sql
Last synced: 12 Apr 2026
https://github.com/avrtt/paysage
Pandas add-on library: find data quality issues and clean/improve dataframes in one line using scikit-learn transformer
data-analysis data-cleaning data-compression data-profiling data-quality data-quality-checks data-reporting pandas pandas-dataframe schema-validation scikit-learn scikit-learn-transformer
Last synced: 14 May 2026
https://github.com/inphyt/inphyt.github.io
Special repository hosting the InPhyT website.
computational-epidemiology computational-modelling computational-neuroscience computational-social-science computational-socialscience computer-science data-analysis data-mining machine-learning mathematical-modelling mathematics modeling network-analysis physics scientific-computing scientific-machine-learning statistical-modeling statistical-physics
Last synced: 02 Feb 2026
https://github.com/beckversync/probability-and-statis_computer-parts-cpus-and-gpus-ics_
Probability and statistical analysis techniques are employed to explore data related to computer components, such as CPUs, GPUs, and Integrated Circuits (ICs). The objective is to uncover trends, identify patterns, and extract meaningful insights from real-world hardware data.
Last synced: 18 Feb 2026