Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-06-30 00:07:38 UTC
- JSON Representation
https://github.com/devlucho/modelos-predictivos
Modelos predictivos utilizando los algoritmos de Regresión Lineal, Regresión Logística y Árboles de Decisión.
data-analysis jupyter-notebook python3
Last synced: 03 May 2026
https://github.com/ababic/dumpling
Fast, flexibile, powerful static data anonymisation for SQL dumps
anonymisation cli data-analysis data-science pii pii-redaction postgres privacy rust rust-lang scrubber scrubbing security tooling
Last synced: 03 May 2026
https://github.com/yashsingh43/lung-cancer-biomarker-analysis
Gene expression analysis to identify biomarkers for early lung cancer detection (SCLC & NSCLC)
bioinformatics biomarkers cancer cytoscape data-analysis gene-expression gsea nsclc r sclc
Last synced: 11 Jun 2026
https://github.com/syed-m-nofel/python-data-science-fundamentals
Python notebooks for data manipulation (Pandas/NumPy) and API workflows – from basics to practical examples.
api beginner-friendly data-analysis data-science http-requests jupyter-notebook numpy pandas pandas-dataframe python tutorial
Last synced: 03 May 2026
https://github.com/obinnaokoye89/fraud-detection-monitoring
ML model monitoring for fraud detection using NannyML
analytics automation data-analysis fraud-detection jupyter-notebook machine-learning monitoring nannyml pandas python sciki-learn
Last synced: 03 May 2026
https://github.com/prakhargpt/sql-data-warehouse-project
Building Data Warehouse project using SQL Server, including ETL processes, data modelling and analytics.
analytics data data-analysis data-cleaning data-engineering data-engineering-pipeline data-lakehouse data-science data-warehouse etl etl-job etl-pipeline medallion-architecture sql sql-server
Last synced: 12 Jun 2026
https://github.com/matteospanio/speed-analysis
A project to analyze the internet speed
Last synced: 03 May 2026
https://github.com/syarwinaaa09/analyzing-crime-in-los-angeles
Exploratory data analysis of Los Angeles crime data with insights on temporal patterns, locations, and age demographics.
crime-data data-analysis eda los-angeles pandas public-safety python visualization
Last synced: 03 May 2026
https://github.com/bpkaur/whats-in-a-name
Exploring dataset of first names of babies born in the US in order to uncover interesting stories
data-analysis datacamp numpy pandas python3
Last synced: 04 May 2026
https://github.com/mindlessmuse666/titanic-data-visualization
Проект по визуализации данных о пассажирах Титаника с использованием библиотек Python Matplotlib, Seaborn и Plotly.
data-analysis data-visualization matplotlib pandas plotly python seaborn titanic
Last synced: 04 May 2026
https://github.com/douglasvolcato/focus-report-ibov-direction-prediction-model
Brazilian index direction prediction model for the first hour of the day based on Focus reports
data-analysis finance financial-analysis financial-data machine-learning machine-learning-algorithms machinelearning-python prediction-model predictive-analytics predictive-modeling python python-lambda python-script python3 web-scraping web-scraping-python webscraping
Last synced: 04 May 2026
https://github.com/arv-anshul/ipl-api
IPL API using Flask framework and ipl dataset.
api data-analysis fast-api flask flask-api ipl ipl-api python3
Last synced: 04 May 2026
https://github.com/aaaa-source/us-stock-market-analysis-and-prediction
US Stock Market Analysis and Prediction
artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks classification clustering data-analysis finance financial-analysis python
Last synced: 09 Jun 2026
https://github.com/madrury/commute-times
Simulated Commute Times Data
data-analysis data-science data-visualization dataset
Last synced: 12 Jun 2026
https://github.com/marionchaff/real-estate-price-prediction-france
Real estate price prediction using French public database DVF
data-analysis dvf-data machine-learning price-prediction python real-estate scikit-learn
Last synced: 04 May 2026
https://github.com/hemangsharma/streamingcontentanalyzer
This Streamlit application provides an interactive dashboard for analyzing streaming content data. It allows users to explore movie and TV show ratings, distributions, temporal trends, and genre breakdowns through various visualizations and filters.
dashboard data-analysis data-science data-visualization python streamlit-dashboard streamlit-webapp
Last synced: 02 Apr 2025
https://github.com/hyperplasma/olympic-visualization-analysis
Multidimensional analysis and visualization of Olympic medals, economy, and happiness index.
data-analysis data-visualization matplotlib numpy pandas python wordcloud
Last synced: 04 May 2026
https://github.com/mishaa931/amazon-sales-dashboard-power-bi
This project features a dynamic Power BI dashboard built on dummy Amazon sales data. It visualizes key business metrics such as revenue trends, top-selling categories, discount impact, and geographic performance. The dashboard is designed to help stakeholders make data-driven decisions through clear, interactive visuals.
data-analysis data-quality data-visualization microsoftpowerbi
Last synced: 05 Feb 2026
https://github.com/vatshayan/youtube-user-analysis
Analysis of Youtube Users about their choice and preferences
data data-analysis data-mining data-science data-visualization dataset machine-learning machine-learning-algorithms
Last synced: 05 Feb 2026
https://github.com/joaquinmoron/airbnb-eda-python
EDA de Airbnb — limpieza, exploración y visualización en Python (pandas, matplotlib, seaborn).
airbnb data-analysis eda matplotlib pandas python seaborn
Last synced: 13 Apr 2026
https://github.com/saksham-jain177/cryptodataanalysis
A Python powered project that fetches live cryptocurrency data from the CoinMarketCap API, analyzes it, and updates a live Excel sheet every 5 minutes.
api-integration coinmarketcap cryptocurrency data-analysis excel live-data python
Last synced: 12 Jun 2026
https://github.com/mxagar/data_science_udacity
My personal notes, code and projects of the Udacity Data Science Nanodegree.
dashboard data-analysis data-engineering data-science machine-learning-pipelines
Last synced: 09 Apr 2025
https://github.com/kittonn/data-analysis-freecodecamp
freecodecamp - data analysis projects.
Last synced: 05 Apr 2025
https://github.com/luminati-io/Indeed-dataset-samples
A sample dataset of over 1000 Indeed job listings, extracted using the Bright Data API, ideal for market analysis and growth.
api data-analysis datasets indeed jobs web-scraping
Last synced: 09 Apr 2025
https://github.com/luminati-io/Target-dataset-samples
A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.
api data-analysis data-mining datasets target web-scraper web-scraping
Last synced: 09 Apr 2025
https://github.com/shivamsharma32/customer-churn-analysis-power-bi-
This project is about analyzing and visualizing customer churn data using Power BI. Customer churn is the percentage of customers who stop doing business with a company over a given period of time. It is an important metric for businesses to understand why customers leave and how to retain them.
data-analysis dataanalytics datavisualization powerbi
Last synced: 15 Jan 2026
https://github.com/deliprofesor/virtual-reality-in-education-impact-analysis-and-insights
This project examines the impact of Virtual Reality (VR) on education, focusing on its effects on student engagement, learning outcomes, and creativity. It uses data analysis techniques like descriptive statistics, correlation analysis, and clustering to assess VR's effectiveness in enhancing learning.
clustering data data-analysis data-science data-visualization exploratory-data-analysis hypothesis-testing machine-learning python regression-analysis virtual-reality
Last synced: 14 Jun 2025
https://github.com/jatin-mehra119/flight-price-prediction
This study aims to analyze flight booking data from "Ease My Trip" website, using statistical tests and linear regression to extract insights. By understanding this data, valuable information can be gained to benefit passengers using the platform.
data-analysis datacleaning datavisualization machine-learning preprocessing-data python sklearn-pipeline sklearn-regression-algorithm streamlit-webapp
Last synced: 04 May 2026
https://github.com/shellynagar27/good-cabs-data-analysis-project
This project is part of CodeBasics Challenge #13, where the goal was to provide actionable insights to the Chief of Operations at Goodcabs, a cab service provider in tier-2 cities of India. The project focused on analyzing key metrics like trip volume, repeat passenger rate, and passenger satisfaction.
critical-thinking data-analysis data-visualization excel exploratory-data-analysis power-bi presentation problem-solving sql storytelling
Last synced: 25 Jan 2026
https://github.com/code-jl/dna-sequence-analyzer
A robust Python-based bioinformatics tool for comprehensive DNA sequence analysis and manipulation.
bio-tools bioinformatics biological-data computational-biology data-analysis dna-analysis dna-sequencing fasta gc-content gene-detection genetics genomics molecular-biology motif-finding nucleotide-analysis python python3 scientific-computing sequence-analysis sequence-manipulation
Last synced: 11 Mar 2025
https://github.com/shellynagar27/business-insights-360-project
A comprehensive Dashboard which provides better understanding of the business's market standing, key focus areas for optimization, underperforming customers, and year-wise financial insights, aiding in better inventory planning and performance tracking. Further it can be used in answering n number of why questions based on the situations.
dashboard data-analysis data-visualization dax-languague dax-studio excel performance-optimization power-bi reporting sql storage-manager
Last synced: 27 Jan 2026
https://github.com/iqbalmind/learn-python-data-scientist
IqbalMind Playground for python data scientist
data data-analysis data-visualization datascience datascientist datascientisttraining python python-playground
Last synced: 16 Mar 2025
https://github.com/docuvesta/youtube-api-fragrance-channel-analytics
Engagement metrics analysis of perfume Youtube channel using Youtube API 🎀
analysis beauty-products comments data-analysis data-analysis-python engagement-metrics insights jupyter-notebook likes-count marketing marketing-analytics perfume python views-count youtube youtube-api youtube-api-v3
Last synced: 03 May 2026
https://github.com/youssefyaser/scrape-the-imdb-site-for-the-top-250-movies
Web scraping the top 250 movies in IMDB site.
data-analysis numpy pandas python
Last synced: 04 May 2026
https://github.com/mr-chang95/udacity_movie_project
Movie Data Analysis and Visualization Project for Udacity's Data Analyst Program. Using Python in Jupyter Notebook.
data-analysis data-visualization jupyter-notebook movie python
Last synced: 13 Apr 2026
https://github.com/athari22/investigating-netflix-movies-and-guest-stars-in-the-office
Apply basic Python skills in Introduction to Python and Intermediate Python by processing and visualizing film and television data.
data-analysis data-science data-visualization loop loops matplotlib matplotlib-pyplot netflix numpy office pandas python
Last synced: 11 Apr 2026
https://github.com/pratanup/bank-customer-churn
A prediction model based on ML as well as DL and compare their performances to find Churned Customers
adaboost-classifier ann churn-prediction data-analysis data-visualization decision-tree-classifier deep-learning deep-learning-algorithms gaussian-naive-bayes-classification gradient-boosting-classifier k-nearest-neighbours logistic-regression machine-learning machine-learning-algorithms random-forest-classifier svc svm-classifier xgboost-classifier
Last synced: 10 Mar 2026
https://github.com/arturo2r/dashboard
Dashboard of New House Index Pricing
colombia dashboard data-analysis forecasting forecasting-models forecasting-time-series prices r
Last synced: 25 Mar 2025
https://github.com/paul0vinicius/ad2
Repositório da disciplina de Análise de Dados 2 (Data Analysis II)
Last synced: 08 Jan 2026
https://github.com/chiragkumargohil/co2-emissions-data-analysis
A Python programme that analyses CO2 emission data from 1997 to 2010. This programme prints data, provides brief of a given year, displays and compares Year vs. Emission graphs for chosen countries, and generates a separate data file for chosen countries. It was a self-paced project that Guru 99 provided.
co2-emission data-analysis matplotlib python
Last synced: 28 Aug 2025
https://github.com/pawlo77/smarty
End-to-End Data Science tool
data-analysis data-processing pandas pipeline
Last synced: 08 May 2026
https://github.com/dbriane208/python-for-data-science
Machine Learning and Data Science repository. Love crafting Machine Learning models.
data-analysis data-science data-visualization machine-learning numpy pandas python seaborn
Last synced: 13 Apr 2026
https://github.com/rishitabansal9/adult-census-income-prediction
This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.
data data-analysis data-science feature-engineering random-forest-classifier
Last synced: 25 Mar 2025
https://github.com/diligencefrozen/dcinside-data
Analyzing the Dcinside Frozen Gallery Dataset. #디시
Last synced: 30 May 2026
https://github.com/surayasumona/test_bowlers_analysis
Data Analysis with Python
data-analysis data-manipulation data-preprocessing numpy pandas
Last synced: 04 May 2026
https://github.com/extwiii/datascience-jhu
Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera
biostatistics data-analysis data-science linear-regression multivariate-regression r r-programming toolbox visualization
Last synced: 05 Jul 2025
https://github.com/1401dev/customer-lifetime-value-prediction
A data science project leveraging Python and Scikit-Learn to build predictive models that estimate customer lifetime value (CLV). Includes data cleaning, feature engineering, and model selection to identify key drivers of CLV, supporting strategic decision-making in customer retention and marketing.
clv clv-analysis customer-retention data-analysis dataprocessing feature-engineering machine-learning marketing-analytics predictive-modeling python regression-analysis scikit-learn
Last synced: 06 May 2026
https://github.com/nature40/casestudies
Case studies for testing the functionality of database systems, sensors, etc
casestudies data-analysis data-visualization database
Last synced: 02 May 2026
https://github.com/amoghkori/working-with-apache-spark-mllib
Implemented Apache Spark MLLib to analyze a large car dataset, predict car selling prices, and gain insights into the car market.
amazon-web-services data-analysis data-visualization exploratory-data-analysis linear-regression machine-learning model-selection pyspark python random-forest sagemaker spark
Last synced: 13 Apr 2026
https://github.com/flytomarsz/bike-sharing-system-analysis
This analysis project aim to identify bike rental's behavior in 2012 from Capital Bikeshare system, Washington D.C., USA. This project is part of my Data Analysis study at Dicoding.
data-analysis data-visualization jupyter-notebook python streamlit
Last synced: 04 May 2026
https://github.com/isaacmaffeis/imad-2023
Model Identification and Data Analysis (IMAD) | University course
data data-analysis data-science model model-identification
Last synced: 09 May 2026
https://github.com/marielachirinosr/pandas-weather-project
Pandas Weather Data. Explore straightforward Python scripts for weather information analysis.
Last synced: 29 Apr 2026
https://github.com/abhisek-13/whatsapp-chat-analyzer
The WhatsApp Chat Analyzer is a data analysis project that provides insights into WhatsApp chats. It analyzes chat data to show metrics like the number of lines, most used letter, chatting duration, media files shared, most used emojis, and group member activity. The results are displayed on a user-friendly dashboard built with Streamlit.
data-analysis data-mining data-visualization eda machine-learning machine-learning-algorithms matplotlib numpy pandas python seaborn sklearn
Last synced: 13 Apr 2026
https://github.com/shrikantnaidu/sql-for-data-analysis
SQL for Data Analysis
data-analysis parch-and-posey postgresql
Last synced: 27 Feb 2025
https://github.com/lucalullo/monitoring-healthcare-waiting-times-puglia
Monitoring and analysis of public healthcare waiting times in Puglia (Italy), 2024 — based on official open data
data-analysis healthcare italy jupyter-notebook kaggle open-data pandas public-data puglia time-series waiting-times
Last synced: 08 Jan 2026
https://github.com/tiagocavalcante/nesfit
NES 2024 Practical and Research Work - Group 2
Last synced: 09 Jun 2026
https://github.com/balajimohan18/milk-production-time-series-forecasting-datascience-project
This project uses time series forecasting to predict future milk production. The data used in this project is monthly milk production data from January 1962 to December 1975. The ARIMA (autoregressive integrated moving average) model is used to forecast the milk production. The model is evaluated using various metric.
acf adf data-analysis data-cleaning data-science data-visualization eda exploratory-data-analysis machine-learning pacf seasonality time-series trends
Last synced: 30 May 2026
https://github.com/giorgossideris/athens_weather_analysis
Analyse the data of Athens' weather.
Last synced: 16 Mar 2025
https://github.com/analysisbyvivek/Crime-data
Analyzes crime patterns across different areas, exploring factors such as crime type, weapon usage, demographic influences, and geographic distribution to uncover trends in frequency, correlations, and hotspots.
apache-superset data-analysis eda jupyter-notebook python
Last synced: 29 Jan 2026
https://github.com/jasontan22/aefes-time-series-forecasting
Bu proje, Anadolu Efes Biracılık ve Malt Sanayii A.Ş. (AEFES) piyasa verilerini kullanarak kapanış fiyatlarının gelecekteki değerlerini tahmin etmek amacıyla derin öğrenme yöntemleri (LSTM, BiLSTM, CNN+LSTM) kullanmaktadır. Projede, veri ön işleme, model eğitimi ve değerlendirme adımları detaylandırılmıştır.
bilstm cnn-lstm data-analysis deep-learning financial-forecasting lstm machine-learning python stock-price-prediction tensorflow
Last synced: 09 Aug 2025
https://github.com/srinibas-masanta/electric-vehicle-analysis-dashboard
This repository features an interactive Tableau dashboard that visualizes electric vehicle (EV) adoption trends in the U.S. 🚗⚡ Explore EV growth, top manufacturers, regional distribution, and the impact of incentives—all in one dynamic view. 📊 Use filters to dive deeper into the data and uncover key insights! 🚀
dashboards data-analysis data-visualization tableau
Last synced: 15 Jan 2026
https://github.com/aimin-nur/visualisasi_bikestore
Data Analyst - Dashboard Bike Store
data-analysis sql visualization
Last synced: 29 Jan 2026
https://github.com/nurulashraf/polynomial-regression-manufacturing
A Python project implementing polynomial regression to analyse and predict manufacturing-related data. Features include data preprocessing, model training, and visualisation of results. Ideal for exploring machine learning applications in manufacturing process optimisation.
data-analysis data-visualization machine-learning manufacturing polynomial-regression predictive-modeling process-optimization python regression-models scikit-learn
Last synced: 16 Apr 2026
https://github.com/dhruvsrikanth/basic-data-science
A short Data Science Project I took up for fun! This is a data analysis based on a dataset I created to predict the distribution of wealth within an economy as well as several characteristics of each class within society!
analysis data-analysis data-pipeline data-science data-visualization machine-learning matplotlib pandas python seaborn sklearn
Last synced: 05 May 2026
https://github.com/shridhar1504/peerloankart-loan-fraud-detection-datascience-project
This project uses machine learning to predict whether a loan applicant will repay their loan. The project uses a dataset of historical loan data from PeerLoanKart, a peer-to-peer lending platform.
classification-model data-analysis data-analytics data-cleaning data-science data-visualization dimensional-analysis eda exploratory-data-analysis feature-engineering gradient-boosting-classifier hyperparameter-tuning jupyter-notebook machine-learning machine-learning-algorithms predictive-modeling python supervised-learning
Last synced: 30 Apr 2026
https://github.com/anas436/predictive-modelling-urban-growth-ai
Predictive Modelling for Urban Growth using AI
artificial-intelligence dashboard data-analysis data-collection data-preprocessing data-science data-visualization deep-learning deployment jupyterlab machine-learning python3 remote-sensing streamlit webapplication webscraping
Last synced: 05 Sep 2025
https://github.com/deypadma2020/sql_project
✏️ A collection of practical SQL case studies and solutions exploring real-world business scenarios: car showroom analysis, esports tournament, customer insights, finance analysis, pricing strategy, and marketing analytics.
business-intelligence case-study data-analysis database mysql queries sql
Last synced: 30 May 2026
https://github.com/abhijeet107/final-project
Final project summation INTERNSHIP PROJECTS (2 WEEKS)
data-analysis data-cleaning-and-preprocessing excel mysql-database python tableau-public
Last synced: 23 Feb 2026
https://github.com/anniefib/otherprojects
Powering Data Dreams: From Orchestration to Analytics with Cloud Precision
airflow-etl-orchestration aws cloud-native-data-solutions data-analysis data-visualization database datamodelling datawarehousing eda end-to-end-data-pipelines machine-learning-models pgadmin4 spark-analytics sql
Last synced: 07 May 2026
https://github.com/avratanubiswas/fluorpenplugin
A matlab user interface for analysing OJIP curve datasets from FluorPen instrument. That is, serving as an additional plug in for "quick categorical analysis".
data-analysis fluorpen ojip-curve
Last synced: 18 Mar 2026
https://github.com/fbarffmann/nosql-challenge
Analyzed 28,000+ UK restaurant records using MongoDB and PyMongo. Queried hygiene scores, location data, and customer ratings.
data-analysis data-cleaning database-analysis json mongodb nosql pymongo python restaurant-data
Last synced: 13 Apr 2026
https://github.com/fbarffmann/python-challenge
Automated financial and election data analysis using Python. Cleaned and transformed large CSV datasets, calculated key business metrics, and generated automated reports for stakeholders.
automation csv data-analysis data-cleaning election-analysis financial-analysis python reporting
Last synced: 24 Apr 2025
https://github.com/anudeepkaddala/bankds
This repository contains a Python-based solution for cleaning, matching, and formatting bank data. The primary goal is to match banks from two datasets based on their names and associate each bank with its respective asset size. The final output is a cleaned dataset with asset sizes in Indian-style currency format.
data-analysis data-science fuzzy-matching pandas python
Last synced: 12 Apr 2026
https://github.com/shrinidhi857/simpledataanalysisonstartups
The Indian startup ecosystem has experienced remarkable growth over the past decade, becoming a hotbed of innovation and entrepreneurship. In this data analysis we are segregating fields ,finding new insights.
data-analysis data-science data-visualization indian-startups
Last synced: 17 Sep 2025
https://github.com/samwhaaa/da_portfolio
Showcasing some of my Data Analytics projects
data-analysis data-analytics data-visualization jupyter jupyter-notebook python
Last synced: 01 Mar 2025
https://github.com/wilfordaf/dataanalyst-test
Test task for Junior Data Analyst position
data-analysis pandas python trading-data
Last synced: 28 Feb 2025
https://github.com/smoeding/jmeterplugin-datasketches
A JMeter listener using DataSketches to estimate response time quantiles and histograms
data-analysis jmeter jmeter-listeners jmeter-plugin
Last synced: 06 Mar 2025
https://github.com/LipunKumarDalai/Youtube-Analysis
A Simple DataAnalysis Project On Youtube-Data.
apache-superset beautifulsoup bootstrap5 data-analysis data-visualization django html jupyter-notebook postgresql-database python scraping selenium-webdriver sqlite-database youtube-api
Last synced: 30 Dec 2025
https://github.com/ymorsi7/caliwageanalysis
California employment and wage analysis on data from the past decade.
data-analysis data-science ipynb jupyter-notebook
Last synced: 21 Jan 2026
https://github.com/ryanbbrown/volleyball-analysis-project
Analyzes 10 years of self-collected men's NCAA volleyball player height and team wins data to determine the importance of height for success.
data-analysis data-visualization python volleyball
Last synced: 31 May 2026
https://github.com/firetyrant/sql-portfolio-projects
Documenting my SQL learning journey with hands-on projects focused on data cleaning, analysis, and optimization.
bigquery data-analysis databases etl learning portfolio query-optimization sql
Last synced: 19 Apr 2026
https://github.com/roshaka/samplr
Samplr is a Python decorator for selecting a subset of items from a list, with options for customisation and informative console printouts.
data data-analysis data-engineering decorators list python sampling
Last synced: 14 Jan 2026
https://github.com/luizassimoes/q5ga-latency-and-throughput
Quick 5G Analyser: PyQT5 software developed to help with simple graphical analysis and chart generating for ping and iperf3 tests.
data-analysis data-visualization pyqt5 python
Last synced: 13 Jun 2026
https://github.com/parthshah02/customer_churn_dashboard
This repository features a comprehensive project showcasing data analysis and interactive dashboard using Python
data-analysis matplotlib numpy pandas python
Last synced: 13 Apr 2026
https://github.com/kwokhing/pencils-of-promise
Data for A Cause - Pencils of Promise
data-analysis data-cleaning data-visualization paired-t-test r r-markdown t-test
Last synced: 25 Mar 2025
https://github.com/nimomach/cafe-sales
This analysis focuses on evaluating the sales performance of a cafe by examining key metrics such as total revenue, sales by product category, peak sales times, and many more.
cafe data-analysis data-visualization sales
Last synced: 12 Mar 2026
https://github.com/aalekhpatel07/statcan
StatCAN dataset fetcher and cleaner.
census data-analysis data-science statcan
Last synced: 02 Apr 2025
https://github.com/deliprofesor/behavioral-insights-and-data-exploration
This project analyzes Spanish speech data, focusing on acoustic features and demographics. It includes data cleaning, outlier detection, clustering, and time series modeling (ARIMA, Holt-Winters) to uncover patterns in speech duration and word frequency.
acoustic-features arima clustering data-analysis holt-winters k-means machine-learning speech-analysis time-series-analysis
Last synced: 10 Apr 2025
https://github.com/beaprogrammer02345/python_data_analysis
Sales Analysis using Python
data-analysis data-visualization python
Last synced: 05 May 2026
https://github.com/zafir100100/cancer-stage-prediction
This code predicts cancer data using various regression models, calculates their average R-squared scores, and prints the best model.
cross-validation data-analysis data-preprocessing decision-trees gradient-boosting linear-regression machine-learning-algorithms numpy pandas random-forest regression scikit-learn
Last synced: 05 May 2026
https://github.com/luciocolonna/cyclistic-bikesharing-2023
Case study on public data from Chicago's Divvy bikeshare, using R
bikesharing capstone-project cyclistic cyclistic-bikshare data-analysis data-visualization geojson ggplot2 google-data-analytics google-data-analytics-capstone-project google-data-analytics-professional leaflet r sf tidyverse
Last synced: 02 Apr 2025
https://github.com/prajjwol09/data-cleaning-project
This project is dedicated to cleaning, standardizing a dataset, dealing with null values from a CSV file named "layoffs" using MySQL, with MySQL Workbench as the workspace environment. The goal is to prepare the data for analysis.
cleaning-data columns data-analysis database duplicates mysql rows standard
Last synced: 20 Apr 2026
https://github.com/astropenguin/optimap
Optimized integrated intensity map method for spectral cubes
astronomy data-analysis data-science python python3 radio-astronomy spectral-cubes
Last synced: 09 Apr 2025
https://github.com/ilke-kas/multivariate-data-analysis
A curated collection of R-based data analysis projects applying regression modeling, clustering, dimensionality reduction, multivariate statistics, and classification. Each project showcases practical data science techniques, interpretability, and domain insights using real-world and academic datasets.
classification data-analysis data-visualization dimensionality-reduction machine-learning multivariate-analysis r regression statistics
Last synced: 05 Oct 2025
https://github.com/harkishen/Agriculture-DS
An Agricultural based Mtech project, on Data Science, which predicts the growth of crops based on previous year records.
Last synced: 11 Dec 2025