Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-01-31 00:07:48 UTC
- JSON Representation
https://github.com/kiranmayi5/r-projects
This repository showcases R projects designed to tackle real-world problems through data-driven solutions.
data-analysis exploratory-data-analysis predictive-modeling r statistical-analysis
Last synced: 25 Jun 2025
https://github.com/hyperspy/exspy-demos
eXSpy Jupyter Notebook demos
data-analysis data-visualization eds edx eels electron-energy-loss-spectroscopy hyperspy life-sciences materials-science multi-dimensional physical-sciences spectroscopy tutorial x-ray-spectroscopy xrf
Last synced: 13 May 2025
https://github.com/ifibla/adsdb-project
Algorithms, Data Structures and Databases Project
data-analysis data-engineering python
Last synced: 30 Dec 2025
https://github.com/grypesc/graduateadmissions
Visualization, analysis and predictive modeling of a Kaggle graduate admissions dataset.
data-analysis data-mining data-science data-visualization dataset
Last synced: 08 Jul 2025
https://github.com/dina-hosny/explore-us-bike-share-data-project
Explore US Bike Share Data project - FWD Data Analysis Professional Track. In this project, I used Python to explore data related to bike share systems for three major cities in the United States and answer questions about it by computing descriptive statistics.
data-analysis data-science numpy pandas python
Last synced: 19 Jul 2025
https://github.com/morphclue/godot-trend
R-Code and data for game engines on itch.io
data-analysis game-engines trends
Last synced: 05 Apr 2025
https://github.com/shriram-vibhute/digit_classification
This project demonstrates various machine learning techniques for classifying handwritten digits from the MNIST dataset. It covers data preprocessing, model training, evaluation, and advanced classification strategies.
classification data-analysis data-visualization machine-learning matplotlib numpy pandas sk-learn
Last synced: 28 Oct 2025
https://github.com/gad-dimnt-cptec/scanplot
Um sistema de plotagem simples para o SCANTEC
data-analysis jupyter-notebook pandas python scantec
Last synced: 17 Jan 2026
https://github.com/sedatdikbas/traditional-machine-learning
Geleneksel Makine Öğrenmesi Yöntemleri ile Çalışmalarım
classification confusion-matrix data-analysis data-visualization decision-tree machine-learning naive-bayes python random-forest svm traditional-machine-learning
Last synced: 11 Apr 2025
https://github.com/timmymatten/spikeball-stat-tracker
Spikeball stat tracking web app built with Streamlit and Python, designed to easily log and analyze player performance over multiple games.
data data-analysis data-visualization dataset matplotlib-pyplot multipage python spikeball statistics streamlit
Last synced: 22 Jul 2025
https://github.com/busraozdemir0/data_analysis_apps
Data analysis and data visualization applications with python
data-analysis data-analysis-python data-visualization matplotlib matplotlib-figures matplotlib-pyplot numpy numpy-python pandas pandas-dataframe pandas-library
Last synced: 28 Mar 2025
https://github.com/sunnybibyan/random_data_generation
A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.
data-analysis data-visualization python random-data-generation statistics streamlit-webapp
Last synced: 23 Feb 2025
https://github.com/swarchal/morar
Processing phenotypic screening data
biology data data-analysis drug-discovery hts phenotypic
Last synced: 19 Jun 2025
https://github.com/antononcube/wl-datareshapers-paclet
Wolfram Language (aka Mathematica) paclet for data reshaping functions, like, long- and wide form, cross tabulation, etc.
contingency-table cross-tabulation data-analysis data-transformation long-form wide-form
Last synced: 16 Jan 2026
https://github.com/as16082023/nashville-housing-data-cleaning-project
This project involved using MySQL to clean and optimize a Nashville housing dataset, addressing key data quality issues to ensure it was ready for accurate analysis.
data-analysis data-cleaning mysql nashville-housing-data
Last synced: 10 Apr 2025
https://github.com/thevinh-ha-1710/diabetes-predictive-model
This project aims to train a predictive model to diagnose diabetes on women patients.
data-analysis data-science data-visualization model-training-and-evaluation python
Last synced: 17 Jun 2025
https://github.com/alcestide/scianalytics
Data Analysis and Visualization for Research and Scientifical Purposes with Pandas and Plotly.
csv data-analysis data-science data-visualization pandas plotly python science-research statistics
Last synced: 29 Oct 2025
https://github.com/mijisu0103/ukhsa-dashboard-project
Simple dashboard that downloads and displays the data about infectious diseases (Influenza, Rhinovirus and COVID-19) from the UK Health Security Agency (UKHSA) dashboard.
data-analysis data-visualisation ipywidgets python voila-dashboard
Last synced: 17 Jun 2025
https://github.com/shoyebmd424/design-and-analysis-algorithm
algorithm daa data-analysis data-structures
Last synced: 10 Sep 2025
https://github.com/phillbertnevinemmanuel/automotivesalesdataanalysis
This marks my inaugural venture into personal data analysis, employing SQL and Python for Correlation Analysis. I've sourced the dataset from Kaggle, specifically focusing on automotive sales. You can find the dataset linked on my website below. I'm excited to share that I've independently managed the majority of tasks involved in this project.
data-analysis dataset microsoft-sql-server python python-lambda sql ssms tsql
Last synced: 24 Dec 2025
https://github.com/steciuk/ium-recommendation-system
Evaluation and comparison of 3 different recommendations models for web shopping service simulation.
data-analysis model-evaluation recomendation-system
Last synced: 29 Oct 2025
https://github.com/sedatdikbas/aefes-time-series-forecasting
Bu proje, Anadolu Efes Biracılık ve Malt Sanayii A.Ş. (AEFES) piyasa verilerini kullanarak kapanış fiyatlarının gelecekteki değerlerini tahmin etmek amacıyla derin öğrenme yöntemleri (LSTM, BiLSTM, CNN+LSTM) kullanmaktadır. Projede, veri ön işleme, model eğitimi ve değerlendirme adımları detaylandırılmıştır.
bilstm cnn-lstm data-analysis deep-learning financial-forecasting lstm machine-learning python stock-price-prediction tensorflow
Last synced: 12 Jun 2025
https://github.com/archie-cm/a-b-testing-mobile-games
This project have objective to examine what happens when the first gate in the game was moved from level 30 to level 40. When a player installed the game, he or she was randomly assigned to either gate30 or gate40.
abtesting data-analysis python retention-rate
Last synced: 29 Dec 2025
https://github.com/thecoderpinar/telecommunication-customer-churn-analysis-and-prediction
📊 This project focuses on customer churn analysis and prediction in the telecommunications sector. Using data analysis, modeling, and predictive techniques, it aims to understand and mitigate customer loss by developing strategies.
churn churn-prediction classification customer data-analysis data-science deep-learning machine-learning neural-network telecom
Last synced: 07 Aug 2025
https://github.com/maskedsyntax/taskit
A simple web based Task Tracker for better focus
charts data-analysis python3 streamlit task-tracker-app todo-list
Last synced: 29 Mar 2025
https://github.com/harshmule1/store-sales-analysis
Sales Analysis Using Power Bi
Last synced: 07 Jan 2026
https://github.com/dina-hosny/telco-customer-churn-analysis-using-power-bi
An interactive dashboard to represent some analysis of "Telco customer churn" data and the reasons that made customers churn using Microsoft Power BI.
business-intelligence data-analysis data-modeling data-visualization power-bi powerbi
Last synced: 03 Mar 2025
https://github.com/rafiulgits/data-analysis
Data Analysis with python programming language
classification data-analysis data-mining data-visualization machine-learning mglearn regression regression-models sklearn
Last synced: 17 Mar 2025
https://github.com/aliciagilmatute/analisis-multinivel-bayesiano
Este estudio explora el análisis multinivel desde un enfoque bayesiano para evaluar la variabilidad del rendimiento en matemáticas entre 10 centros educativos
bayesian-statistics cmdstanr data-analysis hierarchical-models multilevel-models rstats rstudio stan
Last synced: 30 Oct 2025
https://github.com/madhuresh2011/amazon-sales-report-analysis-using-python
This project focuses on analyzing Amazon sales data using Python to uncover insights into sales performance, customer behavior, and product trends
charts cleaning-data data-analysis jupyter-notebook matplotlib numpy pandas python seaborn visualization
Last synced: 28 Mar 2025
https://github.com/drisskhattabi6/data-analysis-and-ml-app
A Python desktop application using CustomTkinter for data analysis and machine learning.
custom-tkinter data-analysis data-processing data-visualization desktop-application machine-learning machine-learning-models machine-learning-pipeline tkinter
Last synced: 21 Mar 2025
https://github.com/hrosicka/czechpopulationestimation
This GitHub repository contains Python code for data analysis and population prediction in the Czech Republic up to the year 2050. The code is written in Python and utilizes the Pandas and Matplotlib libraries.
data-analysis data-visualization matplotlib matplotlib-figures matplotlib-pyplot pandas pandas-dataframe pandas-library pandas-python python python3
Last synced: 21 Mar 2025
https://github.com/sandk21/detection_faux_billets
Algorithme de détection de faux billets selon leurs dimensions géométriques et application web pour générer les prédictions
data-analysis data-science data-visualization machine-learning pandas python scipy sklearn streamlit
Last synced: 30 Dec 2025
https://github.com/ansh420/mcdonald_case-study
It is basically depend on the market Segment Analysis. It is a case study of mcDonald.
algorithms-implemented data-analysis python3 segmentation
Last synced: 24 Dec 2025
https://github.com/mleidel/sqlcel
Python GUI to run SQL select on Spreadsheets, CSV, and Sqlite database tables
csv-export csv-import data data-analysis data-science datascience excel-export excel-import pandas python3 sql sqlalchemy sqlite3
Last synced: 26 Jul 2025
https://github.com/deepanshkhurana/cloudsimplifier
Simple helper functions to fetch and read data from various formats stored on Amazon AWS S3 Buckets. Most functions are essentially wrapping over cloudyR.
amazon aws cloudyr data-analysis data-fetching data-science package r rpackage s3
Last synced: 16 Jul 2025
https://github.com/ysayaovong/stockroom_management
The Stockroom Management project is a comprehensive tool that automates and simplifies the process of managing inventory in stockrooms. By incorporating features like real-time updates, report generation, and low-stock alerts, it helps businesses save time, reduce errors, and optimize their inventory operations.
business-applications data-analysis data-visualization database-management inventory-control inventory-management logistics sql warehouse warehouse-management
Last synced: 09 Jul 2025
https://github.com/fdtomasi/regain-applications
Containers for notebooks and data where REGAIN has been used.
algorithms data-analysis latent-variable-models machine-learning minimization network-inference regain sklearn time-series
Last synced: 14 Sep 2025
https://github.com/jpcadena/car-sales-etl
ETL process for a Car Sales project.
asyncpg car-sales data-analysis data-engineering data-visualization database etl etl-pipeline postgresql python sqlalchemy
Last synced: 29 Oct 2025
https://github.com/sayantanidalui/student-mental-health-analysis
A SQL-based analysis project exploring student mental health, stress, and lifestyle patterns. Uncovers key insights using joins, CTEs, and window functions — no other tools used.
data-analysis mental-health mysql sql studentdata
Last synced: 07 Jul 2025
https://github.com/allanotieno254/employee-performance-tracker-excel-
An Excel-based tool to track and evaluate employee performance, compliance, and skills assessments with summary statistics and visual charts
compliance-tracker data-analysis employee-performance-analysis excel human-resources
Last synced: 24 Jan 2026
https://github.com/iguptashubham/pizzahut-analysis-sql
best dataset for data analysis. Pizzahut data analysis done by Shubham Gupta in MySql. This dataset is provided by friend of mine intern at pizzahut. In pizzahut, they used this dataset to train and ask question. This data does not reveal anything about the pizzahut. It is safe to share. data
data-analysis data-analytics database dataset datasets mysql mysql-database pizzahut
Last synced: 03 Mar 2025
https://github.com/carmoreno/analisisaccidentalidadbogota
Data Analysis about traffic accidents at Bogotá, Colombia.
data-analysis data-science jupyer-notebook matplotlib numpy pandas scikit-learn
Last synced: 23 Feb 2025
https://github.com/mgobeaalcoba/analisis_con_r
Trabajos de análisis realizados con lenguaje R
data-analysis data-science dataset r r-package r-programming r-studio
Last synced: 29 Dec 2025
https://github.com/mengyaohuang/data-manipulation-and-analysis
Data processing implementation with tools in Python
data-analysis nlp-machine-learning pandas-dataframe python
Last synced: 26 Mar 2025
https://github.com/sejalmankar1012/yuvaco_data_analysis_assessment
This assignment involves writing a Python script to calculate the cost of package deliveries based on provided data and a cost grid. The script takes package details such as weight, distance, and delivery type, applies the cost calculation rules, and saves the results in an output file. You can also run the script in Google Colab for convenience.
csv-file-handling data-analysis google-colab package-delivery python python-scripting
Last synced: 23 Jun 2025
https://github.com/kumaranand05/suicide-rate-analysis
Analysis of Mortality data of WHO and visualization using Power BI
analytics data-analysis data-visualization mortality-rates powerbi python suicide-dataset suicide-rate
Last synced: 23 Jun 2025
https://github.com/rmnldwg/liver-smart
Data and analysis pipeline for a study on the potential advantages of daily adaptive liver SBRT performed at the University Hospital Zurich.
data-analysis fractionation jupyter-notebook liver-cancer metastasis radiation-oncology stereotactic
Last synced: 24 Dec 2025
https://github.com/monish-nallagondalla/diamondpriceprediction
Diamond Price Prediction is an end-to-end machine learning project that predicts diamond prices based on attributes like carat, cut, color, clarity, and dimensions. It features a Flask web application for real-time predictions and utilizes models such as Linear Regression, Lasso, and Ridge.
data-analysis data-science flask jupyter-notebooks machine-learning predictive-modeling python
Last synced: 05 Apr 2025
https://github.com/anshmnsoni/pizza-market-analysis
data-analysis powerbi-visuals powerbidashboard
Last synced: 09 Jul 2025
https://github.com/nirmalvatsyayan/data-analyst-nanodegree
Udacity data analyst nanodegree project submissions and learning
data-analysis numpy pandas python statistics udacity-data-analyst-nanodegree
Last synced: 02 Mar 2025
https://github.com/mgobeaalcoba/matplotlib_y_seaborn
Aquí dejaré trabajos de visualización realizados con ambas librerías de Python.
data-analysis data-science data-visualization dataset matplotlib numpy pandas python seaborn
Last synced: 31 Dec 2025
https://github.com/gurpreetkaurjethra/ai-data-visualization-agent
This Streamlit application creates an interactive Data Visualization Assistant that can understand Natural Language Queries and generate appropriate Visualizations using LLMs.
aiagents aichatbot aidevelopment artificial-intelligence data-analysis data-visualization generative-ai llms
Last synced: 25 Jun 2025
https://github.com/prashver/dashboard-gallery
These dashboards provide insights across diverse domains, including cryptocurrency sales, workforce challenges, disease impact analysis, and retail trends. Leveraging tools like Power BI and Excel, they offer actionable insights for decision-making.
cryptocurrency dashboards data-analysis data-profession data-visualization market-segmentation-analysis microsoft-excel monkey-pox powerbi product-analysis retail-trends
Last synced: 12 Sep 2025
https://github.com/dcs-training/introcausalinference
This is a repository for the Introduction to Causal Inference course provided by Chris Oldnall for the CDCS. Go to the readme file
data-analysis python r statistics
Last synced: 25 Feb 2025
https://github.com/avikdatta/python_data_docker_files
A repository for docker files for data analysis using Python and Hadoop
data-analysis dockerfile python-docker raspbian spark ubuntu1604
Last synced: 06 May 2025
https://github.com/gauranshgoel123/predictive-demand-analysis
Demand Forecasting Project A web application for predicting future demand for part numbers based on historical data. Built with React for the frontend and FastAPI with Python for the backend, this application visualizes demand trends and allows users to input additional data for improved accuracy. In render analyzer is frontend analysis is backend
chartjs data-analysis data-science data-visualization dataset deployment full-stack machine-learning numpy pandas predictive-analysis prophet-model python reactjs render
Last synced: 01 Nov 2025
https://github.com/patilni3/numpy-in-depth
Python's NumPy Library for Data Analysis, Machine Learning, Data Science and many more...
data-analysis data-engineering data-science machine-learning numpy pandas
Last synced: 03 Apr 2025
https://github.com/flexycode/biof-101
💫 BIOINFORMATICS for Drug Development
bioinformatics biology classification clustering-algorithm computer-science data-analysis drugs-dataset machine-learning matplotlib pymol python rdkit
Last synced: 02 Mar 2025
https://github.com/pzim-devdata/data-developer
All my DATA developer projects
correlation data-analysis data-mining data-science data-visualization database folium folium-maps mongodb mysql python spark sql
Last synced: 30 Dec 2025
https://github.com/patilni3/seaborn-in-depth
Python's Seaborn Library for Data Analysis, Machine Learning, Data Science and many more...
data-analysis data-reporting data-representation data-science data-visualization plots-in-python powerbi seaborn sns
Last synced: 03 Apr 2025
https://github.com/makosai/covid19datachart
A basic chart for checking corona data. Written in a single HTML file for convenience. Grab the single file and run it anywhere. Or visit the webpage.
chart chartjs corona coronavirus coronavirus-analysis covid-19 covid-2019 covid19 covid19-data data data-analysis datasets
Last synced: 24 Feb 2025
https://github.com/hemanthkumarsunkari27/pmay_analysis_project
Built for the 1st AI for Good Hackathon by Snowflake, this project uses data analytics and AI to explore housing and sanitation trends in India under PMAY. Using Snowflake and Streamlit, it provides interactive insights into regional disparities, helping guide sustainable infrastructure development.
data-analysis data-visualization pmay-analysis sanitation-coverage snowflake-integration streamlit-dashboard sustainable-development
Last synced: 26 Mar 2025
https://github.com/naruaika/eruo-data-studio
A powerful yet friendly ETL tool powered by Polars backend
data-analysis data-science desktop-app gnome-desktop gtk4 proof-of-concept python spreadsheet
Last synced: 18 Jul 2025
https://github.com/simranjeet97/google-cloud-access-using-python
Google Drive Access using Python, Interact Programmatically and Manipulate accordingly
data-analysis data-science data-structures data-visualization gcp gcp-cloud-functions gcp-compute gcp-compute-engine gcp-projects gcp-storage google googlecloud googlecloudplatform python python3 visualization
Last synced: 03 Mar 2025
https://github.com/neemiasbsilva/datascience-portfolio
Hello guys, welcome to my Data Science Portfolio. I include some knowledges I earn in my journey. I included some case study, papers, and code. Please check the readme.
case-study churn-prediction code-challenges data-analysis data-science deep-learning forecasting fundamental-of-statistics health-care image-recognition machine-learnin machine-learning math mathematics pattern-recognition portfolio programming-skills speech-emotion-detection statistics voice-activity-detection
Last synced: 23 Feb 2025
https://github.com/nafisalawalidris/investigating-netflix-movies-and-guest-stars-in-the-office
Dive into the world of Netflix and explore the average duration of movies. Netflix, being the largest entertainment company, offers a wide range of movies for its viewers. In this project, we analyse movie durations using pandas and create a DataFrame from a dictionary. By examining average durations from 2011 to 2020.
average-duration csv-files data-analysis data-visualization dataframe filtering movie-durations movie-length-distribution netflix pandas python trends
Last synced: 16 Mar 2025
https://github.com/msthamizh/phonepe-pulse-data-visualization-and-exploration
Developing a Streamlit application that allows users to explore and analyze transaction data from the PhonePe Pulse dataset. The project aims to provide insights into digital payment trends across India.
data-analysis data-visualization dataframe mysql pandas plotly python streamlit
Last synced: 05 Sep 2025
https://github.com/idaraabasiudoh/knn-customer-classification
Labels telecommunication customer base to respective groups to determine service type required for each customer.
data-analysis jupyter-notebook machine-learning pyhton3 scikit-learn
Last synced: 04 Mar 2025
https://github.com/shubhamgoyal575/ecommerce-product-categorization
This project classifies e-commerce products into predefined categories using machine learning. It includes preprocessing steps like stopword removal, punctuation cleaning, and feature extraction. Models, including LSTM, are implemented, and evaluated for better accuracy.
accuracy-score artificial-neural-networks confusion-matrix data-analysis data-cleaning data-preprocessing data-science data-visualization deep-learning exploratory-data-analysis hyperparameter-tuning logistic-regression long-short-term-memory machine-learning machine-learning-algorithms naive-bayes-algorithm natural-language-processing precision-score random-forest-classifier
Last synced: 30 Aug 2025
https://github.com/apache/cloudberry-gpbackup-s3-plugin
S3 plugin for Apache Cloudberry (Incubating) backup utility
ai big-data cloudberry data-analysis data-warehouse database distributed-database gpbackup greenplum mpp olap postgres postgresql s3plugin
Last synced: 12 Sep 2025
https://github.com/maheshthedev/twitter-analysis
Analysis on Various Topics with Twitter Data
data-analysis twitter-analysis
Last synced: 18 Jul 2025
https://github.com/salman-khan-mohammed/predicting-the-intent-of-online-shoppers
This project aims to predict online shoppers' purchase intentions using browsing history and user data from e-commerce sites. By analyzing clickstream and session information, the goal is to create a machine learning model that accurately forecasts customers' likelihood of making a purchase.
cluster-analysis data-analysis data-pre eda outliers prediction
Last synced: 31 Oct 2025
https://github.com/adityav42/deloitte-forage-virtual-internship
About Submission for Deloitte's STEM Virtual Program on Forage, focusing on data analysis, forensic technology, and cybersecurity.
coding cybersecurity data-analysis deloitte development forage forensics-technology virtual-program
Last synced: 29 Oct 2025
https://github.com/shubhamgoyal575/spam_detective
This project uses machine learning to classify messages as spam or ham based on text analysis. It includes data preprocessing, feature extraction (TF-IDF), and classification models like Logistic Regression and Naive Bayes for accurate spam detection. Built with Python and Scikit-Learn. 🚀
count-vectorizer data-analysis data-analytics data-cleaning data-preprocessing data-science data-visualization data-wrangling exploratory-data-analysis logistic-regression machine-learning machine-learning-algorithms naive-bayes natural-language-processing spam-detection tfidf-vectorizer
Last synced: 02 Jul 2025
https://github.com/carocardenas0699/pi02-data-analysis
Proyecto Individual 2 de la carrera Data Science. Se realizó un análisis de homicidios en siniestros viales en la ciudad de Buenos Aires. Incluye: ETL, EDA, Dashboard interactivo con resultados
data-analysis data-science data-visualization eda etl powerbi python
Last synced: 24 Feb 2025
https://github.com/denko5/sales-analysis
A complete SQL-based sales analysis project covering Africa, showcasing data cleaning, exploratory analysis, insights, and lessons learned. The project highlights sales trends, regional performances, and marketing effectiveness across multiple platforms.
africa data data-analysis data-science exploratory-data-analysis insights kenya sales sql
Last synced: 24 Jan 2026
https://github.com/wiseaidev/truth-guard
Analyzing a 79k Dataset of Misinformation and Fake News
data-analysis fastapi lstm machine-learning python supervised-learning
Last synced: 19 Jan 2026
https://github.com/jasoncobra3/whatsapp_chat_analyzer
WhatsApp Chat Analyzer is a powerful tool that provides insightful analytics from your WhatsApp conversations. Whether you're curious about your chatting habits, want to analyze group dynamics, or need to extract meaningful data from your conversations, this tool has got you covered!
data-analysis data-science data-visualization machine-learning streamlit streamlit-webapp whatsapp-chat whatsapp-chat-analyzer
Last synced: 31 Jan 2026
https://github.com/pranabdas/suvtools
Python library for analyzing and visualizing SSLS SUV Beamline data.
data-analysis data-visualisation python
Last synced: 07 May 2025
https://github.com/mikasenghaas/covid19-analysis
analysis of correlation between covid-19 infection numbers and weather data from the beginning of the pandemic until april 2021
data-analysis statistical-analysis
Last synced: 13 Sep 2025
https://github.com/ahmednurabdii/data-analytics-portfolio-superstore
My first portfolio project showcasing data cleaning, analysis, and visualization of Superstore sales data.
data-analysis data-visualization jupyter-notebook matplotlib numpy pandas portfolio-project python sales-analysis scipy seaborn superstore-dataset
Last synced: 30 Dec 2025
https://github.com/sd7campeon/yelp-sentiment-analysis-with-python-bs4-and-llm
A scalable pipeline for automated extraction, preprocessing, and sentiment analysis of Yelp reviews. Uses advanced HTTP requests, HTML parsing, and text normalization (tokenization, stopword removal, lemmatization) to enable precise polarity and subjectivity analysis for consumer insights and business analytics.
beautifulsoup beautifulsoup4 business-analytics cuda data-analysis nlp-machine-learning nltk opinion-mining pandas python python3 requests-library-python sentiment-analysis text-preprocessing textblob torch web-scraping yelp-reviews
Last synced: 18 Oct 2025
https://github.com/scarblase/portfolioprojects
A collection of data analysis and business intelligence projects using SQL, Python, and visualization tools to uncover insights from real-world datasets. 🚀📊
csv data-analysis data-engineering data-mining data-science data-visualization matplotlib matplotlib-pyplot pandas python python3 seaborn sql
Last synced: 12 Mar 2025
https://github.com/csoren66/diabetics_prediction
Predicting that whether the patient has diabetes or not on the basis of the features we will provide to our machine learning model.
data-analysis machine-learning python svm
Last synced: 03 Mar 2025
https://github.com/yard1/linearordering
An R package. Provides various methods of linear ordering of data. Supports weights and positive/negative impacts.
data-analysis data-analysis-in-r data-analysis-r data-science r
Last synced: 29 Dec 2025
https://github.com/mariam-badr-mb/gtc-ml-project2-diabetes-prediction
This project is part of the GTC Machine Learning Program. It demonstrates the end-to-end ML workflow by building a predictive model for diabetes detection
classification-algorithm data-analysis data-visualization diabetes-prediction gridsearchcv hyperparameter-tuning machine-learning python
Last synced: 13 Sep 2025
https://github.com/discdiver/new-belgium-ratings
Find the most popular New Belgium beers of all time!
beautifulsoup data-analysis pandas python seaborn webscraping
Last synced: 31 Dec 2025
https://github.com/priyanshubiswas-tech/data-analysis-with-python
This repository showcases Python projects completed for a Data Analysis with Python certification, demonstrating skills in data manipulation, visualization, and statistical analysis using libraries like NumPy, Pandas, Matplotlib, Seaborn, and SciPy.
data-analysis demographic-data-analyzer mean-variance-standard-deviation-calculator medical-data-visualizer page-view-time-series-visualizer python scipy-stats sea-level-predictor seaborn
Last synced: 07 May 2025
https://github.com/priyanshubiswas-tech/airflow_dbt_superset_project
End-to-end ITSM data engineering pipeline using PostgreSQL, DBT, Airflow, and Superset. Covers ingestion, cleaning, transformation, orchestration, and visualization, validated across Docker Toolbox and Docker Desktop environments.
apache-airflow apache-superset dags data-analysis dbt docker etl etl-automation etl-pipeline postgresql
Last synced: 07 May 2025
https://github.com/mehulcode12/atliq-bank_creditcard_transaction_analysis
The credit card project at Atliq Bank comprises two key phases: market identification and trial. This initiative aims to leverage mathematical and statistical concepts to analyze data related to demographics, income, credit scores, and spending patterns in order to identify the target audience for the credit card.
codebasics data-analysis data-science data-visualization mathematics python python3 statistics
Last synced: 11 Apr 2025
https://github.com/danhenriquex/final-project-ia
Artificial Intelligence Project - Analysis of sentiments of news that impact the value of shares.
data-analysis machine-learning supervised-learning
Last synced: 25 Jun 2025
https://github.com/nurfakhri/e-commerce-data-analyst
E-commerce data analysis supported by data wrangling, EDA, and web dashboard
dashboard data-analysis e-commerce flask-application python
Last synced: 30 Apr 2025
https://github.com/jayita11/atliqo-bank-credit-card-launch-eda
This project involves exploratory data analysis and statistical testing for AtliQo Bank's new credit card launch. Key insights include targeting high-income occupations and the 18-25 age group. Recommendations focus on tailored marketing campaigns, education, and incentives to enhance credit card adoption and usage among young adults.
data-analysis hypothesis-testing matplotlib p-value pandas python seaborn statistics z-test
Last synced: 31 Dec 2025
https://github.com/rayyan9477/multiple-disease-prediction-system
This repository contains a Multiple Disease Prediction System leveraging machine learning techniques for accurate predictions. It utilizes Python, Pandas, Scikit-learn, and Flask for data preprocessing, model building, and web deployment. Explore the project and connect on LinkedIn for collaborations.
data-analysis data-science machine-learning python streamlit
Last synced: 31 Dec 2025
https://github.com/smahala02/calorimtery
A calorimetry lab project involving Python and Excel for computing heat transfer from experimental data.
calorimetry chemistry data-analysis excel jupyter-notebook python thermodynamics
Last synced: 11 Jul 2025