Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-06-30 00:07:38 UTC
- JSON Representation
https://github.com/ak-pydev/python_practice
Documenting my learning journey from python -> ML -> DL -> LLM/GenAI -> Agents exercises solved daily from Udemy/Kaggle/YouTube.
data-analysis data-science feature-engineering llms machine-learning mlflow mlops-workflow modeling python3 streamlit uvicorn
Last synced: 20 Apr 2026
https://github.com/robinmillford/hr-analytics-employee-performance-analysis
HR Analytics: Unveiling Employee Performance - A comprehensive exploration of employee data using SQL and Power BI, uncovering key insights for strategic HR decision-making.
data-analysis data-visualization jupyter-notebook powerbi python3 sql
Last synced: 20 Apr 2026
https://github.com/profasem/logistics-performance-analysis
Power BI dashboard analyzing logistics performance, delivery delays, carrier efficiency, and regional risk.
business-intelligence dashboard data-analysis logistics powerbi python supply-chain
Last synced: 21 Apr 2026
https://github.com/mahmoudwal27/e-commerce-data-analysis
A collection of data analysis and visualization projects focused on ecommerce datasets. Using Python in Google Colab for analysis and Excel for exploration, these projects uncover key insights and trends, showcasing expertise in data manipulation and visualization to inform business decisions.
analytics data-analysis data-analysis-python data-set google-cloud python
Last synced: 21 Apr 2026
https://github.com/danpoynor/pet-shelter-data-analysis-notebook
Demonstration of skills analyzing data from a pet shelter. The CSV data contains tables detailing the incoming and outgoing animals and I use my knowledge of Pandas to gather and present the requested information.
csv data-analysis data-cleaning data-science jupyter-notebook matplotlib numpy pandas pet-shelter tabular-data
Last synced: 21 Apr 2026
https://github.com/florence-nyokabi/house-power-consumption
Machine Learning: Exploring Regression Analysis
data-analysis data-cleaning data-science data-visualization feature-engineering jupyter-notebook jupyterlab machine-learning pandas-python regression-analysis regression-models
Last synced: 05 Jun 2026
https://github.com/nxion/sql-data-warehouse-project
Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.
data data-analysis data-analytics data-engineering data-lakehouse data-warehouse datalake datascience etl etl-job medallion-architecture ms mssql sql sql-query sql-server
Last synced: 05 Jun 2026
https://github.com/martinkalema/power-distribution-modelling
Power Distribution Modelling for cea and cel algorithms
data-analysis python synthetic-dataset
Last synced: 21 Apr 2026
https://github.com/rahulpatel0615/sales-analysis-project
Sales Data Analysis Dashboard with Python, Pandas, and Matplotlib. Features 12+ visualizations and comprehensive insights.
data data-analysis data-visualization matplotlib pandas portfolio python
Last synced: 21 Apr 2026
https://github.com/meerantajalli/networksecuritydefense
This Network Security defense systems acts as an indicator against SMP Floods, UDP Floods, ICMP Floods. This model is trained using packets from wireshark and can easily differentiate between normal network traffic and traffic that has been targetted on the machine by an attacker using the rate of packets transfer and using the source IP.
anomaly-detection classification cyber-security data-analysis ddos-detection icmp-flood intrusion-detection machine-learning network-security packet-analysis python random-forest security smp-flood udp-flood wireshark
Last synced: 21 Apr 2026
https://github.com/nikhilfuke1/a-b-testing-and-regression-analysis-python
Python Statistical Project involves data analysis, visualization, A/B testing, and regression analysis to determine the best-performing platform.
ab-testing data-analysis hypothesis-testing libraries python regression-analysis statistics visualization
Last synced: 21 Apr 2026
https://github.com/mhuwaimel/data-analysis-of-students-results-in-qiyas
Analysis of student performance data from Qiyas (قياس), the Saudi Arabian National Center for Assessment
data-analysis jupyter-notebook python
Last synced: 22 Apr 2026
https://github.com/tmmvn/analytics-notebooks
A bunch of data analytics notebooks done testing out JetBrains DataLore
ai algorithms data-analysis datalore elements-of-ai helsinki-university-mooc python
Last synced: 22 Apr 2026
https://github.com/kgelli/apple-data-analysis---apache-spark
Modular ETL pipeline for analyzing Apple product purchase patterns using Apache Spark on Databricks with factory design patterns.
apache-spark data-analysis databricks delta-lake etl-pipeline factory-pattern pyspark
Last synced: 22 Apr 2026
https://github.com/thinogueiras/jornada-python
Jornada Python - Hashtag Programação.
data-analysis data-science inteligencia-artificial python rpa
Last synced: 22 Apr 2026
https://github.com/rorrell/lifeexpectancy
A Jupyter Notebook where I create a chart with two line plots on it to check out the life expectancy of men vs. women from 1900-2018
data-analysis data-visualization jupyter-notebook python3
Last synced: 22 Apr 2026
https://github.com/niaid/training-schedule
bcbb-training bioinformatics computational-biology data-analysis data-science pandas python
Last synced: 22 Apr 2026
https://github.com/ayushi-gajendra/restaurant-order-analysis-sql
End-to-end SQL analysis of 12,266 restaurant transactions to identify high-performing menu items, revenue concentration, bulk ordering behavior, and strategic growth opportunities.
analytics-portfolio business-intelligence case-study customer-segmentation data-analysis data-analytics database-analysis menu-engineering mysql revenue-analysis sql sql-project
Last synced: 05 Jun 2026
https://github.com/al-ogr/sf_pr1_job_analysis_hh
SkillFactory DataScience PROJECT-1. Анализ резюме из HeadHunter
data-analysis data-science ipynb plotly python
Last synced: 23 Apr 2026
https://github.com/tranngoca5039/bigquery-a5y
📊 Streamline your data analysis with bigquery-a5y, a powerful tool for optimizing BigQuery performance and improving query efficiency.
analytics api big-data bigquery cloud-computing data-analysis data-integration data-management data-pipeline data-visualization data-warehouse google-cloud machine-learning serverless sql
Last synced: 05 Jun 2026
https://github.com/thc1006/nycu_timtable_crawler
🎓 NYCU Course Data Crawler & Timetable System | 國立陽明交通大學課程爬蟲與選課系統 - Python web scraper for course schedules, syllabi & educational data analysis. Crawls 18K+ courses with 98% success rate. Features: interactive timetable, JSON API, Google Colab support, batch processing, resume capability.
academic course course-selection crawler data-analysis education educational-data google-colab json-api nycu open-data python schedule student-tools syllabus taiwan timetable university web-automation web-scraping
Last synced: 24 Apr 2026
https://github.com/ihnokim/datk
Data Analysis Toolkit (DATK)
data-analysis data-engineering data-science deep-learning image-processing pandas signal-processing
Last synced: 24 Apr 2026
https://github.com/strixion/demoversion_ai
The demoversion of StrixionAI
ai csv data-analysis data-analytics json python txt
Last synced: 24 Apr 2026
https://github.com/arunabhagit/inventory-misalignment-and-revenue-loss-in-multi-store-bike-retail
This project focuses on identifying the inventory and demand mismatch causing stagnant sales and lost revenue in a bike retail chain. By analyzing store-level performance and regional customer preferences, the project aims to detect underperforming products.
data-analysis data-visualization powerbi python
Last synced: 24 Apr 2026
https://github.com/henriquetourinho/s.i.g.m.a
Plataforma de busca e análise de arquivos para Linux, com GUI avançada em PySide6 e foco em metadados ricos para investigações profundas.
data-analysis developer-tools file-search metadata open-source pyqt pyside6 python python-brasil qt6 sysadmin-tools
Last synced: 24 Apr 2026
https://github.com/voidnire/redditviralmysteryposts
Análise de posts de subreddits de mistério. O que define um post viral neste tipo de sub?
data-analysis data-visualization mysteries mystery nlms python-3 reddit
Last synced: 24 Apr 2026
https://github.com/yuvrajsaraogi/-iris-flower-classification
Iris flower has three species; setosa, versicolor, and virginica, which differs according to their measurements. Now assume that you have the measurements of the iris flowers according to their species, and the task is to train a machine learning model that can learn from the measurements of the iris species and classify them.
classification data data-analysis data-science data-visualization flower flower-classification iris iris-classification iris-flower iris-flower-classification knn knn-classification machine-learning machine-learning-algorithms ml natural-language-processing nlp python
Last synced: 24 Apr 2026
https://github.com/lightning-chart/lcjs-example-0507-dashboardfiberanalysis
A demo application showcasing using LightningChart JS to visualize fiber analysis data.
area-plot area-series chart charts dashboard data-analysis demo heatmap javascript lcjs lightningchart-js performance visualization webgl
Last synced: 24 Apr 2026
https://github.com/cyberoctane29/python-for-data-analysis
A repository dedicated to learning Python for data analysis, data science, and data analytics. This collection of Jupyter notebooks covers practical exercises and concepts from the Google Advanced Data Analytics Professional Certificate program.
data data-analysis data-analytics data-science python
Last synced: 24 Apr 2026
https://github.com/pedrohdosanjos/economic-data-analysis
This project aims to analyze the export data from various states in the United States to Brazil over time. The data is sourced from the FRED (Federal Reserve Economic Data) API and processed to identify the top 5 exporting states for each year, as well as the states with the highest total export value across all years.
api data-analysis data-visualization jupyter-notebook python
Last synced: 24 Apr 2026
https://github.com/mariann95/sql_data_warehouse_and_analytics_project
Building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics. This repository also contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.
data-analysis data-analytics data-cleaning data-engineering data-lakehouse data-science data-science-portfolio data-warehouse data-warehousing datalake datawarehouse datawarehousing etl etl-job etl-pipeline medallion-architecture sql sql-query sql-server sqlserver
Last synced: 06 Jun 2026
https://github.com/mehmetkahya0/gallstone_dataset_analysis_project
Safra Taşı Hastalığı (Gallstone-1) Veri Seti Analizi (https://archive.ics.uci.edu/dataset/1150/gallstone-1)
analysis analytics data data-analysis data-science data-visualization database graph matplotlib python
Last synced: 25 Apr 2026
https://github.com/fbarffmann/belly-button-challenge
Built an interactive JavaScript dashboard to visualize bacterial biodiversity from belly button samples. Analyzed data from 153 participants and identified OTU 1167 as the most common bacteria.
biodiversity dashboard data-analysis data-visualization interactive-charts javascript json plotly
Last synced: 25 Apr 2026
https://github.com/rubix982/product-quality-classification
This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.
data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp
Last synced: 25 Apr 2026
https://github.com/xjwllmsx/hacker-news-engagement
Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.
data data-analysis jupyter python
Last synced: 25 Apr 2026
https://github.com/avilash1212/sales-dashboard-using-google-looker-
Sales dashboard project using Google looker studio
data-analysis data-visualization jupyter-notebook python sql
Last synced: 25 Apr 2026
https://github.com/ddihora1604/iit_patna
A multifaceted project involving applying ML models like Ridge Classifier, RNN, RIDOR, Rotation Forest and RUSBoost, integrating SMOTE for class balancing, and handling diverse datasets including those for seating arrangement tasks.
data-analysis data-visualization datamodelling machine-learning-algorithms python
Last synced: 25 Apr 2026
https://github.com/marielachirinosr/bellabeat-wellness-data-trends
Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.
data data-analysis data-visualization pandas python python3 tableau tableau-public
Last synced: 25 Apr 2026
https://github.com/aastopher/mma_outcome
Simple exploratory analysis of UFC Fights and Vegas fight odds from 1993 to 2021
data-analysis data-visualization
Last synced: 06 Jun 2026
https://github.com/mysto-007/world_population_growth_analysis
World Population Growth Analysis
data-analysis data-science data-visualization kaggle matplotlib
Last synced: 25 Apr 2026
https://github.com/chandansoren/customer-personality-analysis
Predict how different customer segments will respond for a particular product or service.
data-analysis data-visualization python
Last synced: 26 Apr 2026
https://github.com/dcs-training/2023-10-22-carpentry-social-science
Go to https://dcs-training.github.io/2023-10-22-Carpentry-Social-Science/ to follow along the material
data-analysis data-visualisation data-wrangling intro-to-programming r
Last synced: 06 Jun 2026
https://github.com/rociobenitez/happiness-index-data-processing
Repository for Big Data Processing - Contains Jupyter Notebooks and Datasets for data analysis and processing tasks related to Big Data.
big-data big-data-processing data-analysis data-processing happiness-index happiness-report jupyter-notebook matplotlib pandas seaborn
Last synced: 15 May 2026
https://github.com/deliprofesor/cinematic-data-analytics-and-recommendation-platform
This project analyzes a movie dataset using machine learning algorithms to predict success, explore revenue-popularity relationships, and develop recommendation systems. It employs techniques like K-Means, DBSCAN, GMM, decision trees, PCA, and NLP for insights and personalized suggestions.
clustering content-based-recommendation data-analysis data-visualization decision-tree gmm k-means machine-learning natural-language-processing nlp pca predictive-modeling python recommendation-system scikit-learn user-based-recommendation
Last synced: 26 Apr 2026
https://github.com/swapnanildutta/prediction-with-python
The projects are made using Jupyter Notebook
data-analysis jupyter-notebook machine-learning prediction python regression-models
Last synced: 27 Apr 2026
https://github.com/pararang/nams-thesis-fuzzy
A specialized data processing tool designed to help with Fuzzy Delphi Method calculations for thesis research data analysis. Then extended with some new features for data processing with different method.
data-analysis dematel hacktoberfest hacktoberfest-accepted house-of-quality python sustainability vibecoding
Last synced: 27 Apr 2026
https://github.com/arush-codes/paris-olympic-de
data engineering project on paris olympics 2024
azure data-analysis data-engineering microsoft-azure olympics2024 pipeline
Last synced: 27 Apr 2026
https://github.com/odinleepro/airbnbnewyorkcityanalysis
AirbnbNewYorkCityAnalysis is a comprehensive data analysis and visualization project exploring short-term Airbnb rental trends across New York City (2008–2022). Using open source Airbnb data, the project combines data cleaning, statistical summaries, and Tableau dashboards to uncover pricing patterns, borough level distribution, and insights.
airbnb analytics-project data-analysis data-cleaning data-science data-visualization new-york-city real-estate-analytics tableau urban-analysis
Last synced: 27 Apr 2026
https://github.com/malexandersalazar/covid-19-peru-estimacion-oxigeno-requerido
Análisis técnico de casos confirmados por COVID-19 en Perú para la estimación de oxígeno medicinal requerido.
covid-19 data-analysis data-science peru python
Last synced: 27 Apr 2026
https://github.com/as16082023/project-portfolio
A guide to all my projects
dashboard data-analysis data-cleaning data-visualization excel mysql power-bi python sql tableau
Last synced: 27 Apr 2026
https://github.com/busesimsek/sql-projects
A collection of my SQL projects with insights into real-world datasets.
data-analysis data-analytics mysql sql
Last synced: 07 Jun 2026
https://github.com/manasashetty01/regulatory-affairs-of-road-accidents
Regulatory Affairs of Road Accidents in Million-Plus Cities (India, 2020)
data-analysis data-science data-visualization exploratory-data-visualizations jupyter-notebook numpy pandas python
Last synced: 27 Apr 2026
https://github.com/edanur-y/laptop-price-prediction-with-regression-models
Comparing the performances of multi-layer perceptron, k-nearest neighbors, random forest, gradient boosting and extreme gradient boosting regression and on laptop data to predict the price.
data-analysis data-transformation feature-importance hyperparameter-tuning python
Last synced: 27 Apr 2026
https://github.com/elakkiya-u/digital-marketing-campaign-conversion-prediction
A Predictive Modelling whether a customer will convert based on digital marketing campaign data.
campaign-analytics churn-prediction data-analysis deployment digital-marketing-analytics machine-learning power-bi predictive-modelling presentation-slides python
Last synced: 27 Apr 2026
https://github.com/hfzdzakii/dicoding-airqualityanalysisdata
This repo is a master submission for my Dicoding Final Project. Air Quality Dataset is being used to fulfill the submission. Feel free to explore and I hope my work give you some insight!
data-analysis data-visualization streamlit
Last synced: 27 Apr 2026
https://github.com/airdac/ml-palmerpenguins
Classification and analysis of the palmerpenguins dataset in Python. Team project from UPC's Master's Degree in Data Science
classification data-analysis data-science machine-learning palmer-penguin python upc
Last synced: 07 Jun 2026
https://github.com/lotfiferaga/hotel-reviews-sentiment-analysis
Efficient Python-driven sentiment analysis for hotel reviews, providing insightful evaluations.
data-analysis data-visualization nlp python
Last synced: 07 Jun 2026
https://github.com/josedanielchg/1990s-netflix-movie-insight
Small exploratory analysis of Netflix movie data from the 1990s. This project is part of the DataCamp Associate Data Scientist in Python program and focuses on filtering, visualizing, and extracting insights from a dataset using Python. Analyze trends in movie durations and count short action films to practice key data science skills!
Last synced: 27 Apr 2026
https://github.com/bheemisme/employee-attrition-analysis
A Dashboard on employee-attrition-analysis
dashboard data-analysis data-science plotly plotly-dash python
Last synced: 28 Apr 2026
https://github.com/tillscode/personal-finance-ml-analysis
Machine learning analysis of personal financial data with predictive modeling and interactive dashboard
dashboard data-analysis finance machine-learning python scikit-learn
Last synced: 28 Apr 2026
https://github.com/sferez/simple_linear_regression
Simple Linear Regression using Python
data-analysis data-science linear-regression python regression
Last synced: 28 Apr 2026
https://github.com/sweta2501/netflix_dataanalysis
With the help of Netflix Data, I have done some Data Analysis.
data-analysis data-science jupyter-notebook python
Last synced: 28 Apr 2026
https://github.com/sujata-adhikari/data-analysis
Data analysis of Market sales data using PowerBi, created dashboard to show analysis.
data-analysis excel pandas powerbi
Last synced: 12 Jun 2026
https://github.com/stefagnone/movies-dataset-analysis-project
Comprehensive analysis of the Movies dataset, exploring genre trends, comparisons, and qualitative insights using Python, Pandas, and visualizations. Designed to uncover actionable findings for stakeholders.
data-analysis data-visualization exploratory-data-analysis matplotlib movies-analysis pandas python seaborn storytelling-with-data
Last synced: 28 Apr 2026
https://github.com/datalopes1/warehouse_rfv
Neste projeto será realizada uma análise do tipo RFV (Recência, Frequência e Valor) com dados que encontrei neste video no Youtube do canal Jie Jenn.
analise-rfv data-analysis data-science kmeans python rfm-analysis
Last synced: 28 Apr 2026
https://github.com/hadson0/chess-live-ratings-data
A study project focused on web scraping the live chess ratings from chess.com, with data analysis and visualization on nearly 5000 players in the classical world ranking.
beautifulsoup chess data-analysis data-visualization numpy pandas python seaborn web-scraping
Last synced: 28 Apr 2026
https://github.com/emmanuelletocs/steam-game-recommender
A powerful recommendation system for Steam games, combining Content-Based and Collaborative Filtering techniques. Built with Python, Scikit-learn, and Streamlit to deliver accurate, real-time game recommendations. Perfect for gamers and data scientists interested in building intelligent recommendation engines.
als-algorithm data-analysis gaming-industry knn machine-learning mds mysql ncf neural-network pyspark recommendation-engine recommendation-system scikit-learn spark
Last synced: 28 Apr 2026
https://github.com/rajivaleaakash/customer-churn-prediction
A machine learning project focused on predicting customer churn using various data analysis and modeling techniques. The repository includes data preprocessing, feature engineering, exploratory data analysis (EDA), model training, evaluation, and visualization to help businesses identify customers at risk of leaving.
churn-prediction classification customer-churn data-analysis data-science gridsearchcv imblearn machine-learning numpy pandas pyhton randomsearchcv scikit-learn
Last synced: 28 Apr 2026
https://github.com/abdeldjalilchafai/us-flight-delay-eda
Structured EDA on 2015 US flight delay data. Clean, reproducible notebook using a 6-step data analysis framework for real-world datasets.
data-analysis data-cleaning eda exploratory-data-analysis flight-delays kaggle matplotlib numpy pandas python seaborn
Last synced: 28 Apr 2026
https://github.com/sufyan14/weather-data-analysis
A Streamlit dashboard that forecasts 30-day weather trends using uploaded CSV data and Facebook Prophet.
data-analysis python streamlit
Last synced: 28 Apr 2026
https://github.com/shreeparab1890/indian-elections-2019-analysis-eda
This ipython notebook is the Exploratory data analysis (EDA) of the Indian Lok Sabha Elections 2019.
data data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib numpy pandas plotly python python3 visualization
Last synced: 28 Apr 2026
https://github.com/delonnewman/relational
Relational programming for Ruby
csv csv-import data data-analysis database export json relational relational-algebra relational-database relational-model relational-programming reporting reports ruby yaml
Last synced: 28 Apr 2026
https://github.com/hayatiyrtgl/arima_linearregression_xgboost_time_series_analysis
This Python script conducts various data processing, visualization, and modeling tasks on a dataset.
arima arima-forecasting arima-model data-analysis data-visualization linear-models linear-regression machine-learning machine-learning-algorithms pandas python xgboost xgboost-regression
Last synced: 28 Apr 2026
https://github.com/matheusafonseca/python-data-visualization-matplotlib-seaborn-masterclass-udemy
This repository is dedicated to storing the code developed during the "Python Data Visualization: Matplotlib & Seaborn Masterclass" course on Udemy.
charts data-analysis data-analysis-python data-science data-visualization database graphics graphics-programming jupyter-notebook matplotlib matplotlib-plots python python3 seaborn seaborn-plots
Last synced: 28 Apr 2026
https://github.com/manalisbhavsar/stock-price-prediction
Stock Price Prediction model using Machine Learning and LSTM to forecast future stock prices based on historical data. Achieved a low error rate of 3.2% by leveraging moving averages and deep learning techniques, ensuring accurate predictions.
data-analysis deep-learning lstm machine-learning matplotlib numpy pandas python
Last synced: 28 Apr 2026
https://github.com/abhi227070/car-price-prediction
This project implements a machine learning model to predict the price of cars based on various features such as mileage, manufacturing date, fuel type, and more. Users can input car information, and the model will estimate the price of the car based on the provided data. This tool can be useful for both car buyers and sellers to estimate car price.
data-analysis machine-learning machine-learning-algorithms machinelearning python3 regression regression-models scikit-learn scikitlearn-machine-learning
Last synced: 28 Apr 2026
https://github.com/buabaj/fortran-assignment
code repository for fortran and python climatology assignment.
big-data climatology data-analysis data-visualization fortran90 python
Last synced: 28 Apr 2026
https://github.com/priyanshubiswas-tech/e-commerce_data_analysis
Analyzes 9,994 e-commerce transactions to uncover insights on sales trends, customer behavior, profitability, and logistics using EDA and visualization. Identifies top products, customer segments, and shipping efficiencies to optimize marketing, inventory, and operations, making it valuable for retail, finance, and logistics.
data data-analysis data-visualization pandas pandas-dataframe plotly-analytics-projects plotly-express python
Last synced: 28 Apr 2026
https://github.com/szapp/candyanalysis
Case study: Analyze the candy power ranking to identify and recommend popular candy characteristics
data-analysis data-visualization feature-selection interaction-terms
Last synced: 28 Apr 2026
https://github.com/tanzeelgcuf/medical-information-rule-based-prediction-model-with-api
a rule based system, that learns to make new rules, for medical information that will take the first 11 text fields and predict the last 2 text fields - the diagnosis and disposition. I needs to show the key words used to make the predicted diagnosis and disposition.
data-analysis django machine-learning-algorithms openpyxl python python3 rest-api
Last synced: 28 Apr 2026
https://github.com/leosimoes/alura-7daysofcode-dados
Desafios das Trilhas de Dados - Ciência de Dados, Machine Learning e Python Pandas.
data-analysis data-science jupyter-notebook machine-learning python
Last synced: 28 Apr 2026
https://github.com/ricram2/column-name-extractor
Jupyter Notebook. Takes Folder with one or more CSV and gives back one CSV with a compendium of column names and 3 example values (first, random, random)
Last synced: 29 Apr 2026
https://github.com/prady2309/sales-prediction-using-python
Implemented using Multiple Linear Regression
data-analysis data-science machine-learning python
Last synced: 29 Apr 2026
https://github.com/emircanakyuzz/veri_gorsellestirilmesi_ve_analizi-analysis_and_visualization_of_dataset
Bu çalışmada numpy, pandas, seaborn ve matplotlib gibi veri biliminde çokca bilinen modülleri kullanarak analiz ve görselleştirme işlemleri gerçekleştirdim.
data-analysis data-science data-visualization jupyter-notebook python
Last synced: 29 Apr 2026
https://github.com/thanaraklee/pyspark-dataframe-operations
This project focuses on utilizing PySpark DataFrames to analyze and visualize data sourced from external datasets, such as CSV files. It provides a practical example of how to manipulate, transform, and gain insights from large datasets using the PySpark framework.
data-analysis dataframe pyspark python
Last synced: 29 Apr 2026
https://github.com/kawshik-khan/fake-news-analysis
A fake news detection ML model. It utilizes the Bag of Words model for text vectorization and a Multinomial Naive Bayes classifier to predict whether news articles are real or fake. The project covers data preprocessing, model training, and performance evaluation with accuracy metrics and a confusion matrix.
data-analysis data-science machine-learning ml python3
Last synced: 08 Jun 2026
https://github.com/marcinz20/anomaly-detection-in-credo-dataset
University project, which goal is to build a system, that detects anomalies in CREDO dataset
credo data-analysis data-science encoder-decoder-model jupiter-notebook pca-analysis python3
Last synced: 29 Apr 2026
https://github.com/prateek5525/yt-analysis-project
This project utilizes the YouTube Data API to analyze channel and video performance, offering insights into subscriber counts, views, video metrics, and monthly trends. It generates visual reports and exports data in CSV format, aiding in effective decision-making and performance tracking.
data-analysis jupyter-notebook python3 seaborn-plots youtube-api
Last synced: 29 Apr 2026
https://github.com/nivasharmaa/spiderverse
A comprehensive Java program for analyzing and managing events and data points within a fictional spiderverse. Features event handling, anomaly detection, cluster management, and robust file I/O operations.
advanced-algorithms anomaly-detection clustering data-analysis file-io object-oriented-programming
Last synced: 29 Apr 2026
https://github.com/kasraskari/learn-r-codes
A learning repository for R programming, covering data manipulation, visualization, and statistical analysis. (Work in progress!) 🚧
data-analysis data-analysis-r data-visualization r r-examples r-graphics r-statistics statistics
Last synced: 08 Jun 2026
https://github.com/jakebrehm/ezpz-plotting
📈 Easily visualize and manipulate plots from multiple data files.
data-analysis data-visualization engineering matplotlib matplotlib-pyplot pandas plotting python python-3 software software-engineering tkinter tkinter-gui
Last synced: 29 Apr 2026
https://github.com/mumtaz4118/scraping-medium-and-data-analytics
The file DataExtraction.py extracts information from the json files scrapped by the scrapper medium_scrapper_post.py. To extract information from json files scrapped by medium_scrapper_tag_archive.py (scrapping from tags archive) then use Data_Extraction_Archive_Tags.py
data data-analysis data-analytics data-extraction data-preprocessing data-science data-scraping deep-learning machine-learning python
Last synced: 29 Apr 2026
https://github.com/anilyigitsel/istanbul-rental-apartments-analysis
This project analyzes the Istanbul Rental Apartments Dataset (2025), which includes rental apartment listings from Istanbul, Turkey.
data-analysis data-visualization jupyter-notebook matplotlib pandas python rental-housing
Last synced: 29 Apr 2026
https://github.com/eco786786/restaurant_orders
This analysis seeks to uncover patterns in customer behaviour by examining restaurant order data.
data-analysis git postgresql tableau
Last synced: 29 Apr 2026
https://github.com/saroshfarhan/kaggle-playground-s4e11
Kaggle old competirion just for practice
data-analysis data-science data-visualization jupiter-notebook python3
Last synced: 29 Apr 2026
https://github.com/rafgpereira/obmep-analise
Código que analisa a retrospectiva das premiações da Obmep em determinada localidade e escola
data-analysis excel pandas python
Last synced: 29 Apr 2026
https://github.com/findmyway/dataframe-in-julia
A quick introduction of DataFrame in Julia for users from Python
data-analysis dataframe julia jupyter-notebook
Last synced: 29 Apr 2026
https://github.com/lankesathwik7/sql-query-assistant
Natural language to SQL query converter using Groq LLM. Ask questions in plain English and get SQL queries, visualized results, and natural language explanations. Built with Streamlit and PostgreSQL.
data-analysis database groq llm natural-language-processing python sql
Last synced: 29 Apr 2026
https://github.com/fatihilhan42/starbucks_analysis_turkey_and_world_with_python
In this project, firstly the brands for coffee in the world and then these brands in Turkey were examined. The data from the dataset, which you can find in the repo, was first organized using data cleaning algorithms. These cleaned data were then graphically extracted using data visualization algorithms.
data-analysis data-cleaning data-science data-visualization jupyter-notebook python
Last synced: 29 Apr 2026