Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-07-01 00:07:23 UTC
- JSON Representation
https://github.com/xjwllmsx/hacker-news-engagement
Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.
data data-analysis jupyter python
Last synced: 25 Apr 2026
https://github.com/badranalyst/student-tests-data-analysis-application
Python-based analysis of student test scores in math, reading, and writing, examining correlations with parental education, lunch type, and test preparation. Includes data cleaning, visualization, and statistical insights into factors influencing academic performance.
data-analysis data-visualization dataset matplotlib numpy pandas python sklearn
Last synced: 05 May 2026
https://github.com/andersoncrs/aprendizaje_no_supervisado_kmeans_customers
Este repositorio contiene un análisis de datos de clientes de un centro comercial utilizando técnicas de aprendizaje no supervisado, específicamente K Means y clustering jerárquico. El objetivo del proyecto es segmentar a los clientes en grupos homogéneos para entender mejor sus comportamientos y características.
data-analysis kmeans-clustering matplotlib numpy seaborn visualization
Last synced: 10 May 2026
https://github.com/marielachirinosr/bellabeat-wellness-data-trends
Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.
data data-analysis data-visualization pandas python python3 tableau tableau-public
Last synced: 25 Apr 2026
https://github.com/m-biriulova/python-job-market-analysis
Web scraping, data analysis, and visualization of Python developer vacancies in Czech Republic.
automation beautifulsoup data-analysis data-visualization portfolio-project python selenium web-scraping
Last synced: 25 Apr 2026
https://github.com/sarangs1621/weather-prediction
Weather Prediction Using Machine Learning is a project that leverages machine learning algorithms to predict weather conditions based on historical data. It evaluates three popular ML models (Decision Tree, KNN, and Logistic Regression) and provides performance insights through metrics and visualizations.
data-analysis decision-tree jupyter-notebook knn logistic-regression machine-learning predictive-modeling python scikit-learn weather-prediction
Last synced: 25 Apr 2026
https://github.com/docuvesta/youtube-api-fragrance-channel-analytics
Engagement metrics analysis of perfume Youtube channel using Youtube API 🎀
analysis beauty-products comments data-analysis data-analysis-python engagement-metrics insights jupyter-notebook likes-count marketing marketing-analytics perfume python views-count youtube youtube-api youtube-api-v3
Last synced: 03 May 2026
https://github.com/aastopher/mma_outcome
Simple exploratory analysis of UFC Fights and Vegas fight odds from 1993 to 2021
data-analysis data-visualization
Last synced: 06 Jun 2026
https://github.com/marielachirinosr/hotel-data-analysis
Pandas & Matplotlib Learning Analysis. Repository featuring data analysis projects using Pandas and Matplotlib libraries
data data-analysis matplotlib pandas python
Last synced: 25 Apr 2026
https://github.com/devexpress-examples/winforms-create-a-custom-exporter-for-pivotgridcontrol-with-xtrareport
This example illustrates how to dynamically create a custom report based on PivotGridControl content in WinForms.
data-analysis dotnet pivot-grid pivot-grid-for-winforms winforms
Last synced: 26 Apr 2026
https://github.com/chandansoren/customer-personality-analysis
Predict how different customer segments will respond for a particular product or service.
data-analysis data-visualization python
Last synced: 26 Apr 2026
https://github.com/dcs-training/2023-10-22-carpentry-social-science
Go to https://dcs-training.github.io/2023-10-22-Carpentry-Social-Science/ to follow along the material
data-analysis data-visualisation data-wrangling intro-to-programming r
Last synced: 06 Jun 2026
https://github.com/mozeel-v/spam-detection
ML-powered SMS Spam Classifier using NLP and Scikit-learn. Detects and filters spam messages with interactive Streamlit UI.
classification data-analysis mnb streamlit
Last synced: 10 May 2026
https://github.com/rociobenitez/happiness-index-data-processing
Repository for Big Data Processing - Contains Jupyter Notebooks and Datasets for data analysis and processing tasks related to Big Data.
big-data big-data-processing data-analysis data-processing happiness-index happiness-report jupyter-notebook matplotlib pandas seaborn
Last synced: 15 May 2026
https://github.com/crazy-dot/covid-19-analysis
This project performs an in-depth analysis and visualization of COVID-19 data, focusing on India and its states/union territories.
covid-19-india data-analysis jupyter-notebook matplotlib pandas python3 seaborn
Last synced: 10 May 2026
https://github.com/ys1f/geothermal_project
Geothermal Data Analysis & Visualization for Texas – well data, temperature gradients & zone mapping
bht bottom-hole-temperature data-analysis folium geopandas geospatial geothermal gis interpolation irena jupyter-notebook mapping python rasterio spatial-analysis temperature-gradient texas visualization well-data zone-mapping
Last synced: 26 Apr 2026
https://github.com/deliprofesor/cinematic-data-analytics-and-recommendation-platform
This project analyzes a movie dataset using machine learning algorithms to predict success, explore revenue-popularity relationships, and develop recommendation systems. It employs techniques like K-Means, DBSCAN, GMM, decision trees, PCA, and NLP for insights and personalized suggestions.
clustering content-based-recommendation data-analysis data-visualization decision-tree gmm k-means machine-learning natural-language-processing nlp pca predictive-modeling python recommendation-system scikit-learn user-based-recommendation
Last synced: 26 Apr 2026
https://github.com/swapnanildutta/prediction-with-python
The projects are made using Jupyter Notebook
data-analysis jupyter-notebook machine-learning prediction python regression-models
Last synced: 27 Apr 2026
https://github.com/moshora99/sql-data-warehouse-project
Build modern data warehouse with mysql, Including ETL processes, data modeling and analytics
data-analysis data-engineering data-science database datawarehouse datawarehousing etl scheme sql sql-query sql-server
Last synced: 27 Apr 2026
https://github.com/akashvarma26/data-analysis-on-imbd-using-sqlite3
Data Analysis on IMDb dataset using sqlite3 and Pandas in Jupyter notebook.
data-analysis jupyter-notebook pandas-dataframe sqlite
Last synced: 27 Apr 2026
https://github.com/shreyaamenon/data-analysis-aiml-mini-projects
mini projects to help me grow skills in data analysis, artificial intelligence and machine learning.
ai data-analysis jupyter-notebook machine-learning python
Last synced: 11 Apr 2026
https://github.com/mumtaz4118/amazon-iphone-12-data-scrapped
Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.
data-analysis data-extraction data-science data-scraping html mark-up python
Last synced: 27 Apr 2026
https://github.com/aksoni07/movie-recommendation
A hybrid movie recommendation system designed to deliver personalized and accurate suggestions by combining user preferences, item attributes, and collaborative patterns, ensuring a seamless and engaging experience.
clustering content-based-filtering data-analysis embeddings jupyter-notebook numpy ollaborative-filtering pandas personalization python recommendation-systems scikit-learn user-item-interactions
Last synced: 11 Apr 2026
https://github.com/malexandersalazar/covid-19-peru-estimacion-oxigeno-requerido
Análisis técnico de casos confirmados por COVID-19 en Perú para la estimación de oxígeno medicinal requerido.
covid-19 data-analysis data-science peru python
Last synced: 27 Apr 2026
https://github.com/as16082023/project-portfolio
A guide to all my projects
dashboard data-analysis data-cleaning data-visualization excel mysql power-bi python sql tableau
Last synced: 27 Apr 2026
https://github.com/manasashetty01/regulatory-affairs-of-road-accidents
Regulatory Affairs of Road Accidents in Million-Plus Cities (India, 2020)
data-analysis data-science data-visualization exploratory-data-visualizations jupyter-notebook numpy pandas python
Last synced: 27 Apr 2026
https://github.com/garcane/exodus_analysis
This project analyses cryptocurrency transaction data exported from the Exodus wallet. The goal is to explore and visualize the inflows and outflows of assets, the types of transactions, and other key metrics over time.
bitcoin btc crypto cryptocurrencies cryptocurrency data-analysis data-visualization eth ethereum pandas seaborn
Last synced: 27 Apr 2026
https://github.com/elakkiya-u/digital-marketing-campaign-conversion-prediction
A Predictive Modelling whether a customer will convert based on digital marketing campaign data.
campaign-analytics churn-prediction data-analysis deployment digital-marketing-analytics machine-learning power-bi predictive-modelling presentation-slides python
Last synced: 27 Apr 2026
https://github.com/mnkanout/patients_medication_prediction
The aim of the project is to create a model that can help medical professionals select the proper medication for patients based on their symptoms. The model uses historical data of other patients to predict what could be the most suitable medication based on the patient's symptoms.
data data-analysis data-science data-visualization decision-tree-classifier machine-learning python3
Last synced: 29 Jun 2025
https://github.com/caesaredia/food-app-user-behavior-analysis
Analyze user behavior and optimize app experience in a food-tech startup through funnel analysis and A/A/B testing. Includes data prep, visualization, and statistical testing in Python.
a-b-testing chi-square data-analysis data-visualization funnel-analysis python statistical-testing user-behavior
Last synced: 27 Apr 2026
https://github.com/banyc/dfplot
Summarize a data frame by plotting. `cargo install --git https://github.com/Banyc/dfplot.git`.
csv data-analysis plotly plotting statistics
Last synced: 27 Apr 2026
https://github.com/luca-02/credit-card-fraud-detection
This is a small master's degree project for New Generation Data Models and DBMSs course (academic year 2024/25).
data-analysis database nosql python
Last synced: 10 Jun 2026
https://github.com/audy21/datacamp
Learning portfolio documenting my progress, while taking Data Analyst & Data Science certifications from DataCamp.
data-analysis data-science machine-learning matplotlib numpy pandas python scikit-learn seaborn
Last synced: 11 Apr 2026
https://github.com/lotfiferaga/hotel-reviews-sentiment-analysis
Efficient Python-driven sentiment analysis for hotel reviews, providing insightful evaluations.
data-analysis data-visualization nlp python
Last synced: 07 Jun 2026
https://github.com/josedanielchg/1990s-netflix-movie-insight
Small exploratory analysis of Netflix movie data from the 1990s. This project is part of the DataCamp Associate Data Scientist in Python program and focuses on filtering, visualizing, and extracting insights from a dataset using Python. Analyze trends in movie durations and count short action films to practice key data science skills!
Last synced: 27 Apr 2026
https://github.com/bheemisme/employee-attrition-analysis
A Dashboard on employee-attrition-analysis
dashboard data-analysis data-science plotly plotly-dash python
Last synced: 28 Apr 2026
https://github.com/sferez/simple_linear_regression
Simple Linear Regression using Python
data-analysis data-science linear-regression python regression
Last synced: 28 Apr 2026
https://github.com/codingvangogh/data-science
Data Science, Machine Learning, Data Exploration, Big Data etc
data-analysis datascience decision-tree-classifier decision-tree-regression heatmap jupyter-notebook machinelearning python python3 ridge-regression seaborn sklearn svm-classifier
Last synced: 11 May 2026
https://github.com/sujata-adhikari/data-analysis
Data analysis of Market sales data using PowerBi, created dashboard to show analysis.
data-analysis excel pandas powerbi
Last synced: 12 Jun 2026
https://github.com/farhad-here/median-performance-comparison
Benchmarking the performance of median calculation using vanilla Python vs NumPy.
data-analysis matplotlib numpy python
Last synced: 18 Apr 2026
https://github.com/stefagnone/movies-dataset-analysis-project
Comprehensive analysis of the Movies dataset, exploring genre trends, comparisons, and qualitative insights using Python, Pandas, and visualizations. Designed to uncover actionable findings for stakeholders.
data-analysis data-visualization exploratory-data-analysis matplotlib movies-analysis pandas python seaborn storytelling-with-data
Last synced: 28 Apr 2026
https://github.com/melissaantunes/ibm-data-analyst-professional
IBM Data Analyst Professional Certificate
analyze-data data-analysis data-analyst data-manipulation data-science data-visualization ibm-data-analyst-professional pandas python
Last synced: 11 May 2026
https://github.com/haseebn19/urban-housing-demand
A full-stack web application for visualizing housing and labour market data
data-analysis data-visualization docker full-stack gradle statistics web webapp
Last synced: 22 Jun 2026
https://github.com/shinie19/sql-data-warehouse-project
Build a modern Data Warehouse from scratch with SQL Server, including ETL processes, data modeling and analytics.
data-analysis data-analytics data-cleaning data-engineering data-lake data-lakehouse data-modeling data-normalization data-science data-standardization data-warehouse etl-pipeline medallion-architecture sql-server
Last synced: 29 Jun 2026
https://github.com/rajivaleaakash/customer-churn-prediction
A machine learning project focused on predicting customer churn using various data analysis and modeling techniques. The repository includes data preprocessing, feature engineering, exploratory data analysis (EDA), model training, evaluation, and visualization to help businesses identify customers at risk of leaving.
churn-prediction classification customer-churn data-analysis data-science gridsearchcv imblearn machine-learning numpy pandas pyhton randomsearchcv scikit-learn
Last synced: 28 Apr 2026
https://github.com/elmezianech/autoinventory
This project is an end-to-end, fully automated warehouse management solution designed to tackle real-world inventory challenges in the FMCG sector. From real-time data ingestion and predictive analytics to interactive dashboards, this project combines cutting-edge technologies and an event-driven architecture to simulate a business-ready system.
automation dashboard data-analysis data-engineering-pipeline docker etl glue-job inventory-management kafka kpis lambda-functions lstm ml-pipeline mlflow power-bi pytorch redshift s3 streamlit warehouse-management
Last synced: 28 Apr 2026
https://github.com/abdeldjalilchafai/us-flight-delay-eda
Structured EDA on 2015 US flight delay data. Clean, reproducible notebook using a 6-step data analysis framework for real-world datasets.
data-analysis data-cleaning eda exploratory-data-analysis flight-delays kaggle matplotlib numpy pandas python seaborn
Last synced: 28 Apr 2026
https://github.com/sufyan14/weather-data-analysis
A Streamlit dashboard that forecasts 30-day weather trends using uploaded CSV data and Facebook Prophet.
data-analysis python streamlit
Last synced: 28 Apr 2026
https://github.com/shreeparab1890/indian-elections-2019-analysis-eda
This ipython notebook is the Exploratory data analysis (EDA) of the Indian Lok Sabha Elections 2019.
data data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib numpy pandas plotly python python3 visualization
Last synced: 28 Apr 2026
https://github.com/delonnewman/relational
Relational programming for Ruby
csv csv-import data data-analysis database export json relational relational-algebra relational-database relational-model relational-programming reporting reports ruby yaml
Last synced: 28 Apr 2026
https://github.com/27ahmad/ibm-data-science-capstone
The Capstone is the final course in the IBM Data Science Professional Certificate program. It's a project that combines all the skills and knowledge you've gained throughout the specialization.
data-analysis data-science folium-maps machine-learning plotly-dash python sql
Last synced: 26 May 2026
https://github.com/manalisbhavsar/stock-price-prediction
Stock Price Prediction model using Machine Learning and LSTM to forecast future stock prices based on historical data. Achieved a low error rate of 3.2% by leveraging moving averages and deep learning techniques, ensuring accurate predictions.
data-analysis deep-learning lstm machine-learning matplotlib numpy pandas python
Last synced: 28 Apr 2026
https://github.com/zimmi48/nixpkgs-issues
Analysis on nixpkgs issue lifetime.
data-analysis github-api nixpkgs
Last synced: 10 May 2026
https://github.com/delabrov/jwstoolkit
A python package for handling JWST observations
astronomy-astrophysics data-analysis data-cube data-visualization imagery jwst python3 spectroscopic-data
Last synced: 26 May 2026
https://github.com/sisolieri/prova_ds_saloocupacio2024
Admission challenge to Hackató Saló Ocupació by Barcelona activa
arima barcelona catboost data-analysis data-visualizations forecasting machine-learning pandas public-funding python scikit-learn time-series xgboost
Last synced: 10 Apr 2026
https://github.com/ericdataplus/kaggle-airbnb-nyc
NYC Airbnb Market Analysis: Multi-source from 2 Kaggle datasets (151K listings)
airbnb data-analysis kaggle nyc python visualization
Last synced: 28 Apr 2026
https://github.com/szapp/candyanalysis
Case study: Analyze the candy power ranking to identify and recommend popular candy characteristics
data-analysis data-visualization feature-selection interaction-terms
Last synced: 28 Apr 2026
https://github.com/wei-rongrong2/openfoodfactclustering
A project that explores clustering food products based on nutritional attributes using K-Means, Fuzzy C-Means, and DBSCAN algorithms, with a Streamlit dashboard for visualizing results.
clustering dashboard data-analysis dbscan food-products fuzzy-cmeans k-means machine-learning nutrition nutrition-clustering open-food-facts streamlit
Last synced: 28 Apr 2026
https://github.com/haideratgh/sql-data-analytics-project
This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis
analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-query sql-server window-functions-in-sql
Last synced: 29 Jun 2025
https://github.com/stas1f1/methods-and-models-for-multivariate-data-analysis
Completed tasks for the course on methods of mutivatiate data analysis, 1st year of masters, FDT ITMO
data-analysis multivariate-analysis python
Last synced: 10 Mar 2026
https://github.com/analysisbyvivek/crime-data
Analyzes crime patterns across different areas, exploring factors such as crime type, weapon usage, demographic influences, and geographic distribution to uncover trends in frequency, correlations, and hotspots.
apache-superset data-analysis eda jupyter-notebook python
Last synced: 11 May 2026
https://github.com/josedanielchg/efficient-data-storage-for-predictive-modeling
DataCamp project from the Associate Data Scientist track, focusing on optimizing dataset storage by transforming data types and filtering. Prepares data for efficient machine learning workflows
cleaning-dataset data-analysis jupyter-notebook python
Last synced: 28 Apr 2026
https://github.com/bala-1409/titanic-survived-prediction-datascience-classification-project
This projects predicts whether a passenger on the titanic survived or not using machine learning algorithms with the given details of the passenger data.
classification-algorithm data-analysis data-cleaning data-preprocessing data-science data-visualization eda exploratory-data-analysis gradient-boosting jupyter-notebook machine-learning-algorithms matplotlib predictive-modeling python3 seaborn
Last synced: 28 Apr 2026
https://github.com/kisaa-fatima/data-visualization-with-tableauleu
Conducted Exploratory Data Analysis (EDA) on the Berkeley Earth Dataset (large scale dataset), which features high-resolution land and ocean time series data. Created interactive dashboards using Tableau to effectively visualize and highlight trends and patterns within the data.
data-analysis data-science exploratory-data-analysis insights python tableau visualizations
Last synced: 29 Apr 2026
https://github.com/prady2309/sales-prediction-using-python
Implemented using Multiple Linear Regression
data-analysis data-science machine-learning python
Last synced: 29 Apr 2026
https://github.com/devexpress-examples/web-forms-pivot-grid-change-summary-display-mode
This example shows how to use different summary display modes in Pivot Grid for Web Forms.
asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms
Last synced: 29 Apr 2026
https://github.com/emircanakyuzz/veri_gorsellestirilmesi_ve_analizi-analysis_and_visualization_of_dataset
Bu çalışmada numpy, pandas, seaborn ve matplotlib gibi veri biliminde çokca bilinen modülleri kullanarak analiz ve görselleştirme işlemleri gerçekleştirdim.
data-analysis data-science data-visualization jupyter-notebook python
Last synced: 29 Apr 2026
https://github.com/thanaraklee/pyspark-dataframe-operations
This project focuses on utilizing PySpark DataFrames to analyze and visualize data sourced from external datasets, such as CSV files. It provides a practical example of how to manipulate, transform, and gain insights from large datasets using the PySpark framework.
data-analysis dataframe pyspark python
Last synced: 29 Apr 2026
https://github.com/kawshik-khan/fake-news-analysis
A fake news detection ML model. It utilizes the Bag of Words model for text vectorization and a Multinomial Naive Bayes classifier to predict whether news articles are real or fake. The project covers data preprocessing, model training, and performance evaluation with accuracy metrics and a confusion matrix.
data-analysis data-science machine-learning ml python3
Last synced: 08 Jun 2026
https://github.com/devexpress-examples/winforms-visualize-pivot-grid-data-in-chart
The following example shows how to integrate the Pivot Grid with the Chart control.
charting data-analysis dotnet pivot-grid-for-winforms winforms
Last synced: 29 Apr 2026
https://github.com/nivasharmaa/spiderverse
A comprehensive Java program for analyzing and managing events and data points within a fictional spiderverse. Features event handling, anomaly detection, cluster management, and robust file I/O operations.
advanced-algorithms anomaly-detection clustering data-analysis file-io object-oriented-programming
Last synced: 29 Apr 2026
https://github.com/rodrigojunqueiradev/data-exploration-and-cleaning
Credit Analysis Data: Foundations for Cleaning and Exploration
data-analysis data-engineering data-science data-visualization datascience matplotlib matplotlib-pyplot numpy pandas python python-3 python3
Last synced: 13 Apr 2026
https://github.com/kasraskari/learn-r-codes
A learning repository for R programming, covering data manipulation, visualization, and statistical analysis. (Work in progress!) 🚧
data-analysis data-analysis-r data-visualization r r-examples r-graphics r-statistics statistics
Last synced: 08 Jun 2026
https://github.com/satyacoder29/crowdfunding-in-sql
Crowdfunding is a method of raising funds for projects or causes by collecting small contributions from a large group of people, usually through online platforms. It enables individuals, startups, and nonprofits to secure funding, offering rewards or recognition in exchange, and helps bring ideas to life without traditional financing.
data-analysis data-cleaning database-management mysql-database quries sql sql-functions sql-server views
Last synced: 29 Apr 2026
https://github.com/ceia-prefeitura/urban-lit-tracker-etl
UrbanLitTracker coleta artigos acadêmicos sobre mudanças urbanas via OpenAlex API, processa e armazena em MongoDB. Oferece dashboard interativo com Dash, exibindo dados como trabalhos mais relevantes, autores e palavras-chave frequentes, facilitando a análise e visualização da literatura urbana.
academic-research bibliometrics data-analysis data-pipeline data-visualization etl openalex-api urban-studies
Last synced: 11 May 2026
https://github.com/mdaffailhami/king_county_home_sales_analysis
This repository contains code and analysis for exploring home sales data in King County, featuring geospatial mapping to visualize trends and factors influencing housing prices, including location, size, and various property features, using Python and popular data analysis libraries.
data-analysis data-science folium-maps geospatial python
Last synced: 29 Apr 2026
https://github.com/deliprofesor/amazon-movie-analysis-and-visualization
"Amazon Movie Analysis and Visualization" is a Python project that analyzes and visualizes movie data from Amazon.com, including ratings, directors, actors, release years, MPAA ratings, and pricing. The project provides insights into movie trends and popular films, helping users explore key patterns through interactive visualizations.
data-analysis data-visualization matplotlib pandas python
Last synced: 12 May 2026
https://github.com/eco786786/restaurant_orders
This analysis seeks to uncover patterns in customer behaviour by examining restaurant order data.
data-analysis git postgresql tableau
Last synced: 29 Apr 2026
https://github.com/dcs-training/network-analyisis-python
Course material for introducing data visualization with Altair and network analysis with NetworkX (in Python). Go to the readme file
data-analysis data-visualisation network-analysis python text-analysis
Last synced: 29 Apr 2026
https://github.com/mrjxtr/ossph_2025_survey_analysis
OSSPH_2025_Survey_Analysis
data-analysis data-visualization matplotlib nltk pandas python sentiment-analysis
Last synced: 29 Apr 2026
https://github.com/saroshfarhan/kaggle-playground-s4e11
Kaggle old competirion just for practice
data-analysis data-science data-visualization jupiter-notebook python3
Last synced: 29 Apr 2026
https://github.com/i7t5/sentimentnlp
Sentiment analysis for COMP 435 Introduction to Machine Learning, Spring 2025
data-analysis jupyter-notebook machine-learning nlp python sentiment-analysis
Last synced: 29 Apr 2026
https://github.com/OdessaZ/Portfolio-Projects
This is a repository I have created to showcase skills, share projects and track my progress in Data Analytics and Data Science
applied-mathematics data-analysis data-science excel jupyter-notebook matplotlib-pyplot pandas portfolio python r r-studio seaborn sql statistics
Last synced: 12 May 2026
https://github.com/regmibijay/opencarp-analyzer
Reads Trace Files created by OpenCARP Models and exports data for easy plotting with inbuilt plotter script.
bioinformatics data-analysis opencarp
Last synced: 16 Jan 2026
https://github.com/lankesathwik7/sql-query-assistant
Natural language to SQL query converter using Groq LLM. Ask questions in plain English and get SQL queries, visualized results, and natural language explanations. Built with Streamlit and PostgreSQL.
data-analysis database groq llm natural-language-processing python sql
Last synced: 29 Apr 2026
https://github.com/jbalooshie/pyber_analysis
Analysis of ride share data using Matplotlib and pandas, executed in Jupyter Notebook. Breakdowns are provided based on the city size, average fare, and number of rides taken.
data-analysis data-science data-visualization jupyter-notebook matplotlib pandas python
Last synced: 12 May 2026
https://github.com/taljindergill78/yelp-arizona-analysis
This project analyzes the Yelp dataset for the state of Arizona to extract insights about restaurant businesses and user behavior. Using Apache Spark and PySpark for distributed data processing, the project demonstrates how big data tools can be used to uncover patterns in customer reviews, business performance, and user engagement.
big-data data-analysis data-engineering distributed-computing pyspark spark sql yelp-dataset
Last synced: 29 Apr 2026
https://github.com/carlos-edulira/mbabigdata-projeto
Entrega do projeto MBA Unipe Big Data BI
data-analysis delta minio python spark
Last synced: 29 Apr 2026
https://github.com/ak-alien/combobullet
ComboBullet is a versatile log processing and credential extraction toolkit for Windows. It offers multiple features to filter, extract, and manage credentials and cookie data from raw .txt files. This tool is particularly useful for combo scrapers, data analysts, and penetration testers.
combo-extraction cookie-extraction credential-management data-analysis log-processing penetration-testing
Last synced: 30 Jun 2025
https://github.com/mfakhriazhar/python-data-analyst-tutorial
A collection of My Python learning files for Data Analyst purposes. Covers fundamental to advanced topics such as data exploration, visualization, statistical analysis, and the use of popular libraries like Pandas, NumPy, Matplotlib, and Seaborn. Suitable for personal documentation or shared learning references.
data-analysis data-science data-visualization exploratory-data-analysis portfolio python
Last synced: 29 Apr 2026
https://github.com/roland045/bike-share-dataset-analysis
User behaviour analysis on a public bike-share dataset
data-analysis data-visualization python time-series-analysis user-behavior-analytics
Last synced: 29 Apr 2026
https://github.com/teja-1403/forage-standard-bank-data-science
This repository contains solutions to the 4 different tasks that must be performed during the Data Science virtual internship provided by Standard Bank via Forage.
automl communication-skills data-analysis data-science machine-learning python sql
Last synced: 29 Apr 2026
https://github.com/farhad-here/textprepx
A Multilingual Text Preprocessing Tool for English and Persian.
cleantext contractions data-analysis deep-learning emoji nlp nltk opp parsivar regex streamlit text-preprocessing textblob
Last synced: 29 Apr 2026
https://github.com/danpoynor/python-number-guessing-game-with-stats
A number guessing game written in Python 3 that presents median, mode, and mean statistics
console-game data-analysis number-guessing-game python3 statistics
Last synced: 26 May 2026
https://github.com/srinibas-masanta/yelp-business-reviews-analysis
This project analyzes Yelp business reviews using Python, Snowflake, and SQL, focusing on efficient data ingestion, transformation, and analysis. We preprocess JSON data, optimize ingestion via Amazon S3, classify sentiments with Python UDFs, and extract insights using SQL queries—showcasing a streamlined end-to-end workflow.
amazon-s3 data-analysis json python snowflake sql
Last synced: 29 Apr 2026
https://github.com/valikmorinko/ecommerce-sales-analysis
Анализ продаж e-commerce: данные, визуализации, аналитические выводы.
data-analysis e-commerce jupyter matplotlib pandas python seaborn
Last synced: 29 Apr 2026
https://github.com/ggarciajavier/udacity-dalf-project2-wrangle-openstreetmap-data
Work performed for the 2nd project of Udacity Data Analyst Nanodegree: OpenStreetMap data wrangling and analysis.
data-analysis openstreetmap python sql
Last synced: 12 May 2026
https://github.com/sdley/cas_pratique-del_annuel
Del-Annuel est logiciel de deliberation annuelle des ecoles superieures ou universités
data-analysis pandas python tkinter-gui
Last synced: 29 Apr 2026