Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/robinmillford/environmental-monitoring-and-analysis

In this comprehensive project, I undertook an in-depth exploration of environmental data, seeking to understand the intricate relationship between Liquefied Petroleum Gas (LPG) consumption and Carbon Monoxide (CO) emissions.

data-analysis data-visualization environmental-monitoring iot jupyter-notebook machine-learning prediction python3 sql tableau

Last synced: 17 Nov 2024

https://github.com/robinmillford/predicting-diabetes-a-machine-learning-approach-to-early-intervention

The goal of this project was to develop a predictive model for diabetes using a dataset containing various health-related features

data-analysis data-science diabetes-prediction jupyter-notebook machine-learning smote

Last synced: 17 Nov 2024

https://github.com/robinmillford/animeinsights-user-feedback-analysis

In this project, I leveraged SQL queries to analyze and extract valuable insights from an "anime" dataset. The dataset includes information such as titles, scores, episode counts, genres, and popularity rankings for various anime series and movies.

anime data-analysis data-cleaning mysql

Last synced: 17 Nov 2024

https://github.com/robinmillford/optimizing-treatment-plans-through-data-analysis

The primary focus was on understanding customer health, treatment, and associated charges over multiple years.

data-analysis data-visualization healthcare mysql powerbi sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/loanalytics-investigating-financial-trends-with-world-bank-data

The project aimed to explore and analyze World Bank Loan Data, leveraging Python for data preprocessing and SQL for in-depth queries

data-analysis data-visualization jupyter-notebook mysql tableau world-bank

Last synced: 17 Nov 2024

https://github.com/robinmillford/playstore-app-insights-uncovering-app-market-trends

In my Playstore App analysis, I uncovered valuable insights about app market trends. I discovered the top-rated apps, identified popular app categories, and explored user sentiments. My findings provide a comprehensive understanding of the app landscape, aiding in informed decision-making and strategy development for app developers and marketers.

data-analysis data-cleaning data-visualization jupyter-notebook python3 sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/hr-analytics-employee-performance-analysis

HR Analytics: Unveiling Employee Performance - A comprehensive exploration of employee data using SQL and Power BI, uncovering key insights for strategic HR decision-making.

data-analysis data-visualization jupyter-notebook powerbi python3 sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/instagram-user-insights-analyzing-user-behavior

In this project, I delved into the dynamics of a popular photo-sharing website using SQL queries

data-analysis data-visualization instragram powerbi sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/customer_personality_analysis

Consumer personality analysis is a thorough examination of a business' ideal clients. A company may more easily adapt goods to meet the unique wants, behaviours, and concerns of various consumer types because to this improved understanding of its customers.

data-analysis data-science machine-learning python3

Last synced: 17 Nov 2024

https://github.com/robinmillford/data-professional-survey-power-bi

I worked on a Power BI project called 'Data Professional Survey Breakdown

data-analysis data-visualization powerbi

Last synced: 17 Nov 2024

https://github.com/robinmillford/analyzing-spotify-streaming-data

The goal of this project was to analyze a dataset of Spotify streaming data, spanning from 2014 to 2022, and extract meaningful insights related to song popularity, artists, and streaming patterns.

data-analysis jupyter-notebook python spotify sqlite3

Last synced: 17 Nov 2024

https://github.com/robinmillford/india-s-covid-19-journey-a-case-study-analysis

In this extensive project, I embarked on a profound exploration of India's journey through the COVID-19 pandemic. This endeavor involved a multi-faceted approach, encompassing data preprocessing with Python, data analysis with SQL queries, and data visualization using Power BI.

covid-19 data-analysis data-cleaning-and-preprocessing data-visualization jupyter-notebook powerbi pythin3 sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/analyzing-e-commerce-transactions---data-cleaning-cohort-analysis-and-sql

In this project, I aimed to analyze the profitability of products in an e-commerce dataset. I performed various SQL queries to extract valuable insights about product profitability, including the identification of the top 5 products with the highest profit margin, and unique combinations of brands and product lines with the highest profitability.

cohort-analysis data-analysis data-visualization excel jupyter-notebook powerbi python3 sql

Last synced: 17 Nov 2024

https://github.com/hasnathjami/data-analysis-of-covid-19

An Oracle PL/SQL-based project on COVID-19 data analysis. It is my CSE 4.1 project of Distributive Database Management System LAB.

data-analysis naive-bayes-classifier oracle-database probability-statistics sqlplus

Last synced: 17 Nov 2024

https://github.com/faizantkhan/automated-eda

This repository showcases tools for automatic Exploratory Data Analysis (EDA) in Python. These tools help you quickly understand your datasets and generate insightful reports.

automatic automation autoviz data-analysis data-analysis-python data-science data-visualization dtale dtale-library eda exploratory-data-analysis ml pandas pandas-profiling python python-library sweetviz

Last synced: 15 Nov 2024

https://github.com/faizantkhan/python_matplotlib

Matplotlib is a powerful Python library for creating visualizations and plots. It’s widely used for data representation, making complex information more accessible and interpretable. It offers various types of plots, including line graphs, scatter plots, bar charts, histograms, and more

data-analysis data-analytics data-engineering data-science data-visualization deep-learning graphs line machine-learning machine-learning-algorithms matplotlib matplotlib-pyplot matplotlib-python python

Last synced: 15 Nov 2024

https://github.com/vaishnavipaithane/cyclistic-bike-share-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

data-analysis r-programming-language rstudio

Last synced: 15 Nov 2024

https://github.com/shriram-vibhute/digit_classification

This project demonstrates various machine learning techniques for classifying handwritten digits from the MNIST dataset. It covers data preprocessing, model training, evaluation, and advanced classification strategies.

classification data-analysis data-visualization machine-learning matplotlib numpy pandas sk-learn

Last synced: 15 Nov 2024

https://github.com/gabrielagodek/webscraper

The project was developed during master's studies. It is based on the Python library Scrapy.

data-analysis python scraper scrapy

Last synced: 17 Nov 2024

https://github.com/rayanwaked/wildfire-analysis

The project aims to visualize wildfire activity in Oregon, exploring related data to create visualizations and tables to analyze the historical patterns.

data data-analysis data-visualization jupyter jupyter-notebook oregon portland-state-university python wildfire-data-visualization

Last synced: 15 Nov 2024

https://github.com/rayanwaked/data-analysis

This data-analysis project looks into a large banking dataset from a Portuguese institution, attempting to uncover linear and non-linear correlations to answer a core business question.

data data-analysis data-science data-visualization jupyter-notebook python

Last synced: 15 Nov 2024

https://github.com/mikhaelmounay/salty-med

Salty Mediterranean - Grade 12 Capstone Project

data-analysis data-visualization

Last synced: 15 Nov 2024

https://github.com/mpoojithavigneswari/bangalore-house-price-prediction

This project involves creating a website that predicts Bangalore house prices with 94.65% accuracy using a machine learning algorithm.

data-analysis data-science flask-server machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 15 Nov 2024

https://github.com/prakashjha1/whatsapp-chat-analyzer

WhatsApp Analyzer means we are analyzing our WhatsApp group activities. It tracks our conversation and analyses how much time we are spending or saying it as “wasting” on WhatsApp.

data-analysis data-science natural-language-processing pandas pyhton regular-expression

Last synced: 15 Nov 2024

https://github.com/prakashjha1/stock-investment-analysis

Stock Investment Analysis Project can help investor to select the better performing stocks.

data-analysis data-science numpy pandas pandas-datareader parallel-programming python

Last synced: 15 Nov 2024

https://github.com/ahmad-ali-rafique/handwritten-digit-recognition-mnist

This project demonstrates a complete pipeline for recognizing handwritten digits using the MNIST dataset. The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation of a Fully Connected Neural Network (FCNN).

ai artificial-intelligence data data-analysis datascience deep-learning deep-neural-networks fcnn fully-connected-network machine-learning machine-learning-algorithms ml modeling

Last synced: 15 Nov 2024

https://github.com/ahmad-ali-rafique/decision-tree-regressor-modeling

Comprehensive exploration of decision tree regressors, including data cleaning, model building, and performance evaluation on various datasets.

artificial-intelligence data data-analysis dataanalytics decision-trees decisiontreeregressor modeling models regression-models

Last synced: 15 Nov 2024

https://github.com/ahmad-ali-rafique/weather-prediction-fcnn

This project demonstrates a complete pipeline for weather prediction using a Fully Connected Neural Network (FCNN). The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation.

ai artificial-intelligence data-analysis data-science deep-learning deep-neural-networks fully-connected-network machine-learning machine-learning-algorithms weather-information

Last synced: 15 Nov 2024

https://github.com/ribin-baby/the-sparks-foundation-data-science-internship

This repository contains tasks and solutions assigned as part of internship program. This repository contains workbooks on data analysis and model building parts.

data-analysis eda python3

Last synced: 15 Nov 2024

https://github.com/hanzopgp/lolanalysis

League Of Legends game data engineering, analysis, visualization and machine learning. Business intelligence project.

data-analysis data-cleaning data-engineering data-visualization dataiku deep-learning etl machine-learning scraping university

Last synced: 15 Nov 2024

https://github.com/w-edward/youtube-keyword-popularity-analyzer

An effort to discover the top trending keywords on Youtube.

data-analysis node-js numpy python webscraping youtube-api

Last synced: 15 Nov 2024

https://github.com/kunalpisolkar24/winequalityprediction

Predicting wine quality using machine learning with matplotlib, numpy, pandas, and seaborn for insightful data analysis. 🍇🤖📊

data-analysis data-science data-visualization machine-learning prediction-model

Last synced: 15 Nov 2024

https://github.com/codewithmayank-py/box-office-analysis-with-seaborn-and-python

This repository contains Python code and datasets for analyzing box office data. Explore trends, patterns, and factors influencing movie performance.

analysis box-office-data-analysis data-analysis data-visualization dataset jupyter-notebook matplotlib pandas python3 seaborn

Last synced: 17 Nov 2024

https://github.com/fhdsl/seattlestatsummer_r

A 4-day introduction to R programming, focused on Fred Hutch Research Interns

beginner beginner-friendly data-analysis data-science introduction-to-programming r-programming tidyverse

Last synced: 15 Nov 2024

https://github.com/alchemine/analysis-tools

Analysis tools for machine learning projects

data-analysis explanatory-data-analysis machine-learning python

Last synced: 15 Nov 2024

https://github.com/alanjamlu34/bike-dataset

Ini adalah tugas akhir dari kelas Dicoding Menjadi Data Analist

data-analysis streamlit-dashboard

Last synced: 17 Nov 2024

https://github.com/michalspano/maturitna-skuska-proj

Maturitná skúška 2021/2022 - objektívna spracovanie a analýza dát

data-analysis

Last synced: 17 Nov 2024

https://github.com/shivakumarhl/digital-music-store-analysis

Digital Music Store Data Analysis using SQL

data-analysis sql

Last synced: 17 Nov 2024

https://github.com/bhavanachitragar/data-analysis-using-pyspark

Working with pyspark module in python and using google colab environment in order to apply some queries to the dataset. The dataset consist of two csv files listening.csv and genre.csv. Also, visualizing query results using matplotlib.

data-analysis google-colab pyspark-sql

Last synced: 17 Nov 2024

https://github.com/smohanta23/ev-trendanalytics-24

This Tableau project analyzes EV adoption trends using data up to May 2024. Visualizations cover growth, geography, market share, CAFV eligibility, and consumer preferences, supporting data-driven decisions with detailed drill-downs. Data is meticulously cleaned, offering stakeholders valuable insights into EV market dynamics and trends for future.

business-intelligence data-analysis data-engineering electric-vehicles feature-engineering kpianalysis predictive-analytics tableau trendanalysis

Last synced: 17 Nov 2024

https://github.com/kevingastelum/mydataanalysis

My DataAnalyst Projects | Python, SQL, Excel, PowerBI & Tableau

data-analysis python sql visualization

Last synced: 17 Nov 2024

https://github.com/umutsevdi/hr-management

HR Management, Analytics and Salary Determination System

analytics data-analysis java java17 postgresql python spring spring-boot vaadin vaadin-flow

Last synced: 17 Nov 2024

https://github.com/sajjad425/edaipl

The dataset covers the Indian Premier League (IPL) with details on matches (date, teams, venue, results), player stats (runs, wickets), team stats (wins, losses), season summaries, and umpire info. The EDA reveals patterns and insights, highlighting dominant teams, star players, and trends across seasons.

data-analysis eda exploratory-data-analysis ipl python

Last synced: 17 Nov 2024

https://github.com/spacebakery/nba-trends-project

Data Science Foundations I | Exploratory Data Analysis in Python | Summarizing Relationship Between Two Features

categorical-variables data-analysis data-visualization matplotlib nba-dataset quantitative-variables scipy seaborn subset summary-statistics

Last synced: 17 Nov 2024

https://github.com/lc-rezende/eqx_boston_dataset

Exploratory data analysis, clustering, and forecasting on Boston crime data (2011-2015), revealing key crime trends, hotspots, and temporal patterns to support data-driven insights for urban safety and policing strategies.

data-analysis exploratory-data-analysis jupyter-notebook kmeans matplotlib numpy pandas prophet-facebook python scikit-learn seaborn

Last synced: 17 Nov 2024

https://github.com/adeebkhan25/dataset_suicide_susceptible

The "Student Suicide Risk Factors Dataset" is a comprehensive collection of data aimed at understanding and mitigating the factors contributing to student suicides.

data-analysis dataset machine-learning supervised-learning

Last synced: 17 Nov 2024

https://github.com/sarthakmishraa/bike_rental_predictor

Bike Sharing Dataset : This dataset contains the hourly and daily count of rental bikes between years 2011 and 2012 in Capital bikeshare system with the corresponding weather and seasonal information.

data-analysis machine-learning python xgboost

Last synced: 17 Nov 2024

https://github.com/sehgal-vishal/sql-nyc-collision-analysis

this analysis is based on the Collisions(Accidents) happend in New York City. I have used Sql Server For EDA(Exploratory Data Analysis

data-analysis database eda sql-server

Last synced: 17 Nov 2024

https://github.com/aavishkarmahajan/sql

SQL code assignments and practice questions from SQL courses, SQL data analysis

data-analysis sql sql-server

Last synced: 17 Nov 2024

https://github.com/shubham200137/spotify-listening-habits-analytics

Spotify Listening Habits Analytics is a project aimed at analyzing personalized Spotify listening habits and music trends. It involves Exploratory Data Analysis (EDA) with Python Pandas, data processing using SQL Server, and creating visualizations with Power BI. The goal is to uncover insights into listening patterns, track popularity, and artist.

data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas power-bi-dashboard sqlserver

Last synced: 17 Nov 2024

https://github.com/krzysikd/apartment-prices-in-poland-analysis-and-visualization

Data Analyst portfolio project that involves cleaning, transforming, and visualizing data to create an insightful dashboard. The project uses SSIS for ETL processes, SSMS for database management and queries, and Power BI for data visualization, focusing on the analysis of rental and sales apartment prices in Poland.

data-analysis data-cleaning data-visualizations powerbi sql sqlserver ssis

Last synced: 17 Nov 2024

https://github.com/analyticslover/salifort-motors-turnover-project

The Salifort Motors H.R. Project serves as the capstone for the Google Advanced Analytics Program on Coursera. This project presents a business scenario and a problem on the scnario context, employee turnover. In this project, essential techniques as EDA and Data Modeling are used to analyze and predict the employee turnover rates in the company.

data data-analysis datamodeling eda machine-learning pandas python sklearn

Last synced: 17 Nov 2024

https://github.com/edwinrlambert/investigating-netflix-movies

Demonstrates data analysis and visualization techniques for Netflix movies using Python in a Jupyter notebook. This is a DataCamp project.

data-analysis data-analysis-python netflix python

Last synced: 17 Nov 2024

https://github.com/edwinrlambert/exploring-airbnb-market-trends

Dive into NYC's Airbnb market trends through detailed analysis of listings data, including prices, types, and review dates.

airbnb data-analysis jupyter-notebook market-trends python

Last synced: 17 Nov 2024

https://github.com/karlyndiary/adidas-sales-analysis

Analyzing Adidas' sales performance and profitability across US retailers by exploring regional trends, product performance, and sales channels, using Python for data cleaning, SQL for querying, and Excel for visualizations.

adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python

Last synced: 17 Nov 2024

https://github.com/karlyndiary/smartphone-price-analytics

A data pipeline for analyzing smartphone pricing by retrieving data from Flipkart using RapidAPI, transforming it, and visualizing insights using SQL Server and Excel.

beautifulsoup data-analysis data-pipeline data-visualization data-visualization-dashboard etl microsoft microsoft-excel microsoft-sql-server python smartphone-price-analysis

Last synced: 17 Nov 2024

https://github.com/1401dev/iowa-liquor-retail-sales-analysis

This repository contains the analysis of Iowa liquor retail sales data, aimed at uncovering sales trends and forecasting future sales patterns. The project involves data cleaning, preparation, and advanced time series analysis using Microsoft SQL Server and Google Colab.

customer-behavior data-analysis data-cleaning data-science data-visualization exploratory-data-analysis forecasting google-colab machine-learning microsoft-sql-server pandas prophet python retail-analytics retail-sales sales-forecasting sales-performance sql statsmodels time-series-analysis

Last synced: 17 Nov 2024

https://github.com/lewismakau/portfolio-projects

This repository contains file data and SQL files for projects used for my Portfolio.

data-analysis data-cleaning data-structures data-visualization database google-analytics microsoft-sql-server mysql powerbi tableau

Last synced: 17 Nov 2024

https://github.com/balajimohan18/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language

data-analysis data-mining data-science eta microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 17 Nov 2024

https://github.com/bala-1409/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation database eda etl-framework exploratory-data-analysis microsoft-sql-server query-language sql sql-server sql-server-database sql-server-management-studio

Last synced: 17 Nov 2024

https://github.com/shridhar1504/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation eda etl-framework microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 17 Nov 2024

https://github.com/navp7/pizzasales_powerbi

This project involves creating a comprehensive sales performance dashboard using Power BI to visualize and analyze the sales data of an Italian pizza company.

data-analysis ms-sql-server ms-word powerbi visualization

Last synced: 17 Nov 2024

https://github.com/chandkund/loan-eligibility-prediction

This project is designed to predict the eligibility of loan applicants based on various factors such as income, credit history, and marital status. By analyzing historical loan application data, the model helps to determine whether a loan application should be approved or not.

data-analysis data-science data-visualization machine-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 17 Nov 2024

https://github.com/shervinnd/bazar_app_store_eda

Bazar App Data analysis code to find the most and most popular installed apps

data data-analysis data-science dataanalysis eda python

Last synced: 17 Nov 2024

https://github.com/jovicdev97/financial-data-analytics

using numpy and pandas to analyze a synthetic loan dataset with python

data-analysis matlabplot numpy pandas plotting python seaborn

Last synced: 18 Nov 2024

https://github.com/bnvulpe/regression-and-time-series

This work centers on assessing and comparing predictive models for regression and time series prediction using specific datasets, with the goal of selecting the most effective methodology for unseen test data.

colab data-analysis data-analysis-python data-science data-visualization forecasting jupyter-notebook machine-learning model-evaluation predictive-modeling python regression sarima sarimax time-series-analysis time-series-analysis-and-forecasting

Last synced: 18 Nov 2024

https://github.com/azaz9026/loan_approval_prediction

Welcome to the Loan Approval Prediction repository! This project aims to build a predictive model that can determine whether a loan application should be approved or denied based on various features. Purpose The goal of this repository is to develop a machine learning model that can accurately predict loan approval decisio

data data-analysis data-visualization eda machine-learning numpy pandas python statistics

Last synced: 18 Nov 2024

https://github.com/azaz9026/car_price_prediction_model

This repository contains a machine learning model designed to predict car prices based on various features. Using historical data on car attributes such as make, model, year, mileage, and other relevant factors, the model aims to provide accurate and reliable price estimates for used cars.

data-analysis data-engineering liner-regestion machine-learning modeling numpy pandas python3 rendering

Last synced: 18 Nov 2024

https://github.com/aishwaryagm1999/california-housing-prices-data-analysis

Performed Data Cleaning and Data Analysis of the California Housing Prices Dataset to find the relation between the housing prices at a block and the amenities and facilities stated in the dataset such as total number of rooms, ocean proximity etc.

data-analysis data-visualization matplotlib numpy pandas python seaborn

Last synced: 18 Nov 2024

https://github.com/wikidata/purdue-data-mine-2024

Program materials for WMDE's 2024 Purdue Data Mine project

analytics data-analysis data-quality data-science etl open-data python wikidata wikimedia

Last synced: 18 Nov 2024

https://github.com/reddyprasade/r-program

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

data-analysis data-science r-programming

Last synced: 18 Nov 2024

https://github.com/nivasharmaa/friskwatch

A Java program for analyzing stop-and-frisk data from the NYPD. Features data import, organization, and statistical analysis to compare occurrences during and after policy implementation.

data-analysis data-visualization dataprocessing datascience file-io java java-oop nypd-data

Last synced: 18 Nov 2024

https://github.com/nivasharmaa/spiderverse

A comprehensive Java program for analyzing and managing events and data points within a fictional spiderverse. Features event handling, anomaly detection, cluster management, and robust file I/O operations.

advanced-algorithms anomaly-detection clustering data-analysis file-io object-oriented-programming

Last synced: 18 Nov 2024

https://github.com/curtisalexander/cramisc

Personal R functions for data analysis

data-analysis r r-pkg

Last synced: 18 Nov 2024

https://github.com/andrewzgheib/premier-league-analysis

A PowerBI dashboard analysing various KPIs of the Premier League

data-analysis kpi powerbi

Last synced: 18 Nov 2024

https://github.com/prajjwol09/data-cleaning-project

This project is dedicated to cleaning, standardizing a dataset, dealing with null values from a CSV file named "layoffs" using MySQL, with MySQL Workbench as the workspace environment. The goal is to prepare the data for analysis.

cleaning-data columns data-analysis database duplicates mysql rows standard

Last synced: 18 Nov 2024

https://github.com/sweta2501/netflix_dataanalysis

With the help of Netflix Data, I have done some Data Analysis.

data-analysis data-science jupyter-notebook python

Last synced: 18 Nov 2024