Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mpoojithavigneswari/bangalore-house-price-prediction

This project involves creating a website that predicts Bangalore house prices with 94.65% accuracy using a machine learning algorithm.

data-analysis data-science flask-server machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 15 Nov 2024

https://github.com/hevalhazalkurt/word_analyser

A web app developed in Python and Django that analyzes given text mathematically and sentimentally.

analyzer analyzes content data-analysis django emotion python python3 sentiment sentiment-analyser sentiment-analysis text text-analysis

Last synced: 20 Nov 2024

https://github.com/ljadhav25/knn-algorithm-data-science-

This repository contains a project demonstrating the implementation and application of the K-Nearest Neighbors (K-NN) algorithm in Data Science. The objective is to provide a comprehensive understanding of the K-NN algorithm, including data preprocessing, model training, evaluation, and visualization of results. This project is ideal for beginners

data-analysis data-science knn-classification machine-learning matplotlib-pyplot numpy pandas-library seaborn

Last synced: 18 Nov 2024

https://github.com/mikhaelmounay/salty-med

Salty Mediterranean - Grade 12 Capstone Project

data-analysis data-visualization

Last synced: 15 Nov 2024

https://github.com/smohanta23/ev-trendanalytics-24

This Tableau project analyzes EV adoption trends using data up to May 2024. Visualizations cover growth, geography, market share, CAFV eligibility, and consumer preferences, supporting data-driven decisions with detailed drill-downs. Data is meticulously cleaned, offering stakeholders valuable insights into EV market dynamics and trends for future.

business-intelligence data-analysis data-engineering electric-vehicles feature-engineering kpianalysis predictive-analytics tableau trendanalysis

Last synced: 17 Nov 2024

https://github.com/kevingastelum/mydataanalysis

My DataAnalyst Projects | Python, SQL, Excel, PowerBI & Tableau

data-analysis python sql visualization

Last synced: 17 Nov 2024

https://github.com/umutsevdi/hr-management

HR Management, Analytics and Salary Determination System

analytics data-analysis java java17 postgresql python spring spring-boot vaadin vaadin-flow

Last synced: 17 Nov 2024

https://github.com/aneeshmurali-n/global-superstore-sales-dashboard---power-bi-stunning-dark-theme

This Power BI dashboard provides a comprehensive view of sales data, enabling users to analyze sales trends, identify top-performing regions, and gain insights into customer behavior.

dark-theme dashboard data-analysis data-science data-visualization powerbi salesdashboard

Last synced: 20 Nov 2024

https://github.com/rayanwaked/data-analysis

This data-analysis project looks into a large banking dataset from a Portuguese institution, attempting to uncover linear and non-linear correlations to answer a core business question.

data data-analysis data-science data-visualization jupyter-notebook python

Last synced: 15 Nov 2024

https://github.com/rayanwaked/wildfire-analysis

The project aims to visualize wildfire activity in Oregon, exploring related data to create visualizations and tables to analyze the historical patterns.

data data-analysis data-visualization jupyter jupyter-notebook oregon portland-state-university python wildfire-data-visualization

Last synced: 15 Nov 2024

https://github.com/findmyway/dataframe-in-julia

A quick introduction of DataFrame in Julia for users from Python

data-analysis dataframe julia jupyter-notebook

Last synced: 20 Nov 2024

https://github.com/aim-harvard/faceage

Decoding biological age from face photographs using deep learning.

age-estimation biological-age cnn data-analysis deep-learning survival-analysis

Last synced: 20 Nov 2024

https://github.com/gabrielagodek/webscraper

The project was developed during master's studies. It is based on the Python library Scrapy.

data-analysis python scraper scrapy

Last synced: 17 Nov 2024

https://github.com/shz-code/diwali_sales_data_analysis

Customer Product Purchase Behavior Analysis

behavior-analysis data-analysis matplotlib ml sales seaborn

Last synced: 20 Nov 2024

https://github.com/sajjad425/edaipl

The dataset covers the Indian Premier League (IPL) with details on matches (date, teams, venue, results), player stats (runs, wickets), team stats (wins, losses), season summaries, and umpire info. The EDA reveals patterns and insights, highlighting dominant teams, star players, and trends across seasons.

data-analysis eda exploratory-data-analysis ipl python

Last synced: 17 Nov 2024

https://github.com/spacebakery/nba-trends-project

Data Science Foundations I | Exploratory Data Analysis in Python | Summarizing Relationship Between Two Features

categorical-variables data-analysis data-visualization matplotlib nba-dataset quantitative-variables scipy seaborn subset summary-statistics

Last synced: 17 Nov 2024

https://github.com/shriram-vibhute/digit_classification

This project demonstrates various machine learning techniques for classifying handwritten digits from the MNIST dataset. It covers data preprocessing, model training, evaluation, and advanced classification strategies.

classification data-analysis data-visualization machine-learning matplotlib numpy pandas sk-learn

Last synced: 15 Nov 2024

https://github.com/lc-rezende/eqx_boston_dataset

Exploratory data analysis, clustering, and forecasting on Boston crime data (2011-2015), revealing key crime trends, hotspots, and temporal patterns to support data-driven insights for urban safety and policing strategies.

data-analysis exploratory-data-analysis jupyter-notebook kmeans matplotlib numpy pandas prophet-facebook python scikit-learn seaborn

Last synced: 17 Nov 2024

https://github.com/adeebkhan25/dataset_suicide_susceptible

The "Student Suicide Risk Factors Dataset" is a comprehensive collection of data aimed at understanding and mitigating the factors contributing to student suicides.

data-analysis dataset machine-learning supervised-learning

Last synced: 17 Nov 2024

https://github.com/sarthakmishraa/bike_rental_predictor

Bike Sharing Dataset : This dataset contains the hourly and daily count of rental bikes between years 2011 and 2012 in Capital bikeshare system with the corresponding weather and seasonal information.

data-analysis machine-learning python xgboost

Last synced: 17 Nov 2024

https://github.com/vaishnavipaithane/cyclistic-bike-share-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

data-analysis r-programming-language rstudio

Last synced: 15 Nov 2024

https://github.com/ljadhav25/linear_regression_data_science

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable.

data-analysis data-science linear-regression machine-learning

Last synced: 18 Nov 2024

https://github.com/sehgal-vishal/sql-nyc-collision-analysis

this analysis is based on the Collisions(Accidents) happend in New York City. I have used Sql Server For EDA(Exploratory Data Analysis

data-analysis database eda sql-server

Last synced: 17 Nov 2024

https://github.com/aavishkarmahajan/sql

SQL code assignments and practice questions from SQL courses, SQL data analysis

data-analysis sql sql-server

Last synced: 17 Nov 2024

https://github.com/ljadhav25/logistic-regression-data-science-

Logistic regression estimates the probability of an event occurring, such as voted or didn’t vote, based on a given data set of independent variables.

data-analysis data-science data-visualization logestic-regression machine-learning

Last synced: 18 Nov 2024

https://github.com/shubham200137/spotify-listening-habits-analytics

Spotify Listening Habits Analytics is a project aimed at analyzing personalized Spotify listening habits and music trends. It involves Exploratory Data Analysis (EDA) with Python Pandas, data processing using SQL Server, and creating visualizations with Power BI. The goal is to uncover insights into listening patterns, track popularity, and artist.

data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas power-bi-dashboard sqlserver

Last synced: 17 Nov 2024

https://github.com/krzysikd/apartment-prices-in-poland-analysis-and-visualization

Data Analyst portfolio project that involves cleaning, transforming, and visualizing data to create an insightful dashboard. The project uses SSIS for ETL processes, SSMS for database management and queries, and Power BI for data visualization, focusing on the analysis of rental and sales apartment prices in Poland.

data-analysis data-cleaning data-visualizations powerbi sql sqlserver ssis

Last synced: 17 Nov 2024

https://github.com/faizantkhan/python_matplotlib

Matplotlib is a powerful Python library for creating visualizations and plots. It’s widely used for data representation, making complex information more accessible and interpretable. It offers various types of plots, including line graphs, scatter plots, bar charts, histograms, and more

data-analysis data-analytics data-engineering data-science data-visualization deep-learning graphs line machine-learning machine-learning-algorithms matplotlib matplotlib-pyplot matplotlib-python python

Last synced: 15 Nov 2024

https://github.com/analyticslover/salifort-motors-turnover-project

The Salifort Motors H.R. Project serves as the capstone for the Google Advanced Analytics Program on Coursera. This project presents a business scenario and a problem on the scnario context, employee turnover. In this project, essential techniques as EDA and Data Modeling are used to analyze and predict the employee turnover rates in the company.

data data-analysis datamodeling eda machine-learning pandas python sklearn

Last synced: 17 Nov 2024

https://github.com/edwinrlambert/investigating-netflix-movies

Demonstrates data analysis and visualization techniques for Netflix movies using Python in a Jupyter notebook. This is a DataCamp project.

data-analysis data-analysis-python netflix python

Last synced: 17 Nov 2024

https://github.com/edwinrlambert/exploring-airbnb-market-trends

Dive into NYC's Airbnb market trends through detailed analysis of listings data, including prices, types, and review dates.

airbnb data-analysis jupyter-notebook market-trends python

Last synced: 17 Nov 2024

https://github.com/karlyndiary/adidas-sales-analysis

Analyzing Adidas' sales performance and profitability across US retailers by exploring regional trends, product performance, and sales channels, using Python for data cleaning, SQL for querying, and Excel for visualizations.

adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python

Last synced: 17 Nov 2024

https://github.com/karlyndiary/smartphone-price-analytics

A data pipeline for analyzing smartphone pricing by retrieving data from Flipkart using RapidAPI, transforming it, and visualizing insights using SQL Server and Excel.

beautifulsoup data-analysis data-pipeline data-visualization data-visualization-dashboard etl microsoft microsoft-excel microsoft-sql-server python smartphone-price-analysis

Last synced: 17 Nov 2024

https://github.com/1401dev/iowa-liquor-retail-sales-analysis

This repository contains the analysis of Iowa liquor retail sales data, aimed at uncovering sales trends and forecasting future sales patterns. The project involves data cleaning, preparation, and advanced time series analysis using Microsoft SQL Server and Google Colab.

customer-behavior data-analysis data-cleaning data-science data-visualization exploratory-data-analysis forecasting google-colab machine-learning microsoft-sql-server pandas prophet python retail-analytics retail-sales sales-forecasting sales-performance sql statsmodels time-series-analysis

Last synced: 17 Nov 2024

https://github.com/lewismakau/portfolio-projects

This repository contains file data and SQL files for projects used for my Portfolio.

data-analysis data-cleaning data-structures data-visualization database google-analytics microsoft-sql-server mysql powerbi tableau

Last synced: 17 Nov 2024

https://github.com/balajimohan18/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language

data-analysis data-mining data-science eta microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 17 Nov 2024

https://github.com/bala-1409/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation database eda etl-framework exploratory-data-analysis microsoft-sql-server query-language sql sql-server sql-server-database sql-server-management-studio

Last synced: 17 Nov 2024

https://github.com/shridhar1504/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation eda etl-framework microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 17 Nov 2024

https://github.com/navp7/pizzasales_powerbi

This project involves creating a comprehensive sales performance dashboard using Power BI to visualize and analyze the sales data of an Italian pizza company.

data-analysis ms-sql-server ms-word powerbi visualization

Last synced: 17 Nov 2024

https://github.com/faizantkhan/automated-eda

This repository showcases tools for automatic Exploratory Data Analysis (EDA) in Python. These tools help you quickly understand your datasets and generate insightful reports.

automatic automation autoviz data-analysis data-analysis-python data-science data-visualization dtale dtale-library eda exploratory-data-analysis ml pandas pandas-profiling python python-library sweetviz

Last synced: 15 Nov 2024

https://github.com/chandkund/loan-eligibility-prediction

This project is designed to predict the eligibility of loan applicants based on various factors such as income, credit history, and marital status. By analyzing historical loan application data, the model helps to determine whether a loan application should be approved or not.

data-analysis data-science data-visualization machine-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 17 Nov 2024

https://github.com/muhammadadilnaeem/machine-learning-project-student-performance-indicator

This project explores how students' performance (exam scores) is influenced by variables such as Gender, Ethnicity, Parental Level of Education, Lunch Type, and Test Preparation Course.

data-analysis data-science data-visualization dataset flas html-css machine-learning machine-learning-algorithms python

Last synced: 16 Nov 2024

https://github.com/shervinnd/bazar_app_store_eda

Bazar App Data analysis code to find the most and most popular installed apps

data data-analysis data-science dataanalysis eda python

Last synced: 17 Nov 2024

https://github.com/hasnathjami/data-analysis-of-covid-19

An Oracle PL/SQL-based project on COVID-19 data analysis. It is my CSE 4.1 project of Distributive Database Management System LAB.

data-analysis naive-bayes-classifier oracle-database probability-statistics sqlplus

Last synced: 17 Nov 2024

https://github.com/jovicdev97/financial-data-analytics

using numpy and pandas to analyze a synthetic loan dataset with python

data-analysis matlabplot numpy pandas plotting python seaborn

Last synced: 18 Nov 2024

https://github.com/bnvulpe/regression-and-time-series

This work centers on assessing and comparing predictive models for regression and time series prediction using specific datasets, with the goal of selecting the most effective methodology for unseen test data.

colab data-analysis data-analysis-python data-science data-visualization forecasting jupyter-notebook machine-learning model-evaluation predictive-modeling python regression sarima sarimax time-series-analysis time-series-analysis-and-forecasting

Last synced: 18 Nov 2024

https://github.com/azaz9026/loan_approval_prediction

Welcome to the Loan Approval Prediction repository! This project aims to build a predictive model that can determine whether a loan application should be approved or denied based on various features. Purpose The goal of this repository is to develop a machine learning model that can accurately predict loan approval decisio

data data-analysis data-visualization eda machine-learning numpy pandas python statistics

Last synced: 18 Nov 2024

https://github.com/azaz9026/car_price_prediction_model

This repository contains a machine learning model designed to predict car prices based on various features. Using historical data on car attributes such as make, model, year, mileage, and other relevant factors, the model aims to provide accurate and reliable price estimates for used cars.

data-analysis data-engineering liner-regestion machine-learning modeling numpy pandas python3 rendering

Last synced: 18 Nov 2024

https://github.com/aishwaryagm1999/california-housing-prices-data-analysis

Performed Data Cleaning and Data Analysis of the California Housing Prices Dataset to find the relation between the housing prices at a block and the amenities and facilities stated in the dataset such as total number of rooms, ocean proximity etc.

data-analysis data-visualization matplotlib numpy pandas python seaborn

Last synced: 18 Nov 2024

https://github.com/robinmillford/analyzing-e-commerce-transactions---data-cleaning-cohort-analysis-and-sql

In this project, I aimed to analyze the profitability of products in an e-commerce dataset. I performed various SQL queries to extract valuable insights about product profitability, including the identification of the top 5 products with the highest profit margin, and unique combinations of brands and product lines with the highest profitability.

cohort-analysis data-analysis data-visualization excel jupyter-notebook powerbi python3 sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/india-s-covid-19-journey-a-case-study-analysis

In this extensive project, I embarked on a profound exploration of India's journey through the COVID-19 pandemic. This endeavor involved a multi-faceted approach, encompassing data preprocessing with Python, data analysis with SQL queries, and data visualization using Power BI.

covid-19 data-analysis data-cleaning-and-preprocessing data-visualization jupyter-notebook powerbi pythin3 sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/analyzing-spotify-streaming-data

The goal of this project was to analyze a dataset of Spotify streaming data, spanning from 2014 to 2022, and extract meaningful insights related to song popularity, artists, and streaming patterns.

data-analysis jupyter-notebook python spotify sqlite3

Last synced: 17 Nov 2024

https://github.com/wikidata/purdue-data-mine-2024

Program materials for WMDE's 2024 Purdue Data Mine project

analytics data-analysis data-quality data-science etl open-data python wikidata wikimedia

Last synced: 18 Nov 2024

https://github.com/robinmillford/data-professional-survey-power-bi

I worked on a Power BI project called 'Data Professional Survey Breakdown

data-analysis data-visualization powerbi

Last synced: 17 Nov 2024

https://github.com/robinmillford/customer_personality_analysis

Consumer personality analysis is a thorough examination of a business' ideal clients. A company may more easily adapt goods to meet the unique wants, behaviours, and concerns of various consumer types because to this improved understanding of its customers.

data-analysis data-science machine-learning python3

Last synced: 17 Nov 2024

https://github.com/reddyprasade/r-program

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

data-analysis data-science r-programming

Last synced: 18 Nov 2024

https://github.com/robinmillford/instagram-user-insights-analyzing-user-behavior

In this project, I delved into the dynamics of a popular photo-sharing website using SQL queries

data-analysis data-visualization instragram powerbi sql

Last synced: 17 Nov 2024

https://github.com/nivasharmaa/friskwatch

A Java program for analyzing stop-and-frisk data from the NYPD. Features data import, organization, and statistical analysis to compare occurrences during and after policy implementation.

data-analysis data-visualization dataprocessing datascience file-io java java-oop nypd-data

Last synced: 18 Nov 2024

https://github.com/nivasharmaa/spiderverse

A comprehensive Java program for analyzing and managing events and data points within a fictional spiderverse. Features event handling, anomaly detection, cluster management, and robust file I/O operations.

advanced-algorithms anomaly-detection clustering data-analysis file-io object-oriented-programming

Last synced: 18 Nov 2024

https://github.com/jatin-mehra119/flight-price-prediction

This study aims to analyze flight booking data from "Ease My Trip" website, using statistical tests and linear regression to extract insights. By understanding this data, valuable information can be gained to benefit passengers using the platform.

data-analysis datacleaning datavisualization machine-learning preprocessing-data python sklearn-pipeline sklearn-regression-algorithm streamlit-webapp

Last synced: 16 Nov 2024

https://github.com/curtisalexander/cramisc

Personal R functions for data analysis

data-analysis r r-pkg

Last synced: 18 Nov 2024

https://github.com/jatin-mehra119/roe-prediction-modeling

A Web-APP for predicting Return of Equity using Machine learning model.

data-analysis data-science forecasting machine-learning regression-models scikit-learn

Last synced: 16 Nov 2024

https://github.com/andrewzgheib/premier-league-analysis

A PowerBI dashboard analysing various KPIs of the Premier League

data-analysis kpi powerbi

Last synced: 18 Nov 2024

https://github.com/robinmillford/hr-analytics-employee-performance-analysis

HR Analytics: Unveiling Employee Performance - A comprehensive exploration of employee data using SQL and Power BI, uncovering key insights for strategic HR decision-making.

data-analysis data-visualization jupyter-notebook powerbi python3 sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/playstore-app-insights-uncovering-app-market-trends

In my Playstore App analysis, I uncovered valuable insights about app market trends. I discovered the top-rated apps, identified popular app categories, and explored user sentiments. My findings provide a comprehensive understanding of the app landscape, aiding in informed decision-making and strategy development for app developers and marketers.

data-analysis data-cleaning data-visualization jupyter-notebook python3 sql

Last synced: 17 Nov 2024

https://github.com/robinmillford/loanalytics-investigating-financial-trends-with-world-bank-data

The project aimed to explore and analyze World Bank Loan Data, leveraging Python for data preprocessing and SQL for in-depth queries

data-analysis data-visualization jupyter-notebook mysql tableau world-bank

Last synced: 17 Nov 2024

https://github.com/prajjwol09/data-cleaning-project

This project is dedicated to cleaning, standardizing a dataset, dealing with null values from a CSV file named "layoffs" using MySQL, with MySQL Workbench as the workspace environment. The goal is to prepare the data for analysis.

cleaning-data columns data-analysis database duplicates mysql rows standard

Last synced: 18 Nov 2024

https://github.com/sweta2501/netflix_dataanalysis

With the help of Netflix Data, I have done some Data Analysis.

data-analysis data-science jupyter-notebook python

Last synced: 18 Nov 2024

https://github.com/robinmillford/optimizing-treatment-plans-through-data-analysis

The primary focus was on understanding customer health, treatment, and associated charges over multiple years.

data-analysis data-visualization healthcare mysql powerbi sql

Last synced: 17 Nov 2024

https://github.com/arction/lcjs-example-0507-dashboardfiberanalysis

A demo application showcasing using LightningChart JS to visualize fiber analysis data.

area-plot area-series chart charts dashboard data-analysis demo heatmap javascript lcjs lightningchart-js performance visualization webgl

Last synced: 18 Nov 2024

https://github.com/robinmillford/animeinsights-user-feedback-analysis

In this project, I leveraged SQL queries to analyze and extract valuable insights from an "anime" dataset. The dataset includes information such as titles, scores, episode counts, genres, and popularity rankings for various anime series and movies.

anime data-analysis data-cleaning mysql

Last synced: 17 Nov 2024

https://github.com/robinmillford/predicting-diabetes-a-machine-learning-approach-to-early-intervention

The goal of this project was to develop a predictive model for diabetes using a dataset containing various health-related features

data-analysis data-science diabetes-prediction jupyter-notebook machine-learning smote

Last synced: 17 Nov 2024