Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/hyperentangledqubit/shellplot

shellplot -- Generate plot(s) directly from terminal via matplotlib or ggplot2 (plotnine)!

data-analysis ggplot2 graphics matplotlib plotnine plotting pyplot terminal

Last synced: 10 Jan 2025

https://github.com/prateek5525/online-shopping-analytics-project

The Online Shopping Analytics Project analyzed product trends, and regional sales using SQL and Tableau. Insights from the Sales and Location Dashboards highlighted key trends in demographics, product popularity, and regional performance. These findings empower businesses to optimize strategies, enhance marketing, and improve inventory management.

data-analysis excel kaggle-dataset sql tableau

Last synced: 29 Jan 2025

https://github.com/deliprofesor/customerseg-customer-segmentation-and-shopping-analysis

This project performs data exploration, segmentation, and modeling of wholesale customer data using clustering algorithms, PCA, and decision trees to analyze purchasing behavior and predict customer channel preferences.

clustering customer-segmentation data-analysis data-visualization dbscan decision-tree gmm kmeans machine-learning pca

Last synced: 15 Feb 2025

https://github.com/aalekhpatel07/statcan

StatCAN dataset fetcher and cleaner.

census data-analysis data-science statcan

Last synced: 08 Feb 2025

https://github.com/juliuspinsker/bioconductor-learning-container

🧬 Containerized development environment for Harvard's Professional Certificate in Data Analysis for Genomics (PH525.x series). Streamlined setup for Bioconductor, R, and genomic data analysis with RStudio and DevContainer support.

bioconductor bioinformatics chip-seq data-analysis data-science devcontainer dna-methylation docker edx functional-genomics genomics harvard harvardx ph525 ph525x r reproducible-research rna-seq rstudio single-cell-rna-seq

Last synced: 29 Jan 2025

https://github.com/deeksha-dhawan/pizza-outlet-analysis-using-sql

This project analyzes pizza sales data to gain insights into customer behavior and revenue patterns. Key analyses include customer insights, popular pizza types and sizes, revenue generation, and order trends. The findings help optimize menu offerings, staffing, and marketing strategies to boost overall business performance.

coding-challenge data-analysis data-science microsoft my portfolio-project programming project projects sql sql-analysis sql-project sqlproject sqlserver

Last synced: 29 Jan 2025

https://github.com/banner-19/extraction-and-analysis-of-text

The objective is to analyze text content from a list of URLs. This involves extracting article titles and text, then performing natural language processing to generate metrics like sentiment, readability, and word usage. Finally, the results are stored for further analysis or visualization.

data-analysis data-analytics data-science nlp nltk python3 text-analysis text-extraction

Last synced: 21 Feb 2025

https://github.com/supertetelman/frc-data-analysis

A Collection of R, Matlab, and Bash scripts that were developed in real-time from the stands of a FRC competition. Gathered data from various online sources, parsed it, and ran some basic analysis on it to calculate ratings and make basic match predictions. Results were mad public and hosted live via AWS. Developed as a student teaching tool under poor Internet Connectivity with minimal access to real-time match data.

bash data-analysis matlab r teaching

Last synced: 28 Jan 2025

https://github.com/rijul007/market-basket-analysis-using-r

Market Basket Analysis using association rules, leveraging R’s powerful tools for data-driven retail strategies.

data-analysis data-science r

Last synced: 08 Feb 2025

https://github.com/82luli02/sakila_dvd_rental_database_analysis

Analysis of the Sakila DVD Rental database using SQL

data data-analysis data-science data-visualization sql

Last synced: 17 Jan 2025

https://github.com/mateusoliveira30/top-intelligent-people

This project performs an exploratory analysis of the top_intelligent_people_in_the_world_5000.csv dataset, featuring some of the world's most intelligent individuals. Using pandas and matplotlib, the analysis includes checking for missing values, describing variables, and visualizing data.

data-analysis graphics kaggle-dataset python3

Last synced: 02 Jan 2025

https://github.com/madusales/powerbi-etl-elt

Venho estudando, através do Bootcamp da DIO sobre Data Analytics & Power BI, acerca do uso de SQL para criar soluções em BI. Esse repositório é dedicado a registrar os meus conhecimentos adquiridos até então sobre o que é BI, Tipos de análises, ETL e ELT.

big-data business-intelligence data-analysis powerbi

Last synced: 10 Jan 2025

https://github.com/netesf13d/expt-sequence-analysis

Data processing, analysis and visualization package for atomic physics experiments in the single-atom regime.

cold-atoms data-analysis data-visualization optical-tweezers

Last synced: 10 Jan 2025

https://github.com/quocduyenanhnguyen/roi-modeling-and-analysis-of-sports-dataset

In this project, you will find my ROI model for retirement savings and PowerPoint presentation of my ROI model, as well as my data analysis/visualization of Sports Ticket Sales dataset that I concluded with a PDF group written report

data-analysis data-visualization microsoft-excel rate-of-return-modeling sports-ticket-sales-dataset

Last synced: 02 Jan 2025

https://github.com/laudebugs/fec-data-analysis-2020

The project aimed to determine the total sum of contributions to the candidate committees as well as the number of contributions made by individuals.

data-analysis fec presidential-candidates

Last synced: 09 Jan 2025

https://github.com/deanlogan/data-analysis-course

Code created when completing the Data Analysis with Python Course on freecodecamp.org

course data-analysis numpy pandas python python3

Last synced: 23 Jan 2025

https://github.com/lulloooo/bizdata-nexus

Collection of my Business & Data Analysis projects, from professional/academic endeavors to passion-driven explorations 📊

business-analysis data-analysis economics etl excel finance mysql python r risk-analysis

Last synced: 08 Feb 2025

https://github.com/astropenguin/optimap

Optimized integrated intensity map method for spectral cubes

astronomy data-analysis data-science python python3 radio-astronomy spectral-cubes

Last synced: 15 Feb 2025

https://github.com/satyacoder29/crowdfunding-in-sql

Crowdfunding is a method of raising funds for projects or causes by collecting small contributions from a large group of people, usually through online platforms. It enables individuals, startups, and nonprofits to secure funding, offering rewards or recognition in exchange, and helps bring ideas to life without traditional financing.

data-analysis data-cleaning database-management mysql-database quries sql sql-functions sql-server views

Last synced: 19 Feb 2025

https://github.com/rociobenitez/airbnb-data-mining

Análisis detallado y modelado predictivo de alojamientos en Madrid utilizando técnicas de Big Data y estadística en R, enfocado en optimización de datos y predicción de características de propiedades.

airbnb data-analysis data-mining estadistica prediction-model predictive-analytics predictive-modeling qmd r rstudio

Last synced: 23 Jan 2025

https://github.com/vriv06/btk-trials-data-analysis

Data analysis of Bioteksa plant nutrition trials for measure nutrient efficacy, resistance against biotic and abiotic factors, etc.

agriculture-research confluence crops data-analysis quarto r

Last synced: 29 Jan 2025

https://github.com/nitins17/tableauvisualizations

Visualizations I created while learning to work with Tableau

data-analysis data-science data-visualization tableau visualization

Last synced: 08 Feb 2025

https://github.com/ibrahimhabibeg/national-university-of-singapore-sms-analysis

Analysis of SMS messages collected by the National University of Singapore

analytics data-analysis data-science nlp python

Last synced: 14 Feb 2025

https://github.com/vishnu-vamshii/data-science-jobs-salaries

Created an interactive dashboard to analyze data science jobs salaries in different regions of the world, experience levels, average salaries in USD and type of employment along with a geographical visual.

data-analysis data-science data-visualization tableau tableau-dashboard

Last synced: 29 Jan 2025

https://github.com/navp7/hr_analysis_excel

This project utilizes Microsoft Excel to conduct a comprehensive analysis of HR data, focusing on identifying the various reasons for employee attrition and evaluating job satisfaction

dashboards data-analysis excel visualization

Last synced: 17 Feb 2025

https://github.com/jakobzmrzlikar/trg-dela

Data analysis of student job offers.

data-analysis ipython-notebook web-scraping

Last synced: 17 Feb 2025

https://github.com/antononcube/wl-tilestats-paclet

Wolfram Language (aka Mathematica) paclet for statistics over 2D tillings. (Tile binning, aggregation functions application, etc.)

2d-data data-analysis geospatial-data mathematica wolfram-language

Last synced: 08 Feb 2025

https://github.com/antononcube/wl-mosaicplot-paclet

Wolfram Language (aka Mathematica) paclet for mosaic plots over datasets or lists of records.

data-analysis machine-learning mosaic mosaic-plots

Last synced: 08 Feb 2025

https://github.com/vishnu-vamshii/layoffs-data-analysis-in-sql

This project focuses on the cleaning and exploratory analysis of a dataset containing layoff information. It includes data deduplication, standardization of columns, handling null and blank values, and analyzing layoffs by company, industry, country, and date. Various SQL queries are used to explore trends and patterns in layoffs over time.

data-analysis eda mysql

Last synced: 29 Jan 2025

https://github.com/robinmillford/strategic-insights-unveiling-the-dynamics-of-ipl-2022-auction

This project involves a comprehensive analysis of the IPL 2022 Auction. The goal was to gain insights into the auction dynamics, player characteristics, and spending patterns of different teams.

data-analysis data-visualization ipl powerbi sql

Last synced: 17 Jan 2025

https://github.com/robinmillford/precision-marketing-for-personal-loans

In this project, I conducted an in-depth analysis of a dataset containing personal loan information

data-analysis data-visualization kaggle loan mysql tableau

Last synced: 17 Jan 2025

https://github.com/robinmillford/loanalytics-investigating-financial-trends-with-world-bank-data

The project aimed to explore and analyze World Bank Loan Data, leveraging Python for data preprocessing and SQL for in-depth queries

data-analysis data-visualization jupyter-notebook mysql tableau world-bank

Last synced: 17 Jan 2025

https://github.com/robinmillford/playstore-app-insights-uncovering-app-market-trends

In my Playstore App analysis, I uncovered valuable insights about app market trends. I discovered the top-rated apps, identified popular app categories, and explored user sentiments. My findings provide a comprehensive understanding of the app landscape, aiding in informed decision-making and strategy development for app developers and marketers.

data-analysis data-cleaning data-visualization jupyter-notebook python3 sql

Last synced: 17 Jan 2025

https://github.com/robinmillford/customer_personality_analysis

Consumer personality analysis is a thorough examination of a business' ideal clients. A company may more easily adapt goods to meet the unique wants, behaviours, and concerns of various consumer types because to this improved understanding of its customers.

data-analysis data-science machine-learning python3

Last synced: 17 Jan 2025

https://github.com/robinmillford/analyzing-spotify-streaming-data

The goal of this project was to analyze a dataset of Spotify streaming data, spanning from 2014 to 2022, and extract meaningful insights related to song popularity, artists, and streaming patterns.

data-analysis jupyter-notebook python spotify sqlite3

Last synced: 17 Jan 2025

https://github.com/robinmillford/instagram-user-insights-analyzing-user-behavior

In this project, I delved into the dynamics of a popular photo-sharing website using SQL queries

data-analysis data-visualization instragram powerbi sql

Last synced: 17 Jan 2025

https://github.com/robinmillford/data-professional-survey-power-bi

I worked on a Power BI project called 'Data Professional Survey Breakdown

data-analysis data-visualization powerbi

Last synced: 17 Jan 2025

https://github.com/robinmillford/animeinsights-user-feedback-analysis

In this project, I leveraged SQL queries to analyze and extract valuable insights from an "anime" dataset. The dataset includes information such as titles, scores, episode counts, genres, and popularity rankings for various anime series and movies.

anime data-analysis data-cleaning mysql

Last synced: 17 Jan 2025

https://github.com/robinmillford/optimizing-treatment-plans-through-data-analysis

The primary focus was on understanding customer health, treatment, and associated charges over multiple years.

data-analysis data-visualization healthcare mysql powerbi sql

Last synced: 17 Jan 2025

https://github.com/robinmillford/hr-analytics-employee-performance-analysis

HR Analytics: Unveiling Employee Performance - A comprehensive exploration of employee data using SQL and Power BI, uncovering key insights for strategic HR decision-making.

data-analysis data-visualization jupyter-notebook powerbi python3 sql

Last synced: 17 Jan 2025

https://github.com/robinmillford/india-s-covid-19-journey-a-case-study-analysis

In this extensive project, I embarked on a profound exploration of India's journey through the COVID-19 pandemic. This endeavor involved a multi-faceted approach, encompassing data preprocessing with Python, data analysis with SQL queries, and data visualization using Power BI.

covid-19 data-analysis data-cleaning-and-preprocessing data-visualization jupyter-notebook powerbi pythin3 sql

Last synced: 17 Jan 2025

https://github.com/robinmillford/analyzing-e-commerce-transactions---data-cleaning-cohort-analysis-and-sql

In this project, I aimed to analyze the profitability of products in an e-commerce dataset. I performed various SQL queries to extract valuable insights about product profitability, including the identification of the top 5 products with the highest profit margin, and unique combinations of brands and product lines with the highest profitability.

cohort-analysis data-analysis data-visualization excel jupyter-notebook powerbi python3 sql

Last synced: 17 Jan 2025

https://github.com/tomijuarez/lemmatisation

Lemmatisation fully implemented in Java.

algorithms data-analysis data-science java-8 lemmatization oop

Last synced: 14 Feb 2025

https://github.com/usheninte/py-notebooks

Jupyter Notebooks holding Data Science projects

data-analysis data-science data-visualization datasets jupyter-notebooks python

Last synced: 08 Feb 2025

https://github.com/deliprofesor/cinematic-data-analytics-and-recommendation-platform

This project analyzes a movie dataset using machine learning algorithms to predict success, explore revenue-popularity relationships, and develop recommendation systems. It employs techniques like K-Means, DBSCAN, GMM, decision trees, PCA, and NLP for insights and personalized suggestions.

clustering content-based-recommendation data-analysis data-visualization decision-tree gmm k-means machine-learning natural-language-processing nlp pca predictive-modeling python recommendation-system scikit-learn user-based-recommendation

Last synced: 15 Feb 2025

https://github.com/lucaspadoni/9-11-hijackers-social-network-analysis

Social Network Analysis focused on the events of 9/11/2001. By examining publicly available data through SNA techniques, we gain insights into the organizational structure of the terrorist network, offering valuable perspectives on key relationships and connections.

9-11 data-analysis data-analytics graph-theory hijacking network-analysis sna social-network-analysis terrorism terrorist-attacks

Last synced: 02 Jan 2025

https://github.com/pablo1785/receipt-rs

Receipt processing backend built with Shuttle.rs, Axum and Azure Form Recognizer API

api-rest axum azure backend cognitive-services computer-vision data-analysis rust shuttle-rs sqlx

Last synced: 14 Feb 2025

https://github.com/bhushan148/finance-domain-bank-loan-report-tableau

I analyzed 🏦 bank loan data to reveal trends, KPIs, and insights. Using Tableau 📈 for dashboards and SQL 🗃️ for data extraction, I visualized loan applications, borrower profiles, and repayment behaviors 💡.

bussiness-intelligence dashboard-design data-analysis data-visualization excel figma sql sqlqueries tableau

Last synced: 14 Feb 2025

https://github.com/mirdan08/crafty

Data analysis project i've developed for the web scraping course.

blockchain data-analysis webscraping

Last synced: 30 Jan 2025

https://github.com/deliprofesor/behavioral-insights-and-data-exploration

This project analyzes Spanish speech data, focusing on acoustic features and demographics. It includes data cleaning, outlier detection, clustering, and time series modeling (ARIMA, Holt-Winters) to uncover patterns in speech duration and word frequency.

acoustic-features arima clustering data-analysis holt-winters k-means machine-learning speech-analysis time-series-analysis

Last synced: 15 Feb 2025

https://github.com/michalspano/maturitna-skuska-proj

Maturitná skúška 2021/2022 - objektívna spracovanie a analýza dát

data-analysis

Last synced: 18 Jan 2025

https://github.com/riborings/uranouchi42microdiversity

In this repository live the bash, R and Julia scripts used to explore the microdiversity of the prokaryotic community at Uranouchi Inlet (42-sample time-series) by means of metagenomic shotgun sequencing under the supervision of the Ogata Lab.

big-data data-analysis data-visualisation diversity-analysis marine-ecology marine-ecosystem metagenomics microbiome-analysis prokaryotic-genomes

Last synced: 10 Feb 2025

https://github.com/hasnathjami/data-analysis-of-covid-19

An Oracle PL/SQL-based project on COVID-19 data analysis. It is my CSE 4.1 project of Distributive Database Management System LAB.

data-analysis naive-bayes-classifier oracle-database probability-statistics sqlplus

Last synced: 17 Jan 2025

https://github.com/syedanimrafatima/ecommerce-store-sales-analysis-powerbi

The Sales Analysis Dashboard is designed to help an E-commerce Business to overview their Sales performance throughout the year. It includes a report and visualizations that cover sales performance, customer segmentation, product analysis, and more.

business-intelligence csv dashboard data-analysis data-cleaning data-visualization excel powerbi sales-analysis-dashboard storytelling

Last synced: 03 Jan 2025

https://github.com/samruddhi3012/screen-time-analysis

Hi! This repo demonstrates a python project on Screen Time Analysis.

data-analysis data-visualization python

Last synced: 04 Feb 2025

https://github.com/allanotieno254/bank-loan-analysis-dashboard-power-bi

An interactive Power BI dashboard that analyzes bank loan data to provide insights into approval trends, default risks, and customer profiles. Designed to assist financial institutions in making data-driven lending decisions.

bank-loans business-intelligence dashboard data-analysis financial-analysis power-bi risk-assessment

Last synced: 05 Feb 2025

https://github.com/ninadpatil09/heart_disease_detection_analysis

The Heart Disease Detection Analysis aims to create a predictive model for identifying individuals at risk of heart disease. Using a dataset with attributes like age, sex, and health metrics, the project focuses on distinguishing patients with and without heart disease.

data-analysis data-cleaning data-science data-visualization machine-learning

Last synced: 03 Jan 2025

https://github.com/ninadpatil09/hospital_emergency_room_analysis

This comprehensive analysis delves into the performance and characteristics of the hospital's emergency room over the past year. By scrutinizing key metrics and patient demographics, this study aims to provide valuable insights for optimizing patient care, resource allocation, and overall operational efficiency.

data-analysis tableau-public visualization

Last synced: 03 Jan 2025

https://github.com/dthung1602/goodread-bestbook-prediction

Data analysis - trying to predict the result of Goodreads Choice Adward

data-analysis goodreads pca python r xgboost

Last synced: 03 Jan 2025

https://github.com/davidzajac1/four-percent-rule-pandas-analysis

Analysis of the 4% Personal Finance Rule of Thumb

data-analysis data-visualization pandas python

Last synced: 03 Jan 2025

https://github.com/matteospanio/speed-analysis

A project to analyze the internet speed

bash-script data-analysis

Last synced: 03 Jan 2025

https://github.com/charlenry/python_pour_la_data_science

Les notebooks de ce dépôt sont mes notes de cours sur Python pour la Data Science.

data-analysis data-science data-visualisation dataframes jupyter-notebooks kaggle matplotlib-pyplot numpy pandas python seaborn

Last synced: 03 Jan 2025

https://github.com/mirokeimioniemi/classifying-software-pirates

Exploring the factors driving people into software piracy by training two machine learning models to predict whether a person with certain characteristics and sentiments is likely to possess any pirated software or not using a dataset collected via a survey targeting users of music production software.

data-analysis data-science decision-tree-classifier logistic-regression machine-learning piracy python software-piracy survey

Last synced: 10 Jan 2025

https://github.com/gabrielagodek/webscraper

The project was developed during master's studies. It is based on the Python library Scrapy.

data-analysis python scraper scrapy

Last synced: 17 Jan 2025

https://github.com/balajimohan18/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataeset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization eda kmeans-clustering machine-learning sql unsupervised-learning

Last synced: 14 Jan 2025

https://github.com/noorulhudaajmal/business-performance-analytics

Python-Streamlit based interactive dashboard to analyze and visualize key business metrics for an eBay seller store.

business-analytics dashboard data-analysis python-streamlit

Last synced: 08 Feb 2025

https://github.com/allanotieno254/powerbi-dax-filter-context

This repository contains a Power BI project that explores **DAX Filter Context**, a crucial concept in DAX calculations. The project focuses on **Bank Loan Analysis**, demonstrating how different filter contexts affect DAX formulas.

business-intelligence data data-analysis dax dax-functions powerbi powerbi-visuals visualization

Last synced: 05 Feb 2025

https://github.com/samruddhi3012/health-care-analytics

Hi! This repo involves analyzing the Healthcare analytics using Advanced Microsoft Excel.

dashboard data-analysis data-visualization healthcare microsoft-excel pivot-chart pivot-tables vlookup

Last synced: 04 Feb 2025

https://github.com/thanhngan22/data-analyst-fundamental

🧩 data analyst fundamental | Knowledge relevant to a datathon | materials

analyzing-data-using-pandas data-analysis datathon tensorflow

Last synced: 03 Jan 2025