Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/phillbertnevinemmanuel/movieindustryanalysis-correlation

This project is a comprehensive data analysis endeavor within the Movie Industry, spanning from Data Cleaning to Exploratory Data Analysis, Correlation Analysis, and Temporal Analysis. The dataset was sourced from Kaggle, purportedly scraped using the IMDb API. Python was the primary tool utilized for analysis.

data-analysis data-cleaning python

Last synced: 23 Dec 2024

https://github.com/vhtua/group4_data_analysis

Hierarchical Cluster Analysis: Movie Genres Preferences

data-analysis hierarchical-clustering r unsupervised-learning

Last synced: 10 Dec 2024

https://github.com/nafisalawalidris/buybuy-e-commerce-company

The BuyBuy E-commerce Company repository is a comprehensive hub for the company's e-commerce platform. It includes source code, documentation, and data analysis insights, providing a data-driven approach to improve customer experience, drive revenue, and inform decision-making.

buybuy cleaning-data company customer-experience data data-analysis decision-making documentation e-commerce excel insights postgresql repository revenue source-code sql

Last synced: 22 Nov 2024

https://github.com/bcko/ud-da-stroopeffect

Udacity Data Analyst Nanodegree Project : Test a Perceptual Phenomenon (Stroop Effect)

data-analysis data-analyst-nanodegree stroop-effect udacity udacity-data-analyst-nanodegree

Last synced: 26 Nov 2024

https://github.com/tim-hub/python-course

A new Python Course, a new trial to offer MOOC style learning resources and content for python learners

data-analysis learning python

Last synced: 23 Nov 2024

https://github.com/nafisalawalidris/dr.-semmelweis-and-the-discovery-of-handwashing

Uncover the revolutionary impact of handwashing on mortality rates in healthcare. Explore the story of Dr. Semmelweis and his groundbreaking findings.

data data-analysis handwashing healthcare-analysis medical-breakthrough mortality-rates

Last synced: 22 Nov 2024

https://github.com/nafisalawalidris/international-breweries

This GitHub readme provides an overview of data analysis using SQL on the International Breweries dataset, including dataset description, analysis questions, example SQL queries, and key insights derived from the analysis.

data-analysis insights international-breweries-dataset queries sql

Last synced: 22 Nov 2024

https://github.com/souravsuvarna/whatsapp-chat-analyzer-api

The WhatsApp Chat Analyzer API is a public api specifically designed for frontend enthusiasts who are interested in building a WhatsApp Chat Data Visualizer project. Built on FastAPI, this API offers a seamless and efficient method to process chat data and returns the processed result data in JSON format.

api data-analysis data-science fastapi publicapi python

Last synced: 05 Jan 2025

https://github.com/maciekmalachowski/crypto-charts-site

📊Application which return financial data for selected cryptocurrency.

binance-api data-analysis jupyter-notebook matplotlib mplfinance numpy pandas python python-binance

Last synced: 15 Dec 2024

https://github.com/vara-co/sql-challenge

EmployeeSQL "Data modeling, data engineering, and data analysis."

data-analysis data-engineering data-modeling employee-database erd erdiagram postgres postgresql schema sql

Last synced: 07 Dec 2024

https://github.com/titanscouting/tra-superscript

The Red Alliance data analysis package

data-analysis frc-scouting hacktoberfest python

Last synced: 22 Nov 2024

https://github.com/ibnaleem/cyberchef-discord

A versatile Discord bot that implements CyberChef's features for encoding, decoding, encrypting, compressing, analysing data directly and more in your Discord server

compression cti cyberchef cybersecurity data-analysis data-manipulation discord-bot discord-js encoding encryption hashing infosec parsing redteam

Last synced: 07 Dec 2024

https://github.com/vara-co/pandas-challenge

PyCitySchools - Analysis between budget and academic performance in schools

budget-analysis data-analysis jupiter-notebook pandas-dataframe python school-performances

Last synced: 07 Dec 2024

https://github.com/abeltavares/finstockdash

A streamlit web app for retrieving and analyzing financial data for a stock ticker.

dashboard data-analysis finance financial-analysis financial-reporting financial-statements investments python stocks streamlit web-app

Last synced: 05 Jan 2025

https://github.com/antononcube/wl-datareshapers-paclet

Wolfram Language (aka Mathematica) paclet for data reshaping functions, like, long- and wide form, cross tabulation, etc.

contingency-table cross-tabulation data-analysis data-transformation long-form wide-form

Last synced: 15 Dec 2024

https://github.com/seekinginfiniteloop/fedcal

A feature-rich Python calendar that enables time series analyses of changes in federal workforce schedules and shifts in executive department funding status.

data-analysis data-science econometrics economic-data economics federal federal-government hr pandas pandas-library pandas-python pydata python

Last synced: 14 Oct 2024

https://github.com/nafisalawalidris/hici-african-foods

HiCi African Foods: Excel dashboard & pivot table analysis of EU food rejection data to identify risks & recommend focus areas for market expansion.

data-analysis data-cleaning data-visualization eu-food-rejection excel-dashboard hici-african-foods market-expansion pivot-tables

Last synced: 22 Nov 2024

https://github.com/narius2030/sakila-datawarehouse-analysis

Implement a Hive data warehouse to store meaningful data, apply Machine Learning like Clustering or Regression for dealing with business problems

apache-hadoop apache-hive data-analysis etl-pipeline hiveql machine-learning statistics

Last synced: 14 Dec 2024

https://github.com/antononcube/wl-quantileregression-paclet

Wolfram Language (aka Mathematica) paclet that provides various Quantile Regression functions.

data-analysis machine-learning quantile-regression time-series time-series-analysis

Last synced: 15 Dec 2024

https://github.com/akash1070/data-science-virtual-internship-by-accenture

data merging and data cleaning in python as well as data visulaisation with dashboard in Tableau.

data-analysis data-cleaning data-science python3 tableau visualization

Last synced: 01 Dec 2024

https://github.com/nafisalawalidris/data-analysis-with-python

This repo features Jupyter Notebook labs for learning data analysis with Python. Explore data acquisition, wrangling, visualization, modeling, and evaluation. Enhance your skills in Python data analysis.

data-acquisition data-analysis data-science data-wrangling exploratory-data-analysis feature-engineering machine-learning model-development model-evaluation-and-refinement pandas

Last synced: 22 Nov 2024

https://github.com/vara-co/space-missions

Space Missions Over Time (1957-2022): Successes vs Failures, and Rocket Usage

data-analysis data-analysis-python history matplotlib pandas pandas-python space space-race spaceships team-project

Last synced: 07 Dec 2024

https://github.com/nafisalawalidris/tools-for-data-science

It covers popular languages (Python, R, SQL) and libraries (NumPy, Pandas) used in the field. The author shares their objectives of teaching data analysis, web development, and critical thinking skills. The repository also includes code examples, explanations of arithmetic expressions, and contact information for the author.

arithmetic-expressions data-analysis data-science data-visualization languages libraries matplotlib numpy pandas programming python r sql tools web-development

Last synced: 22 Nov 2024

https://github.com/colburncodes/se_pudding_2023

This project is a React app designed to showcase research conducted by a team of data scientists and data analysts. The app is utilizing React and React-Chartjs-2

chartjs-2 data-analysis data-science data-visualization react-chartjs-2 reactjs

Last synced: 22 Nov 2024

https://github.com/jen-uis/loan-status-prediction

This repository contains project materials for the Winter STAT 206 class, University of California, Riverside, A. Gary Anderson School of Management.

data data-analysis data-analytics data-cleaning data-visualization descriptive-analytics julia julia-language jupyter-notebook predictive-analytics predictive-modeling team-collaboration

Last synced: 21 Nov 2024

https://github.com/al-ghaly/airline-company-data-warehouse

Data Warehouse modeling, design, implementation, and analysis for an Airline Company.

data-analysis data-warehousing database-modeling sql-server

Last synced: 21 Nov 2024

https://github.com/mindful-ai-assistants/movierevenueanalysis

🎬💰 Analyze movie companies' revenue, release strategies, and financial performance using statistical techniques for actionable insights. This project explores data on total revenue, number of releases, and lifetime gross to uncover patterns that can drive strategic decisions in the film industry.

data-analysis data-science python statistical-analysis statistics

Last synced: 21 Nov 2024

https://github.com/nafisalawalidris/logistic-regression-model-for-breast-cancer-recurrence-prediction

Predicting Breast Cancer Recurrence - A logistic regression model using patient attributes to classify recurrence risk. Dataset analysis and model evaluation. Contributions welcome.

breast-cancer classification-model data-analysis data-science healthcare logistic-regression machine-learning python recurrence-prediction scikit-learn

Last synced: 22 Nov 2024

https://github.com/akash1070/project---applied-statistics-

To dive deep into this data & find some valuable insights.

data-analysis data-science python statistics

Last synced: 01 Dec 2024

https://github.com/ondrejhruby/countries-of-the-world

Explore global data with this repository, featuring insights, visualizations, and Python code examples on countries worldwide—perfect for enhancing your data analysis and visualization skills.

data-analysis data-science data-visualization geography jupyter-notebook machine-learning matplotlib pandas python statistics

Last synced: 21 Nov 2024

https://github.com/frankelavsky/political-polarization-challenge

I had 8 hours to build a solution to the research claim that "politics have become more divided in the past 50 years." You can navigate views of congressional voting patterns using arrows. I used d3, require, MVC pattern, and vanilla js. Pre-processed the data in node.js. Data is from DW-NOMINATE: ftp://k7moa.com/junkord/HANDSL01114A20_STAND_ALONE_30.DAT

client-side css d3 d3js data-analysis data-visualization frontend frontend-app html interactive interactive-visualizations javascript modular nodejs political-science politics requirejs research single-page-app visualization

Last synced: 18 Nov 2024

https://github.com/numbersprotocol/dyda

Dynamic data pipeline framework

ai artificial-neural-networks data-analysis data-science

Last synced: 27 Dec 2024

https://github.com/nafisalawalidris/sales-performance-dashboard

Sales Performance Dashboard: Analyze and visualize sales data using Power BI. Gain insights into trends, customer segments, product performance, and geographic distribution. Make data-driven decisions to optimize sales strategies and maximize revenue.

analytics-revenue dashboard-power-bi data data-analysis intelligence-sales optimization performance sales visualization-business

Last synced: 22 Nov 2024

https://github.com/aadityatamrakar/futures_spread_chart

Cash Market & Futures Daily Spread Chart - NSE Stocks

data data-analysis data-mining expressjs nodejs requests

Last synced: 28 Nov 2024

https://github.com/nafisalawalidris/predicting-credit-card-approvals

Explore credit card approval prediction through data analysis and machine learning. Preprocess data, train logistic regression models, and optimize hyperparameters. Learn data preprocessing, feature engineering, model training, and evaluation. Dive into the world of machine learning with Python and popular libraries.

approval-prediction credit-card data-analysis data-preprocessing feature-engineering hyperparameter-optimization libraries logistic-regression machine-learning model-evaluation model-training python python3

Last synced: 22 Nov 2024

https://github.com/csoren66/diabetics_prediction

Predicting that whether the patient has diabetes or not on the basis of the features we will provide to our machine learning model.

data-analysis machine-learning python svm

Last synced: 13 Jan 2025

https://github.com/kalebers/data_streams_parametric_t-sne

Research for Parametric T-SNE in high to low dimensional data stream, published in 2021 by Kalebe Rodrigues Szlachta and Andre de Macedo Wlodkowski, oriented by Jean Paul Barddal, Computer Science graduation from Pontifical Catholic University of Parana (PUCPR)

classifier data-analysis data-science data-visualization machinelearning parametric parametric-tsne python tsne-algorithm tsne-visualization

Last synced: 20 Nov 2024

https://github.com/akash1070/data-science-virtual-internship-by-anz

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

data-analysis data-science predictive-analytics presentation-slides

Last synced: 01 Dec 2024

https://github.com/thennen/py-ivtools

This is a package for measurement and analysis of current-voltage characteristics of electronic devices.

current-voltage data-analysis data-visualization electrical-engineering emerging-technology instrumentation measurements

Last synced: 24 Nov 2024

https://github.com/nafisalawalidris/springforth-university-foodbank

Springforth University Food Bank: A collaborative initiative with UNESCO to address student food insecurity. Contains code and resources for the web application, data analysis, and insights into the prevalence and impact of food insecurity on academic performance.

academic-performance collaborative-initiative data-analysis data-visualization excel pivot-tables powerbi springforth-university-food-bank student-food-insecurity unesco

Last synced: 22 Nov 2024

https://github.com/nafisalawalidris/northwind-traders-sales-analysis

Northwind Traders Sales Analysis project, which analyses sales data for a fictitious company. It utilises the Northwind Database and includes SQL queries to provide insights on employees, products, suppliers and revenue. The project aims to help the company gain valuable information for business decision-making.

business-insights data-analysis database northwind-traders sales sql

Last synced: 22 Nov 2024

https://github.com/fatihilhan42/web_scraping_football_statistics_per_game_data-main

In this notebook I will describe the process of scraping data from web portal understat.com that has a lot of statistical information about all games in top 5 European football leagues.

data-analysis data-manipulation data-science data-scraping data-visualization jupyter-notebook python

Last synced: 01 Dec 2024

https://github.com/montanaz0r/testing-if-mma-math-deduction-works-using-ufc-fighters-data

The probabilistic reasoning about phenomenon called MMA math using UFC fighters data and Python.

bayesian-inference data-analysis data-science graphviz jupyter-notebook pandas python scipy statistics

Last synced: 14 Dec 2024

https://github.com/olgapavlova/agile-health-hackathon

Визуализируем здоровье спринтов разработки по сырым данным

data-analysis data-visualization figma google-sheets matplotlib pandas python sql

Last synced: 18 Nov 2024

https://github.com/valeriopagliarino/electronics-2021-unito-public

Data analysis and simulations for the course "Electronics laboratory" held at Physics Dep. - University of Turin, 2021

data-analysis electronics physics

Last synced: 06 Dec 2024

https://github.com/sumitkumargiri/machine-learning-project

This repository contain all the best practices for managing Github repository.

data-analysis github machine-learning opensource project python

Last synced: 18 Nov 2024

https://github.com/gracysapra/r-in-data-science

This repository contains essential guides for data analysis using R, covering topics like data preparation, data reshaping, and data visualization. Each file focuses on fundamental techniques to manipulate, clean, and visualize data effectively using R programming.

data-analysis data-preparation data-reshaping data-science data-visualization data-visualizations ggplot r r-for-data-science

Last synced: 15 Dec 2024

https://github.com/prangonghose/analysis_of_bangladesh_economic_complexity

In this project a brief analysis has been done by our team in the export economy of Bangldesh for the past three decades.

data-analysis data-science data-visualization inequalipy matplotlib pandas plotly

Last synced: 18 Nov 2024

https://github.com/raccoon-hero/gender-equality-tracker

A web application visualizing gender equality metrics with a focus on Ukraine. Built with Flask, it's powered by live data from global open sources, with dynamic research insights and analysis.

chartjs css dashboard data-analysis data-visualization flask frontend gender-equality global-metrics html linked-data openalex opendata python representation semantic-web ukraine webapp wikidata world-bank-api

Last synced: 27 Dec 2024

https://github.com/nafisalawalidris/investigating-netflix-movies-and-guest-stars-in-the-office

Dive into the world of Netflix and explore the average duration of movies. Netflix, being the largest entertainment company, offers a wide range of movies for its viewers. In this project, we analyse movie durations using pandas and create a DataFrame from a dictionary. By examining average durations from 2011 to 2020.

average-duration csv-files data-analysis data-visualization dataframe filtering movie-durations movie-length-distribution netflix pandas python trends

Last synced: 22 Nov 2024

https://github.com/hfxbse/dhbw-data-analysis

Exploratory data analysis R notebook for the module T3INF4333 "Grundlagen Data Science" held in 2024 by Lothar B. Blum at the DHBW Stuttgart.

data-analysis data-science dhbw dhbw-stuttgart ggplot2 r r-notebook

Last synced: 20 Dec 2024

https://github.com/akash1070/data-science-advanced-analytics-virtual-experience-program

The BCG Open-Access Data Science & Advanced Analytics Virtual Experience Program

data-analysis data-science machine-learning-algorithms

Last synced: 01 Dec 2024

https://github.com/nafisalawalidris/analyzing-nobel-prize-dataset-demographics-and-trends

This project analyses a Nobel Prize dataset using Python and data analysis libraries. It explores the distribution of winners by category and country, examines the proportion of female winners over time, investigates the age of winners when they received the prize and identifies the oldest and youngest recipients.

age-at-award country-distribution data-analysis data-manipulation dataset demographics filtering gender-balance grouping nobel-prize notable-laureates python trends visualisation winners

Last synced: 22 Nov 2024

https://github.com/yard1/linearordering

An R package. Provides various methods of linear ordering of data. Supports weights and positive/negative impacts.

data-analysis data-analysis-in-r data-analysis-r data-science r

Last synced: 18 Nov 2024

https://github.com/alejo1630/chicago_crimes

A Jupyter Notebook with the data analysis and data visualization of crimes in Chicago from 2017 to 2023 using libraries such as seaborn and folium

data-analysis data-visualization folium pandas python seaborn

Last synced: 31 Dec 2024

https://github.com/nafisalawalidris/building-a-clustering-model-for-customer-segmentation

Customer Segmentation Using Clustering: This repo applies clustering algorithms to a customer transaction dataset, grouping similar customers together based on their purchasing behavior. Targeted marketing strategies can be developed by analyzing distinct customer segments.

clustering customer-segmentation data-analysis data-visualization k-means machine-learning marketing-analytics unsupervised-learning

Last synced: 22 Nov 2024

https://github.com/ayobami6/tweet-data-analysis

WeRateDogs Tweets Scrape using twitter Api

data-analysis data-science twitter webscraping

Last synced: 13 Jan 2025

https://github.com/okwilkins/retailanalysis

A comprehensive exploratory analysis and implementation of kmeans/hierarchical clustering on online retail data.

data-analysis data-science machine-learning statistics

Last synced: 20 Nov 2024

https://github.com/ansh420/mcdonald_case-study

It is basically depend on the market Segment Analysis. It is a case study of mcDonald.

algorithms-implemented data-analysis python3 segmentation

Last synced: 18 Nov 2024

https://github.com/pheithar/socialdata_madridcentral

Social data and visualization course at DTU - 2022. Effectiveness of Madrid Central

data-analysis data-visualization jupyer-notebook madrid python

Last synced: 30 Nov 2024

https://github.com/sevdanurgenc/python-for-data-science-lecture-notes

In this repo, I have the course contents of Python for Data Science training, which will be given to Siemens by the cooperation of Academy Peak Information Technologies Training and Consultancy between 28 June - 1 July 2022.

data-analysis data-mining data-modeling data-science data-structure data-visualization matplotlib-tutorial numpy-tutorial pandas-tutorial

Last synced: 30 Nov 2024

https://github.com/ituvtu/datamining-ab-testing

This project focuses on conducting A/B testing to evaluate the effectiveness of two marketing campaigns. Using statistical analysis and hypothesis testing, we determine which campaign is more effective in improving conversion rates.

a-b-testing data-analysis data-analysis-python data-mining ipynb jupyter jupyter-notebook python

Last synced: 16 Jan 2025

https://github.com/pranavarora1895/proteintypeprediction

Data Analysis on Protein Type Prediction

bioinformatics data-analysis supervised-learning

Last synced: 18 Nov 2024

https://github.com/sevdanurgenc/data-modeling-techniques-lecture-notes

In this repo, I have the course contents of Data Modelling Techniques training, which will be given to Innova Technology by the cooperation of Academy Peak Information Technologies Training and Consultancy between 25 - 26 January 2022.

data-analysis data-mining data-modeling data-science data-structure data-visualization

Last synced: 30 Nov 2024

https://github.com/alejo1630/sport_stats

Data analysis of information from the summer and winter Olympic games over the years. UC Davis SQL Specialization Final Project

data-analysis jupyter-notebook olympics-dataset plotly python seaborn sql

Last synced: 31 Dec 2024

https://github.com/aisurjyasamantaray/sales-perfomance-analysis-dashboard

A comprehensive sales performance analysis dashboard built using Python, and visualization tools. This project includes data cleaning, descriptive statistics, correlation analysis, and insights into sales trends, profitability, and the impact of discounts. Key features include interactive visualizations using Seaborn, and Matplot

analytics annova data data-analysis data-visualization-project dataproject eda hypothesis-testing pandas-dataframe python sales-performance-analysis statistics

Last synced: 18 Nov 2024

https://github.com/greed2411/ndl

Numbers Don't Lie, attempt on Data Analysis using pandas and matplotlib.

cities data-analysis data-science data-visualization india kaggle

Last synced: 17 Nov 2024

https://github.com/phillbertnevinemmanuel/automotivesalesdataanalysis

This marks my inaugural venture into personal data analysis, employing SQL and Python for Correlation Analysis. I've sourced the dataset from Kaggle, specifically focusing on automotive sales. You can find the dataset linked on my website below. I'm excited to share that I've independently managed the majority of tasks involved in this project.

data-analysis dataset microsoft-sql-server python python-lambda sql ssms tsql

Last synced: 17 Nov 2024

https://github.com/githubuseraccountamazing/the-amari-project

a project in which I attempted to push some of the limits of stable-diffusion while taking some data along the way

ai ai-generated-images bash data-analysis machine-learning stable-diffusion textual-inversion

Last synced: 19 Nov 2024

https://github.com/harshmule1/school-data-analysis-

School Data Analysis Using SQL

data-analysis mssql sql

Last synced: 17 Nov 2024

https://github.com/mohamedhany99/human-voice-identifier-counter

the application developed in (KIVY) it can identify the users imported into the dataset based on the support vector machine training model it has two features ( Importing new voice - Detection to detect the human voices and count them)

android android-app android-application automation automation-framework data data-analysis data-mining data-science data-visualization datascience kivy kivy-framework machine-learning python

Last synced: 17 Nov 2024

https://github.com/mynenik/xyplot-win32

XYPLOT Plotting and Data Analysis Program for 32-bit Windows

cpp data-analysis data-manipulation data-visualization forth mfc windows-app

Last synced: 24 Nov 2024

https://github.com/ajwad-shaikh/sristi-sanshodh-collect

SRISTI Sanshodh Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments. Contribute and make the world a better place! ✨📋✨ https://docs.opendatakit.org/collect-…

collect data-analysis data-collection javarosa odk opendatakit

Last synced: 17 Dec 2024

https://github.com/john-science/data_science_by_example

Examples of Data Science Tools & Libraries

data-analysis data-science ipython pandas

Last synced: 18 Nov 2024

https://github.com/shibam120302/heart-disease-data-analysis-by-shibam

You can read more on the heart disease statistics and causes for self-understanding. This project covers manual exploratory data analysis

analysis data-analysis scraper

Last synced: 20 Nov 2024