Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/dcs-training/introcausalinference

This is a repository for the Introduction to Causal Inference course provided by Chris Oldnall for the CDCS. Go to the readme file

data-analysis python r statistics

Last synced: 07 Jan 2025

https://github.com/nafisalawalidris/tools-for-data-science

It covers popular languages (Python, R, SQL) and libraries (NumPy, Pandas) used in the field. The author shares their objectives of teaching data analysis, web development, and critical thinking skills. The repository also includes code examples, explanations of arithmetic expressions, and contact information for the author.

arithmetic-expressions data-analysis data-science data-visualization languages libraries matplotlib numpy pandas programming python r sql tools web-development

Last synced: 23 Jan 2025

https://github.com/dcs-training/good-data-visualisation-with-r

Our guide on how we create data visualisations through R. Go to the readme file

data-analysis data-visualisation r rmarkdown

Last synced: 07 Jan 2025

https://github.com/nafisalawalidris/international-breweries

This GitHub readme provides an overview of data analysis using SQL on the International Breweries dataset, including dataset description, analysis questions, example SQL queries, and key insights derived from the analysis.

data-analysis insights international-breweries-dataset queries sql

Last synced: 23 Jan 2025

https://github.com/mohammadreza-mohammadi94/data-analysis-and-machine-learning-projects

A comprehensive collection of data analysis and machine learning projects, showcasing techniques and models for various data challenges. Dive in to explore code examples, analyses, and machine learning workflows.

data-analysis data-science dataframes exploratory-data-analysis pandas python scikit-learn visualization

Last synced: 07 Nov 2024

https://github.com/mathieu2301/pbsc-tracker

Expérience de tracking des vélos en libre service fonctionnants avec PBSC

ai data-analysis data-mining data-science data-visualization libelo machine-learning pbsc valence velib-tracker

Last synced: 15 Jan 2025

https://github.com/ibnaleem/cyberchef-discord

A versatile Discord bot that implements CyberChef's features for encoding, decoding, encrypting, compressing, analysing data directly and more in your Discord server

compression cti cyberchef cybersecurity data-analysis data-manipulation discord-bot discord-js encoding encryption hashing infosec parsing redteam

Last synced: 02 Feb 2025

https://github.com/madhuresh2011/amazon-sales-report-analysis-using-python

This project focuses on analyzing Amazon sales data using Python to uncover insights into sales performance, customer behavior, and product trends

charts cleaning-data data-analysis jupyter-notebook matplotlib numpy pandas python seaborn visualization

Last synced: 02 Feb 2025

https://github.com/shadan100/sales-prediction-analysis

The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.

artificial-intelligence data-analysis data-science django django-framework jupyter-notebook machine-learning matplotlib pandas predictive-modeling python sales-prediction

Last synced: 12 Feb 2025

https://github.com/shadan100/stroke-prediction-analysis

A web based application to predict whether a patient is likely to get stroke based on the input parameters like gender, age, various diseases, and smoking status. Each row in the data provides relevant information about the patient.

artificial-intelligence data-analysis data-science django django-framework jupyter-notebook machine-learning matplotlib pandas predictive-modeling python stroke-prediction web-application

Last synced: 12 Feb 2025

https://github.com/cs-joy/pandasv2.0.3

learn data analysis with pandas

data-analysis pandas pandas-learning

Last synced: 05 Jan 2025

https://github.com/hafeez-urrehman/mobile-price-classification

In the Mobile Price Classification project, I built a predictive model to categorize mobile phones into different price ranges based on their features by applying machine learning techniques.

data-analysis linear-regression machine-learning mobile-price-prediction model-save-and-load predictive-modeling

Last synced: 08 Jan 2025

https://github.com/mubassim-khan/stack-overflow-developer-survey-2023

This repository contains the code for data analysis of Stack Overflow Developer Survey 2023, containing the digital representation of most used languages and much more. View README for more descriptive overview of repository.

data-analysis data-analysis-python matplotlib-pyplot numpy pandas-python

Last synced: 15 Jan 2025

https://github.com/sumitgirwal/procoder-public

"ProCoder", which is a web-based application providing massive open online courses for both professionals and students. It aims to offer a platform for learning coding skills online, accessible to anyone who is interested in learning programming or enhancing their coding knowledge. ProCoder provides courses on various programming languages, tools.

blog-platform bootstrap-4 chat-application css3 data-analysis django-crud django-project html5 javascript numpy-library pandas-library python3

Last synced: 06 Jan 2025

https://github.com/stastnypremysl/lsql-csv

lsql-csv is a tool for small CSV file data querying from a shell with short queries. It makes it possible to work with small CSV files like with a read-only relational databases. The tool implements a new language LSQL similar to SQL, specifically designed for working with CSV files in shell.

csv data-analysis data-processing haskell language linux-shell lsql lsql-csv new-language query-language relational-database sql unix-command unix-philosophy unix-shell

Last synced: 02 Feb 2025

https://github.com/githubuseraccountamazing/the-amari-project

a project in which I attempted to push some of the limits of stable-diffusion while taking some data along the way

ai ai-generated-images bash data-analysis machine-learning stable-diffusion textual-inversion

Last synced: 20 Jan 2025

https://github.com/tiwarishubham635/uber-data-analysis-using-r

Analyzes the Uber Cab data using plots, heatmaps and dataframes

data-analysis data-visualization r

Last synced: 15 Jan 2025

https://github.com/jrschmidtt/csv-to-html

Convert csv file to html table in javascript.

body-parser csv data-analysis javascript nodejs oop

Last synced: 03 Jan 2025

https://github.com/pradeepchegur/seamantic_web_design

We designed a semantic web for Instagram in Wix platform.

data-analysis framework instagram semantic-web website-design wix

Last synced: 22 Jan 2025

https://github.com/walkerdustin/vergleich-von-messmethoden-fuer-punktwolken

Bei der Vermessung eines physischen Raumes ist das Ergebnis eine Punktwolke. Diese Punktwolke beschreibt dann ausgewählte Punkte im Raum, zum Beispiel auf den Wänden und der Decke. Wenn diese Punkte in zwei seperaten Messungen gemessen werden, vielleicht sogar von unterschiedlichen Geräten, soll hinterher herausgefunden werden wie genau diese Punktwolken übereinstimmen. Dafür gibt es zwei grundsätzlich verschiedene Methoden. Diese sollen hier verglichen werden.

3d-models accuracy-metrics data-analysis data-visualization kaggle measure-distance numpy point-cloud pointcloudprocessing punkte python science-research simulation statistics

Last synced: 30 Dec 2024

https://github.com/sunnybibyan/call_centre_power_bi_dashboard

Create a dashboard in Power BI to visualize relevant KPIs and metrics that will help the call center manager understand trends.

call-centre-analysis dashboard data-analysis data-visualization powerbi

Last synced: 05 Jan 2025

https://github.com/eslamdyab21/imdb-data-analysis

This data set contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue

data-analysis pandas python udacity-data-analyst-nanodegree

Last synced: 22 Jan 2025

https://github.com/willie-conway/datavista

A robust 🐍Python application for data analysis that provides a wide range of tools for 🔃loading, 🧹cleaning, and 🔃preprocessing data. It includes features for 📈statistical analysis, 👨🏿‍🔬hypothesis testing, 🦾machine learning, clustering, ⏳time series forecasting, and 📊data visualization, all designed to enhance your analytical workflow.

analytics big-data command-line data-analysis data-cleaning data-driven data-mining data-pipeline data-preprocessing data-science data-scientist data-visualization data-wrangling exploratory-data-analysis machine-learning pandas predictive-analytics python statistics visualization-tools

Last synced: 10 Jan 2025

https://github.com/airscholar/data_analysis_with_ai

A repository showing how to use AI and ChatGPT for Data Analysis with Pandas and Python

chatgpt data-analysis gpt4 openai pandas pandasai python

Last synced: 13 Jan 2025

https://github.com/sunnybibyan/random_data_generation

A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.

data-analysis data-visualization python random-data-generation statistics streamlit-webapp

Last synced: 05 Jan 2025

https://github.com/prernarohra/heart-disease-prediction

This project develops a machine learning model to predict heart disease risk based on symptoms and medical history. The model achieved the best accuracy with Logistic Regression, as it works well for binary classification problems.

artificial-intelligence data-analysis data-science dataset heartdisease-prediction machine-learning models

Last synced: 27 Dec 2024

https://github.com/ryanfranklin237/data-visualization-spreadsheets

Data visualization done with microsoft excel and google spreadsheets

data-analysis data-science data-visualization google-spreadsheets microsoft-excel

Last synced: 10 Jan 2025

https://github.com/abhi-lab2/ipl-data-analysis

IPL data analysis for future predictions

data-analysis data-science python

Last synced: 29 Dec 2024

https://github.com/santiagortiiz/snowflake-data-warehousing

Snowflake University. Snowflake Data Warehousing. Foundamentals

big-data data-analysis data-warehouse olap snowflake

Last synced: 08 Jan 2025

https://github.com/shibam120302/heart-disease-data-analysis-by-shibam

You can read more on the heart disease statistics and causes for self-understanding. This project covers manual exploratory data analysis

analysis data-analysis scraper

Last synced: 21 Jan 2025

https://github.com/shuklayash02/data_analysis_using_r

Covid19 analysis and cleaning of data where the death age and deaths of specific gender is cleaned and analysed

analysis cleaning-data data-analysis data-visualization rprogramming

Last synced: 15 Feb 2025

https://github.com/vidhi1290/zomato-data-analysis

Zomato Data Analysis - Explore the world of Zomato restaurant data through Python and data analysis. Uncover trends and insights using Pandas for data manipulation and Matplotlib for visualization. Join us in this journey to reveal the hidden stories within the data!

data-analysis data-analysis-python data-science data-visualization dataprocessing machine-learning machine-learning-algorithms matplotlib numpy pandas python scikit-learn zomato-data-analysis

Last synced: 02 Feb 2025

https://github.com/nomadsdev/financial-trend-analyzer

FinancialTrendAnalyzer helps analyze and visualize sales data to uncover financial trends. It uses Python to calculate total sales, track changes, and generate insightful charts for better decision-making.

business-intelligence data-analysis data-visualization financial-analysis matplotlib numpy pandas python revenue-trends sales-data seaborn time-series-analysis

Last synced: 12 Feb 2025

https://github.com/karatechop/noaa-storm-database-data-analysis

Analysis of population health and economic consequences of events documented in the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

data-analysis knitr r rmarkdown

Last synced: 21 Jan 2025

https://github.com/sunnybibyan/marketing_campaign_analysis_power_bi_dashboard

Campaign Performance Analysis This project analyzes the performance of Spring, Summer, and Fall marketing campaigns, revealing key insights and actionable recommendations.

data-analysis data-visualization dax marketing-campaign powerbi

Last synced: 05 Jan 2025

https://github.com/aymane-maghouti/sentiment-analysis-for-jumia-reviews-and-smartphone-price-prediction-system

The project focuses on customer sentiment analysis for Jumia, aiding informed online decisions. It collects and analyzes product comments to determine sentiments and implements a decision-making algorithm. Additionally, it includes product price prediction system using regression techniques.

beutifulsoup data-analysis data-cleaning data-collection data-preprocessing data-scraping data-visualization eda falsk machine-learning python web-application

Last synced: 17 Jan 2025

https://github.com/whis99/userfunnelanalysis

An ecommerce user funnel conversion data analysis with matplotlib & python.

data-analysis data-analysis-python data-analyst data-visualization google-colab jupyter-notebook matplotlib python

Last synced: 13 Jan 2025

https://github.com/leocornus/leocornus-visualdata

JavaScript libraries to make data visualization simpler and easier.

data-analysis data-mining data-visualization data-visualization-simpler javascript-library

Last synced: 08 Jan 2025

https://github.com/akankshaaa013/30-day-machine-learning-deep-learning

To practically Learn, Explore, and Share my Insights on the Libraries and Tools that power Machine Learning.

data-analysis machine-learning python

Last synced: 21 Jan 2025

https://github.com/mahdi-eth/covid-analysis

Covid-19 data analysis project using python, numpy, pandas, matplotlib

data-analysis data-science python

Last synced: 08 Jan 2025

https://github.com/dina-hosny/telco-customer-churn-analysis-using-power-bi

An interactive dashboard to represent some analysis of "Telco customer churn" data and the reasons that made customers churn using Microsoft Power BI.

business-intelligence data-analysis data-modeling data-visualization power-bi powerbi

Last synced: 13 Jan 2025

https://github.com/dina-hosny/explore-us-bike-share-data-project

Explore US Bike Share Data project - FWD Data Analysis Professional Track. In this project, I used Python to explore data related to bike share systems for three major cities in the United States and answer questions about it by computing descriptive statistics.

data-analysis data-science numpy pandas python

Last synced: 13 Jan 2025

https://github.com/antononcube/wl-datareshapers-paclet

Wolfram Language (aka Mathematica) paclet for data reshaping functions, like, long- and wide form, cross tabulation, etc.

contingency-table cross-tabulation data-analysis data-transformation long-form wide-form

Last synced: 08 Feb 2025

https://github.com/nikhilash45/power-bi-vsualisation-of-joins

In This Power Bi Report User Can Visualis Join By Themselves , and it is easy to understand joins now.

business-analytics business-intelligence data data-analysis data-visualization joins powerbi sql visualization

Last synced: 11 Jan 2025

https://github.com/asifdotexe/flipkart-electric-scooter-data-analysis

In this project, I have web scraped Electric Scooter data from Flipkart and turn it into a csv file for further analysis

beautifulsoup4 data-analysis data-science flipkart webscraping

Last synced: 15 Jan 2025

https://github.com/asifdotexe/quickvu

Quick VU: No-code, data cleaning analysis and visualization tool built on Streamlit. Quickly clean, visualize, explore, and understand data relationships and correlations with ease. Perfect for analysts, business users, and anyone looking to gain data insights—without writing a single line of code.

automation data-analysis data-cleaning data-visualization python3 streamlit-application toolkit

Last synced: 15 Jan 2025

https://github.com/asifdotexe/air-quality-analysis-aqa

AQA is a data-driven project focused on analyzing air quality data sourced from data.gov.in. The project encompasses data preprocessing, analysis, and visualization to gain insights into air pollution levels across various locations in India. By examining six key pollutants, the project aims to raise awareness about the environmental issues

aqi-analysis data-analysis data-preprocessing data-science data-visualization presentation

Last synced: 15 Jan 2025

https://github.com/manwithacap/by-the-metric-match

🎲🃏 A game data tracker for your board/card/video games!

data-analysis data-visualization games jupyter-notebook python utility

Last synced: 08 Jan 2025

https://github.com/markmusic27/data-statistics-calculator

💣 This method (made in JavaScript / Python) can find the mean, median, mode, range, and standard deviation.

data-analysis standard-deviation statistics statistics-calculator

Last synced: 05 Jan 2025

https://github.com/2003harsh/house-price-prediction-using-machine-learning

This project features a web app that predicts house prices using a linear regression model. Users can input details like location, square footage, bathrooms, and bedrooms through an HTML form. I've added a CI/CD pipeline with GitHub Actions, unit testing with pytest, and automated Docker containerization to improve deployment and robustness.

ci-cd data-analysis docker-image flask linear-regression machine-learning matplotlib mlops-workflow requests scikit-learn

Last synced: 09 Feb 2025

https://github.com/csoren66/diabetics_prediction

Predicting that whether the patient has diabetes or not on the basis of the features we will provide to our machine learning model.

data-analysis machine-learning python svm

Last synced: 13 Jan 2025

https://github.com/henrylin03/china-gdp

Analysis and visualisation of China GDP data using Python.

data data-analysis data-visualisation dataset kaggle pandas

Last synced: 14 Jan 2025

https://github.com/garciparedes/castile-and-leon-crops

Data Analysis of Castile and Leon Crops Area over the last years

castile-and-leon crops data-analysis data-science jupyter jupyter-notebook notebook spain

Last synced: 16 Jan 2025

https://github.com/akshat0427/python_youtube_history

a bunch of data science operations performed on youtube history data

data-analysis data-science extracting-features

Last synced: 11 Jan 2025

https://github.com/richardwarepam16/rental_analysis_using_python_and_sql

Maximizing Rental Profits: Data-Driven Strategies for a Movie Rental Store

data-analysis data-analytics python3 rental-management sakila-db sqlite3

Last synced: 22 Jan 2025

https://github.com/siddharthbadal/sql-case-studies-data-analysis

Data Analysis case studies on various databases using SQL

data-analysis sql sql-query sql-server sqlserver

Last synced: 26 Jan 2025

https://github.com/sarincr/data-analytics-with-knime

Data Analytics with KNIME (Konstanz Information Miner), a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing (ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data-analysis data-mining data-science data-structures data-visualization database datascience deep-learning machine-intelligence machine-learning machine-learning-algorithms machinelearning mining mining-software

Last synced: 21 Jan 2025

https://github.com/ganeshkumartk/ncov-2019

[EDA] Statistical modelling of Novel Coronavirus breakout nCoV-2019

corona data-analysis ncov ncov-2019 statistics wuhan wuhan-coronavirus wuhan-virus

Last synced: 17 Jan 2025

https://github.com/shahaf-f-s/feature-space

A modular framework for combining pandas series features

data-analysis data-science feature-engineering

Last synced: 08 Jan 2025

https://github.com/jabhij/tableau_dashboards

Consists brief info about all of my tableau dashboards, insights that I got out of them, & the outcomes that I got after analyzing those visualizations.

data-analysis data-analytics data-science data-visualization tableau visualisation

Last synced: 17 Jan 2025

https://github.com/jabhij/eda_experiments

In this repo I'll use different types of datasets to explore and implement various Exploratory Data Analysis (EDA) approaches.

ames-housing analysis battery-life blackfriday-analysis data-analysis data-science data-visualization eda matplotlib-pyplot numpy pandas python seaborn visualization zomato-data-analysis

Last synced: 17 Jan 2025

https://github.com/jabhij/fbi_nics-firearm-background-checks

This project is a try to showcase the use of guns across the US.

data-analysis data-analytics data-science data-visualization tableau

Last synced: 17 Jan 2025

https://github.com/adagio/ivoox_episodes

iVoox Episodes: Scraping & Analysis

beautifulsoup4 data-analysis ivoox pandas python python3 scraping

Last synced: 27 Dec 2024

https://github.com/muneeb1030/dataannotation

This streamlines the process of annotating data for machine learning tasks, making it easier and more efficient for teams to create labeled datasets by leveraging Label Studio and Bulk

bulk data-analysis data-annotation label-studio python

Last synced: 11 Jan 2025

https://github.com/harmanveer-2546/supply-chain

Supply chain analytics is a valuable part of data-driven decision-making in various industries such as manufacturing, retail, healthcare, and logistics. It is the process of collecting, analyzing and interpreting data related to the movement of products and services from suppliers to customers.

customer-segmentation-analysis data data-analysis data-cleaning data-insights ggplot2 numpy pandas performance-evaluation predictive-analytics-for-business python risk-assessment sales-analysis statistical-analysis supply-chain tidyverse trend-analysis

Last synced: 11 Jan 2025

https://github.com/jamiemagee/rhi

Collating the data on the Renewable Heat Incentive scheme, and presenting it in a more readable format.

data-analysis open-data open-government rhi

Last synced: 08 Jan 2025

https://github.com/kirkalyn13/opensignal_autogenerate_report

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 16 Jan 2025

https://github.com/vi/rendercsv

Tool to convert CSV table to a picture.

animation csv csv2pic csv2png data-analysis picture png table table-renderer visualization

Last synced: 16 Oct 2024

https://github.com/zpreisler/modules

Python libraries and modules for processing simulation outputs

data-analysis python scripts tensorflow

Last synced: 11 Jan 2025

https://github.com/ahmad-ali-rafique/handwritten-digit-recognition-mnist

This project demonstrates a complete pipeline for recognizing handwritten digits using the MNIST dataset. The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation of a Fully Connected Neural Network (FCNN).

ai artificial-intelligence data data-analysis datascience deep-learning deep-neural-networks fcnn fully-connected-network machine-learning machine-learning-algorithms ml modeling

Last synced: 16 Jan 2025

https://github.com/ahmad-ali-rafique/weather-prediction-fcnn

This project demonstrates a complete pipeline for weather prediction using a Fully Connected Neural Network (FCNN). The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation.

ai artificial-intelligence data-analysis data-science deep-learning deep-neural-networks fully-connected-network machine-learning machine-learning-algorithms weather-information

Last synced: 16 Jan 2025

https://github.com/ayaanjawaid/google_playstore_data_analysis

This project provides an in-depth analysis of Google Play Store apps and user reviews, focusing on understanding app performance, user sentiment, and key trends in app categories. Using Python, I performed data cleaning, feature engineering, and exploratory data analysis (EDA) on app data and reviews.

data-analysis eda html numpy pandas-dataframe plotly python vizualisation

Last synced: 31 Oct 2024

https://github.com/ysayaovong/portfolio

Explore my portfolio showcasing projects in data engineering, cybersecurity, software development, and cloud computing. Highlights include SQL tutorials, automation tools, cybersecurity assessments, and innovative Python applications. Dive into my work and see my expertise in action.

api-integration automation aws cloud-computing cybersecurity cybersecurity-risk-assessment data-analysis data-engineering data-science database-management etl linux project-management python scripting security-policy software-development sql system-optimization visualization

Last synced: 30 Jan 2025

https://github.com/flyingfathead/neurograph-framework

A versatile tool for visualizing entropy loss in TensorFlow-based neural network training, providing insightful scatter plots with annotations.

data-analysis data-analysis-python data-visualization entropy graph graphs neural-network neural-networks neural-networks-visualization nn python python3 tensorflow tensorflow2 training visualization visualization-tools

Last synced: 11 Jan 2025

https://github.com/viseshrp/community_health_indicator

Android app to fetch,organize and represent NYC health data

android data-analysis data-visualization health

Last synced: 13 Jan 2025

https://github.com/alfikiafan/air-quality-analysis

This repository contains a comprehensive data analysis project on Air Quality Dataset, covering the complete data analysis process from data gathering, cleaning, exploratory data analysis (EDA), to building a fully interactive dashboard using Streamlit.

air-quality data-analysis dicoding

Last synced: 17 Jan 2025

https://github.com/lightbridge-ks/zoominterface

A data analysis Shiny app of program Zoom report files.

data-analysis r shiny-apps zoom-class zoom-meetings

Last synced: 16 Jan 2025

https://github.com/tddschn/whatsapp-chat-analyze

Command Line Tool to Generate Pretty Charts from Whatsapp Exported Chats

data-analysis data-visualization plotly python whatsapp whatsapp-data

Last synced: 17 Jan 2025

https://github.com/hafeez-urrehman/mental-health-analyzer

Mental-Health-Analyzer is an AI-Based project for predicting mental health disorders such as stress, anxiety, depression, and loneliness. By applying machine learning techniques, this project analyzes user inputs and behavioral data to provide accurate predictions, aiming to support mental well-being and early intervention.

data-analysis data-science early-diagnonosis machine-learning mental-health mental-wellbeing predictive-modeling python

Last synced: 08 Jan 2025