Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/jabhij/tableau_dashboards

Consists brief info about all of my tableau dashboards, insights that I got out of them, & the outcomes that I got after analyzing those visualizations.

data-analysis data-analytics data-science data-visualization tableau visualisation

Last synced: 17 Jan 2025

https://github.com/jabhij/eda_experiments

In this repo I'll use different types of datasets to explore and implement various Exploratory Data Analysis (EDA) approaches.

ames-housing analysis battery-life blackfriday-analysis data-analysis data-science data-visualization eda matplotlib-pyplot numpy pandas python seaborn visualization zomato-data-analysis

Last synced: 17 Jan 2025

https://github.com/jabhij/fbi_nics-firearm-background-checks

This project is a try to showcase the use of guns across the US.

data-analysis data-analytics data-science data-visualization tableau

Last synced: 17 Jan 2025

https://github.com/mgobeaalcoba/matplotlib_y_seaborn

Aquí dejaré trabajos de visualización realizados con ambas librerías de Python.

data-analysis data-science data-visualization dataset matplotlib numpy pandas python seaborn

Last synced: 20 Jan 2025

https://github.com/mgobeaalcoba/analisis_con_r

Trabajos de análisis realizados con lenguaje R

data-analysis data-science dataset r r-package r-programming r-studio

Last synced: 20 Jan 2025

https://github.com/avinesh-masih/data-analytics-assignment

Comprehensive repository of data analytics assignments covering Python, EDA, data cleaning, visualization, machine learning, statistics, SQL, Power BI, and more. Includes practical projects and examples to build skills in tools like NumPy, Pandas, and business intelligence.

ai api data-analysis data-science data-visualization eda flask hypothesis-testing jupyter-notebook machine-learning matplotlib numpy pandas python seaborn sql statistics

Last synced: 11 Feb 2025

https://github.com/muneeb1030/dataannotation

This streamlines the process of annotating data for machine learning tasks, making it easier and more efficient for teams to create labeled datasets by leveraging Label Studio and Bulk

bulk data-analysis data-annotation label-studio python

Last synced: 11 Jan 2025

https://github.com/harmanveer-2546/supply-chain

Supply chain analytics is a valuable part of data-driven decision-making in various industries such as manufacturing, retail, healthcare, and logistics. It is the process of collecting, analyzing and interpreting data related to the movement of products and services from suppliers to customers.

customer-segmentation-analysis data data-analysis data-cleaning data-insights ggplot2 numpy pandas performance-evaluation predictive-analytics-for-business python risk-assessment sales-analysis statistical-analysis supply-chain tidyverse trend-analysis

Last synced: 11 Jan 2025

https://github.com/sarincr/data-analytics-with-knime

Data Analytics with KNIME (Konstanz Information Miner), a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing (ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data-analysis data-mining data-science data-structures data-visualization database datascience deep-learning machine-intelligence machine-learning machine-learning-algorithms machinelearning mining mining-software

Last synced: 21 Jan 2025

https://github.com/kirkalyn13/opensignal_autogenerate_report

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 16 Jan 2025

https://github.com/daniel1kp/openrtb-dashboard

This is a demo project designed to illustrate using Rill to analyze programmatic bid logs using the canonical open RTB framework.

data-analysis openrtb real-time-bidding rill

Last synced: 15 Jan 2025

https://github.com/pradipece/weather_forecast_data_analysis

Using decision trees and random forest algorithms to solve real-world data analysis. "sklearn_decision_trees_random_forests"

data-analysis data-science data-visualization git github python python3

Last synced: 02 Feb 2025

https://github.com/nirmit27/book-recommender-system

This is a book recommendation system based on item-based Collaborative Filtering memory-based model created using Flask.

data-analysis data-science flask python python3 recommender-system render

Last synced: 08 Jan 2025

https://github.com/greed2411/ndl

Numbers Don't Lie, attempt on Data Analysis using pandas and matplotlib.

cities data-analysis data-science data-visualization india kaggle

Last synced: 18 Jan 2025

https://github.com/saidsef/ff18

A complete catalog of all the players in Fifa 2018 and their complete statistics

data-analysis data-visualization fifa18 machine-learning machine-learning-algorithms world-cup-2018 world-cup-ranking

Last synced: 15 Jan 2025

https://github.com/airdac/sim-telco_customer_churn

Prediction of customer churn with logistic regression in R. Team project from UPC's Master's Degree in Data Science

classification data-analysis data-science logistic-regression r statistical-models upc

Last synced: 15 Jan 2025

https://github.com/mohammadreza-mohammadi94/data-analysis-and-machine-learning-projects

A comprehensive collection of data analysis and machine learning projects, showcasing techniques and models for various data challenges. Dive in to explore code examples, analyses, and machine learning workflows.

data-analysis data-science dataframes exploratory-data-analysis pandas python scikit-learn visualization

Last synced: 07 Nov 2024

https://github.com/jamesnw/wtb-data

Explore beer addition and style info from WhatToBrew.com

data-analysis homebrewing jupyter-notebook python3

Last synced: 27 Jan 2025

https://github.com/gowthamsundaresan/eigenscan

blockexplorer for eigenlayer

crypto data-analysis eigenlayer nextjs web3

Last synced: 08 Jan 2025

https://github.com/odeyiany2/flit-apprenticeship-data-science-projects

This repo contains all my projects for my FLiT Apprenticeship

data-analysis data-science data-visualization machine-learning sql

Last synced: 02 Jan 2025

https://github.com/ondrejhruby/countries-of-the-world

Explore global data with this repository, featuring insights, visualizations, and Python code examples on countries worldwide—perfect for enhancing your data analysis and visualization skills.

data-analysis data-science data-visualization geography jupyter-notebook machine-learning matplotlib pandas python statistics

Last synced: 21 Jan 2025

https://github.com/haloapping/pisangijo

Kumpulan library dan framework untuk analisa data, data science, machine learning, deep learning dan masih banyak lagi berbasis bahasa pemrograman Python 🐍.

belajar data-analysis data-science deep-learning forecasting libraries machine-learning perkakas pustaka python3 recommender-system referensi tools

Last synced: 06 Jan 2025

https://github.com/zpreisler/modules

Python libraries and modules for processing simulation outputs

data-analysis python scripts tensorflow

Last synced: 11 Jan 2025

https://github.com/ahmad-ali-rafique/handwritten-digit-recognition-mnist

This project demonstrates a complete pipeline for recognizing handwritten digits using the MNIST dataset. The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation of a Fully Connected Neural Network (FCNN).

ai artificial-intelligence data data-analysis datascience deep-learning deep-neural-networks fcnn fully-connected-network machine-learning machine-learning-algorithms ml modeling

Last synced: 16 Jan 2025

https://github.com/sathyasris27/data-analysis-on-adult-smoking-patterns-in-the-uk

The aim of this analysis is to understand the smoking patterns among adults in the UK.

data data-analysis data-visualization python3

Last synced: 10 Jan 2025

https://github.com/ahmad-ali-rafique/weather-prediction-fcnn

This project demonstrates a complete pipeline for weather prediction using a Fully Connected Neural Network (FCNN). The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation.

ai artificial-intelligence data-analysis data-science deep-learning deep-neural-networks fully-connected-network machine-learning machine-learning-algorithms weather-information

Last synced: 16 Jan 2025

https://github.com/arv-anshul/easy-analysis

A python package to perform Data Analysis easily. (Not Recommended)

arv-dumped data-analysis data-science easy-analysis eda pypi pypi-package python3

Last synced: 17 Feb 2025

https://github.com/bala-1409/foreign-exchange-rate-time-series-data-science-project

This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.

data-analysis data-science data-visualization datapreprocessing eda exploratory-data-analysis forecasting machine-learning-algorithms model modelfitting predictive-modeling python3 scikit-learn statsmodels time-series time-series-analysis

Last synced: 27 Jan 2025

https://github.com/ysayaovong/portfolio

Explore my portfolio showcasing projects in data engineering, cybersecurity, software development, and cloud computing. Highlights include SQL tutorials, automation tools, cybersecurity assessments, and innovative Python applications. Dive into my work and see my expertise in action.

api-integration automation aws cloud-computing cybersecurity cybersecurity-risk-assessment data-analysis data-engineering data-science database-management etl linux project-management python scripting security-policy software-development sql system-optimization visualization

Last synced: 30 Jan 2025

https://github.com/carusel02/sequential-data-processing-and-analysis

Sequential data processing and analysis using linked-list in C

data-analysis data-processing linked-list

Last synced: 09 Feb 2025

https://github.com/mindgamesnl/yanderestats

https://mindgamesnl.github.io/YandereStats/

data-analysis reporting-pipeline yandere yandere-sim

Last synced: 01 Jan 2025

https://github.com/alfikiafan/air-quality-analysis

This repository contains a comprehensive data analysis project on Air Quality Dataset, covering the complete data analysis process from data gathering, cleaning, exploratory data analysis (EDA), to building a fully interactive dashboard using Streamlit.

air-quality data-analysis dicoding

Last synced: 17 Jan 2025

https://github.com/mg380/ibm-applied-data-science-capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization

capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql

Last synced: 09 Feb 2025

https://github.com/lightbridge-ks/zoominterface

A data analysis Shiny app of program Zoom report files.

data-analysis r shiny-apps zoom-class zoom-meetings

Last synced: 16 Jan 2025

https://github.com/talha-1010/imdb-data-analysis

A data analysis project made with python using pandas

data-analysis data-visualization jupyter-notebook pandas pandas-dataframe

Last synced: 14 Jan 2025

https://github.com/mr-chang95/loan_data_visualization

Data Visualization Project for Udacity's Data Analyst Program. Using Python in Jupyter Notebook.

data-analysis data-visualization jupyter-notebook loans python udacity-data-analyst-nanodegree

Last synced: 26 Jan 2025

https://github.com/rickcontreras/modelos1

Modelo de clasificación para predecir el desempeño de estudiantes en las Pruebas Saber Pro en Colombia. Incluye análisis exploratorio de datos, preprocesamiento y modelos de machine learning.

classification colombia data-analysis data-science education educational-assessment exploratory-data-analysis jupyter-notebook machine-learning python saber-pro scikit-learn student-performance

Last synced: 09 Feb 2025

https://github.com/parthds02/customer-segmentation-with-kmeans-clustering

Analyze customer behavior using Python and KMeans Clustering on transactional data. Features RFM analysis, data cleaning, clustering insights, and actionable visualizations to support business decision-making.

data-analysis data-visualization feature-engineering kmeans-clustering numpy pandas vscode

Last synced: 30 Dec 2024

https://github.com/zimmi48/nixpkgs-issues

Analysis on nixpkgs issue lifetime.

data-analysis github-api nixpkgs

Last synced: 16 Feb 2025

https://github.com/rodrigojunqueiradev/exploracao-e-limpeza-de-dados

Repositório utilizado para estudos de "Exploração e Limpeza de Dados" seguindo como guia o livro "Projetos de Ciência de Dados com Python"

data-analysis data-engineering data-science data-visualization datascience matplotlib matplotlib-pyplot numpy pandas python python-3 python3

Last synced: 20 Jan 2025

https://github.com/rodrigojunqueiradev/python-exercises

Repositório para armazenar exercícios realizados na linguagem Python / Repository to organize exercises with Python language

data-analysis data-science data-structures data-visualization database math pandas pandas-python python python-3 python3 sql statistics

Last synced: 20 Jan 2025

https://github.com/devbigboy/excel-advanced-formulas-and-functions

how to develop your own style working with formulas and functions. Next, Oz covers a variety of formulas such as the XLOOKUP/VLOOKUP and INDEX functions, counting and statistical functions, text functions, and date/time, array, math, and information functions.

data-analysis excel

Last synced: 18 Feb 2025

https://github.com/josafary-ds/curso_dnc

Repositório para armazenamento dos arquivos de estudo e projetos DNC - Cientista de Dados

data-analysis data-science data-visualization machine-learning powerbi python

Last synced: 20 Jan 2025

https://github.com/archie-cm/credit_risk_model_vix_id-x_partners

The objective project is to decrease the company's losses by up to 30% through bad loans by creating a machine learning system to assist in automating loan assessments

credit-risk data-analysis data-visualization machine-learning scorecard

Last synced: 20 Jan 2025

https://github.com/archie-cm/a-b-testing-mobile-games

This project have objective to examine what happens when the first gate in the game was moved from level 30 to level 40. When a player installed the game, he or she was randomly assigned to either gate30 or gate40.

abtesting data-analysis python retention-rate

Last synced: 20 Jan 2025

https://github.com/banyc/dfplot

Summarize a data frame by plotting. `cargo install --git https://github.com/Banyc/dfplot.git`.

csv data-analysis plotly plotting statistics

Last synced: 20 Jan 2025

https://github.com/banyc/csv_logger

Long-term logger for data analysis

csv data-analysis logging

Last synced: 20 Jan 2025

https://github.com/colindean/allegheny_voter_reg_analysis

Allegheny County Voter Registration Analysis Tools

data-analysis data-science elections pandas polars python voting

Last synced: 09 Feb 2025

https://github.com/agustin-caceres/proyecto-data-analyst

Proyecto de Data Analyst sobre servicios de Telecomunicaciones en Argentina

business-analytics business-intelligence data-analysis data-visualization database postgresql python streamlit

Last synced: 11 Nov 2024

https://github.com/kunalkumar2001/sales-project-using-excel-and-sql

Comprehensive sales analysis using SQL, Excel, and PowerPoint to uncover insights on top-sellers, peak times, and branch performance.

data-analysis data-analytics excel mssql sql

Last synced: 18 Feb 2025

https://github.com/percival33/machine-learning-engineering

Uni project about enhancing fictional music streaming service, by developing machine learning models to generate popular playlists

data-analysis data-science machine-learning python

Last synced: 22 Nov 2024

https://github.com/dionixius7/titanic-disaster-ml-model

This project predicts the survival of passengers on the Titanic by using Kaggle Titanic Disaster Dataset. The dataset contains information related to passengers, such as age, gender, and class. Different machine learning algorithms have been applied for this predictive model to accomplish an accurate prediction that will define the survival chances

data-analysis data-science data-visualization eda knn-classifier machine-learning neural-network python scikit-learn svm tensorflow titanic-kaggle titanic-survival-prediction

Last synced: 18 Jan 2025

https://github.com/mohnish88/e-commerce-data-analysis

I analyzed sales data to identify trends and patterns, which significantly enhanced decision-making processes. Additionally, I created interactive visualizations to present these insights clearly and effectively, facilitating better understanding and communication of the data's implications.

data-analysis data-cleaning jupyter-notebook pandas plotly python python-library sales sales-analysis visulaization

Last synced: 20 Jan 2025

https://github.com/hosseinkarimi128/zed-one

An AI-powered assistant that analyzes CSV data using natural language queries to generate pandas code and visualizations.

ai-data-analysis automated-pandas automated-pandas-queries csv data-analysis fastapi langchain machine-learning matplotlib nlp openai pandas restful-api summarization visualization-tools

Last synced: 01 Feb 2025

https://github.com/charlescro/reddit-classification-nlp

Analyzing subreddit language via Reddit API and NLP techniques.

data-analysis data-science data-visualization nlp-machine-learning reddit-api scikit-learn

Last synced: 09 Feb 2025

https://github.com/rakeshkanneeswaran/project-titanic-machine-learning-from-disaster

The Titanic Survival Prediction project uses a Decision Tree algorithm combining both regression and classification to predict passenger survival.

data-analysis data-science data-visualization decision-tree-classifier decision-trees supervised-machine-learning

Last synced: 12 Jan 2025

https://github.com/soumya-thoutam/covid-19-impact-on-u.s.-states-and-colleges

Covid-19 analysis and impact on United States Colleges and States using SQL and Tableau.

covid-19 dashboard data-analysis data-visualization dataset sql sql-server tableau

Last synced: 11 Jan 2025

https://github.com/fabioassuncao/desafio-tecnico-iede

Este projeto foi desenvolvido como parte do Desafio Técnico IEDE.

challenge cloudflare data-analysis docker laravel php

Last synced: 26 Jan 2025

https://github.com/ssoehdata/sql_for_data_science_specialization_course

Materials and Certifications from the SQL for DataScience Course

data-analysis data-science database databricks postgresql sql sqlite

Last synced: 26 Jan 2025

https://github.com/ahmedkhaled404/data-cleaning-and-eda-layoffs-mysql

This project involves cleaning a dataset containing information about layoffs from companies around the world.

data data-analysis data-cleaning data-preprocessing datacleaning eda exploratory-data-analysis mysql sql

Last synced: 12 Jan 2025

https://github.com/dhruvsrikanth/basic-data-science

A short Data Science Project I took up for fun! This is a data analysis based on a dataset I created to predict the distribution of wealth within an economy as well as several characteristics of each class within society!

analysis data-analysis data-pipeline data-science data-visualization machine-learning matplotlib pandas python seaborn sklearn

Last synced: 16 Feb 2025

https://github.com/thanaraklee/pyspark-dataframe-operations

This project focuses on utilizing PySpark DataFrames to analyze and visualize data sourced from external datasets, such as CSV files. It provides a practical example of how to manipulate, transform, and gain insights from large datasets using the PySpark framework.

data-analysis dataframe pyspark python

Last synced: 16 Feb 2025

https://github.com/giatraskon/sandbox.bio-solutions

Bash scripts replicating the commands from sandbox.bio's interactive bioinformatics tutorials, organized by categories such as Data Exploration, File Formats, Quality Control, and Data Analysis.

bam-files bash bed-files bioinformatics bioinformatics-workflows command-line-tools computational-biology data-analysis data-exploration data-wrangling fasta-files fastq-files file-formats genomic-data quality-control sandbox-bio sandbox-bio-tutorials sequence-alignment unix-shell variant-calling

Last synced: 06 Feb 2025

https://github.com/edoaltamura/rotational-ksz-macsis

Repository for suppelementary material from my publication on the rotational kinetic SZ effect in MACSIS

cosmology data-analysis galaxy-clusters high-performance-computing hydrodynamics

Last synced: 05 Jan 2025

https://github.com/ayushbaid/football_stats

Analysing the competitiveness in different European football leagues

data-analysis football

Last synced: 09 Feb 2025

https://github.com/greenpau/esqrunner

Run Elasticsearh queries and create metrics based on the result of the queries in Elasticsearch database.

data-analysis elasticsearch query-builder querydsl

Last synced: 26 Jan 2025

https://github.com/brownred/python-and-sql

Python and SQL (postgreSQL & mySQL) for data analysis.

data-analysis databases python3 sql

Last synced: 26 Jan 2025

https://github.com/aran203/fluxease

Python package for eddy flux data post processing

data-analysis data-science eddy-covariance python

Last synced: 09 Feb 2025

https://github.com/iamber12/stack-overflow-analysis-using-stack-exchange-api

This Python-based project utilizes the Stack Exchange API to analyze StackOverflow data, focusing on the 'R' and 'Dot Net' programming tags.

data-analysis data-visualization python stack-exchange-api

Last synced: 09 Feb 2025

https://github.com/macorisd/instagram-fake-account-analysis

A project in R focused on detecting fake Instagram accounts. It includes exploratory data analysis, data visualization, and analysis using three techniques: association rules, formal concept analysis, and regression. The results are presented in an interactive Quarto book.

data-analysis data-science data-visualization r

Last synced: 09 Feb 2025

https://github.com/cescedes/diagnosing-diabetes

Inspect, clean, and validate the data using Pima Indians Diabetes Database. Exploring data that looks at how certain diagnostic factors affect the diabetes outcome of women patients.

data-analysis data-science exploratory-data-analysis python

Last synced: 01 Feb 2025

https://github.com/nemat-al/multivariate_data_analysis

Tasks for Multivariate Data Analysis Course @ ITMO University

data-analysis multivariate-analysis python

Last synced: 23 Jan 2025

https://github.com/cescedes/medical-insurance-costs-with-python

Investigate how different factors affect the prediction of medical insurance costs by practicing many python concepts. From Codecademy Data Science Career Path.

data-analysis python python-dictionaries python-functions python-lists python-loops python-strings

Last synced: 01 Feb 2025

https://github.com/jendives2000/regressions

Performing of a Linear Regression analysis to determine the strength of the relationship between the number of reviews and sales for a retail company.

data-analysis linear-regression pearson-correlation-coefficient regression

Last synced: 26 Jan 2025

https://github.com/matteofasulo/cdc-finf

Project of fundamentals of Computer Science

data-analysis data-science data-visualization numpy pandas python python3

Last synced: 20 Jan 2025

https://github.com/sakan811/honkai-star-rail-a-few-fun-insights-with-data-analysis

The project gives insights that delve into the Honkai Star Rail's character's stats of all available characters as of the given date.

data data-analysis data-science data-visualization game honkai honkai-star-rail honkai-starrail webscraping webscraping-data webscraping-selenium

Last synced: 05 Jan 2025

https://github.com/sakan811/stress-pattern-occurrence-in-english-words

This project is intended to provide English learners with data that allows them to make a data-driven guess when encountering words that they aren't sure where to stress

data-analysis data-visualization english english-language english-learning language powerbi powerbi-report powerbi-visuals

Last synced: 05 Jan 2025