An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/olob0/badwords-pt-br

💬 Wordlist com palavrões em pt-BR para análise de dados, filtros, ou texto considerado "evitável"

badword-filter badwords brasil data-analysis filter filter-lists filterlist portugues portuguese text-analysis wordlist

Last synced: 06 Jan 2026

https://github.com/john-science/data_science_by_example

Examples of Data Science Tools & Libraries

data-analysis data-science ipython pandas

Last synced: 12 May 2025

https://github.com/dwidevelopes/database-input-pelanggran-mahasiswa

Menginput data Mahasiswa Yang Melakukan Pelanggran yang siap di data dan di hukum Dan juga siap Terkena Sanksi

aplikasi aplikasi-sekolah data data-analysis database input-method mahasiswa sekolah siswa siswi website

Last synced: 02 May 2026

https://github.com/thomascenni/anfavea-data-analysis

Data analysis with Pandas and Datapane.

data-analysis datapane pandas

Last synced: 06 May 2026

https://github.com/rani-sikdar/python-data-structures

A repository for sharing Python implementations of common data structures and algorithms, including arrays, linked lists, stacks, queues, trees, graphs, and sorting/searching techniques. Perfect for learning, revisiting fundamentals, and collaborating on computer science concepts. Contributions welcome! 🚀

data-analysis data-structure data-visualization numpy-library pandas-dataframe python python-3

Last synced: 30 Mar 2025

https://github.com/rajnish93/jpandas

A lightweight JavaScript library for working with tabular data, inspired by Pandas in Python. Built with TypeScript, it provides an intuitive API for data manipulation and analysis.

data-analysis data-analytics data-manipulation data-science dataframe javascript pandas stream-processing table typescript

Last synced: 11 Jun 2025

https://github.com/gholamrezadar/most-profitable-actors

Finds the list of actors with the most boxoffice profit using TMDB API.

crawling data-analysis tmdb

Last synced: 16 Jan 2026

https://github.com/vi/rendercsv

Tool to convert CSV table to a picture.

animation csv csv2pic csv2png data-analysis picture png table table-renderer visualization

Last synced: 01 Apr 2025

https://github.com/lmuffato/analise-de-diarias-prefeituas-do-es

Esse código faz parte de um projeto de descoberta e combate a esquemas de corrupção, através do tratamento e cruzamento de dados abertos disponíveis em diversas prefeituras do Espirito Santo através do portal da transparência. Junção e análise de várias tabelas importadas em csv.

data-analysis personal-project r rstudio

Last synced: 12 Jun 2025

https://github.com/mr-vozhyk/test-tasks

Выполненные тестовые задания (не запрещенные к публикации)

analysis data-analysis digital-analysis e-commerce excel google-colab google-sheets marketing marketplace power-bi power-query python sql

Last synced: 07 May 2026

https://github.com/gholamrezadar/favourite-youtube-channels

this program goes through your youtube watch history and sorts channels based how many of their videos you have watched!

data-analysis data-visualization python

Last synced: 16 Jan 2026

https://github.com/kentlouisetonino/sw-project-data-analysis

My project for AMA MATH 6200 course.

data-analysis python school-project

Last synced: 28 Feb 2025

https://github.com/zen204/airbnb-availability

A machine learning model that predicts Airbnb listing availability, utilizing feature engineering and supervised learning techniques to improve guest experience and optimize host management.

binary-classification data-analysis data-preprocessing data-visualization feature-engineering machine-learning matplotlib model-evaluation nlp pandas predictive-modeling python scikit-learn seaborn supervised-learning

Last synced: 21 Jan 2026

https://github.com/shakhthi/deep-learning

All Materials, Practice codes and Projects related ML & DL

data-analysis deep-learning machine-learning

Last synced: 09 Apr 2025

https://github.com/rahul-404/full_stack_data_science_with_generative_ai

Welcome to the repository for the course "Full Stack Data Science with Generative AI". This repository is designed to accompany the course and provide resources, exercises, and projects related to the study of data science and generative AI techniques.

data-analysis data-science data-visualization database deep-learning exploratory-data-analysis feature-engineering generative-ai machine-learning nlp python statistics

Last synced: 12 Apr 2026

https://github.com/asifdotexe/flipkart-electric-scooter-data-analysis

In this project, I have web scraped Electric Scooter data from Flipkart and turn it into a csv file for further analysis

beautifulsoup4 data-analysis data-science flipkart webscraping

Last synced: 29 May 2026

https://github.com/jpotter80/notebook-examples

This repository demonstrates a systematic approach to cleaning and standardizing e-commerce product data using DuckDB. The notebook serves as a detailed walkthrough of our data cleaning methodology, showcasing how we handle common data quality challenges in e-commerce datasets.

data-analysis data-cleaning jupyter-notebook

Last synced: 12 Jun 2025

https://github.com/cworld1/novel-analysis

A simple project for analyzing Chinese novels

data-analysis novel

Last synced: 17 Mar 2025

https://github.com/saidsef/ff18

A complete catalog of all the players in Fifa 2018 and their complete statistics

data-analysis data-visualization fifa18 machine-learning machine-learning-algorithms world-cup-2018 world-cup-ranking

Last synced: 29 May 2026

https://github.com/webuccinoco/mysql-pivot-tables

Build complex MySQL pivot tables without touching a single line of code. This free PHP tool lets you visually connect your database and map out your data sources with a few simple clicks.

business-analytics business-intelligence crosstab data-analysis data-analytics data-visualization mysql mysql-database mysql-pivot-table mysql-reports mysql-virtualization php php-pivot-table php-reports pivot-tables reporting-tools

Last synced: 04 Feb 2026

https://github.com/gappeah/nike_web_crawler

This project involves web scraping Nike's product pages to extract product names, prices and links. The project showcases three different implementations of the web crawler using Selenium and BeautifulSoup. It also includes visualisation of the scraped data using Matplotlib and Seaborn.

beautifulsoup data-analysis data-visualization python selenium web-crawler web-scraper webcrawler webscraper webscraping webscraping-beautifulsoup

Last synced: 04 Jul 2025

https://github.com/anushadatta/airbnb-in-seattle

🏨 Understanding the Airbnb rental landscape in Seattle using data science.

airbnb data-analysis data-exploration data-visualization datascience sentiment-analysis

Last synced: 13 Jun 2025

https://github.com/harmanveer-2546/supply-chain

Supply chain analytics is a valuable part of data-driven decision-making in various industries such as manufacturing, retail, healthcare, and logistics. It is the process of collecting, analyzing and interpreting data related to the movement of products and services from suppliers to customers.

customer-segmentation-analysis data data-analysis data-cleaning data-insights ggplot2 numpy pandas performance-evaluation predictive-analytics-for-business python risk-assessment sales-analysis statistical-analysis supply-chain tidyverse trend-analysis

Last synced: 10 Apr 2026

https://github.com/jiachengwang-punch/predictive-analytics-skill

A reusable, multi-model, language-adaptive methodology for end-to-end machine learning analysis of tabular data.

claude-skill codex-skill data-analysis data-science deepseek feature-engineering lightgbm llm machine-learning methodology prompt-engineering tabular-data

Last synced: 30 May 2026

https://github.com/whis99/userfunnelanalysis

An ecommerce user funnel conversion data analysis with matplotlib & python.

data-analysis data-analysis-python data-analyst data-visualization google-colab jupyter-notebook matplotlib python

Last synced: 13 Apr 2026

https://github.com/smahala02/calorimtery

A calorimetry lab project involving Python and Excel for computing heat transfer from experimental data.

calorimetry chemistry data-analysis excel jupyter-notebook python thermodynamics

Last synced: 05 Feb 2026

https://github.com/0xpr03/clantool

CF Management & Data Analysis Tool, crawler backend in rust

backend-server crawler data-analysis rust

Last synced: 05 Feb 2026

https://github.com/jakubkorytko/data-graphs

Transform raw data into captivating visual stories with this app, effortlessly craft stunning data charts that unveil insights and trends

charts data-analysis mit-license open-source

Last synced: 14 May 2026

https://github.com/luminati-io/Airbnb-dataset-samples

A sample dataset of over 1000 Airbnb listings, extracted using the Bright Data API, ideal for competitor tracking, brand reputation, and market analysis.

airbnb airbnb-listings api data-analysis datasets web-scraper web-scraper-api web-scraping

Last synced: 09 Apr 2025

https://github.com/shubham5027/kisanai--the-ultimate-ai-ml-powered-platform-smart-farming-platform

KisanAI – The Ultimate AI/ML-Powered Smart Farming Platform KisanAI leverages AI/ML to optimize farming practices, enhance crop yields, and empower small-scale farmers with data-driven insights.

ai api aws chatbot crm data-analysis deep-learning deplyment farming llm mapping ml nodejs predictive-modeling reactjs supabase sustainability

Last synced: 30 May 2026

https://github.com/cintia0528/data_cleaning_and_analytics-python

Evaluate if aggressive discounting benefits Eniac long-term, considering differing views on customer acquisition and brand positioning. Focus on data cleaning for informed decision-making.

colab-notebook data data-analysis datacleaning dataquality jupyter-notebook matplotlib pandas python seaborn

Last synced: 08 Jan 2026

https://github.com/prernarohra/mental-health-prediction

This project focuses on predicting mental health outcomes using machine learning algorithms. By analyzing various psychological, social, and lifestyle factors, the model aims to identify individuals at risk, enabling early intervention and support.

data-analysis data-science data-visualization machine-learning mental-health python

Last synced: 20 May 2026

https://github.com/prernarohra/quakeguard

QuakeGuard is an innovative project for reducing earthquake intensity and structural damage. It takes a proactive approach to seismic activity, by using complex algorithms and real-time data to improve safety and resilience for people in earthquake-prone areas.

artificial-intelligence backend data-analysis data-science earthquake-intensity final-year-project front-end geology machine-learning open-source python visualization

Last synced: 21 May 2026

https://github.com/gustavo-zamai/product_return_data_analysis

Analysis returns products of differents stores

data-analysis excel pandas plotly-express python3 pywin32

Last synced: 13 May 2026

https://github.com/gustavo-zamai/shop_data_analisys

Analysis diferents shopping mall sells

data-analysis openpyxl pandas python3 pywin32

Last synced: 01 Mar 2025

https://github.com/jatin-mehra119/bike-rentals-dataset

This repository focuses on optimizing bike rental availability during peak hours and days using machine learning techniques. Leveraging publicly available data from the UCI Machine Learning Repository, it includes scripts for data preprocessing, model training, and visualization, along with detailed observations and results.

data-analysis data-science ensemble-model pandas scikitlearn-machine-learning

Last synced: 15 Apr 2026

https://github.com/garcane/global-shipping-analytics-dashboard

This Tableau project provides a comprehensive visual analysis of global sales, shipping costs, and quality metrics across different regions and countries.

data data-analysis data-analyst data-visualization metrics tableau

Last synced: 01 Mar 2026

https://github.com/devexpress-examples/aspnet-pivot-grid-custom-aggregates

This example shows how to aggregate data by the field's first value.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 06 Jul 2025

https://github.com/nafisalawalidris/building-a-clustering-model-for-customer-segmentation

Customer Segmentation Using Clustering: This repo applies clustering algorithms to a customer transaction dataset, grouping similar customers together based on their purchasing behavior. Targeted marketing strategies can be developed by analyzing distinct customer segments.

clustering customer-segmentation data-analysis data-visualization k-means machine-learning marketing-analytics unsupervised-learning

Last synced: 16 Mar 2025

https://github.com/ssreeramj/youtube_channels_analysis

This web app gives a detailed analysis of the videos uploaded in a particular youtube channel.

data-analysis heroku pandas python streamlit youtube

Last synced: 29 Apr 2026

https://github.com/jubinjacob03/heartdiseaseclassify-ml

Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.

data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine

Last synced: 18 Jan 2026

https://github.com/robinmillford/analytics_for_fashion_supply_management

This Streamlit dashboard provides a comprehensive analysis of supply chain data, focusing on key metrics such as production volumes, stock levels, order quantities, revenue, manufacturing costs, lead times, shipping costs, transportation routes, risk factors, and sustainability factors

dashboard data-analysis data-visualization streamlit supply-chain-management

Last synced: 07 Sep 2025

https://github.com/albertomorini/policesviolence

Repository for the project of the course Data Science (Fondamenti di Scienza dei Dati) at UniUD.

data-analysis data-science data-visualization r

Last synced: 31 May 2026

https://github.com/programmer-rd-ai/dimensionality-reduction

DimRed is a comprehensive Python toolkit for advanced dimensionality reduction, integrating with major machine learning libraries and featuring real-time performance monitoring to enhance data analysis and model efficiency.

analytics data-analysis data-science lightgbm machine-learning matplotlib numpy pandas programming python python3 sklearn university xgboost

Last synced: 01 Mar 2025

https://github.com/as16082023/music-store-analysis

This project involves analyzing music store data using SQL queries in MySQL workbench to enhance decision-making, identify trends, and understand customer behavior

data-analysis music-store-analysis mysql sql

Last synced: 06 Jul 2025

https://github.com/sarathchandranpm/restaurant_order_analysis

This project entails an in-depth analysis of a restaurant's order and menu data. The focus is on exploring customer ordering behaviors, menu item attributes, and order specifics. By investigating the connections between order details, menu items, and order dates, the project seeks to generate valuable insights into the restaurant's operations.

data-analysis mysql sql

Last synced: 10 Apr 2025

https://github.com/sijuswamy/data-analytics-using-r

Course Repository for Data Analysis using R- Add-on course

data-analysis

Last synced: 12 Apr 2025

https://github.com/jerela/mola

A Python library for matrix algebra

data-analysis linear-algebra-library matrix-algebra python

Last synced: 14 Jan 2026

https://github.com/jcaperella29/stock_evaluation_python

A Python script to classify companies based on financial metrics like Piotroski F-Score and Stock Valuation, using CSV financial data for analysis and output.

ai-in-finance artificial-intelligence classification csv-processing data-analysis expert-system finance financial-analysis financial-analysis-tools piotroski-f-score python quantitative-analysis rule-based-classifier stock-analysis stock-valuation

Last synced: 07 Sep 2025

https://github.com/jeniljani-4444/end-to-end-world-cup-analysis-web-app

Our streamlined Streamlit web app fetches and processes ESPN CricInfo data delivering dynamic graphs for a quick and engaging cricket experience. Deployed on AWS EC2 with CI/CD pipelines.

aws-ec2 data-analysis plotly preprocessing streamlit-webapp

Last synced: 02 Apr 2025

https://github.com/tks18/pyquery

PyQuery is a local-first data operating system built on lazy execution that processes 100GB+ files while you doomscroll. No cap. 🧢

data-analysis data-science etl hdfs parquet pipeline polars python

Last synced: 14 Jan 2026

https://github.com/erickkhosasi/thelook-data_analysis

Final project for my SQL mini bootcamp. This project explores an e-commerce dataset to uncover key business insights. Data insights were queried in Google BigQuery and visualized with Google Sheets.

bigquery data-analysis e-commerce sql

Last synced: 05 Oct 2025

https://github.com/victoriapm/analyze_a-b_test_results

Understand the results of an A/B test run by an e-commerce website.

ab-testing data-analysis ecommerce-website

Last synced: 06 Oct 2025

https://github.com/abhinavsharma07/fraud_analytics-credit_card_fraud_detection

The aim of this project is to predict fraudulent credit card transactions with the help of different machine learning models.

banking data-analysis decision-trees hyperparameter-optimization machine-learning-algorithms pipelines random-forest-classifier svm-classifier xgboost-classifier

Last synced: 06 Oct 2025

https://github.com/siddharthbadal/sql-case-studies-data-analysis

Data Analysis case studies on various databases using SQL . Demonstrating proficiency in solving diverse business problems. Projects cover sales, orders, products, finance, healthcare and other sectors, and highlight my ability to analyze complex datasets through SQL queries, data manipulation, and visualization techniques.

data-analysis sql sql-query sql-server sqlserver

Last synced: 08 Jan 2026

https://github.com/aritrakar/statpy

A simple package containing some functions for analysing Gaussian and Binomial distributions. Created for the Udacity AWS MLE Foundations 2021 course.

data-analysis python statistics

Last synced: 24 Oct 2025

https://github.com/ahammadshawki8/playing-with-pandas

🐼 Pandas is one of my favourite library in python. It is well-known for "Analyzing" data. Learn basics and beyond the basics of Pandas from this repository. 🤍🖤

beginner-friendly data-analysis favourite-library pandas python

Last synced: 17 Apr 2026

https://github.com/sunnybibyan/exploratory-data-analysis-eda

Welcome to the Titanic Dataset - Exploratory Data Analysis (EDA) project repository! This project aims to uncover insights from the Titanic dataset using Python and Jupyter Notebook. By analyzing key variables such as age, gender, and class, we aim to visualize relationships between passenger characteristics and survival rates.

data-analysis data-visualization jupyter-notebook python titanic-dataset

Last synced: 18 Jan 2026

https://github.com/jabhij/eda_experiments

In this repo I'll use different types of datasets to explore and implement various Exploratory Data Analysis (EDA) approaches.

ames-housing analysis battery-life blackfriday-analysis data-analysis data-science data-visualization eda matplotlib-pyplot numpy pandas python seaborn visualization zomato-data-analysis

Last synced: 14 Apr 2026

https://github.com/muneeb1030/webscrapper_mastodon

The Mastodon Social Platform Scraper is a Python-based web scraping tool designed to explore and extract valuable data from the Mastodon social platform.

data-analysis data-collection mastodon python3 scrapy scrapy-spider selenium-python webscraping

Last synced: 09 Oct 2025

https://github.com/shuklayash02/data_analysis_using_r

Covid19 analysis and cleaning of data where the death age and deaths of specific gender is cleaned and analysed

analysis cleaning-data data-analysis data-visualization rprogramming

Last synced: 09 Oct 2025

https://github.com/smusab9152/pokemon_data_analysis

This repo that explores and analyzes a dataset of Pokémon attributes. The analysis includes data cleaning, exploratory data analysis (EDA), and visualizations .

analytics data-analysis data-visualization exploratory-data-analysis jupyter-notebook matplotlib numpy pandas pokemon python seaborn statistical-analysis

Last synced: 02 May 2026

https://github.com/PatriLoto/Intro_R_para_reinventarTEC_2021

Material para el taller de Primeros pasos en R para el análisis de datos

data-analysis rstats

Last synced: 10 Oct 2025

https://github.com/kamanhang/sqldatawarehousedataengineeringproject

This project delivers a modern data warehouse which focuses on building clean, organized data pipeline which covers important aspects such as ETL Pipeline Development, Data Cleaning, Data Modelling and Data Analytics

customer-analytics data-analysis data-cleaning data-engineering data-modeling data-pipeline data-visualization datascience etl-pipeline postgresql powerbi powerbidashboard sales-analysis sql

Last synced: 10 Oct 2025

https://github.com/cosmoduende/r-twitter

Explore your Twitter activity with R: Sentiment Analysis and Data Visualization. How to analyze your Twitter account (or any account), discover your habits and sentiments with the "rtweet" package and NLP.

data-analysis data-visualization lemmatization nlp nlp-library nlp-resources nltk nltk-library r-package r-programming r-studio rtweet stemming twitter twitter-api twitter-data twitter-data-analysis twitter-data-extraction twitter-sentiment-analysis udpipe

Last synced: 10 Oct 2025

https://github.com/cyberoctane29/unicorn-companies-analysis

This project explores unicorn companies, private startups valued at over $1 billion, using Python for data analysis. It covers industry trends, geographic distribution, and investment patterns through EDA, including data cleaning, handling missing values, datetime transformations, and visualizations to uncover key insights.

data-analysis eda numpy pandas python

Last synced: 02 May 2026

https://github.com/selcuk05/forbes_top_100_celebrities_data_analysis

Forbes Top 100 Celebrities since 2005 Data Analysis and Visualization

data-analysis data-science

Last synced: 11 Oct 2025

https://github.com/rupav/fifa17-detailed-analysis

⚽ FIFA 17 data analysis using various Machine Learning Algorithms. ⚽

data-analysis data-visualization fifa17 machine-learning-algorithms radar-chart

Last synced: 16 May 2026

https://github.com/famarks/grafarg

Grafarg is an interactive data analytics and graphical data visualization application. Grafarg being a progressive fork of Grafana 7.5.17 continues to be available under open source Apache 2.0 License

analytics charts data data-analysis data-science data-visualization grafana grafarg graph

Last synced: 19 Jan 2026

https://github.com/shreeparab1890/flipkart-laptops-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Laptops listed on Flipkart.

data-analysis eda exploratory-data-analysis matplotlib-pyplot numpy pandas plotly

Last synced: 14 Apr 2026

https://github.com/titanscouting/tra-superscript

The Red Alliance data analysis package

data-analysis frc-scouting hacktoberfest python

Last synced: 11 Oct 2025

https://github.com/abhi-lab2/ipl-data-analysis

IPL data analysis for future predictions

data-analysis data-science python

Last synced: 14 Apr 2026

https://github.com/phomint/udacity_dataanalysis

All projects and activities

data-analysis python udacity-nanodegree

Last synced: 11 Oct 2025

https://github.com/strampelligiovanni/straklip

An HST pipeline for reducing wide-field imaging observations not specifically designed for High Contrast Imaging analysis. Published in Strampelli et al. 2022.

binaries data-analysis data-reduction direct-imaging exoplanets high-contrast-imaging hst wide-field-surveys

Last synced: 12 Oct 2025

https://github.com/listiangr/ecommerce_sales_data_analysis

Proyek ini menganalisis data penjualan e-commerce untuk membantu bisnis memahami tren penjualan, performa produk, dan segmen pelanggan. Tujuan utamanya adalah memberikan wawasan yang dapat meningkatkan strategi pemasaran dan pengelolaan produk.

dashboard data-analysis data-cleaning data-collection data-penjualan data-visualization exploratory-data-analysis microsoft-excel

Last synced: 19 Jan 2026

https://github.com/ayobami6/tweet-data-analysis

WeRateDogs Tweets Scrape using twitter Api

data-analysis data-science twitter webscraping

Last synced: 31 May 2026

https://github.com/bristolmyerssquibb/blockr.workshop

R in Pharma 2024 blockr workshop

data-analysis nocode r

Last synced: 18 Apr 2026

https://github.com/raad07/sql_project-world_layoffs_dataset

This is a SQL project which comprises the Data Cleaning in the first part and Exploratory Data Analysis (EDA) in the second part.

data-analysis database mysql sql

Last synced: 27 Jan 2026

https://github.com/agustinmusanti/delitosencaba-proyectofinal-dataanalytics-coderhouse

En este repositorio muestro mi proyecto final en el curso "Data Analytics" de Coderhouse.

data-analysis excel powerbi

Last synced: 22 Jan 2026

https://github.com/rkolehov/retail-sales-analysis-project

End-to-end e-commerce analysis showcasing SQL and data visualization skills. Tracks sales, customer behavior, product performance, and delivery efficiency. Interactive dashboards provide actionable insights for business decision-making

analytics dashboard data-analysis ecommerce jupyter-notebook postgresql python sql tableau vscode

Last synced: 19 Apr 2026