An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/betkh/datascieneinpython

Jupiter Notebook files

data-analysis data-visualization

Last synced: 16 Jun 2025

https://github.com/abidshafee/google.colaboratory_projects

This repository contains the collections of interactive python notebooks (ipynb) that are some of my projects on Data Science, Machine Learning (ML), and Natural Language Processing (NLP).

colaboratory data-analysis data-science lstm machine-learning nlp statistics time-series

Last synced: 09 Jul 2025

https://github.com/felipe-veas/visor-sueldos-publicos

Herramienta interactiva para visualizar y analizar remuneraciones del sector público en Chile, construida con Streamlit.

audit chile data-analysis python streamlit transparency

Last synced: 16 May 2026

https://github.com/czesctuklap/sustainable-fashion-database-analysis

This project, analyzes a dataset of sustainable fashion trends for 2024. It includes data preprocessing, exploration, visualization, and insights on environmental impact factors such as carbon footprint, water usage, waste production, and sustainability practices.

data-analysis data-visualization database dataset keggle sustainable-fashion

Last synced: 30 Apr 2026

https://github.com/fmind/malpop

Rank the popularity of malware applications by their occurrence on VirusTotal

data-analysis malware popularity ranking virustotal

Last synced: 11 Apr 2025

https://github.com/estevan-ulian/py-agent-voice

Um projeto para lidar com interações de voz entre humano e agente de I.A. permitindo a leitura e análise de dados de um arquivo CSV.

agent-based-modeling data-analysis python3 whisper-ai

Last synced: 11 Apr 2025

https://github.com/badranalyst/startup-expansion-analysis-with-pandas-matplotlib-and-power-bi

Analyzes startup growth and expansion factors using Pandas for data analysis and Matplotlib for visualizations. Complements findings with data visualizations in Power BI, providing actionable insights into funding and market trends.

dashboard data-analysis data-visualization dataset matplotlib matplotlib-pyplot pandas power-bi powerbi

Last synced: 16 May 2026

https://github.com/josedanielchg/nyc-schools-test-scores-exploration

DataCamp project analyzing NYC public school test scores to identify top math-performing schools, the best overall SAT scores, and borough-level variability using Python and pandas

data-analysis jupyter-notebook python

Last synced: 19 Mar 2025

https://github.com/swat1563/recommendation-system

This repository features a recommendation system and analytics engine using datasets on users, organizations, contents, contacts, events, and recommendations. It includes data preprocessing, building a recommendation system, and creating visual reports with Power BI.

analytics data-analysis data-visualization engine kaggle numpy pandas powerbi powerbi-dashboards powerbi-desktop powerbi-reports python recommendation-engine recommendation-system recommender-systems scikit-learn scipy

Last synced: 07 Jan 2026

https://github.com/damianmarti/big-mac-index

Data analysis from BigMac index

data-analysis data-science

Last synced: 03 Apr 2025

https://github.com/rijul007/diamonds-analysis-using-r

Diamonds data analysis using R, exploring relationships between diamond attributes (such as carat, cut, color, and clarity) and price, with a focus on providing insights for engagement ring selection through various statistical techniques and data visualizations including histograms, boxplots, scatter plots, and bar charts.

data-analysis data-science

Last synced: 25 Jan 2026

https://github.com/rahil-p/nba-hackathon

2018 NBA Hackathon application

data-analysis data-wrangling

Last synced: 16 May 2026

https://github.com/istinnew/enaic-s-discount-strategy-analysis

**(Open to Collaboration):** This project evaluates the impact of discounts on sales and customer retention for Eniac. It includes data cleaning, visualization, storytelling, and strategic insights to optimize discount strategies while maintaining brand reputation. 📊🛍️✨

cleaning-data cleaning-data-in-python cost-optimization data-analysis data-science data-visualization library presentation python visualization

Last synced: 03 Apr 2025

https://github.com/elliotone/nl-semantic-kernel-sales-analyzer

A console project showing Microsoft Semantic Kernel examples for sales data analysis using local AI models via LM Studio.

ai csharp data-analysis dotnet lm-studio local-ai machine-learning semantic-kernel

Last synced: 16 May 2026

https://github.com/sadia-khan13/modern_arts_data_cleaning

Welcome to the Data Cleaning project! This repository is dedicated to showcasing best practices and techniques for cleaning data using Pandas within Jupyter Notebook

data-analysis data-analysis-python data-cleaning data-science jupyter-notebook pandas-python

Last synced: 10 May 2026

https://github.com/as16082023/goodcabs-performance-analysis

Codebasics Resume Challenge 13 Analysing Goodcabs' performance in transportation across India from January to June 2024

codebasicsresumeprojectchallenge data-analysis goodcabs mysql sql

Last synced: 03 Apr 2025

https://github.com/alejandrolara11/desafio_latam_introduccion_analisis_de_datos

Repositorio del curso "Introducción al Análisis de Datos" de Desafío Latam. Ejercicios prácticos realizados durante el curso, enfocados en análisis de datos con Python, Pandas, y visualización básica.

data-analysis data-science data-visualization matplotlib numpy pandas python seaborn statsmodels

Last synced: 29 Apr 2026

https://github.com/jwt218/isonq

MATLAB package for Qtegra-generated data file processing.

data-analysis geochemistry isotopes matlab

Last synced: 03 Apr 2025

https://github.com/yasir-arafah/nyc-trip-fare-prediction-using-tcn

"NYC Trip Fare Prediction Using Temporal Convolutional Networks (TCN)" is a Data Analytics Project where the trip and fare data of NYC taxi are combined and then analyzed using Pyspark and visualized using Matplotlib library. The project predicts the fare by using Temporal Convolutional Neural Network.

colab data-analysis matplotlib nyc-taxi-dataset pyspark python

Last synced: 29 Apr 2026

https://github.com/ggarciajavier/udacity-dalf-project3-test-perceptual-phenomenom

Work performed for the 3rd project of Udacity Data Analyst Nanodegree: statistical testing of a perceptual phenomenom (Stroop task).

data-analysis python statistical-inference udacity-data-analyst-nanodegree

Last synced: 18 May 2026

https://github.com/pdiegel/currencytracker

A Python application that fetches real-time currency exchange rates from an API, securely stores the data in an SQLite database, and includes error handling, logging, and good programming practices for reliable and periodic data capturing.

analysis api currency data-analysis data-capture logging python python3 sqlite3 tracker

Last synced: 09 Sep 2025

https://github.com/michael-angelo-mootoo/quanta-app

Quanta is an open source statistical package app / toolkit for neuroscience and general computational descriptive and inferential statistics.

computational-statistics customtkinter data-analysis descriptive-statistics gui-application inferential-statistics neuroscience python r statistical-analysis statistics tkinter-python

Last synced: 16 May 2026

https://github.com/grindelfp/two-data-manipulative-tasks

Two simple tasks on data analysis and processing.

data-analysis ipynb mlda

Last synced: 17 Feb 2026

https://github.com/mahmoudwal27/brazilian_ecommerce

This project explores and cleans the Olist Brazilian E-Commerce dataset using Python (Pandas) to prepare it for Power BI visualization. The process includes loading data, performing exploratory analysis, handling missing values and duplicates, formatting key columns, and exporting clean datasets.

analytics data-analysis data-analysis-python google-cloud python

Last synced: 16 May 2026

https://github.com/ishansurdi/data-visualisation-empowering-business-with-effective-insights

The following tasks are completed for Data Visualization: Empowering Business with Effective Insights on Forage in October 2024. It is important to note that this should not be interpreted as an endorsement.

chart communicating-insights-and-analysis dashboard data data-analysis forage powerbi powerbi-visuals tableau tata tata-group virtual-internship visual visualization

Last synced: 17 Feb 2026

https://github.com/karishmagupta05/e-commerce-sales-dashboard

This project is an interactive E-Commerce Sales Dashboard built using Power BI. It provides key insights into sales, profit, and customer behavior through visually engaging charts and graphs.

data-analysis data-visualization powerbi

Last synced: 09 Feb 2026

https://github.com/nick-peter-marcus/chocolate-bar-analysis

Analyzing Chocolate Bar Features and Ratings - Data Visualization, Decision Trees, Random Forest

data-analysis data-visualization decision-trees python random-forest seaborn sklearn

Last synced: 10 May 2026

https://github.com/satyacoder29/smartfinance-dynamic-financial-dashboard

SmartFinance: Dynamic Financial Dashboard is an interactive tool designed to visualize key financial metrics like revenue, expenses, and profit. It features real-time data updates, charts, slicers, and navigation for easy analysis. This dashboard helps businesses make data-driven decisions and optimize financial performance.

data-analysis data-cleaning data-modeling data-visualization powerbi powerbi-desktop powerbi-visuals powerquerym

Last synced: 13 Feb 2026

https://github.com/chrisrobertsjr/chrisrobertsjr

Welcome to my Github Profile!

data data-analysis java r sql statistics

Last synced: 03 May 2026

https://github.com/hassanislam463/sentiment_analysis_of_financial_news_headlines_and_affect_on_stock_price_prediction

This project analyzes financial news sentiment using a fine-tuned RoBERTa model and integrates it with stock data to predict price movements using LSTM and GRU. It highlights the role of sentiment in enhancing stock market forecasting.

data-analysis data-science data-visualization deep-learning lstm-neural-networks nlp-machine-learning

Last synced: 28 Mar 2025

https://github.com/hassanislam463/british-airways-data-science

Analyze Skytrax reviews to uncover customer sentiments and key themes while predicting booking behavior using machine learning. This repository includes data collection, analysis, and modeling scripts alongside concise, visualized insights to improve customer experience and operational efficiency.

data-analysis data-science data-visualization

Last synced: 28 Mar 2025

https://github.com/martachesnova/sql

Performing data modeling (ERD) and data engineering. Then, writing series of SQL queries to analyze Employee Database of a company.

data-analysis data-engineering data-modeling erd postgresql sql

Last synced: 16 May 2026

https://github.com/datalopes1/fifa21_datacleaning

Neste projeto será feito o processo de limpeza e manipulação a partir do dataset FIFA 21 messy, raw dataset for cleaning/ exploring, que pode ser encontrado no Kaggle, com licensa CC0: Public Domain e enviado por Rachit Toshniwal.

data-analysis data-cleaning python

Last synced: 30 Apr 2026

https://github.com/engraulleite/local-data-warehousing-with-docker

Creating a DW from 0 to hero. Starting with logical and physical modeling to valuable reports.

airbyte data-analysis datawarehouse docker etl-pipeline metabase pgadmin4 postgresql

Last synced: 01 May 2026

https://github.com/gaurav-van/data-analysis-projects

Collections of Projects that involves Data Analysis and Informed Decision Making

data-analysis database powerbi sql

Last synced: 06 Sep 2025

https://github.com/colindean/allegheny_voter_reg_analysis

Allegheny County Voter Registration Analysis Tools

data-analysis data-science elections pandas polars python voting

Last synced: 16 May 2026

https://github.com/katarinatmb/serbia-protest-analysis

This project analyzes the frequency, regional distribution, and group characteristics of protests that emerged across Serbia following the fatal collapse of the Novi Sad train station roof in November 2024. The analysis explores how different communities responded in the aftermath of the disaster, using data visualization in RStudio

data-analysis data-visualization r r-mark rstudio

Last synced: 10 Jul 2025

https://github.com/mboula/mboula.github.io

GitHub portfolio + interactive resume | Showcasing data projects in civil rights (housing), cannabis, and analytics

cannabis case-study civil-rights compliance dashboards data-analysis data-cleaning data-vizualization excel google-data-analytics housing open-data pattern-analysis portfolio pro-se public-data r sql tableau

Last synced: 10 Jul 2025

https://github.com/carlosvinimsouza/jupyter-notebook-basic

Armazenado todos os trabalhos referentes a Ciência de Dados.

data-analysis data-science programas-jupyter-notebook python

Last synced: 11 May 2026

https://github.com/stkisengese/numpy-data-fundamentals

A comprehensive collection of NumPy exercises covering array manipulation, slicing, broadcasting, random data generation, and real-world data analysis applications.

data data-analysis numpy pre-processing

Last synced: 16 May 2026

https://github.com/mfakhriazhar/housing-price-analysis

Determining the price of a house also depends on various factors such as building area, exterior quality, and amenities. This dataset provides information on properties for sale, and through Exploratory Data Analysis (EDA), patterns and key factors affecting house prices can be identified.

data-analysis data-science data-visualization eda exploratory-data-analysis python

Last synced: 16 May 2026

https://github.com/ashwin331133/sql-healthcare-data

This repository contains SQL queries designed to analyze health care data. The queries focus on patient demographics, encounter costs, and flu shot statistics, aiming to provide insights into patient behavior and financial impacts. The datasets include information on patient encounters, flu shots, and hospital admissions.

data-analysis mysql sql

Last synced: 29 Oct 2025

https://github.com/jelhamm/internode-hellinger-distance-based-decision-tree

Simulations for the paper "Inter node Hellinger Distance based Decision Tree by Pritom Saha Akash, Md. Eusha Kadir, Amin Ahsan Ali, Mohammad Shoyaib"

articles data-analysis data-mining decision-tree decision-tree-classifier hddt hellinger-distance-criterion machine-learning numpy-library paper-implementations python scipy-library simulation tree-node

Last synced: 04 Apr 2025

https://github.com/panoschatzi/erythrocyte_study_statistical_analyses

R code for data transformation, analysis and visualization of experimental data, as well as for statistical analyses and quantitative simulations.

afex data-analysis emmeans ggplot2 lme4 purrr r rprogramming rstats rstudio statistics tidyverse visualization

Last synced: 04 Apr 2025

https://github.com/alfioma/ada-xtq

🔗 Simplify data transfer with ada-xtq, a lightweight tool for seamless integration and efficient handling of data between platforms.

ada algorithms api-development artificial-intelligence automation data-analysis data-visualization docker machine-learning neural-networks open-source programming python software-development xtq

Last synced: 01 May 2026

https://github.com/RLAlpha49/AniSearch-Model

AniSearchModel leverages Sentence-BERT (SBERT) models to generate embeddings for synopses, enabling the calculation of semantic similarities between descriptions. This allows users to find the most similar anime or manga based on a given description.

anime api data-analysis data-merging embeddings flask hugging-face-datasets kaggle-datasets machine-learning manga natural-language-processing nlp python sentence-bert similarity-search

Last synced: 06 May 2025

https://github.com/ifigeneiatsiflidou/popular-items-sales-analysis

Two data tasks in Python: popular items by ZIP & store sales breakdown with plots.

data-analysis matplotlib pandas

Last synced: 16 May 2026

https://github.com/jatin-mehra119/sales-analysis

Sales Analysis of super market

data-analysis salesanalysis visualization

Last synced: 29 Oct 2025

https://github.com/arkww/chinesenewspaperwordcount

Analysis the word count of Chinese characters in Simplified and Traditional Chinese characters and comparing the results

chinese-language data-analysis data-science python

Last synced: 16 May 2026

https://github.com/htsandaruvan/attrition-analytics-suite-by-hello-green

I have created a comprehensive data analytics dashboard to identify factors contributing to attrition,

data-analysis data-analytics data-visualization powerbi

Last synced: 20 Jan 2026

https://github.com/arkww/matmap

Making maps from a Database and making the user guess which map is displayed

data-analysis data-science javascript python

Last synced: 24 Apr 2026

https://github.com/coditheck/data_analysis

Data analysis is the process of inspecting, cleaning, transforming, and modeling data in order to discover useful information, draw conclusions, and support decision making.

data-analysis python

Last synced: 17 Jun 2025

https://github.com/rohitha-tata/churn-predict

Churn Predict uses Machine Learning to analyze customer behavior and identify those likely to leave. It involves data preprocessing, feature selection, model training (Logistic Regression, Random Forest, XGBoost), and evaluation using accuracy and ROC-AUC. The model provides actionable insights to help businesses reduce churn and improve retention

data-analysis logistic-regression machine-learning python

Last synced: 16 May 2026

https://github.com/abhishekyadav915/data-analytics-projects

This project focuses on performing comprehensive data analysis to extract valuable insights from a given dataset. By leveraging various data manipulation, cleaning, and visualization techniques, the project aims to uncover patterns, trends, and correlations that can inform decision-making and strategy.

data-analysis data-visualization dataset

Last synced: 05 Apr 2025

https://github.com/riborings/uranouchi42microdiversity

In this repository live the bash, R and Julia scripts used to explore the microdiversity of the prokaryotic community at Uranouchi Inlet (42-sample time-series) by means of metagenomic shotgun sequencing under the supervision of the Ogata Lab.

big-data data-analysis data-visualisation diversity-analysis marine-ecology marine-ecosystem metagenomics microbiome-analysis prokaryotic-genomes

Last synced: 29 Oct 2025

https://github.com/ygalvao/uow_ai_final_project

This was my Final Project for the Artificial Intelligence Diploma program of The University of Winnipeg - Professional, Applied and Continuing Education (PACE).

data-analysis data-analytics dbscan elections k-means k-means-clustering machine-learning som som-clustering

Last synced: 10 Jul 2025

https://github.com/rorrell/spotifyhistory

A Jupyter Notebook where I wrangle some data and plot a chart to draw some conclusions about a user's Spotify history

data-analysis data-visualisation data-wrangling jupyter-notebook python3

Last synced: 19 May 2026

https://github.com/rita94105/smart_contract_vulnerability_detector

Smart contracts are pivotal in blockchain applications but are prone to vulnerabilities that can lead to significant losses. SmartGuard: Multi-Stage Smart Contract Vulnerability Detection tackles this issue by developing a machine learning framework to identify eight vulnerability types using datasets from Kaggle and Hugging Face.

data-analysis machine-learning smart-contracts streamlit vulnerability-detection

Last synced: 01 Aug 2025

https://github.com/halyusa16/sql-employee-insights

This project dives into employee data to uncover actionable insights using SQL. It mimics real-world HR and business analysis tasks, from salary comparisons to workforce demographics and potential cost-cutting strategies.

data-analysis mysql sql

Last synced: 11 Apr 2025

https://github.com/sukitsubaki/screen-time-tracker

A minimalist Python tracker that records the usage time of various applications and provides insights into your computer usage habits.

application-usage data-analysis monitoring productivity python python-cli screen-time time-tracking

Last synced: 12 Apr 2025

https://github.com/mindlessmuse666/missing-data-processing

Проект по обработке пропущенных значений в данных о пассажирах Титаника с использованием библиотек Python Matplotlib и Seaborn.

data-analysis data-visualization matplotlib missing-values-analysis missing-values-handling pandas python seaborn titanic

Last synced: 16 May 2026

https://github.com/beatrice-b-m/bea-tools

🐝 𝓉𝑜𝑜𝓁𝓈 𝓂𝒶𝒹𝑒 𝒷𝓎, 𝒶𝓃𝒹 𝒻𝑜𝓇, 𝒷𝑒𝒶 🐝 . ݁₊ ⊹ . ݁ ⟡ ݁ . ⊹ ₊ ݁ ⊹ . ݁ ⟡ ݁ . ⊹ ₊ ݁. ⊹ . ݁ ⟡ ݁ .⊹ . ݁ ⟡ A Python package of random functions and tools that I use regularly. Data science / analysis focused since, ya know, I'm a data scientist c:

data-analysis data-science data-visualization

Last synced: 15 Jan 2026

https://github.com/georgiifirsov/educational-research-work

Educational research project on 3rd year (6th semester). Topic: ARMA models in time series analysis

arma data-analysis jupyter-notebook python time-series time-series-analysis tsa

Last synced: 27 Apr 2026

https://github.com/datalopes1/desafio_delivery

Desafio do Clube de Assinaturas da Universidade dos Dados para simular as demandas reais de um analista de dados

data-analysis jupyter python

Last synced: 06 Mar 2026

https://github.com/prathmesh2507/ctc-hackthon

A data-driven system designed to reduce overcrowding and optimize urban public transport using real-world geospatial data and intelligent simulation.

dashboard data-analysis data-visualization python streamlit

Last synced: 16 May 2026

https://github.com/kakri787/alcoholism-and-grade-analysis

A mini project for university data science module where we analyzed on the relationship between alcohol consumption in students and their academic performance, making use of exploratory data analysis and machine learning techniques to see if we can predict student's grades.

data-analysis data-science data-vizualisation lasso-regression machine-learning neural-network

Last synced: 12 Apr 2025

https://github.com/danitilahun/exploratory-data-analysis-projects

This repository contains a collection of my personal Exploratory Data Analysis (EDA) projects. Each project involves exploring various datasets to gain insights, uncover patterns, and visualize trends.

data-analysis data-science data-visualization exploratory-data-analysis python

Last synced: 16 May 2026

https://github.com/j-faria/bicerin

Working on the RV challenge in Torino

data-analysis gp radial-velocity rv-challenge

Last synced: 07 Apr 2026

https://github.com/nafisrayan/crypto-trading-platform

This React Crypto Exchange Template is designed to provide a solid foundation for building a comprehensive cryptocurrency exchange platform. With its sleek and modern design, this template is perfect for anyone looking to create a user-friendly and intuitive trading experience.

crypto dashboard data-analysis data-visualization react template

Last synced: 16 May 2026

https://github.com/athari22/multivariable_regression_and_valuation_model_

Multivariable regression model using Python to analyze and predict Boston housing prices based on various socioeconomic and environmental features.

data-analysis data-analysis-python housing-prices housing-prices-competition machine-learning pandas pandas-python plotly python regression-models seaborn seaborn-python sklearn

Last synced: 17 Jun 2025

https://github.com/tabibyte/aoty-highest-rated-albums-data-analysis

Data Analysis of AOTY Highest Rated Albums

albums aoty data-analysis music

Last synced: 10 Sep 2025

https://github.com/nferno55/mock-data-governance

Working with messy data and using data quality practices to clean it up and practice SQL/Python automation. YAML will be used for Metadata validation soon.

data-analysis database-management metadata python sql sqlite3 yaml

Last synced: 16 May 2026