An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/iamantimpal/iamantimpal

👋 Hi, I'm Antim Pal, the Founder of Optimism Educator. An online platform dedicated to empowering students with skills in Computer Science, Web Design, Graphic

data-analysis data-science data-visualization database database-design database-management datascience graphical-user-interface graphics grapic-design reading-list readme readme-badges readme-generator readme-md readme-profile readme-stats readme-template

Last synced: 10 Apr 2025

https://github.com/pseudomanifold/enchiridion-tda

An enchiridion for instructing mortals in the hidden arts of topological data analysis

data-analysis graph-kernels persistent-homology topology

Last synced: 10 Apr 2025

https://github.com/coalio/Assistant

A data science library providing flexible dataframes for Lua 5.1+

data-analysis data-science data-structures dataframe lua

Last synced: 11 Apr 2025

https://github.com/anotherkamila/stalky

Self-hosted app to keep track of anything you want. Don't let others stalk you, stalk yourself!

csv data-analysis data-storage geeks privacy-by-design self-hosted timeseries tracking

Last synced: 12 Apr 2025

https://github.com/thecoderpinar/gen-expression

Gene expression analysis is a fundamental component of genomics research, providing valuable insights into how genes are regulated and their impact on various biological processes. This project delves into the realm of gene expression data, aiming to uncover hidden patterns and relationships within complex datasets. 🚀

bioinformatics biotechnology data-analysis data-science data-visualization genomics kaggle machine-learning pca python

Last synced: 30 Apr 2025

https://github.com/misaghmomenib/data-analysis-projects

A Repository Featuring a Collection of Data Analysis Projects, Showcasing Various Techniques and Tools for Extracting Insights From Data. Explore, Learn, and Utilize These Projects to Enhance Your Data Analysis Skills and Workflows.

data-analysis data-analysis-python data-visualization jupyter-notebook open-source python

Last synced: 13 Apr 2025

https://github.com/flexmonster/pivot-jupyter-notebook

Jupyter Notebook pivot table example with Flexmonster

data-analysis data-science interactive jupyter-notebook pivot-tables python

Last synced: 16 Jun 2025

https://github.com/abmenzel/messenger-wrapped

Explore your Facebook Messenger chat history, in a manner inspired by Spotify Wrapped

data-analysis data-visualization facebook meta nextjs reactjs spotify

Last synced: 18 Jul 2025

https://github.com/acs/ghtorrent

Experimental GItHub project analysis based on GHTorrent

data-analysis data-visualization ghtorrent github visualization

Last synced: 20 Mar 2025

https://github.com/stappit/blog

I often post solutions to textbook exercises, including: Bayesian Data Analysis (BDA) by Gelman et al; Causal Inference in Statistics Primer (CISP) by Pearl et al; Purely Functional Data Structures (PFDS) by Okasaki.

bayesian-data-analysis blog data-analysis data-science gelman hakyll haskell pearl purely-functional-data-structures solutions stan static-site statistical-inference statistics

Last synced: 14 Mar 2025

https://github.com/mmfava/personal-r-script-2015-2017

Este repositório contém uma coleção de scripts R que desenvolvi ao longo de diversos cursos que participei ou ministrei. Eles cobrem uma ampla gama de tópicos em estatística, análise de dados e técnicas de machine learning, com foco especial em aplicações na ecologia e ciências ambientais.

data-analysis r

Last synced: 12 Apr 2025

https://github.com/rahul-jha98/restauranttrends.stats

Visualise the trends in food and restaurant choices of customers in a city by scraping data from Zomato.

data-analysis data-science visualization vuejs zomato zomato-api zomato-scraper

Last synced: 08 Jul 2025

https://github.com/ascender1729/iris-flower-classification-2024

An exploratory data analysis and machine learning project using the Iris dataset to classify flower species with a K-Nearest Neighbors classifier. It includes data visualization, feature scaling, model training, and evaluation with 100% accuracy on the test set.

classification data-analysis iris-dataset k-nearest-neighbors machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 20 Jul 2025

https://github.com/yusufcinarci/web-scraping-projects

In these project files, I will host the web scraping examples that I will make day by day.

data-analysis data-science jupyter-notebook python web-scraping

Last synced: 01 May 2025

https://github.com/oghene-ella/airbnb_market

Meeting the high demand for temporary lodging for anywhere between a few nights to many months

data-analysis

Last synced: 30 Apr 2025

https://github.com/anaagg/data-installations-windows

Are you thinking about using the Linux subsystem on Windows? This is your repo, along this repository you will find the necessary steps to install on your computer what you need to work with Python, SQL, miniconda on Windows using the Linux subsystem.

conda conda-environment data-analysis hyper jupyter jupyter-notebook jupyter-notebooks linux miniconda pycharm pycharm-ide python sql vscode windows workbench

Last synced: 23 Oct 2025

https://github.com/yaricom/english-article-correction

The experiment with applying NLP to correction of definite/indefinite articles in English text corpus

data-analysis glove-vectors nlp nlp-machine-learning numpy pandas scikit-learn umbc-webbase-corpus

Last synced: 05 Apr 2025

https://github.com/zvtvz/zvdata

an extendable library for recording and analyzing data

analyzing-data dash data-analysis pandas plotly-dash sqlalchemy

Last synced: 03 May 2025

https://github.com/aim-harvard/faceage

Decoding biological age from face photographs using deep learning.

age-estimation biological-age cnn data-analysis deep-learning survival-analysis

Last synced: 13 Apr 2025

https://github.com/prajwalchapke055/accenture-data-analytics-and-visualization-forage

NAVIGATING NUMBERS - Apply your data analytics & visualization skills to advise a social media client on their content creation strategy as a Data Analyst at Accenture

accenture communication data-analysis data-modeling data-understanding data-visualization forage internship internship-task job-simulation presentation project-planning public-speaking storytelling strategy teamwork virtual-internship

Last synced: 28 Oct 2025

https://github.com/danieldacosta/airbnb-analysis

Data analysis of AirBnb website history in the city of Rio de Janeiro

airbnb-analysis airbnb-website-history data-analysis

Last synced: 30 Apr 2025

https://github.com/coumbacoulibaly/adventureworkscycles

Repository for Adventure Works Sample Database Analysis

adventureworks data-analysis data-analytics mssql-database mssqlserver sql ssms

Last synced: 21 Jul 2025

https://github.com/hevalhazalkurt/exploring_the_data_of_lego_history

A data exploration project on LEGO history in Python with pandas, matplotlib etc. (WIP)

data data-analysis data-science data-visualization datascience datasets lego lego-history matplotlib pandas python python3

Last synced: 13 Apr 2025

https://github.com/pjagielski/worldcup

2018 World Cup match data analysis

clojure data-analysis repl world-cup-2018

Last synced: 30 Oct 2025

https://github.com/dermatologist/goscar-export

:fire: CSV to FHIR (For OSCAR EMR EForm Export)

data-analysis data-warehouse fhir fhir-r4 fhir-server hacktoberfest oscar-emr

Last synced: 10 Sep 2025

https://github.com/misaghmomenib/airport-flight-analysis

Flight Data Analysis Project Aimed at Exploring and Visualizing Airport Operations, Flight Patterns, and Delay Trends Using Python. This Project Involves Data Cleaning, Preprocessing, and Statistical Analysis With Tools Like Pandas, Matplotlib, and Scikit-learn to Uncover Insights and Improve Operational Efficiency.

analysis data-analysis data-visualization git open-source python python3

Last synced: 30 Apr 2025

https://github.com/sukanyabag/statistical-analysis-of-my-medium-articles

This repository contains an exploratory data analysis of my writer data at Medium. I use it to carry out data analysis once in every 4 months to see audience and fan growth, and topics they love! You can check out the articles here👇

data-analysis data-storytelling matplotlib pandas seaborn statistical-analysis sweetviz web-scraping

Last synced: 18 Jul 2025

https://github.com/vedadiyan/genql

GenQL is a generic querying language fully written in Go

data-analysis data-mapping data-processing data-science data-translation json json-data sql

Last synced: 22 Jun 2025

https://github.com/eikevons/pandas-paddles

Access the parent Pandas data frame in loc[], iloc[], assign(), and others Pandas helpers

data-analysis data-exploration data-science pandas pandas-dataframe pandas-library pandas-loc

Last synced: 16 Jun 2025

https://github.com/hrolive/from-data-to-insights-with-google-cloud-platform

Four-course accelerated online specialization teaches course participants how to derive insights through data analysis and visualization using the Google Cloud Platform

data-analysis data-cleaning data-preparation data-visualization sql

Last synced: 12 May 2025

https://github.com/alandefreitas/scistats

High-Performance Descriptive Statistics and Hypothesis Tests in C++20

bayesian-statistics data-analysis descriptive-statistics hypothesis-testing performance-statistics statistics

Last synced: 11 Apr 2025

https://github.com/palewire/baseball-notebooks

Python notebooks exploring Major League Baseball data

baseball baseball-statistics data-analysis jupyter-notebook pandas python

Last synced: 02 Mar 2025

https://github.com/ankman007/cricket-statsguru

Streamlit-based Nepali cricket visualization dashboard that utilizes various python tools & libraries to display interactive and engaging charts & statistics.

data-analysis data-visualization machine-learning python streamlit

Last synced: 09 Jul 2025

https://github.com/csiro/miriad

The CSIRO ATNF version of MIRIAD

data-analysis data-reduction radio-astronomy

Last synced: 16 Jan 2026

https://github.com/duhaime/douglasduhaime.com

Data analysis and visualization

data-analysis data-visualization website

Last synced: 12 Apr 2025

https://github.com/negativenagesh/whatsapp_chat_analyzer

This is an end to end project of whatsapp chat analysis, here I have used my hostel whatsapp group's chat data.

data-analysis data-science frontend modeling python streamlit

Last synced: 12 Apr 2025

https://github.com/volkansah/python-modules-overview

This repository provides an overview of some common and useful Python modules, categorized by their functionality. This list is not exhaustive but serves as a starting point for exploring various Python libraries.

data-analysis modules overview overview-page phyton phyton3 python python-3

Last synced: 19 Oct 2025

https://github.com/lpsm-dev/twitter-sentimental-analysis-covid

✔️ Twitter Sentimental Analysis Covid-19 (using Textblob - Naive bayes) + Python Backend Flask + Docker + Docker Compose + MongoDB

adminmongo alpine backend corona coronavirus-analysis covid covid-19 data-analysis docker docker-compose docker-compose-wait flask flask-restful mongodb mongoku python python-api sentiment-analysis twitter

Last synced: 30 Apr 2025

https://github.com/nafisalawalidris/911-call-analysis

The 911 Call Analysis project explores and visualises emergency call data to uncover patterns and trends. It includes data preparation, exploratory analysis, visualizing call volume and reasons and generating heatmaps. Users can customize the code for their dataset. The project relies on libraries like Pandas, NumPy, Matplotlib, Seaborn, and SciPy

cluster-analysis data-analysis data-visualization decision-making emergency-calls emergency-services exploratory-data-analysis heatmaps matplotlib numpy pandas patterns-and-trends resource-allocation scipy seaborn

Last synced: 26 Jun 2025

https://github.com/vishnu-t-r/sql_functions_reference

This repository contains intermediate to complex sql queries which explains sql concepts. This repository can be helpful when writing queries with complex concepts and can be considered for reference. (Most queries have DDL and DML command within for practise)

complex-sql data-analysis data-mining sql sql-query

Last synced: 26 Jun 2025

https://github.com/haroldeustaquio/sql-coding-challenges

Repository dedicated to solving SQL problems from HackerRank, DataLemur and other challenges. Contains solutions to improve skills in database querying, optimization, and data manipulation.

challenge data-analysis database hackerrank-solutions mysql query sql sqlite t-sql-exercises

Last synced: 12 Jul 2025

https://github.com/femtotrader/dukascopyticksreader.jl

A Julia library to download tick data from Dukascopy https://www.dukascopy.com/swiss/english/marketwatch/historical/

data data-analysis dataset dukascopy html julia stock-data

Last synced: 10 Apr 2025

https://github.com/amrrs/iq18_workshop

Data Science in Python - Workshop Data and Notebook

data-analysis ipl python

Last synced: 22 Jul 2025

https://github.com/pythondeveloper6/matplotlib-for-beginners

how to visualize your data using matplotlib

data-analysis matplotlib numpy pandas python visualization

Last synced: 11 Apr 2025

https://github.com/labrijisaad/data-scientist-tools-pandas

In this hands-on training notebook, we'll see all the basics of the Pandas library used for data science/data analysis and machine learning tasks.

data-analysis data-science dataframe-objects machine-learning pandas python series-objects

Last synced: 08 Apr 2025

https://github.com/jcm-ai/TATA-Data-Visualisation-Empowering-Business-with-Effective-Insights

This repository holds all of the assignments I was needed to complete for the TATA Data Visualization Empowering Business with Effective Insights Virtual Experience Program. 📊 📈 📉

analysis-and-reporting analytics analytics-and-decision-science charts communications dashboards data-analysis data-cleanup data-interpretation data-storytelling data-visualizations graph insights power-bi tableau visual-basic visualizations

Last synced: 19 Aug 2025

https://github.com/jesussantana/ibm-data-analysis-with-python-da0101en

This course will take you from the basics of Python to exploring many different types of data.

anova correlation data-analysis model-evaluation numpy pandas prepare-data python regression-models statistics

Last synced: 17 Jul 2025

https://github.com/scieloorg/scielo20gt6

Analyzes presented during the workshop on data analysis that occurred during the SciELO 20 years week

data-analysis

Last synced: 05 Jan 2026

https://github.com/chaitanyac22/lending-club-project---data-analysis-for-a-consumer-finance-company

Lending Club is a consumer finance company that specializes in lending various types of loans to urban customers. When the company receives a loan application, the company has to make a decision for loan approval based on the applicant’s profile. The project work aims to help the company in understanding the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default. The company can utilize this knowledge for its portfolio and risk assessment.

banking business-intelligence data-analysis data-cleaning data-manipulation data-visualization exploratory-data-analysis feature-engineering finance portfolio-management python3 risk-assessment statistics

Last synced: 23 Aug 2025

https://github.com/koldlight/r4ds

R for data science course

course data-analysis data-science data-viz r

Last synced: 30 Apr 2025

https://github.com/theakashshukla/r-project

🎓 A Collection of Programming Assignment for R Language

algorithms data-analysis data-science data-science-projects ml r

Last synced: 24 Jul 2025

https://github.com/stefen-taime/nifi-etl-data-pipeline

This post will demonstrate the creation of a containerized data engineer environment using Docker Stacks.

apache api big-data cloud data-analysis data-engineering docker-compose etl-pipeline machine-learning nifi postgresql-database slack zookeeper

Last synced: 15 Apr 2025

https://github.com/arsalanjabbari/imdb-top-250-movies-analysis

Academic research on IMDb's Top 250 Movies entails scraping and cleaning data, followed by analyzing genres, directors, release years, IMDb ratings, and actors' influence. This analysis offers insights into evolving cinematic preferences and demonstrates the value of data-driven research in understanding cultural phenomena.

data-analysis imdb

Last synced: 31 Aug 2025

https://github.com/tushar2704/store-demand-forecasting

This project predicts the sales demand for various items in different stores based on historical sales data. The objective is to develop a machine learning model that can provide accurate forecasts for future sales of each store-item combination.

artifi data-analysis data-science python sales-analysis sales-forecasting tushar2704

Last synced: 04 Nov 2025

https://github.com/memoryfraction/LLSDA-Lightning-Location-System-Data-Analyzer

LLSDA is a public benefit project that helps lightning engineers and lightning scientists analyze lightning distribution. 一款跨平台的闪电定位(LLS)数据分析工具软件基础类库,用于对LLS数据进行时空特征分析,帮助雷电相关分析人员(科研人员、学生)提高开发效率、避免重复造轮子。

data-analysis data-visualization lightning-location-system public-benefits

Last synced: 04 May 2025

https://github.com/mribeirodantas/vidente

R package to parse and preprocess the Surveillance, Epidemiology, and End Results (SEER) Program data from NIH/NCI

cancer-patients cancer-research data-analysis data-science data-structures r-package seer

Last synced: 14 Apr 2025

https://github.com/apache/incubator-devlake-playground

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

dashboard-friendly data data-analysis data-engineering data-integration data-transfers devops domain-layer dora etl hacktoberfest integration jira open-source python user-friendly

Last synced: 19 Oct 2025

https://github.com/sayakpaul/analysis-of-college-database-of-2017-passouts

Contains my analysis of a database containing information about the students of an engineering college.

data-analysis data-visualization matplotlib python-3

Last synced: 12 Jun 2025

https://github.com/dbeaudoinfortin/NAPSDataAnalysis

Canadian National Air Pollution Surveillance Program (NAPS) data downloader, importer, extractor, analysis, and visualization toolbox.

air-pollution air-quality air-quality-data aqhi aqhi-canada canada cesi data-analysis eccc geoscience heatmap naps pollution

Last synced: 01 Mar 2025

https://github.com/m-barker/fibs-reporter

Automatically generate a pdf report containing feature importance, baseline modelling, spurious correlation detection, and more, from a single command line input for any given ML CSV file

audio-analysis audio-processing automation automl baseline-model data-analysis data-visualization feature-extraction feature-visualization machine-learning pdf-generation

Last synced: 14 Jan 2026

https://github.com/mjunaidca/sql-auditor-pro-gpt

Get quick insights from your Data Sources in tables, charts and graphs with with AI-driven audits and optimize your SQL databases.

ai ai-agent auditing-data cloudflared compound-ai-systems custom-gpt data-analysis fastapi-template gpt sql sqlmodel

Last synced: 02 Apr 2025

https://github.com/iondv/report

IONDV. Framework: Report module is to form the analytical reports.

analytics businessintelligence css data data-analysis data-visualization iondv iondv-module reporting

Last synced: 19 Oct 2025

https://github.com/aglove2189/appias

Machine learning workflow toolkit ✨🦋✨

appias data-analysis data-science machine-learning pandas python sklearn workflow

Last synced: 19 Oct 2025

https://github.com/pawelgoj/envelope-for-qe-ph-calculations

Crate envelopes for IR and Raman spectra calculated by PHonon from Quantum Espresso.

appium data-analysis matplotlib numpy pytest python quantum-chemistry scipy spetroscopy tkinter-gui tkinter-python

Last synced: 22 Apr 2025

https://github.com/patilni3/project_sql

Data Analysis using SQL

census-data data-analysis sql

Last synced: 08 Oct 2025

https://github.com/cafali/pathscan

PathScan exports information about the contents of directories and hard drives. With a single click, you can create a complete list of all files and paths within a specific folder or across an entire hard drive.

backup command-line data data-analysis data-migration data-mining data-recovery directory folder-management folders forensics hard-drive keyword-extraction logging pathfinding recovery string-search tools utility windows

Last synced: 10 Oct 2025

https://github.com/spacetelescope/stistools

Tools for HST/STIS.

astronomy data-analysis hst stis

Last synced: 10 Oct 2025

https://github.com/jthomperoo/holtwinters

Holt-Winters exponential smoothing implemented in Go.

data-analysis exponential-smoothing go golang holt-winters prediction prediction-algorithm

Last synced: 17 Jul 2025

https://github.com/mahshaaban/intro_data_r

A gentle introduction to data analysis in R

data-analysis image-analysis qpcr-analysis r

Last synced: 28 Apr 2025