An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/dermatologist/goscar-export

:fire: CSV to FHIR (For OSCAR EMR EForm Export)

data-analysis data-warehouse fhir fhir-r4 fhir-server hacktoberfest oscar-emr

Last synced: 10 Sep 2025

https://github.com/sukanyabag/statistical-analysis-of-my-medium-articles

This repository contains an exploratory data analysis of my writer data at Medium. I use it to carry out data analysis once in every 4 months to see audience and fan growth, and topics they love! You can check out the articles here👇

data-analysis data-storytelling matplotlib pandas seaborn statistical-analysis sweetviz web-scraping

Last synced: 18 Jul 2025

https://github.com/hevalhazalkurt/exploring_the_data_of_lego_history

A data exploration project on LEGO history in Python with pandas, matplotlib etc. (WIP)

data data-analysis data-science data-visualization datascience datasets lego lego-history matplotlib pandas python python3

Last synced: 13 Apr 2025

https://github.com/iguptashubham/online-retail-sales

This Power BI dashboard, designed for marketing strategists, analyzes sales trends and customer behavior. It provides key insights empowering them to identify sales opportunities and optimize marketing campaigns, ultimately boosting business sales.

dashboard data data-analysis data-analysis-project data-analysis-project-powerbi data-analysis-python data-project data-science powerbi project

Last synced: 19 Mar 2026

https://github.com/haroldeustaquio/sql-coding-challenges

Repository dedicated to solving SQL problems from HackerRank, DataLemur and other challenges. Contains solutions to improve skills in database querying, optimization, and data manipulation.

challenge data-analysis database hackerrank-solutions mysql query sql sqlite t-sql-exercises

Last synced: 12 Jul 2025

https://github.com/yaricom/english-article-correction

The experiment with applying NLP to correction of definite/indefinite articles in English text corpus

data-analysis glove-vectors nlp nlp-machine-learning numpy pandas scikit-learn umbc-webbase-corpus

Last synced: 05 Apr 2025

https://github.com/virajbhutada/tableau-data-vizzes

Engage with a growing collection of Tableau dashboards covering financial trends, HR analytics, streaming service insights, real estate dynamics, and more. Meticulously crafted for valuable insights, this repository continues to expand with new and compelling visualizations.

business-analytics data-analysis data-visualization hr-analytics industry-trends netflix performance-metrics stock-market-analysis strategic-analytics tableau visual-insights

Last synced: 02 Mar 2026

https://github.com/sodascience/map-explorer

Map Explorer is a Vue.js web application for rendering GeoJSON maps with dynamic region coloring based on external data.

choropleth data-analysis data-visualization geojson

Last synced: 10 Feb 2026

https://github.com/tuliosg/cdp

Repositório do curso "Ciência de Dados para Pesquisa".

data-analysis data-manipulation data-science data-visualization google-colab jupyter-notebook python

Last synced: 03 Mar 2026

https://github.com/fburic/pandance

Advanced relational operations for pandas DataFrames

data-analysis data-science data-wrangling pandas

Last synced: 27 Jun 2026

https://github.com/patilni3/project_sql

Data Analysis using SQL

census-data data-analysis sql

Last synced: 16 Feb 2026

https://github.com/depressioncenter/mden

Mobile technologies code from the University of Michigan's Mobile Data Experts Network (MDEN), featuring data cleaning automations, REDCap project templates, and links to useful external modules. [DOI: 10.6084/m9.figshare.25438714]

automation data-analysis data-cleaning fitness-tracker heart-rate-data mobile-data mobile-development mquery powerautomate powerbi powerquery python r sleep-data smartwatch-data tableau

Last synced: 25 Feb 2026

https://github.com/chaitanyac22/house-price-prediction-project-for-a-us-based-housing-company

The goal of this project is to garner data insights using data analytics to purchase houses at a price below their actual value and flip them on at a higher price. This project aims at building an effective regression model using regularization (i.e. advanced linear regression: Ridge and Lasso regression) in order to predict the actual values of prospective housing properties and decide whether to invest in them or not.

advanced-linear-regression business-analytics data-analysis data-cleaning data-manipulation data-visualization exploratory-data-analysis feature-engineering lasso-regression linear-regression machine-learning model-building model-evaluation prediction-model python3 regularization rfe ridge-regression statistics

Last synced: 30 Apr 2026

https://github.com/dcs-training/digital-method-of-the-month

In this repository you are going to find the documents we produced to support the discussion in our Digital Methods of the Month. These documents will help you orienting yourself if you want to pickup the method in your research. Go to the readme file

3d-data data-analysis data-visualisation data-wrangling geographical-data gis good-practices-digital-research machine-learning network-analysis open-research preregistration statistics text-analysis

Last synced: 25 Feb 2026

https://github.com/quantumudit/regional-sales-analysis

This project focuses on analyzing and visualizing the United States regional sales for a fictitious company in between 2018-2020 using Python & Power BI.

data-analysis data-visualization databases jupyter-notebook power-bi python sqlite

Last synced: 02 May 2026

https://github.com/happybono/avocadosmoothie

VB.NET project for running-median filtering. Users set kernel radius, border count, and pick MiddleMedian or AllMedian. Processing runs in parallel with a progress bar and smooth UI.

algorithms calibration correction data-analysis median outliers quicksort running-median runningmedian smoothing smoothing-methods statistics visual-basic

Last synced: 10 Feb 2026

https://github.com/rudra496/science

🔬 Interactive science experiments and research simulations — physics, chemistry, biology with 3D visualizations and real-time data analysis

data-analysis education experiments hacktoberfest javascript python research science simulation threejs

Last synced: 09 Jun 2026

https://github.com/jggautier/dataverse-curation-assistant

A small software application that provides a UI for automating things in repositories that use the Dataverse software

data-analysis dataverse hacktoberfest python

Last synced: 01 Mar 2026

https://github.com/zmyzheng/signature-authentication-pen

Signature Authentication Pen, a cloud based IoT project which realizes identity authentication by exploiting the signature biometric features of the users. Details:

android aws data-analysis identity-authentication iot neural-network signature-authentication-pen

Last synced: 03 May 2026

https://github.com/campos20/wca-statistics

This repository is meant to provide statistics for the WCA Statistics group on Facebook. It's also my repo to study data science with Python.

data-analysis data-science statistics world-cube-association

Last synced: 29 Jun 2026

https://github.com/edaaydinea/dataquest-projects

This repository is included data analyst, and data science-guided projects through Dataquest.

data-analysis data-science

Last synced: 07 Feb 2026

https://github.com/quantumudit/consumer-goods-sales-analysis

This project focuses on analyzing and visualizing the consumer goods sales in the United States between 2015-2016 using Python & Power BI.

data-analysis data-visualization database jupyter-notebook python sqlite

Last synced: 29 Apr 2026

https://github.com/cyyeh/duckdb-data-agent

An AI-powered data analysis agent with a built-in SQL playground. Upload data files (CSV, JSON, Parquet, Excel) and ask questions in plain English — the agent delegates to a specialized subagent for SQL queries and renders charts inline — or switch to the SQL editor for direct queries.

agent claude-code csv data-analysis duckdb excel json langfuse llm parquet python react sql typescript

Last synced: 04 Jun 2026

https://github.com/niamoto/niamoto

Niamoto is a command-line application and library focused on processing and publishing botanical data

botany cli-application data-analysis data-processing data-publication python-library

Last synced: 23 Apr 2026

https://github.com/mertcandav/julenum

A high-performance library for numerical methods and scientific computing in Jule

data-analysis jule julelang math matrix scientific-computing statistics

Last synced: 09 Feb 2026

https://github.com/quantumudit/analyzing-whiskyexchange-whisky

This project focuses on scraping data related to Japanese Whiskey from the Whiskey Exchange website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 02 May 2026

https://github.com/mscbuild/mscbuild

🏆 Сreating digital experiences that not only meet user expectations, but also drive engagement, loyalty and, ultimately, business success. Passionate developer from Latvia .

analysis best-practices coding config data-analysis data-science design developer freelance fullstack github-config latvia mscbuild profile readme seo site software-engineering web webapp

Last synced: 31 Jan 2026

https://github.com/banisterious/obsidian-oneirometrics

OneiroMetrics (Turning Dreams Into Data). A plugin for Obsidian to track and analyze dream journal metrics.

data-analysis dream-analysis dream-diary dream-journal dreams journaling metrics obsidian obsidian-plugin self-improvement tracking

Last synced: 22 Apr 2026

https://github.com/sushantdhumak/traffic-forecasting-using-iot-sensor-data

Demonstrates how to utilize XGBoost for traffic forecasting using data gathered from IoT sensors, highlighting its efficiency in processing complex datasets and delivering accurate predictions.

data-analysis data-visualization exploratory-data-analysis feature-engineering feature-importance feature-selection gridsearchcv hyperparameter-optimization hyperparameter-tuning iot random-search xgboost-regression

Last synced: 26 Mar 2025

https://github.com/itzmeanjan/indian-railway

Exploring Indian Railways time table dataset, with :heart:

data-analysis data-visualization indian-railways matplotlib python python3 railway

Last synced: 17 Oct 2025

https://github.com/maksimekin/umd_data_challange_2020

Ocean Clean up data analysis project for the UMD Data Challenge 2020. Data Exploration for a Sustainable Planet.

cleanup competition data-analysis data-science folium geolocation machine-learning ocean planet pollution sklearn sustainability time-series trash umd

Last synced: 05 Jul 2025

https://github.com/sonigarima/donation-management-system

A donation management system for NGOs and Donors. The project is designed for Cognizance IITR 2021 - Salesforce Codathon.

data-analysis donation-management reactjs

Last synced: 07 Sep 2025

https://github.com/dsnchz/solid-g6

A SolidJS component library for graph visualization, powered by @antv/g6

analysis data-analysis data-visualization graph graph-visualization node-ui solidjs visualization

Last synced: 13 Oct 2025

https://github.com/trainingbypackt/splunk-7-essentials-elearning

Build an elaborate Splunk enterprise environment that will extract powerful insights from your machine-generated big data

data-analysis eventgen indexing machine-learning splunk sub-search visualization

Last synced: 01 Mar 2026

https://github.com/jackfiszr/pl2xl

Nodejs-polars wrapper with `readExcel` and `writeExcel` methods.

data-analysis data-science deno excel excel-reader excel-writer nodejs polars

Last synced: 21 Jan 2026

https://github.com/afsalashyana/whatsapp-chat-analyzer

Analyze WhatsApp chats with beautiful graphs. Written in JavaFX

data-analysis data-visualization javafx javafx-14 javafx-application whatsapp

Last synced: 04 Sep 2025

https://github.com/darsan-in/rumour-monger-spotter

Rumour Monger Spotter is a prototype developed during a national-level cyber hackathon to identify false information on Twitter. Using the Google Fact Check API and a Multinomial Naive Bayes classifier, the tool analyzes tweet content to assess the likelihood of misinformation. Despite a development window of less than 24 hours, the project won a t

ai data-analysis fact-checking hackathon india naive-bayes national-competition natural-language-processing prototype real-time-analysis social-media text-classification tweet-content twitter

Last synced: 12 Oct 2025

https://github.com/jonzeolla/lab-securitydataanalysis

An introductory lab to Security Data Analysis (using Apache Metron (incubating)).

apache-metron data-analysis lab metron security

Last synced: 03 Jul 2025

https://github.com/globeandmail/startr-cli

A command-line scaffolder for the startr R project template

data-analysis data-journalism data-visualization journalism r

Last synced: 23 Apr 2025

https://github.com/fatihilhan42/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-engineering data-mining data-science data-structures data-visualization database datascience fatihilhan fortytwo fortytwofficial jupyter-notebook python

Last synced: 11 Oct 2025

https://github.com/cparmet/pandas-checks

🐼🩺 Pandas Checks: Non-invasive health checks for Pandas method chains

data-analysis data-engineering data-science method-chaining pandas

Last synced: 27 May 2026

https://github.com/nafisalawalidris/predicting-credit-card-approvals

Explore credit card approval prediction through data analysis and machine learning. Preprocess data, train logistic regression models, and optimize hyperparameters. Learn data preprocessing, feature engineering, model training, and evaluation. Dive into the world of machine learning with Python and popular libraries.

approval-prediction credit-card data-analysis data-preprocessing feature-engineering hyperparameter-optimization libraries logistic-regression machine-learning model-evaluation model-training python python3

Last synced: 19 Apr 2025

https://github.com/cheminfo/compass

Strategy for improved characterisation of human metabolic phenotypes using a COmbined Multiblock Principal components Analysis with Statistical Spectroscopy (COMPASS)

data-analysis metabolomics metabonomics multiblock nmr-spectroscopy pca population-analysis population-model

Last synced: 23 Mar 2025

https://github.com/efharkin/ez-ephys

Easy IO, inspection, and manipulation of electrophysiological data.

data-analysis electrophysiology neurophysiology neuroscience patch-clamp python

Last synced: 14 Jan 2026

https://github.com/abhash-rai/traffic-image-classifier

A web-based solution utilizing a robust tensorflow model for precise traffic condition classification made in ReactJs and FastAPI for backend.

cnn cnn-classification cnn-keras cnn-model data-analysis data-science data-visualization fastapi keras keras-tensorflow python python-3 python3 react reactjs tensorflow traffic traffic-classification transfer-learning

Last synced: 23 Feb 2026

https://github.com/waveform80/structa

A small utility for analyzing data structures (e.g. JSON files)

csv data-analysis data-visualization datajournalism datawrangling json yaml

Last synced: 06 Sep 2025

https://github.com/lucasbotang/real_estate_management_data_analysis

Data analysis for real estate management

data-analysis excel mysql tableau

Last synced: 06 Oct 2025

https://github.com/robinmillford/cortex-ai-multi-model-insights-hub

Cortex AI: Multi-Model Insights Hub is an advanced platform that leverages cutting-edge AI to empower your research, analysis, and data exploration. By integrating multiple Large Language Models (LLMs) with a sophisticated Retrieve-and-Generate (RAG) system

article-extractor chatbot data-analysis data-visualization deepseek-chat deepseek-r1 llama3 llm pdf-document-processor rag streamlit-webapp summarizer vector-database

Last synced: 28 Oct 2025

https://github.com/paezha/edashop

An open educational resource to teach a workshop on Exploratory Data Analysis in R

data-analysis exploratory-data-analysis open-educational-resources package r rstats workshop-materials

Last synced: 18 Mar 2025

https://github.com/yuukidach/twitchanal

Using AI for eGaming analytics to discover community interactions and behaviors of Twitch.

data-analysis data-analytics twitch

Last synced: 05 Jan 2026

https://github.com/bjpop/gurita

A convenient and expressive tool for data analytics and plotting on the command line

command-line data-analysis data-science pandas plotting python

Last synced: 04 Feb 2026

https://github.com/shervinnd/bazar_app_store_eda

Bazar App Data analysis code to find the most downloaded category and most popular installed apps

data data-analysis data-science dataanalysis eda python

Last synced: 15 Apr 2025

https://github.com/1ayanabil1/healthcare-machine-learning

Explore our open-source repository focused on healthcare machine learning. We've developed predictive models for cardiovascular disease, diabetes, breast cancer, and more. Our projects employ diverse machine learning algorithms and data science techniques, enhancing early detection, diagnosis, and patient outcomes.

data-analysis data-science deep-learning disease disease-detection disease-modeling disease-prediction eda healthcare-application heathcare jupyter-notebook machine-learning machine-learning-algorithms machinelearning-python python

Last synced: 28 Apr 2025

https://github.com/theengineeringworld/numpy-data-science

NumPy Data Science Essential Traing COurse. Part of Youtube Course Offered by TheEngineeringWorld.

data-analysis data-science numpy numpy-exercises numpy-library numpy-tutorial python python-3-6 python3 scipy2018

Last synced: 09 Oct 2025

https://github.com/ethan-wickstrom/rrrs

Welcome to RRRS, a rapid, hyper-optimized CSV random sampling tool designed with performance and efficiency at its core. Crafted meticulously in Rust, RRRS offers an unparalleled solution for extracting random data samples from CSV files swiftly and effortlessly.

analytics cli command-line command-line-tool data data-analysis data-science dataset rust rust-lang sample samples

Last synced: 16 May 2025

https://github.com/neutrinoceros/gpgi

A lightweight Python library for efficient in-RAM particle deposition on rectilinear, unrefined grids.

data-analysis grid particles performance

Last synced: 22 Apr 2025

https://github.com/dcs-training/datavisualisationwithr

Data Visualisation with R Workshop (delivered by the Centre in December 2020). This workshop is focusing on visualising your data. Go to the readme file

data-analysis data-visualisation data-wrangling r

Last synced: 25 Apr 2025

https://github.com/sondosaabed/introduction-to-data-analysis-with-pandas-and-numpy

Learning the data analysis process of questioning, wrangling, exploring, analyzing, and communicating data. Working with data in Python using libraries like NumPy and pandas.

data-analysis data-analyst-nanodegree data-wrangling numpy pandas python

Last synced: 09 Apr 2025

https://github.com/sondosaabed/data-visualization-with-matplotlib-and-seaborn

Learning to apply sound design and data visualization principles to the data analysis process. Also learning how to use analysis and visualizations to tell a story with data.

data-analysis data-analyst-nanodegree data-visualization matplotlib python seaborn seaborn-plots

Last synced: 09 Apr 2025

https://github.com/martinthoma/bad-stats

Examples of how not to do statistics / visualizations

data-analysis statistics visualizations

Last synced: 07 Jan 2026

https://github.com/lit26/trump_tweet_analysis

Analysis of Trump's original tweets.

data-analysis lda-model topic-modeling

Last synced: 12 Apr 2025

https://github.com/cosmoduende/r-ufo-sightings

Are we alone in the universe? - Data Analysis and Data Visualization of UFO sightings with R. How to analyze and visualize data of UFO sightings of the last century in the USA and the rest of the world with R language.

data-analysis data-analytics data-science data-visualisation data-visualization data-visualizations dataviz ovni ovni-dataset r-code r-language r-programming r-stats ufo ufo-analysis ufo-dataset ufo-sighting ufo-sightings

Last synced: 13 May 2025

https://github.com/astrodynamic/retailanalitycs-in-postgresql

Develop a SQL script to create a database with tables, views, roles, and functions. Form personalized offers to increase average check, frequency of visits, and cross-selling.

bd csv data-analysis data-export data-input data-manipulation data-validation database-management functions git margin offers postgresql retail role-permission-management selling sql transaction tsv views

Last synced: 06 Apr 2026

https://github.com/johnsell620/sentiment-analysis-goodreads-reviews

Document-level sentiment analysis of book reviews scraped from the Goodreads website. Technologies used include TensorFlow, Spark, HDFS, Sqoop, Scrapy, and D3.js.

data-analysis data-visualization recurrent-neural-networks web-scraping

Last synced: 30 Apr 2025

https://github.com/avinashkranjan/basic-data-analysis-and-visualization-in-python

📊 Some of the most important python tools in data science for Data Analysis and Data Visualization.

data-analysis data-science matplotlib matplotlib-pyplot numpy pandas plotly seabourne

Last synced: 30 Oct 2025

https://github.com/orkunaktas/sofascore-webscraping

⚽️I scraped the shot data of the Fenerbahçe - Adana Demirspor match from Sofascore⚽️

beautifulsoup data-analysis football-analytics football-data selenium webscraping

Last synced: 28 Oct 2025

https://github.com/kellyjadams/spotify-data-analyze

A serverless data pipeline that logs my Spotify listening history to BigQuery using Cloud Run, then visualizes trends with Looker Studio. Built with Python, Flask, Docker, and GCP..

data-analysis data-engineering

Last synced: 07 May 2025

https://github.com/stimulsoft/stimulsoft.dashboards.php

Dashboards.PHP is a complete software package for designing and viewing dashboards. Includes the JS data analysis engine, dashboard designer and viewer. Support PHP 5, PHP 7, and PHP 8 versions.

charts dashboard-builder dashboards data-analysis data-grid data-visualization datatable dynamic-dashboard interactive-dashboards live-data mysql-data php php-bi-tools php-dashboard php-kpi php7 php8 pivot-tables sql-datasources statistics

Last synced: 14 Oct 2025

https://github.com/jimbrig/lossrunAnalyzer

R Package and Shiny App to Analyze Insurance Lossruns

actuarial data-analysis data-mining data-science insurance r record-linkage risk-management shiny

Last synced: 30 Jul 2025

https://github.com/sushant1827/traffic-forecasting-using-iot-sensor-data

Demonstrates how to utilize XGBoost for traffic forecasting using data gathered from IoT sensors, highlighting its efficiency in processing complex datasets and delivering accurate predictions.

data-analysis data-visualization exploratory-data-analysis feature-engineering feature-importance feature-selection gridsearchcv hyperparameter-optimization hyperparameter-tuning iot random-search xgboost-regression

Last synced: 08 Mar 2026

https://github.com/quantumudit/basketball-players-analysis

The project focuses on analyzing salaries and various other in-game metrics of top NBA basketball players from 2005-14 by performing exploratory data analysis with Python and Jupyter Notebook and by visualizing the data in an insightful dashboard made with Power BI

data-analysis jupyter-notebook power-bi python

Last synced: 17 May 2026