An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/stiles/pyfr24

A Python client for the Flightradar24 API with CLI support. Fetch, plot and analyze flight data with ease.

aviation cli data-science flights python

Last synced: 17 Jan 2026

https://github.com/aniketwdubey/rainfall-prediction-end-to-end-ml-project

The main motive of the project is to predict the amount of rainfall in Vidarbha region or state well in advance. We predict average rainfall using past data.

css data-science data-visualization flask html machine-learning python rainfall-prediction

Last synced: 23 Apr 2025

https://github.com/njlyon0/dndr

Dungeons & Dragons Functions for Players and Dungeon Masters

data-science dungeons-and-dragons r-package ttrpg

Last synced: 16 Mar 2025

https://github.com/aiwithqasim/piaic-artificial-intelligence

Welcome, to this OPEN SOURCE repository. In this repository, I'll Add all Material that I had read in the PIAIC course. It will benefit those students of PIAIC who are Enthusiastic toward Artificial Intelligence.

cnn data-science deep-learning docker linux machine-learning matplotlib numpy pandas python3

Last synced: 17 Mar 2025

https://github.com/forieux/qmm

Python Quadratic Majorization-Minimization (MM) optimization algorithms of half-quadratic criteria. Inverses problems, image restoration, denoising, ...

data-processing data-science denoising image-processing inverse-problems non-linear-optimization nonlinear-optimization optimization optimization-algorithms optimization-methods python

Last synced: 23 Jan 2026

https://github.com/jonasaacampos/guia-sql

Além de exercícios com questões reais de negócios, você pode usar os códigos disponibilizados para espadir seus horizontes sobre banco de dados.

data-science database dataquery sql

Last synced: 27 Feb 2025

https://github.com/jason2brownlee/datasciencediagnosticchecklist

Data Science Diagnostic Checklist: Helpful checks for data scientists with urgent problems

checklist data-science diagnostics machine-learning statistics

Last synced: 23 Jan 2026

https://github.com/azure/azuredsvm

AzureDSVM is an R package that offers convenient harness of Azure DSVM, remote execution of scalable and elastic data science work, and monitoring of on-demand resource consumption.

azure data-science data-science-virtual-machine r

Last synced: 07 Oct 2025

https://github.com/aifred-health/vulcanai

A high level deep learning framework for quickly prototyping networks with added tools in data visualisation, model interpretability and performance metrics

data-analysis data-cleaning data-science data-visualization deep-learning deep-neural-networks feature-engineering mental-health python3 pytorch scikit-learn

Last synced: 01 Aug 2025

https://github.com/vida-nyu/alpha-automl

Alpha-AutoML is a Python library for automatically generating end-to-end machine learning pipelines.

automl data-science machine-learning python

Last synced: 19 Feb 2026

https://github.com/safe-ds/dsl

Statically checked Data Science programs.

data-science dsl learnability machine-learning safety static-analysis usability

Last synced: 26 Jun 2025

https://github.com/dr-lego/gag-network

Network Visualizer for the 'Geschichten aus der Geschichte' Podcast

data-science data-visualization database javascript network-analysis podcast python sqlite3 wikipedia wikipedia-dump

Last synced: 14 Jul 2025

https://github.com/nhatsmrt/nn-toolbox

A toolbox of commonly used deep learning components, procedures and applications

data-science deep-learning machine-learning neural-networks python pytorch

Last synced: 10 Apr 2025

https://github.com/cea-list/rpcdataloader

A variant of the PyTorch Dataloader using remote workers.

data-science dataloader distributed-computing hpc machine-learning preprocessing pytorch slurm

Last synced: 21 Jun 2025

https://github.com/smups/rustronomy

rustronomy - an astronomy data analysis toolkit written in rust

astronomy data-science physics rust rust-lang rust-library science

Last synced: 13 Apr 2025

https://github.com/astrojuanlu/workshop-jupyter-kedro

Hands on workshop "Refactor your Jupyter notebooks into maintainable data science code with Kedro"

data-science jupyter-notebooks kedro python

Last synced: 11 Apr 2025

https://github.com/ds4v/30vnfoods

An end-to-end implementation process for building, labeling & deploying a dataset with 25136 images of 30 Vietnamese foods & their URLs: https://www.kaggle.com/quandang/vietnamese-foods

data-science dataset google-sheets kaggle vietnamese-foods

Last synced: 08 Sep 2025

https://github.com/joaocarabetta/osm-road-length

Calculate Open Street Maps road length for any polygon

data-science osm python urban-analytics urban-data-science

Last synced: 02 May 2025

https://github.com/vizlylabs/waraqa

AI-powered tool for local, lightweight data analysis 🍃

ai data-science data-visualization local plotly pyodide react rust tauri

Last synced: 30 Apr 2025

https://github.com/esoxjem/algorithms

Algos and Data Structures

algorithms data-science dsa java kotlin

Last synced: 06 Mar 2026

https://github.com/wlandau/targetsketch

Sketch a pipeline of targets in an interactive web app

data-science high-performance-computing pipeline r reproducibility rstats shiny targets workflow

Last synced: 20 Mar 2025

https://github.com/techforuk/my_eu

Code and data for myeu.uk - find out what the EU has done for your area

brexit data-science google-maps-api ipython-notebook javascript python static-site webpack

Last synced: 28 Oct 2025

https://github.com/alvertogit/deeplearning_flask

Data Science AI Artificial Intelligence Deep Learning Python Keras TensorFlow TensorFlow2 Flask Flask3 Docker NGINX Gunicorn microservices REST API Jupyter Lab Notebook GitHub Actions Ruff

artificial-intelligence data-science deep-learning docker flask github-actions jupyter-lab jupyter-notebook keras python ruff tensorflow tensorflow2

Last synced: 15 Mar 2026

https://github.com/nelsonmestevao/uminho

:books: University projects, exercises & notes

c cpp data-science distributed-systems haskell java software-engineering

Last synced: 27 Oct 2025

https://github.com/chuongmep/aps-bot

Explore Data By CLI With Autodesk Platform Services

aps autodesk-forge autodesk-platform-services cli data-analysis data-science forge

Last synced: 12 Apr 2025

https://github.com/iterative/studio-support

❓ DVC Studio Issues, Question, and Discussions

data-science dvc machine-learning mlops support

Last synced: 04 Feb 2026

https://github.com/amitkaps/datascience

Build and Deploy Machine Learning Models on the Cloud

cloud data-science machine-learning python

Last synced: 10 Nov 2025

https://github.com/njanakiev/wikidata-mayors

Exploration of the Mayors in Europe with Wikidata and Python

data-science data-visualization deckgl python sparql wikidata

Last synced: 21 Aug 2025

https://github.com/nicovandenhooff/top-repo-analysis

This repository contains my work that supports my article on Towards Data Science: "Exploring the Most Popular Machine Learning and Deep Learning GitHub Repositories."

altair automation data-analysis data-science data-visualization pygithub python

Last synced: 21 Aug 2025

https://github.com/rurlus/modelmetricuncertainty

Python package for Model Metric Uncertainty estimation

data-science python science uncertainty

Last synced: 28 May 2026

https://github.com/tushar2704/everyday_python

Welcome to Everyday Python Sheets – your go-to resource for everyday Python cheat sheets, pro tips, interview questions, Python one-liners, and Python data structures. Whether you're a beginner looking to learn Python or an experienced developer seeking quick reference materials, this Streamlit application has got you covered.

artificial-intelligence cheatsheet data data-analysis data-science data-structures data-visualization database protips python streamlit streamlit-tushar2704 tushar2704

Last synced: 09 May 2026

https://github.com/lvalnegri/workshops-setup_cloud_analytics_machine

Tips and Tricks to setup a cloud machine for Analytics and Data Science with R, RStudio and Shiny Servers, Python and JupyterLab

analytics cloud dashboard data-science docker dockerfile jupyterlab linux machine-learning python r raspberry-pi rmarkdown rstats rstudio-server scipy shiny shiny-apps shiny-server ubuntu

Last synced: 30 Jul 2025

https://github.com/sayakpaul/floydhub-anomaly-detection-blog

Contains the thorough experiments made for a FloydHub article on Anomaly Detection

anomaly-detection data-science faker jupyter-notebook pyod python

Last synced: 06 May 2025

https://github.com/bbva/mercury-dataschema

Utility package that, given a Pandas DataFrame, it uses the DataSchema class which auto-infers feature types and automatically calculates different statistics depending on the types.

analytics data data-cleaning data-processing data-science feature-engineering

Last synced: 21 Jun 2025

https://github.com/jonnor/datascience-master

Journal/notes/log of my Masters in Data Science degree

data-science homework machine-learning

Last synced: 06 May 2025

https://github.com/phantominsights/covid-19

Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.

covid-19 data-science etl matplotlib numpy pandas python3 requests seaborn

Last synced: 06 Mar 2026

https://github.com/mukeshmithrakumar/hackerranksolutions

My HackerRank Solutions for Python, Java, C, C++, Shell, SQL, JavaScript and Interview Preparation Kit

bash c cpp data-science hackerrank hackerrank-solutions interview-preparation interview-questions java javascript linux-shell machine-learning python python3 shell software sql

Last synced: 09 Jul 2025

https://github.com/kongruksiamza/python-datascience

เอกสารประกอบการสอนเนื้อหา Python - Data Science และงานด้าน Machine Learning

data-analysis data-science numpy pandas python

Last synced: 05 May 2025

https://github.com/yashksaini-coder/binary-classification-churn-prediction

Bank Churn Binary Classification Project | Prediction of the bank churn implementing the Ensemble model building technique.

classification data-science ensemble-learning machine-learning machine-learning-algorithms supervised-learning

Last synced: 11 Apr 2025

https://github.com/getyourguide/db-rocket

Keep your local python scripts installed and in sync with a databricks notebook. Shortens the feedback loop to develop projects using a hybrid environment.

data-science databricks productivity python

Last synced: 11 Apr 2025

https://github.com/yashksaini-coder/regression-with-an-abalone-dataset

Building a Linear regression model to predict the Age of Abalone from various physical measurements and implementing it in a web app based on Python 🐍

data-science kaggle linear-regression machine-learning streamlit

Last synced: 11 Apr 2025

https://github.com/boschresearch/exekglib

Python library for Executable Machine Learning Knowledge Graphs

data-science knowledge-graph-construction machine-learning machine-learning-pipelines python

Last synced: 07 Oct 2025

https://github.com/Jeniffen/projectr

Set up 📂-structure for data science projects

data-science package r rstats setup

Last synced: 30 Jul 2025

https://github.com/aliktk/python_chilla

This repository contains practice materials on Python, used to deliver online training course. The course was sponsered by codenics and Scholership Network. Pakistan

course data-science eda machine-learning-algorithms pandas-python python scikit-learn training

Last synced: 10 Apr 2025

https://github.com/stratosphereips/netflowlabeler

A configurable rule-based labeling tool for network flow files.

data-science dataset-generation datasets labeler netflow network-traffic tool zeek zeek-analysis

Last synced: 22 Jan 2026

https://github.com/ngohungphuc/data-science-and-analytics

My Data Science and Analytics learning journey

data data-science

Last synced: 07 Oct 2025

https://github.com/sigbla/sigbla-app

Sigbla is a framework for working with data in tables, using the Kotlin programming language. It supports various data types, reactive programming and events, user input, charts, and more.

dashboard dashboard-application data-analysis data-science data-visualization kotlin kotlin-dsl kotlin-library sigbla spreadsheet table

Last synced: 17 Jan 2026

https://github.com/srgrace/generative-ai-compass

A comprehensive guide to navigating the world of generative artificial intelligence!

ai cv data-science deep-learning genai llms machine-learning nlp vlms

Last synced: 23 Jan 2026

https://github.com/michaelakridge-noaa/open-science-codespaces

Zero Setup Open Science Codespaces. Quick and Easy Github Codespaces for RStudio, Tidyverse, Shiny, Python and more.

codespaces data-science devcontainer docker docker-compose geospatial jupyter-notebook python r rocker rstudio shiny streamlit tidyverse

Last synced: 10 Oct 2025

https://github.com/alessandrocorradini/mit-6.00.2x-introduction-to-computational-thinking-and-data-science

6.00.2x - Introduction to Computational Thinking and Data Science from MIT on edX

computer-science data-science edx mitx mitx600

Last synced: 13 Jul 2025

https://github.com/mine-cetinkaya-rundel/errormoji

®️ errors, in emoji

data-science education r rstats

Last synced: 20 Mar 2025

https://github.com/habedi/feature-factory

A high-performance feature engineering library for Rust powered by Apache DataFusion 🦀

data-preprocessing data-science feature-engineering feature-selection machine-learning rust-lang rust-library

Last synced: 01 Aug 2025

https://github.com/firefly-cpp/niaarm

A minimalistic framework for Numerical Association Rule Mining

association-rule-mining association-rules data-mining data-science evolutionary-algorithms swarm-intelligence

Last synced: 16 Jan 2026

https://github.com/kennethleungty/fifa-football-world-rankings

Analyzing FIFA World Football Rankings with Python and R

data-analysis data-analytics data-science football python r soccer sports

Last synced: 12 Jul 2025

https://github.com/amirmardan/ml_course

This repository belongs to the course of machine learning with Python which is getting ready for AUT

data-analysis-python data-science deep-learning keras machine-learning python pytorch scikit-learn tensorflow

Last synced: 28 Jul 2025

https://github.com/vojay-dev/sc2-data-pipeline

StarCraft 2 Data Pipeline with Airflow, DuckDB and Streamlit

airflow data data-engineering data-science duckdb starcraft2 streamlit

Last synced: 20 Sep 2025

https://github.com/lapets/course-data-science

Materials for a computer science course on tools for data science.

data-science

Last synced: 14 Jul 2025

https://github.com/alipsa/ride

A nice R development and analytics environment, for the Renjin JVM implementation of R

analytics data-analysis data-science data-visualization integrated-development-environment r sql

Last synced: 14 Oct 2025

https://github.com/autonomio/astetik

Astetik takes away the pain from telling visual stories with data on Python

data-science descriptive-statistics jupyter matplotlib pandas seaborn visualization

Last synced: 29 Jun 2025

https://github.com/trafficgcn/st-gcn

Repository for advanced traffic forecasting models integrating GCN, LSTM/Bi-LSTM, and attention mechanisms for improved accuracy, including weather data processing.

attention-mechanism data-science gcn graph-neural-network graph-neural-networks gru lstm metr-la neural-network neural-networks pems-bay python traffic traffic-analysis traffic-forecasting traffic-prediction weather

Last synced: 30 Jul 2025

https://github.com/flofriday/youtube-data

Jupyter Notebook to analyze your YouTube data.

data-science jupyter jupyter-lab personal-data python3 youtube

Last synced: 17 Mar 2026

https://github.com/azure99/blossomdata

A fluent, scalable, and easy-to-use LLM data processing framework.

data-engineering data-science fine-tuning gpt llama llm nlp supervised-learning

Last synced: 31 Jan 2026

https://github.com/som-research/hfcommunity

HFCommunity offers an offline up-to-date relational database built from the data available at the Hugging Face Hub, providing queriable data about the repositories hosted in the Hub

data-science database dataset huggingface

Last synced: 05 Apr 2026

https://github.com/tushar2704/streamlit-magic-cheat-sheets

Streamlit Magic Cheat Sheets- All of Streamlit in one Streamlit App!(Available in English, Français & Deutsch.)

data-science machine-learning python snowflake streamlit streamlit-tushar2704 tushar2704 webapp

Last synced: 27 Apr 2026

https://github.com/milos-agathon/map-rivers-with-sf-and-ggplot2-in-r

Let's make a pretty map of European rivers using the Global River Classification dataset 🧑🏼‍💻 Check the full tutorial at https://milospopovic.net/map-rivers-with-sf-and-ggplot2-in-r/

data-science data-visualization gis r rivers

Last synced: 04 Apr 2026

https://github.com/sondosaabed/programming-for-data-science-with-python-nanodegree

I aquired a full scholarship from Google Launchpad. Programming concepts, systems, languages (Python and SQL), and techniques that is data-led, problem-solving and decision making

data-science git nanodegree python sql udacity-data-analyst-nanodegree udacity-nanodegree

Last synced: 09 Apr 2025

https://github.com/nicbet/infozilla

The infoZilla unstructured software engineering data mining tool. It can find and extract source code regions, patches, stack traces, enumerations and itemizations from discussion threads.

bugreport bugzilla data-mining data-science tools unstructured-data

Last synced: 13 Oct 2025

https://github.com/martin-sicho/genui-gui

GenUI frontend application. It provides a GUI to the GenUI REST API web services.

cheminformatics data-science gui molecular-generation qsar react visualization webapp

Last synced: 19 Jan 2026

https://github.com/flipkart/foxtrot

A store abstraction and analytics system for real-time event data.

alerting analytics data-engineering data-science data-visualization elasticsearch hbase java monitoring

Last synced: 12 Dec 2025