An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/minusxai/minusx

MinusX is an Agentic Business Intelligence platform. It's Claude Code for data.

artificial-intelligence data-analytics data-science jupyter metabase

Last synced: 18 Feb 2026

https://github.com/giswqs/leafmaptools

A Python package for building a tool widgets infrastructure with ipyleaflet and ipywidgets

data-science data-visualization geopython geospatial ipyleaflet ipywidgets jupyter jupyter-notebook mapping python

Last synced: 12 May 2025

https://github.com/tyson-swetnam/awesome-fire-science

List of resources for working with wildfire science data

awesome-list data-science fire wildland

Last synced: 24 Apr 2025

https://github.com/dusenberrymw/systemml-nn

A deep learning library for Apache SystemML.

data-science deep-learning machine-learning neural-networks systemml

Last synced: 02 Sep 2025

https://github.com/tushar2704/data-portfolio

This repository showcases my skills and experience in the field of data analysis. Here, you will find a collection of projects and analyses that demonstrate my ability to extract insights and make data-driven decisions.

artificial-intelligence data-science dataanalysis postgresql python r sql streamlit-tushar2704 tushar2704

Last synced: 14 Mar 2026

https://github.com/alexioannides/bodywork-mlops-demo

Demonstrating how Bodywork can be used to deploy a simulation of the lifecycle of a train-and-serve ML pipeline, responding to new data undergoing concept drift.

aws data-science docker kubernetes machine-learning mlops numpy python scikit-learn

Last synced: 29 Oct 2025

https://github.com/davidgasquez/datalab

⚗️ A local and devcontainer friendly alternative to Google Colab

data-science docker jupyter-notebook

Last synced: 08 May 2025

https://github.com/time-series-machine-learning/tsml-py

A toolkit for time series machine learning algorithms that don't fit in aeon. Use aeon instead if you can!

data-science machine-learning python scikit-learn time-series time-series-classification time-series-clustering time-series-regression

Last synced: 02 Apr 2026

https://github.com/lemniscate-world/neural

Neural is a domain-specific language (DSL) designed for defining, training, debugging, and deploying neural networks. With declarative syntax, cross-framework support, and built-in execution tracing (NeuralDbg), it simplifies deep learning development.

automation data-science data-visualization diagrams dsl hyperparameter-optimization lark llms machine-learning neural-architecture-search neural-networks neural-networks-and-deep-learning neural-networks-from-scratch nocode onnx pytorch tensorflow visual-programming-language visualization

Last synced: 11 Apr 2025

https://github.com/camara94/convolutional-neural-networks-tensorflow

In Course 2 of the deeplearning.ai TensorFlow Specialization, you will learn advanced techniques to improve the computer vision model you built in Course 1. You will explore how to work with real-world images in different shapes and sizes, visualize the journey of an image through convolutions to understand how a computer “sees” information, plot loss and accuracy, and explore strategies to prevent overfitting, including augmentation and dropout. Finally, Course 2 will introduce you to transfer learning and how learned features can be extracted from models.

computer-vision convolutional-neural-networks data-science deep-learning deep-neural-networks tensorflow

Last synced: 23 Jun 2025

https://github.com/epicestudar/curso_python

repositório para o curso de python + pwbi

analise-de-dados ciencia-de-dados data-science powerbi python

Last synced: 09 Aug 2025

https://github.com/shohan4556/machine-learning-course-notes

A collection of notes and codes of my Msc course on Machine Learning

ai data-science machine-learning python3

Last synced: 06 Sep 2025

https://github.com/wahyudesu/intro-to-machine-learning

Rekomendasi belajar ML untuk pemula https://www.kaggle.com/learn/intro-to-machine-learning https://developers.google.com/machine-learning/crash-course/ml-intro?hl=id

data-science learning machine-learning machine-learning-algorithms python

Last synced: 22 Sep 2025

https://github.com/arose13/gmem

Generalised Mixed Effects Model. Now any machine learning model can have random effects.

data-science machine-learning mixed-effects-models

Last synced: 07 May 2025

https://github.com/arv-anshul/pw-impact-batch

PW Impact Batch 1.0 - Assignments, Quizzes, learning and a solution website.

assignment data-science ineuron-ai ml nlp physics-wallah project pw-skills python3 streamlit

Last synced: 22 Apr 2025

https://github.com/gagolews/analiza_danych_w_jezyku_python

M. Gągolewski, M. Bartoszuk, A. Cena, Przetwarzanie i analiza danych w języku Python, PWN, 2016

data-science matplotlib numpy pandas polski python scikit-learn

Last synced: 14 Jul 2025

https://github.com/sandptel/captcha_iitkgp

This repository aims to provide a deployable model to solve the captchas given in the erp website.

captcha captcha-solver captcha-solving data-science deep-learning hacktoberfest hacktoberfest-accepted neural-network python

Last synced: 16 Aug 2025

https://github.com/owalid/adaptaviz

👩‍🌾 Decision support tool allows to estimate on a plot of land which crops, vegetables, cereals, legumes will be the most adapted to the future climate.

climate-change climate-data data-science datascience hackathon nuxt python

Last synced: 17 Jan 2026

https://github.com/piesposito/tand

TanD - Train and Deploy is a no-code framework to automatize the Machine Learning workflow.

data-science fastapi machine-learning mlflow pytorch sklearn workflow-automation

Last synced: 24 Oct 2025

https://github.com/holgern/pyrdatasets

2293 datasets from various R packages packed as DataFrames through compressed pickle files

data-science datasets python rdatasets

Last synced: 13 Jul 2025

https://github.com/adrien-legros/rhods-mnist

Data science pipelines and model serving using Red Hat OpenShift Data Science

data-science model-serving openshift-ai pipelines redhat rhoai rhods

Last synced: 17 Jan 2026

https://github.com/gholamrezadar/ghd-snippets-next

GHD Snippets - A Data Science Snippet Library

data-science python pytorch snippets typescript

Last synced: 02 Sep 2025

https://github.com/mkearney/data-scribers

{data scribers} is a collection of posts about data science. And unlike other content aggregating sites, this one encourages people to visit the blog's actual site.

blog-posts blogs data-science keras machine-learning modeling posts python pytorch r rstats statistics tensorflow

Last synced: 12 Apr 2025

https://github.com/emi420/ollama-batch

This simple utility will runs LLM prompts over a list of texts or images for classify them, printing the results as a JSON response.

ai data-science llm ollama

Last synced: 30 Aug 2025

https://github.com/dominictarro/prefecto

Library of Prefect tasks and utilities.

data-engineering data-science prefect python

Last synced: 14 Jan 2026

https://github.com/letsdeepchat/techgig-30-day-challenge

Hello there! Here is 30-Day coding challenge using Python for Techgig and also I have HackerRank 30 Days coding changes coding repository. which really important for those guys who just started coding. I tried to explain basic coding for fresh coders.

30-day-leetcoding-challenge 30-days-of-code 30dayscodechallenge basic-programming competitive-programming data-science data-structures day-01-hello-techgig deepak deepak-chaudhari deepak14ri machine-learning oops-in-python python techgig-solutions

Last synced: 21 Aug 2025

https://github.com/abhay557/fakedata

The fakedata package generates realistic synthetic user profiles for machine learning, deep learning, data analysis, and data science workflows.

abhay557 anime data data-analysis data-science deep-learning fake fake-data generator joke machine-learning mock mock-data

Last synced: 30 May 2026

https://github.com/tseemann/kounta

🧮 🔢 Generate multi-sample k-mer count matrix from WGS

data-science genomics-data gwas kmer-counting machine-learning

Last synced: 12 Apr 2025

https://github.com/mohammadreza-mohammadi94/data-analysis-projects-with-pandas

A repository featuring practical data analysis projects using Pandas, demonstrating data manipulation, visualization, and real-world problem-solving techniques. Ideal for learning and applying Pandas for data analysis.

data data-science jupyter-notebook pandas

Last synced: 05 May 2025

https://github.com/zohaibkhandev/weather_app

Experience WeatherWise, your go-to app for accurate forecasts. Get real-time updates on current conditions and detailed forecasts for the week ahead. Plan your day with confidence using hourly forecasts tailored to your location. Stay informed with customizable weather alerts for your favorite locations. With intuitive navigation and beautiful vis

data-science rest-api weather weather-api weather-app world

Last synced: 11 Apr 2025

https://github.com/nceas/scicomptools

Tools Developed by NCEAS Scientific Computing Support Team

data-science r r-package rstats

Last synced: 07 Mar 2026

https://github.com/agnostiqhq/tutorials_covalent_pydata_2023

Covalent tutorial notebooks and slides for PyData 2023, NYC

ai aws covalent data-science gena hpc llm ml pydata pydata-nyc

Last synced: 11 May 2025

https://github.com/ivanrs297/machine-learning-projects

A repository for Machine Learning, Deep Learning and Data Science projects.

data-science deep-learning machine-learning python

Last synced: 20 Jun 2025

https://github.com/maestre3d/alexandria

The Alexandria Project is an open-source platform where people can share their knowledge through books, podcasts, docs and videos.

alexandria data-science donation ebooks go golang grpc http kafka knowledge knowledge-sharing library microservice podcasts python societies streaming videos webservice

Last synced: 18 Mar 2025

https://github.com/gsarti/cancer-detection

Team Capybara final project "Histopathologic Cancer Detection" for the Statistical Machine Learning course @ University of Trieste

cancer cancer-detection capsule-network capsule-networks convolutional-neural-networks data-science dssc healthcare image-segmentation random-forest university-of-trieste university-project unsupervised-clustering

Last synced: 08 May 2025

https://github.com/sjcobb/music360js

Music Visualization YouTube Channel https://www.youtube.com/channel/UCo_IXLTK8dtF2qOUCt4l47Q

3d-game cannonjs data-science data-visualization javascript midi music music-theory music-visualization music-visualizer physics threejs tonaljs tonejs youtube-channel

Last synced: 28 Oct 2025

https://github.com/bradleyboehmke/dw-r

Code and text for the "Data Wrangling with R" book.

book data-science data-wrangling r

Last synced: 13 Apr 2025

https://github.com/katrienantonio/workshop-loss-reserv-fraud

Course material for a workshop on loss modelling, reserving and insurance fraud analytics

actuarial-science data-science insurance-claims

Last synced: 06 May 2025

https://github.com/praneeth-katuri/house-worth

HouseWorth is an open-source project used for predicting house prices using machine learning techniques

data-science exploratory-data-analysis house-price-prediction machine-learning python real-estate regression

Last synced: 23 Jun 2025

https://github.com/dayyass/ml-interviews

My solutions for Home Assignments for Machine Learning Job Interviews.

bert data-science deep-learning elmo interview machine-learning natural-language-processing word-sense-induction

Last synced: 13 Apr 2025

https://github.com/szczyglis-dev/python-lottery-dataset-analyze

[Python] A Jupyter notebook illustrating methods for analyzing a historical lottery results dataset. The example demonstrates assessing linear relationships between variables, incorporating astronomical data, and visualizing number distributions.

analyze-data astronomy csv data-science datasets jupyter linear-regression lottery-draw notebook-jupyter plot predictive-modeling probability-distribution python random relationship skyfield

Last synced: 29 Jun 2025

https://github.com/reycn/data-analytics-in-julia

Notebooks for data analysis in social science using Julia, replicating frequent analytical steps in Python & R.

data data-analysis data-science data-visualization julia

Last synced: 07 May 2025

https://github.com/calvinmccarter/kditransform

Kernel density integral transformation: feature preprocessing and univariate clustering (TMLR, 2023)

data-science discretization kernel-density-estimation preprocessing python quantiles

Last synced: 07 Apr 2026

https://github.com/javorraca/quarto-blog

Personal data science website rebuilt as a Quarto blog

analytics blog data-science machine-learning python quarto rstats

Last synced: 19 Jul 2025

https://github.com/datalorax/sds-r

Repo for a draft book on social data science methods with R

data-science r rstats social-data-science

Last synced: 11 Apr 2025

https://github.com/gtzinos/bigdata-graph-analysis

Probably the first scalable and open source triangle count based on each edge, on scala and spark for every Big Dataset. (Louvain)

big-data bigdata community-detection data-mining data-science datamining intellij louvain louvain-algorithm louvain-community-detection louvain-method sbt scala spark triangle-counting

Last synced: 04 Jul 2025

https://github.com/psyplot/psy-maps

The psyplot plugin for visualizations on a map

cartopy cf-conventions data-science icon-esm matplotlib netcdf psyplot ugrid visualization xarray

Last synced: 03 Aug 2025

https://github.com/jalajthanaki/get_jobs_in_data_science

Introduction and Career Guide for Data Science enthusiasts

data-science data-science-learning jobsearch

Last synced: 10 Aug 2025

https://github.com/manikantasanjay/financial_analysis_using_python_and_ml_libraries

This repository has been created as part of my Udemy Course learning "Python & Machine Learning for Financial Analysis" by Dr. Ryan Ahmed.

data-science deep-learning financial-analysis portfolio-management predictive-modeling time-series-analysis

Last synced: 28 Jul 2025

https://github.com/devscast/cd-data

important background data for the creation of a solution for the DRC

congo congo-kinshasa data data-science json rdata rdc rdc-data

Last synced: 06 Apr 2025

https://github.com/AurelienAubry/Spotlight

Spotlight is a Spotify dashboard that allows user to visualize his listening habits.

backend bootstrap chartjs data data-analysis data-science data-visualization flask frontend javascript js pandas python python3 react react-bootstrap spotify

Last synced: 15 Apr 2025

https://github.com/wahyudesu/predicting-hotel-booking-cancellations

This project will help hotel managers optimize their booking policies, reduce cancellations, and improve revenue.

data data-analysis data-science python

Last synced: 07 Jul 2025

https://github.com/jasdumas/depaul

Coursework from DePaul MS in Predictive Analytics

coursework data-science grad-school predictive-analytics

Last synced: 05 Mar 2025

https://github.com/pirate/experiments

:sparkles: Random (sometimes xkcd-inspired) Python, Haskell, and JS experiments involving data science and algorithm fun.

algorithms computer-science data-science data-structures game-theory haskell javascript machine-learning math python random snippets statistics test-bed

Last synced: 24 Mar 2025

https://github.com/jieguangzhou/fifa-world-cup-2022

DolphinScheduler machine learning "FIFA World Cup 2022 Predictions" betting workflow

data-science dolphinscheduler machine-learning world-cup-2022

Last synced: 15 Apr 2025

https://github.com/cbueth/infomeasure

Python package for calculating various information measures, including entropy, mutual information, transfer entropy, and more, with support for both discrete and continuous variables.

complex-networks conditional-probability data-science entropy entropy-measures information-theory machine-learning mathematical-modelling mutual-information numpy physics research statistical-analysis transfer-entropy

Last synced: 25 Aug 2025

https://github.com/derlin/dev.to-is-for-web-devs-and-beginners

Analysis of the tags that work on dev.to. Full article: https://dev.to/derlin/devto-is-for-webdevs-and-beginners-i-have-data-to-prove-it-54c4

data-science devto jupyter-notebook python

Last synced: 15 May 2025

https://github.com/jmaces/statstream

Statistics for Streaming Data

data-science numpy statistics streaming-data

Last synced: 23 Apr 2025

https://github.com/jphall663/jsm_2018_paper

Paper for 2018 Joint Statistical Meetings: https://ww2.amstat.org/meetings/jsm/2018/onlineprogram/AbstractDetails.cfm?abstractid=329539

data-mining data-science explainable-ml fatml iml interpretability interpretable-ai interpretable-machine-learning interpretable-ml machine-learning machine-learning-interpretability python transparency xai

Last synced: 04 Aug 2025

https://github.com/kennethleungty/aws-rds-mysql-python

Integrating Amazon RDS, MySQL Workbench, and PyMySQL to build and deploy a database on the cloud

aws aws-rds aws-rds-mysql data-science database database-management mysql mysql-database pymysql python rdbms sql

Last synced: 12 Jul 2025

https://github.com/markdouthwaite/serverless-scikit-learn-demo

A repository providing demo code for deploying a lightweight Scikit-Learn based ML pipeline modelling heart disease data as a Google Cloud Function.

data-science google-cloud google-cloud-function machine-learning machine-learning-api machine-learning-projects scikit-learn serverless tutorial

Last synced: 25 Jul 2025

https://github.com/chandru-21/mlops_project

An end-to-end MLOps pipeline(CI/CD/CT/CM) project for training, versioning, deploying, and monitoring machine learning models using FastAPI, Kubernetes, MLflow, DVC, Prometheus, and Grafana.

aws cicd data-science docker dvc evidentlyai fastapi github-actions grafana grafana-dashboard kubernetes machine-learning mlops mlops-community mlops-project mlops-template prometheus

Last synced: 24 Sep 2025