An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/repetere/modelscript

REPO MOVED TO https://github.com/repetere/jsonstack-data - Data Science and Machine learning in JavaScript

data-mining data-preprocessing data-science javascript machine-learning

Last synced: 01 Oct 2025

https://github.com/ag-ds-bubble/swtloc

Python package for Stroke Width Transform - Localizing the Text (Letters & Words) in a Natural Image

bounding-boxes bubble-bounding-box computer-vision computer-vision-algorithms data-science image-processing ocr preprocessing python stroke-width-transform text-localization

Last synced: 18 Feb 2026

https://github.com/tdeboissiere/cookiecutter-deeplearning

Project folder structure for doing and sharing deep learning work.

data-science project-template

Last synced: 06 Apr 2025

https://islinxu.github.io/awesome-road-map/

๐Ÿ—บ๏ธawesome-road-map๐Ÿ—บ๏ธ

artificial-intelligence data-science deep-learning kaggle machine-learning python

Last synced: 17 Apr 2025

https://github.com/jefkine/zeta-learn

zeta-lean: minimalistic python machine learning library built on top of numpy and matplotlib

autoencoder cnn data-science datasets deep-learning gan gru k-means-clustering lstm machine-learning matplotlib neural-networks numpy pca perceptron python regression rnn

Last synced: 14 Jan 2026

https://github.com/runprism/alto

Serverless for data practitioners. The fastest โšก๏ธ way to run your code in the cloud. Effortlessly run scripts, functions, and Jupyter notebooks in virtual machines.

aws cli cloud data data-analysis data-science deployment ec2 entrypoint function gcp infrastructure jupyter python serverless

Last synced: 31 May 2026

https://github.com/openintrostat/ims-tutorials

Interactive tutorials developed with the learnr package supporting the textbook OpenIntro::Introduction to Modern Statistics.

data-science educational learnr open-educational-resources open-source rstats statistics

Last synced: 26 Jan 2026

https://github.com/m-dadej/marswitching.jl

MarSwitching.jl: Julia package for Markov switching dynamic models :chart_with_upwards_trend:

data-science econometrics julia machine-learning markov-chain statistics time-series

Last synced: 10 Apr 2025

https://github.com/datawithbaraa/tableau-ultimate-course

All Tableau workbooks, datasets, and resources for students from Data with Baraaโ€™s YouTube Tableau series. The most comprehensive Tableau & SQL guide from a real-world expert! Learn everything from basics to advanced analytics, dashboards, optimizations, and real-world business use cases.

bi-tools business-intelligence data-analysis data-analytics data-science data-visualization tableau tableau-course tableau-dashboard tableau-dashboards tableau-desktop tableau-project tableau-projects tableau-public tableau-repository tableau-server tableau-story tableau-training tableau-tutorial tableau-workbooks

Last synced: 02 Sep 2025

https://github.com/jules32/rmarkdown-website-tutorial

Tutorial for creating websites w/ R Markdown

data-science rmarkdown rstats teaching tutorial

Last synced: 22 Apr 2025

https://github.com/darribas/gds17

Geographic Data Science'17

data-science gis pysal python

Last synced: 17 Jun 2025

https://github.com/ralsei/graphite

A data visualization library for Racket.

data-science ggplot2 racket

Last synced: 18 Feb 2026

https://github.com/apreshill/data-vis-labs-2018

Principles & Practice of Data Visualization, CS631 Spring 2018

data-science data-visualization education rstats teaching

Last synced: 01 Dec 2025

https://github.com/dineshpinto/market-analytics

A random set of notebooks to analyze the crypto markets

analytics blockchain cryptocurrency data-science ethereum quant

Last synced: 12 Apr 2025

https://github.com/jhwohlgemuth/pwsh-prelude

PowerShell โ€œstandardโ€ library for supercharging your productivity. Provides a powerful cross-platform scripting environment enabling efficient analysis and sustainable science in myriad contexts.

applied-mathematics cli cli-app data-science hacktoberfest library mathematics powershell powershell-module statistics text-processing text-to-speech user-interface

Last synced: 15 Oct 2025

https://github.com/jmari/ipharo

Pharo Smaltalk kernel for Jupyter

data-science jupyter-notebook pharo pharo-smalltalk smalltalk

Last synced: 23 Oct 2025

https://github.com/jmari/iPharo

Pharo Smaltalk kernel for Jupyter

data-science jupyter-notebook pharo pharo-smalltalk smalltalk

Last synced: 11 May 2025

https://github.com/meiyulee/MathAI

ๅ…่ฒปๆ•ธๅญ—้ฉ…ๅ‹•็š„ๆ•ธๅญธๆจกๅž‹ไบบๅทฅๆ™บ่ƒฝ | ็‚บไฝ ็š„ๆ•ธๅญ—่ฆๅพ‹ๅปบ็ซ‹ๆ•ธๅญธๆจกๅž‹ | C่ชž่จ€ๅ…ๅฎ‰่ฃ่ปŸ้ซ”

ai artifical-intelligence bigdata chatgpt data-science dataanalytics datadriven math-ai mathai mathematical-modelling mathematics mathgpt numerical-computation numerical-methods portable regression regression-analysis regression-models science statistics-modeling

Last synced: 21 Apr 2025

https://github.com/dMLTquant/openbb_sdk_exporation

Explore OpenBB SDK without having to install anything on your local machine. You just need a GitHub and a GitPod account.

algorithmic-trading data-science financial-data jupyter notebook openbb python

Last synced: 30 Mar 2025

https://github.com/skagr/footballdata

A collection of wrappers over football data from various websites / APIs.

data-science dataset football pandas python soccer

Last synced: 27 Aug 2025

https://github.com/creditas/galeritas

Galeritas is an open library for data visualization.

data-science data-visualization python

Last synced: 13 Apr 2025

https://github.com/ammsa/dtcleaner

DTCleaner: data cleaning using multi-target decision trees.

data-cleaning data-mining data-preprocessing data-quality data-science data-wrangling

Last synced: 21 Mar 2025

https://github.com/leemengtw/gist-evernote

A Python application that sync Github Gists and save them to Evernote notebook as screenshots.

data-science evernote gists github github-graphql jupyter-notebook pet-project python selenium sync

Last synced: 13 Jul 2025

https://github.com/florents-tselai/greek-wines-analysis

Scraper, Data and Analysis for "Analyzing 1000+ Greek Wines with Python"

beautifulsoup data-science pandas python seaborn web-scraping

Last synced: 11 Apr 2025

https://github.com/tejzpr/ordered-concurrently

Ordered-concurrently a library for concurrent processing with ordered output in Go. Process work concurrently and returns output in a channel in the order of input. It is useful in concurrently processing items in a queue, and get output in the order provided by the queue.

concurrent concurrent-data-structure data-pipeline data-science golang golang-library ordered parallel parallel-computing

Last synced: 16 Jun 2025

https://github.com/dwhitena/julia-workshop

"Integrating Julia in real-world, distributed pipelines" for JuliaCon 2017

data-science docker julia julia-language kubernetes machine-learning pipelines

Last synced: 13 Sep 2025

https://github.com/rafzamb/sknifedatar

sknifedatar is a package that serves primarily as an extension to the modeltime ๐Ÿ“ฆ ecosystem. In addition to some functionalities of spatial data and visualization.

data data-analysis data-science data-visualization forecasting r statistics time-series

Last synced: 07 Mar 2026

https://github.com/scicloj/scicloj-data-science-handbook

Clojure data science handbook - journal style examples of data science

clojure clojurescript data-science notebook scicloj

Last synced: 14 Apr 2025

https://github.com/cloudguruab/modsysML

Model management toolkit for continuous model improvement. Evaluate and compare LLM outputs, test quality, catch regressions and automate.

ai automation-framework data-science machinelearning mlops natural-language-processing nlp-machine-learning open-source prompt-toolkit prompts python security-tools

Last synced: 02 Apr 2025

https://github.com/davidnx/baby-kusto-csharp

A self-contained execution engine for the Kusto Query Language (KQL) written in C#

azure-data-explorer data-science kql kusto

Last synced: 14 Jan 2026

https://github.com/sztal/pybdm

Python implementation of block decomposition method for approximating algorithmic complexity

algorithmic-complexity algorithmic-information-dynamics algorithmic-information-theory complexity data-science kolmogorov-complexity

Last synced: 14 Jan 2026

https://github.com/root-11/tablite

multiprocessing enabled out-of-memory data analysis library for tabular data.

data-analysis data-science datatype disk etl excel filereader pandas pivot-tables python table tabular-data

Last synced: 28 Oct 2025

https://github.com/tstreamdoth/instacart-market-basket-analysis

Use Instacart public dataset to report which products are often shopped together. ๐Ÿ‹๐Ÿ‰๐Ÿฅ‘๐Ÿฅฆ

data-analysis data-science instacart market-basket-analysis

Last synced: 21 Mar 2025

https://github.com/paulosalem/gpt3-poc-tutorial-with-braindump

A demo application to support my tutorial on building applications with GPT-3.

data-science gpt gpt-3 natural-language-understanding openai proof-of-concept

Last synced: 12 Jul 2025

https://github.com/aachartmodel/aachartkit-swift-pro

๐Ÿ“ˆ๐Ÿ“Š๐Ÿ‘‘๐Ÿ‘‘๐Ÿ‘‘AAChartKit-Swift-Pro is a professional version of AAChartKit-Swift, it is an elegant and friendly chart framework for iOS, iPadOS, macOS. AAChartKit-Swift-Pro is a more powerful data visualization framework that supports more types beautiful chart like bellcurve, bullet, columnpyramid, cylinder, dependencywheel, heatmap, histogram, networkgraph, organization, packedbubble, pareto, sankey, series, solidgauge, streamgraph, sunburst, tilemap, timeline, treemap, variablepie, variwide, vector, venn, windbarb, wordcloud, xrange charts and so on.

aacharts chart charting-library data-science data-visualization framework highcharts hybrid ios ipados macos plot swift webview

Last synced: 12 Apr 2025

https://github.com/chandrikadeb7/coursera_ibm_data_science_professional_certificate

This repo consists of the lecture PDFs and quiz solutions of all the courses under the IBM Data Science Professional Certificate specialization course of Coursera.

coursera coursera-assignment coursera-data-science coursera-solutions coursera-specialization data-science ibm ibm-data-science jupyter-notebook lecture-pdfs professional-certificates quiz-solutions solutions specialization

Last synced: 24 Aug 2025

https://github.com/ak-coram/cl-duckdb

Common Lisp CFFI wrapper around the DuckDB C API

c-bindings common-lisp data-science duckdb lisp olap parquet sql

Last synced: 26 Aug 2025

https://github.com/open-source-labs/mlflow-js

A JavaScript client library for MLflow that streamlines machine learning lifecycle management in web environments.

ai data-science javascript machine-learning mlflow mlops node-js typescript

Last synced: 16 Aug 2025

https://github.com/openghg/openghg

A platform for greenhouse gas (GHG) data analysis and collaboration.

analysis cloud collaboration data-science greenhouse-gas

Last synced: 16 Jan 2026

https://github.com/openbiox/py4ds-cn

๐Ÿ“šๅˆฉ็”จPython่ฟ›่กŒๆ•ฐๆฎๅค„็†็ฌฌไบŒ็‰ˆไธญๆ–‡gitbook๏ผŒ็”จไบŽไธชไบบๅญฆไน 

chinese-translation data-science gitbook python3

Last synced: 08 Jan 2026

https://github.com/megagonlabs/ruler

Data Programming by Demonstration (DPBD) for Document Classification

data-labeling data-programming data-science machine-learning training-data weak-supervision

Last synced: 07 Jul 2025

https://github.com/deepsense-ai/ds-template

Template for professional data science and python applications made by deepsense.ai

cookiecutter-template data-science project-template python

Last synced: 17 Jan 2026

https://github.com/scrapinghub/page_clustering

A simple algorithm for clustering web pages, suitable for crawlers

data-science

Last synced: 25 Feb 2025

https://github.com/benjaminmbrown/real-time-data-viz-d3-crossfilter-websocket-tutorial

Tutorial on real-time data visualization. Python websocket server & d3.js + crossfilter.js frontend

crossfilter d3 d3js data-science data-visualization dcjs tutorial websockets

Last synced: 20 Jan 2026

https://github.com/lamres/capm_shiny

Demo project of creating an interactive analytical tool for stock market using CAPM.

capm data-science r shiny shinyapps stock-market stocks time-series

Last synced: 30 Jul 2025

https://github.com/google-marketing-solutions/ml_toast

Cluster multilingual search terms captured from different time windows into semantically relevant topics.

data-science machine-learning marketing-science nlp tensorflow topic-clustering

Last synced: 31 Jul 2025

https://github.com/plotly/dash-brain-surface-viewer

Dash app for viewing brain surfaces saved as MNI files. Data from https://github.com/aces/brainbrowser

brain-imaging dash data-science mcgill

Last synced: 01 Jul 2026

https://github.com/arvkevi/disarray

Confusion matrix metrics directly from your pandas DataFrame

confusion-matrix data-analysis data-science dataframe machine-learning pandas performance-metrics

Last synced: 07 Apr 2026

https://github.com/sayakpaul/benchmarking-and-mli-experiments-on-the-adult-dataset

Contains benchmarking and interpretability experiments on the Adult dataset using several libraries

data-science fastai h2oai interpretable-machine-learning machine-learning microsoft-interpret tensorflow

Last synced: 06 Mar 2026

https://github.com/ivan-bilan/nlp-and-data-science-spotlights

Regular spotlights of underrated NLP and Data Science GitHub repositories

data-science deep-learning natural-language-processing nlp spotlight

Last synced: 12 Feb 2026

https://github.com/girder/girder_worker

Distributed task execution engine with Girder integration, developed by Kitware

data-analytics data-science kitware

Last synced: 28 Jan 2026

https://github.com/mlsanigeria/ai-hacktober-mlsa

Contributing to cutting-edge open-source projects in Machine Learning hosted by MLSA Nigeria

artificial-intelligence data-science hacktoberfest machine-learning microsoft-azure mlsa open-source python

Last synced: 11 Jun 2025

https://github.com/openpolicedata/openpolicedata

The OpenPoliceData (OPD) Python library is the most comprehensive centralized public access point for incident-level police data in the United States. OPD provides easy access to 550+ incident-level datasets from 236 police agencies and 11 entire states. Types of data include traffic stops, use of force, officer-involved shootings, and complaints.

accountability arcgis-api data-science officer-involved-shootings open-data pandas police-complaints police-data python socrata-api traffic-stops transparency use-of-force

Last synced: 10 Jan 2026

https://github.com/vincentauriau/tennis-prediction

Predicts the winner of a tennis match with machine learning

atp data data-science machine-learning tennis

Last synced: 22 Apr 2025

https://github.com/IMSoley/cs-study-plan

๐Ÿ“š๐Ÿ‘จโ€๐ŸŽ“ Resources I'm using everyday to develop my skills to become a self-taught good programmer ...

artificial-intelligence computer-science data-science data-structures-and-algorithms higher-education machine-learning web-development

Last synced: 08 Jul 2025

https://github.com/center-for-threat-informed-defense/sightings_ecosystem

Sightings Ecosystem gives cyber defenders visibility into what adversaries actually do in the wild. With your help, we are tracking MITRE ATT&CKยฎ techniques observed to give defenders real data on technique prevalence.

ctid cyber-threat-intelligence cybersecurity data-science data-visualization mitre-attack

Last synced: 12 Apr 2025

https://github.com/rubixml/iris

The original lightweight introduction to machine learning in Rubix ML using the famous Iris dataset and the K Nearest Neighbors classifier.

classification cross-validation data-science example-project introduction-to-machine-learning iris-dataset k-nearest-neighbors knn machine-learning machine-learning-tutorial nearest-neighbors php php-machine-learning php-ml rubix-ml tutorial

Last synced: 26 Jun 2025

https://github.com/martinfleis/sdsc21-workshop

Materials for SDSC 2021 Workshop

data-science python workshop

Last synced: 31 Jul 2025

https://github.com/virgesmith/ukcensusapi

UK Census Data queries and downloads from python or R

data-science python r

Last synced: 20 Mar 2025