An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/hoangsonww/standard-deviation-calculator

๐Ÿ“Š This repository contains a Standard Deviation Calculator implemented in C++. It provides an efficient algorithm for calculating the statistical standard deviation of a dataset, making it a valuable tool for students, researchers, and analysts seeking a reliable method for data analysis.

algorithms cplusplus cpp data data-analysis data-analytics data-science standard-deviation standard-deviation-calculator standard-deviations

Last synced: 22 Sep 2025

https://github.com/shwetajoshi601/world-bank-data-analysis

An Exploratory Data Analysis on the World Bank Dataset.

analysis data-science eda python3 world-bank-api worldbank

Last synced: 02 Aug 2025

https://github.com/qpwedev/blockchain-network-visualizer

Blockchain Network Visualizer for TON.

blockchain data-science network ton toncoin

Last synced: 14 Mar 2025

https://github.com/elliotwutingfeng/twitter200m

Simple analysis of the Twitter 200M Data Dump of January 2023.

200m data-science haveibeenpwned leak osint twitter

Last synced: 16 Mar 2026

https://github.com/devopscorner/nifi

Production Grade Nifi & Nifi Registry. Deploy for VM (Virtual Machine) with Terraform + Ansible, Helm & Helmfile for Kubernetes (EKS)

ansible data-science data-structures docker docker-compose dockerhub ecr eks eks-cluster etl kubernetes machine-learning ml mlops nifi nifi-registry terraform vpn vpn-client

Last synced: 08 Sep 2025

https://github.com/mathewroy/ynabr

Analyze and visualize your You Need A Budget (YNAB) data. YNAB meets R programming language.

api data-analysis data-science data-visualization r ynab ynab-api

Last synced: 30 Jul 2025

https://github.com/oceannetworkscanada/api-python-client

Provides easy access to ONC data in Python

api data-science ocean-sciences onc python

Last synced: 20 Jul 2025

https://github.com/anshchoudhary/xgmodel

This repository contains code to predict the Expected Goals (xG) from shots in football using various machine learning models.

data-science football-analytics football-data machine-learning machine-learning-algorithms

Last synced: 10 Apr 2025

https://github.com/sdpython/mlstatpy

Mathematics, Algorithmic, Data-Science, Teaching Materials

algorithms data-science mathematics python3 teaching-materials

Last synced: 23 Jun 2025

https://github.com/zenml-io/template-starter

A template for a starter project for ZenML

cookiecutter copier-template data-science machine-learning mlops zenml

Last synced: 14 Apr 2025

https://github.com/matteocargnelutti/maguire-lab-seizure-detection-webapp

๐Ÿง  Maguire Lab's Deep Learning Seizure Detection WebApp.

data-science eeg-signals-processing neuroscience

Last synced: 21 Apr 2025

https://github.com/takuti/anompy

A Python library for anomaly detection

anomaly-detection data-science forecasting machine-learning python

Last synced: 15 Apr 2025

https://github.com/bsomps/OpenGeoPlotter

A PyQt5 app catered to the exploration industry for visualizing geologic drill hole data with features like cross-sections, simple 3D views, strip logs, scatter plots, and downhole line plots. Includes data transformation techniques like factor analysis, desurveying, and alpha-beta conversion.

cross-sections data-science drilling exploration geology geoscience pyqt5 python strip-logs

Last synced: 05 Mar 2025

https://github.com/clojurecivitas/clojurecivitas.github.io

An open effort to structure learning resources with meaningful connections.

blog clay clojure data-science literate markdown notebooks

Last synced: 24 Jun 2025

https://github.com/eliasdabbas/dash-aggrid-scales

Color scales (continuous and categorical) and bar charts for Dash-Ag-Grid

aggrid color-scales color-scheme data-science data-visualization html plotly-dash table

Last synced: 16 Mar 2026

https://github.com/bdist/bdist-workspace

This repository provides containerized applications and microservices for the Information Systems and Databases Course @ Instituto Superior Tรฉcnico

data-engineering data-science docker jupyter jupyterlab notebook postgres postgresql python sql sqlite

Last synced: 09 Apr 2026

https://github.com/correia-jpv/fucking-awesome-datascience

๐Ÿ“ An awesome Data Science repository to learn and apply for real world problems. With repository starsโญ and forks๐Ÿด

analytics awesome awesome-list data-mining data-science data-scientists data-visualization deep-learning hacktoberfest machine-learning science

Last synced: 27 Apr 2025

https://github.com/amirhosseinhonardoust/underwriting-decision-safety-lab

A decision-safety lab for loan approval: trains a baseline classifier, calibrates probabilities (ECE/Brier), sweeps confidence thresholds to build a coverage, quality frontier and outputs a defensible abstention policy (auto-decide vs review). Includes a Streamlit dashboard for report cards, triage UI, and data quality checks.

abstention calibration classification credit-risk data-quality data-science decision-policy loan-approval machine-learning mlops model-evaluation monitoring pandas reliability responsible-ai scikit-learn selective-classification streamlit uncertainty underwriting

Last synced: 10 Jun 2026

https://github.com/sjcobb/webxr-threejs-midi-visualizer

WebXR, augmented reality MIDI data visualization, built with Three.js and Tone.js. See video: https://youtu.be/lIecCGtbqSM

3d aframe cannonjs data-science data-visualization depth-estimation game-development hit-detection javascript midi music-theory physics three threejs tone tonejs webvr webxr

Last synced: 12 Jul 2025

https://github.com/faridrashidi/cnsplots

๐ŸŽจ Toolkit for generating publication-quality plots for Cell, Nature and Science journals

data-science data-visualization plotting publication-quality python scientific-publications

Last synced: 06 Apr 2026

https://github.com/eyadsibai/machine-learning-docker-image

Data Science/Machine Learning Docker Image for CPU

data-science docker docker-image google-cloud machine-learning

Last synced: 30 Apr 2025

https://github.com/martincastroalvarez/html2vec

Algorithm that converts an HTML to a vectorized object suitable for neural networks.

data-science html2vec natural-language-processing python web-scraping word2vec

Last synced: 11 Apr 2025

https://github.com/krypty/trefle

Trefle is a scikit-learn compatible estimator implementing the FuzzyCoCo algorithm that uses a cooperative coevolution algorithm to find and build interpretable fuzzy systems.

data-science deap evolutionary-algorithm fuzzy-logic interpretability machine-learning python scikit-learn

Last synced: 29 Oct 2025

https://github.com/zen-reportz/zen_dash

Simple, Fast, Scalable , production grade dashboard application . Right solution for team

dashboard data-analytics data-science fastapi flask python3 shiny streamlit

Last synced: 13 Apr 2025

https://github.com/jimbrig/lossrx

An R package, plumber API, database, and Shiny App for Actuarial Loss Development and Reserving Workflows.

actuarial-science claims-data claims-reserving data-science insurance modelling property-casualty reserving rpackage rshiny rstats workflow

Last synced: 01 Jul 2025

https://github.com/fabsta/interesting_notebooks

A collection of Data Science Jupyter notebook (reference material)

data-science eda jupyter-notebook kaggle machine-learning python

Last synced: 03 Jul 2025

https://github.com/chaganti-reddy/evmarket-india

Electric Vehicle Market Segmentation Analysis in India

data-analysis data-science machine-learning market-segmentation pandas python

Last synced: 12 Apr 2025

https://github.com/alan-turing-institute/hds-discussiongroup

Repo of the Turing's Humanities & Data Science Discussion Group

data-science digital-humanities discussion-group

Last synced: 03 Mar 2026

https://github.com/doarakko/kagoole

Search kaggle competitions and solutions based on data and predict type, evaluation metric, etc.

artificial-intelligence data-science heroku kaggle kaggle-competition kaggle-solution machine-learning webapp

Last synced: 17 Oct 2025

https://github.com/xuri/excelize-py

Excelize is a Python port of Go Excelize library that allow you to write to and read from XLAM / XLSM / XLSX / XLTM / XLTX files.

calculation chart data-analysis data-science data-visualization ecma-376 excel excelize golang microsoft office ooxml pipy python spreadsheet visualization xlsm xlsx xlsxreader xlsxwriter

Last synced: 07 May 2025

https://github.com/openbridge/ob_pysh-db

pysh-db - The Data Science Toolkit (DSK)

bash data-science mysql postgres python redshift sql

Last synced: 10 Apr 2025

https://github.com/tristanbilot/airflow-rbac-roles-cli

A tool to create Airflow RBAC roles with dag-level permissions from cli.

airflow cloud-composer data-engineering data-science gcp permissions pipeline rbac-roles

Last synced: 25 Oct 2025

https://github.com/canagnos/mcp

Tools for Measuring Classification Performance for R, Python and Spark

artificial-intelligence classification data-mining data-science machine-learning machine-learning-algorithms

Last synced: 28 Apr 2025

https://github.com/emptymalei/audiorepr

A python package to represent data using musical notes.

audiolization data data-audiolization data-science

Last synced: 12 Oct 2025

https://github.com/mindful-ai-assistants/hackapucsp-2024

๐Ÿ† HackaPUCSP 2024 - - Data Science and AI Hackathon - Pontifical Catholic University of Sรฃo Paulo

automation data-science design github-actions hackathon-project oneness-consciousness package-manager programming pucsp pytest python3 unittest

Last synced: 11 Jul 2025

https://github.com/dovolopor-research/data-science-research-toolbox

๐Ÿงฐ ๆ•ฐๆฎ็ง‘ๅญฆ็ง‘็ ”ๅทฅๅ…ท็ฎฑ

data-science data-science-research data-science-resourses research-resources research-tool visualization

Last synced: 05 Jan 2026

https://github.com/lungben/tableio.jl

A glue package for reading and writing tabular data. It aims to provide a uniform api for reading and writing tabular data from and to multiple sources.

arrow csv data data-science database dataframe dataframes excel jdf json-format parquet postgresql sqlite zip

Last synced: 12 Oct 2025

https://github.com/urbanclimatefr/coursera-learn-sql-basics-for-data-science

This repository contains the materials to "Learn SQL Basics for Data Science", a specialization provided by University of California, Davis through Coursera.

coursera data-science sql

Last synced: 19 Feb 2026

https://github.com/rbhatia46/python-for-data-science

This repository contains iPython notebooks to get you started with sufficient amount of Python you need to learn to get started with your Data Science Journey.

data-science python-basics python3

Last synced: 03 Sep 2025

https://github.com/networks-learning/discussion-complexity

Code for "On the Complexity of Opinions and Online Discussions", WSDM 2019

complexity data-science discussion online-discussions opinion-mining paper wsdm

Last synced: 10 Aug 2025

https://github.com/gabrieltempass/abtester

A web application to design and evaluate the results of A/B tests.

ab-testing data-science hypothesis-testing python sample-size statistical-significance statistics streamlit web-app

Last synced: 06 Oct 2025

https://github.com/nas5w/imdb-data

A JSON file of 50,000 IMDB movie reviews to be used in machine learning applications.

data data-science imdb javascript machine-learning

Last synced: 19 Apr 2025

https://github.com/lucadibello/it-salary-analysis

๐Ÿ’ฐ Analysis of Salaries in IT Roles: DevOps, Cyber Security, and AI

ai cybersecurity data-science devops jupyter-notebook salary-analysis

Last synced: 03 Jul 2025

https://github.com/chandraprakash-bathula/apparel-recommendations

This project implements a personalized apparel recommendation engine using content-based search with the Amazon API, NLTK, and Keras libraries.

boxplot cnn-keras data-analysis data-science deep-learning linear-regression machine-learning numpy pandas scatter-plot scikit-learn svm tensorflow xgboost

Last synced: 23 Mar 2025

https://github.com/anshumansinha3301/occupational-hazard-analysis

The Occupational Hazard Analysis Using Industry Data project aims to analyze safety metrics across various industries to identify trends in reported incidents, injuries, and fatalities.

consulting-services data-science industrialisation jupyter-notebook python

Last synced: 09 Oct 2025

https://github.com/dina-hosny/chaincare

ChainCare is a health information system that uses smart contracts to handle medical procedures and stores the medical history in Block Chains.

api-rest bigchain blockchain blockchain-technology data-science data-storage data-visualization ethereum golang health-informatics-systems healthcare insomnia metamask postgresql postman reactjs solidity truffle web3

Last synced: 13 Apr 2026

https://github.com/koalaverse/analyticssummit19

Material for 2019 Analytics Summit Machine Learning with R Training

data-science educational-materials machine-learning r workshop-materials

Last synced: 15 May 2025

https://github.com/blurred-machine/data-science

This repository contains all of my minor projects built by me during the learning plase of Machine Learning and Data Science. Feel free to create a PR for modifications.

algorithms-python data-science jupyter-notebook learning-by-doing machine-learning-algorithms minor-project python

Last synced: 27 Apr 2025

https://github.com/aruizeac/alexandria

The Alexandria Project is an open-source platform where people can share their knowledge through books, podcasts, docs and videos.

alexandria data-science donation ebooks go golang grpc http kafka knowledge knowledge-sharing library microservice podcasts python societies streaming videos webservice

Last synced: 11 Mar 2026

https://github.com/lambdaclass/data_etudes

LambdaClass statistics, machine learning and data science etudes

data-science notebook probability statistics

Last synced: 09 Apr 2025

https://github.com/dhhruv/stock-price-prediction

A deep learning project in which the model was trained using LSTM layers and Tata Stock prices were predicted and compared with thier actual values.

algorithm cli college-project data data-science dataset deep-learning jupyter jupyter-notebook lstm machine-learning prediction science shell stock-price-prediction tata-beverages terminal

Last synced: 03 May 2025

https://github.com/joaocarabetta/project-templates

Fast Project Templates

data-science python template

Last synced: 19 Sep 2025

https://github.com/fabriziomusacchio/python_neuro_practical

This is the course material for the advanced course into Python for Data Scientists.

data-analysis data-science jupyter jupyter-notebook jupyter-notebooks open-source python teaching teaching-materials

Last synced: 22 Jul 2025

https://github.com/alvarobartt/ea-associate-ds

Electronic Arts (EA) NLP Assignment for: Associate Data Scientist

data-science electronic-arts nlp recruitment-task

Last synced: 12 Apr 2025

https://github.com/klarna-incubator/mleko

Simplify and accelerate your machine learning development with mleko. Designed with modularity and customization in mind, it seamlessly integrates into your existing workflows. Its robust caching system optimizes performance, taking you from data ingestion to finalized models with unparalleled efficiency.

artificial-intelligence data-science machine-learning pipeline python vaex

Last synced: 11 Apr 2025

https://github.com/thecoderpinar/spotify_trends_2023_analysis

Exploring Spotify's latest trends, top songs, genres, and artists using Python, Pandas, NumPy, Matplotlib, CNNs for image-based analysis, and advanced algorithms for music recommendation. Dive into the world of music data and discover what's trending on Spotify! ๐ŸŽต๐Ÿ“Š

cnn cnn-keras data-analysis data-science data-visualization machine-learning matplotlib music-trend numpy pandas python spotify

Last synced: 30 Apr 2025

https://github.com/l480/rewe-price-data

๐Ÿช Daily updated prices of all items from the German supermarket chain REWE as CSV (including EAN, grammage, product image etc.)

csv data-science ean inflation prices rewe shrinkflation supermarket

Last synced: 11 Jan 2026

https://github.com/bcgov/canwqdata

R ๐Ÿ“ฆ to download ๐Ÿ‡จ๐Ÿ‡ฆ open water quality data

data-science env r r-package rlang rstats

Last synced: 20 Jul 2025

https://github.com/luminousmen/python_for_ds

Python for Data Analysis workshop

data-analysis data-science python tutorial

Last synced: 01 May 2025