An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/girder/girder

A data management platform for the web, developed by Kitware

data-analytics data-management data-science javascript kitware python resonant

Last synced: 03 Apr 2025

https://github.com/wilsonrljr/sysidentpy

A Python Package For System Identification Using NARMAX Models

data-science dynamical-systems machine-learning narmax narx system-identification time-series

Last synced: 01 May 2025

https://github.com/BlackHC/toma

Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory

data-science gpu machine-learning python pytorch

Last synced: 08 May 2025

https://github.com/blackhc/toma

Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory

data-science gpu machine-learning python pytorch

Last synced: 12 Apr 2025

https://github.com/tlkh/ai-lab

All-in-one AI container for rapid prototyping

cuda data-science deep-learning docker jupyter nvidia pytorch tensorflow

Last synced: 05 Apr 2025

https://github.com/ptyadana/data-science-and-machine-learning-projects-dojo

collections of data science, machine learning and data visualization projects with pandas, sklearn, matplotlib, tensorflow2, Keras, various ML algorithms like random forest classifier, boosting, etc

boosting-algorithms data-analysis data-science data-visualization deep-learning keras machine-learning machine-learning-algorithms natural-language-processing pandas probability-statistics scikit-learn seaborn tensorflow

Last synced: 05 Apr 2025

https://github.com/kevintpeng/Learn-Something-Every-Day

📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->

algorithm aws blog computer-science course-materials data-engineering data-science education educational engineering learning math mathematics research software-engineering university unix waterloo

Last synced: 20 Mar 2025

https://github.com/jobream/List-of-Learning-Resources

This collection provides a list of educational resources for Software Engineers. Feel free to add your favorite resources as well and help others in their journey of learning.

competitive-programming computer-science data-science resources software-engineering web-development

Last synced: 02 May 2025

https://github.com/DataScienceUB/introduction-datascience-python-book

Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications

analytics data data-science datascience machine-learning python sentiment-analysis

Last synced: 26 Nov 2024

https://github.com/plotly/dash-table

OBSOLETE: now part of https://github.com/plotly/dash

dash data-science data-visualization plotly plotly-dash python react table

Last synced: 04 Apr 2025

https://github.com/firmai/pandasvault

Advanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).

data-science data-structures dataframe functions pandas python snippets table tips

Last synced: 06 May 2025

https://github.com/ashishpatel26/resourcebank_cv_nlp_mlops_2022

This repository offers a goldmine of materials for students of computer vision, natural language processing, and machine learning operations.

computer-vision data-science deep-learning mlops natural-language-processing

Last synced: 05 Apr 2025

https://github.com/epistasislab/scikit-rebate

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

data-science feature-selection python

Last synced: 16 May 2025

https://github.com/5agado/data-science-learning

Repository of code and resources related to different data science and machine learning topics. For learning, practice and teaching purposes.

data-science deep-learning jupyter-notebook learning-by-doing machine-learning statistics

Last synced: 17 Apr 2025

https://github.com/publicdomaincompany/scroll

Scroll is a language for scientists of all ages. Scroll includes a command line app that builds static blogs, websites, CSVs, text files, and more.

blog cms csv data-science knowledge-base knowledge-graph markdown markup markup-language note-taking scroll static-site-generator tree-notation

Last synced: 23 Nov 2024

https://github.com/rebecca-vickery/data-science-learning-resources

A comprehensive list of free resources for learning data science

artificial-intelligence data data-science machine-learning python

Last synced: 26 Apr 2025

https://github.com/Niketkumardheeryan/ML-CaPsule

ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.

analytics data-analysis data-science data-visualization datascience deep-learning deep-neural-networks deployment flask heroku-deployment machine-learning python r statistics streamlit-webapp

Last synced: 05 May 2025

https://github.com/okfn-brasil/rosie

🤖 Python application responsible for Serenata de Amor's intelligence

artificial-intelligence data-science machine-learning

Last synced: 28 Mar 2025

https://github.com/ClimbsRocks/machineJS

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

auto-ml automated-machine-learning automl data-science data-scientists javascript javascript-library kaggle machine-learning machine-learning-algorithms machine-learning-library ml numerai scikit-learn

Last synced: 27 Nov 2024

https://github.com/sforaidl/genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL

algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms

Last synced: 04 Apr 2025

https://github.com/tobgu/qframe

Immutable data frame for Go

data-frame data-science dataframe go golang immutable

Last synced: 04 Apr 2025

https://github.com/xoolive/traffic

A toolbox for processing and analysing air traffic data

adsb air-traffic-data data-analytics data-science data-visualisation declarative-pipeline mode-s trajectory

Last synced: 14 May 2025

https://github.com/Chicago/food-inspections-evaluation

This repository contains the code to generate predictions of critical violations at food establishments in Chicago. It also contains the results of an evaluation of the effectiveness of those predictions.

cdph chicago data-science food-poisoning open-data open-science public-health

Last synced: 27 Mar 2025

https://github.com/ShawhinT/YouTube-Blog

Codes to complement YouTube videos and blog posts on Medium.

data-science example-code machine-learning medium-articles youtube

Last synced: 25 Nov 2024

https://github.com/kunalj101/Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 05 May 2025

https://github.com/SforAiDl/genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL

algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms

Last synced: 01 May 2025

https://github.com/kunalj101/data-science-hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 13 Feb 2025

https://github.com/climbsrocks/machinejs

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

auto-ml automated-machine-learning automl data-science data-scientists javascript javascript-library kaggle machine-learning machine-learning-algorithms machine-learning-library ml numerai scikit-learn

Last synced: 05 Apr 2025

https://github.com/aiguofer/gspread-pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

data data-analytics data-engineering data-science dataframes google google-sheets google-spreadsheets gspread pandas python sheets

Last synced: 15 May 2025

https://github.com/basedosdados/sdk

⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/sdk/

bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia

Last synced: 14 May 2025

https://github.com/ledell/user-machine-learning-tutorial

useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html

data-science deep-learning ensemble-learning gradient-boosting-machine machine-learning r random-forest tutorial

Last synced: 05 Apr 2025

https://github.com/ledell/useR-machine-learning-tutorial

useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html

data-science deep-learning ensemble-learning gradient-boosting-machine machine-learning r random-forest tutorial

Last synced: 27 Nov 2024

https://github.com/rio-labs/rio

WebApps in pure Python. No JavaScript, HTML and CSS needed

data-analysis data-science data-visualization deep-learning machine-learning python ui webapp

Last synced: 09 Apr 2025

https://github.com/kevinliao159/mydatascienceportfolio

Applying Data Science and Machine Learning to Solve Real World Business Problems

api data-science data-visualization machine-learning neural-networks nlp recommendation-system spark

Last synced: 05 Apr 2025

https://github.com/airalcorn2/Michael-s-Data-Science-Curriculum

This is the companion curriculum to my guide to becoming a data scientist.

curriculum data-science machine-learning statistics

Last synced: 22 Nov 2024

https://github.com/olavolav/uniplot

Lightweight plotting to the terminal. 4x resolution via Unicode.

data-analysis data-science plot python

Last synced: 29 Mar 2025

https://github.com/weijie-chen/econometrics-with-python

Tutorials of econometrics featuring Python programming. This is a crash course for reviewing the most important concepts and techniques of basic econometrics, the theories are presented lightly without hustles of derivation and Python codes are straightforward.

data-analysis data-science econometrics economics python statistics time-series

Last synced: 05 Apr 2025

https://github.com/datalayer/jupyter-ui

🪐 ⚛️ React.js components 💯% compatible with 🪐 Jupyter https://jupyter-ui-storybook.datalayer.tech

data data-product data-science data-visualisation datalayer ipywidgets jupyter jupyterlab lumino notebook reactjs ui

Last synced: 15 May 2025

https://github.com/EpistasisLab/scikit-rebate

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

data-science feature-selection python

Last synced: 27 Mar 2025

https://github.com/thoughtworks/mlops-platforms

Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...

azureml data-science databricks dataiku datarobot google-ai-platform h2oai iguazio knime kubeflow machine-learning mlflow mlops pachyderm sagemaker seldon

Last synced: 07 May 2025

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 03 Apr 2025

https://github.com/solegalli/feature-engineering-for-machine-learning

Code repository for the online course Feature Engineering for Machine Learning

data-science feature-engineering feature-extraction machine-learning python

Last synced: 04 Apr 2025

https://github.com/dagshub/fds

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

data-science dvc git

Last synced: 12 Apr 2025

https://github.com/operatorai/modelstore

🏬 modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud storage provider.

data-science keras machine-learning mlops modelstore python-library pytorch s3-storage scikit-learn tensorflow transformer

Last synced: 14 May 2025

https://github.com/DagsHub/fds

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

data-science dvc git

Last synced: 08 May 2025

https://github.com/plotly/dashR

Create data science and AI web apps in R

dash data-science data-visualization plotly plotly-dash python r react web-application

Last synced: 15 Mar 2025

https://github.com/matrix-profile-foundation/matrixprofile

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.

algorithms anomaly-detection clustering data-mining data-science hacktoberfest matrixprofile motif-discovery python python2 python3 segmentation time-series time-series-analysis

Last synced: 16 May 2025

https://github.com/finos/jupyterlab_templates

Support for jupyter notebook templates in jupyterlab

data-science dataviz jupyter jupyterlab jupyterlab-extension machine-learning notebook

Last synced: 12 Apr 2025

https://github.com/jkrumbiegel/chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 12 Apr 2025

https://github.com/jkrumbiegel/Chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 14 May 2025

https://github.com/adicherlavenkatasai/ml-workspace

Machine Learning (Beginners Hub), information(courses, books, cheat sheets, live sessions) related to machine learning, data science and python is available

cheat-sheets convolutional-networks data-science deep-learning deep-neural-networks gans harvard-edx interview-questions machine-learning python

Last synced: 11 Apr 2025

https://github.com/anothersamwilson/miceforest

Multiple Imputation with LightGBM in Python

data-science imputed-values mice-algorithm python random-forest

Last synced: 08 Apr 2025

https://github.com/aunum/goro

A High-level Machine Learning Library for Go

data-science go golang machine-learning machinelearning

Last synced: 20 Mar 2025

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 13 Apr 2025

https://github.com/aaronpenne/data_visualization

A collection of my data visualizations, mostly in Python.

data-science data-visualization python3 visualization

Last synced: 14 Mar 2025

https://github.com/MaxHalford/xam

:dart: Personal data science and machine learning toolbox

data-science machine-learning preprocessing python stacking

Last synced: 08 May 2025

https://github.com/maxhalford/xam

:dart: Personal data science and machine learning toolbox

data-science machine-learning preprocessing python stacking

Last synced: 07 Apr 2025

https://github.com/paddymul/buckaroo

Buckaroo - the data wrangling assistant for pandas. Quickly explore dataframes, and run pandas commands via a GUI. Works inside the jupyter notebook.

buckaroo data-science jupyter paddy pandas

Last synced: 15 May 2025

https://github.com/souzatharsis/open-quant-live-book

An open source, hands-on and fully reproducible book in quantitative finance, data science and econophysics. Join us and help Make Wall Street Great Again!

algo-trading altdata data-science econophysics financial-analysis financial-markets machine-learning open-source quantitative-finance

Last synced: 23 Feb 2025

https://github.com/joaquinamatrodrigo/estadistica-con-r

Apuntes personales sobre estadística, machine learning y lenguaje de programación R

bioestadistica data-mining data-science estadistica machine-learning mineria-de-datos r

Last synced: 05 Apr 2025

https://github.com/tellery/tellery

Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.

analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql

Last synced: 16 May 2025

https://github.com/AnotherSamWilson/miceforest

Multiple Imputation with LightGBM in Python

data-science imputed-values mice-algorithm python random-forest

Last synced: 22 Nov 2024

https://github.com/meteostat/meteostat-python

Access and analyze historical weather and climate data with Python.

climate climate-change climate-data data-science meteostat open-data statistics weather weather-data weather-station

Last synced: 27 Nov 2024

https://github.com/InseeFrLab/onyxia

🔬 Data science environment for k8s

bluehats data-science datalab helm insee kubernetes onyxia

Last synced: 27 Dec 2024