An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/hassanalgoz/python

ูƒุชุงุจ ุงู„ุจุงูŠุซูˆู†ูŠุฉ: ู…ุฏุฎู„ ุนู…ู„ูŠ ู„ุชุนู„ู… ุงู„ุจุฑู…ุฌุฉ ุจู„ุบุฉ ุจุงูŠุซูˆู†. ูƒุชุงุจ ู…ูˆุฌู‡ ู„ู„ู…ุจุชุฏุฆูŠู† ููŠ ุงู„ุจุฑู…ุฌุฉ ู…ู† ุฎู„ููŠุงุช ุชู‚ู†ูŠุฉ ุฃูˆ ุบูŠุฑ ุชู‚ู†ูŠุฉุŒ ูŠุนุฑุถ ุงู„ู…ูุงู‡ูŠู… ุงู„ุฃุณุงุณูŠุฉ ุจู„ุบุฉ ูˆุงุถุญุฉุŒ ุจุชุณู„ุณู„ ู…ู†ุทู‚ูŠุŒ ู…ุน ู…ุณุงุฆู„ ูˆุงู‚ุนูŠุฉ ูˆุชุทุจูŠู‚ุงุช ู†ุงูุนุฉุŒ ุจุนูŠุฏู‹ุง ุนู† ุงู„ุนุดูˆุงุฆูŠุฉ ูˆุงู„ุณุทุญูŠุฉ. ูŠุตู„ุญ ู„ู„ุชุนู„ู… ุงู„ุฐุงุชูŠุŒ ูˆู„ู„ุชุฏุฑูŠุณ ูƒุฐู„ูƒ.

ai arabic curriculum data-science learn-to-code learning-by-doing programming project-based-learning python

Last synced: 03 May 2025

https://github.com/badr-moufad/cookiecutter-simple-ds-project

A simple cookiecutter template to structure your Data Science projects.

cookiecutter data-science project-structure python simple-ds-project

Last synced: 23 Apr 2025

https://github.com/kleinhenz/wiki-network-extractor

python module for extracting link networks from wikimedia xml dumps

data-science network-graph python

Last synced: 07 May 2025

https://github.com/prem07a/credit-score-classification

This is ML project which is based on Classification of Credit Score

data-science fastapi feature-extraction machine-learning python3 sklearn-classify website

Last synced: 13 Apr 2025

https://github.com/giswqs/learning-scipy

Learning SciPy for Numerical and Scientific Computing

data-science jupyter-notebook python scipy

Last synced: 12 May 2025

https://github.com/synthesized-io/synthesized-notebooks

Discover the art of enhancing your data using generative modelling in these notebooks.

data-privacy data-science generative-modelling ml notebooks synthetic-data

Last synced: 14 Jul 2025

https://github.com/edaaydinea/python-ml-dl-ds-projects

This repository is included artificial intelligence, machine learning, data science, computer vision projects which are written Python language.

computer-vision data-science deep-learning machine-learning projects python

Last synced: 02 Jul 2025

https://github.com/tushar2704/ml-portfolio

This repository showcases a collection of machine learning projects in various domains, demonstrating my skills and expertise as a data scientist and machine learning engineer. Each project provides step-by-step instructions, code, and visualizations to showcase the data analysis and modeling techniques employed.

artificial-intelligence data-science machine-learning portfolio python streamlit-tushar2704 tushar2704

Last synced: 07 May 2025

https://github.com/sachinl0har/data-analytics

Data Analytics in Python. Numpy, Pandas, Matplotlib, Seaborn. Still Learning...

data-analytics data-science data-visualization matplotlib numpy pandas python seaborn

Last synced: 07 Jul 2025

https://github.com/spidy20/kaggle_kernels

It's contain a Data scince - Machine learning ,Data visualizations codes & Datasets

clustering data-science data-visualization eda kaggle-competition kaggle-dataset kaggle-scripts kmeans-clustering

Last synced: 12 Apr 2025

https://github.com/ivanrs297/endoscopycorruptions

The endoscopycorruptions Python package provides utilities to simulate common image corruptions that might occur during endoscopic procedures. This tool is designed to assist in the development and testing of image processing algorithms intended for endoscopic imagery by introducing realistic corruptions into clean images.

computer-vision data-science machine-learning medical-imaging python

Last synced: 25 Apr 2026

https://github.com/rolv-io/rolvapp

Rolv is your AI-powered research assistant for life sciences!

ai biology data-analysis data-science genomics life-sciences medicine

Last synced: 02 Mar 2026

https://github.com/pathwiselabs/pixel-pipeline

A Python application with Gradio UI for batch processing and captioning of images, allowing for easy integration with AI image training workflows.

data-cleaning data-science flux generative-ai stable-diffusion stable-diffusion-webui

Last synced: 04 Mar 2026

https://github.com/blacksuan19/redash-python

A More complete Redash API python client

dashboards data-science data-visualization python

Last synced: 24 Apr 2025

https://github.com/moindalvs/forecasting_airline_passengers_traffic

Forecast the Airlines Passengers. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.

additive arima-forecasting data-science double-exponential-smoothing forecasting holt-winters holt-winters-forecasting multiplicative sarima-model seasonality-analysis simple-exponential-smoothing stationarity stationarity-test time-series-forecasting timeseries-analysis trend-analysis triple-exponential-smoothing

Last synced: 23 Apr 2025

https://github.com/fearlesssolutions/engineering-practice-domains

A mono-repo for the Engineering Practice Domains of Development, Data, Infrastructure, Testing, and Platforms

data data-engineering data-science database-design devops drupal end-to-end-testing engineering infrastructure machine-learning salesforce security testing web-development

Last synced: 26 Oct 2025

https://github.com/zackakil/friendlier-data-labelling

Code resources for generating a google form for labelling data.

data-science google google-apps-script google-forms google-sheets machine-learning

Last synced: 04 Oct 2025

https://github.com/hissain/jscipy

Java Scientific Computing Library for Signal Processing, Filters, and Transformations. A NumPy/SciPy port for JVM & Android, used in Machine Learning and Data Science.

android chebyshev-filter cubic-splines data-science dsp fft findpeaks hilbert-transform interpolation java machine-learning numerical-computing python resample savitzky-golay scientific-computing scipy scipy-signal signal-processing

Last synced: 23 Jan 2026

https://github.com/abhaysingh71/ai-powered-healthcare-intelligence-network

The AI-Powered Healthcare Intelligence Network is an AI-driven system offering disease prediction, drug recommendations, heart disease risk assessment, and an AI medical chatbot. Using ML, NLP, and LLMs, it provides accurate diagnoses, insights, and recommendations, enhancing healthcare accessibility, efficiency, and decision-making .

airtificialintelligence chatbot data-analysis data-science datawrangling disease-prediction healthcare-ai heart-disease huggingface langchain large-language-models lightgbm machine-learning mistral-7b recommendation-system retrieval-augmented-generation sentence-transformers shap vector-database

Last synced: 01 Feb 2026

https://github.com/kalyan4636/python-eering

PYTHON PROJECT WITH SOURCE CODE. the best Python project name is one that is descriptive, memorable, and fun for you to say. Don't be afraid to get creative and use emojis to make your project stand out! ๐Ÿ“ˆ

artificial-intelligence artificial-intelligence-algorithms data-science deep-learning django framework machine-learning machine-learning-algorithms numpy opencv opencv-python opensource pandas pil-tinker pillow python python-3 python-library python3

Last synced: 23 Apr 2025

https://github.com/epiverse-trace/epi-training-kit

An e-learning strategy for training on analysis, modelling and response to outbreaks and epidemics in Latin-America and the Caribbean

data-science e-learning epidemics training

Last synced: 07 Jul 2025

https://github.com/sirius248/introduction-to-data-science-in-python

Introduction to Data Science in Python (Coursera)

data-science python

Last synced: 11 Nov 2025

https://github.com/adrtod/rchallenge

A simple datascience challenge system using R Markdown and Dropbox.

challenge data-science r

Last synced: 21 Feb 2026

https://github.com/sondosaabed/introduction-to-sql

Course with udacity that cover SQL for data Scientists, this is my solution for the lessons and the project

aggregations data-science dvd-rental-database joins nanodegree sql subqueries udacity-nanodegree

Last synced: 21 Jan 2026

https://github.com/recodehive/recode-website

recodehive helps you to learn and master the skills on data, and encourage you to code on opensource.

data data-science dataengineering opensource python sql tutorials website

Last synced: 15 Mar 2026

https://github.com/phazerooman/dcai-ocr-krooki

OCR model, trained for extracted coordinates from Omani title deeds ("krooki") utilizing a Data Centric AI (DCAI) approach.

ai data-science ocr python

Last synced: 12 Apr 2025

https://github.com/the-data-dilemma/parquettohuggingface

ParquetToHuggingFace processes raw audio data, converts it into Parquet files, and uploads them to Hugging Face. The README explains how to set up the environment, configure paths, and run the scripts to generate and upload the data.

audio-dataset audio-processing automatic-speech-recognition data-analysis data-science dataset healthcare-application huggingface huggingface-datasets pandas parquet parquet-generator python3 speech-data speech-recognition speech-to-text speech-translation

Last synced: 21 Aug 2025

https://github.com/habedi/myr-languagecodes

My efforts for bettering my knowlage of R language

data-mining data-science data-visualization dataset graph r

Last synced: 27 Apr 2025

https://cufctl.github.io/mlbd/

Repository for the machine learning / big data creative inquiry

data-science high-performance-computing machine-learning python tensorflow

Last synced: 16 Mar 2025

https://github.com/ksdkamesh99/medium-blogs

It is a stack of all my medium and analytics vidya articles on different technologies in computer science like AI,ML,Deep learning and many more

analytics-vidya-articles computer-science data-science deep-learning machine-learning medium medium-blogs python

Last synced: 12 May 2025

https://github.com/urbanclimatefr/coursera-applied-data-science-with-python

This repository contains the materials to "Applied Data Science with Python", a specialization provided by University of Michigan through Coursera.

coursera data-science machine-learning python3

Last synced: 22 Apr 2025

https://github.com/bitliner/d3-bipartite-graph

Hello world for bipartite graph in D3.js

charts data-science data-visualization graph

Last synced: 11 Jun 2025

https://github.com/gesiscss/ptm

Introduction to Natural Language Processing with a special emphasis on the analysis of Job Advertisements

binder data-science information-retrieval labour-market nlp r text-mining topic-modeling

Last synced: 07 May 2025

https://github.com/noorkhokhar99/plagiarsim-checker

Plagiarsim checker using cosine algorithm #Plagiarsimchecker

ai api checker data-science database nlp nlptk plagiarsim python

Last synced: 16 Oct 2025

https://github.com/dse-capstone-sharknado/advancedbpr

Amazon Recommendation System build on BPR TensorFlow implementation

data-prep data-science exploratory-analysis ipynb machine-learning recommender-system

Last synced: 15 Oct 2025

https://github.com/raphaelsenn/playervectors

Implementation of the paper "Player Vectors: Characterizing Soccer Players Playing Style from Match Event Streams".

data-science

Last synced: 04 Mar 2026

https://github.com/akashkobal/data-science

I'm excited to share my data science project๐Ÿš€, where I've applied various techniques and insights to solve a specific problem. The project follows best practices for maintainability and reproducibility, using the Data Science Project Template. Dive into the project to explore the code, datasets, documentation, and resources that showcase MyJourney

akash akash-kobal akashkobal applied-data-science artificial-intelligence classification data-science dataanalysis dataanalytics datascienceproject datascientist deep-learning kobal machine-learning prediction regression

Last synced: 17 Mar 2026

https://github.com/x-tabdeveloping/rvfln

A Python implementation of random vector functional networks and broad learning systems using Sklearn's Regressor and classifier APIs

broad-learning data-science deep-learning machine-learning scikit-learn sklearn sklearn-compatible

Last synced: 22 Mar 2025

https://github.com/zehracakir/verimadenciliginotlarim

My notes and my own studies in the Data Mining course in the computer engineering department of Sรผleyman Demirel University

classifying clustering data data-mining data-science linear-regression machine-learning pandas python

Last synced: 18 Jun 2025

https://github.com/ashwinpn/advanced-python

Python for Machine Learning/AI/DS, Game Theory and Convex Optimization using Python, Managing Docker in Python, Web Scraping / Development in Python using Django and Flask, Functional Programming in Python.

convex-optimization data-science docker flask functional-programming game-theory machine-learning machine-learning-algorithms python web-development web-scraping

Last synced: 13 Apr 2025

https://github.com/philipperemy/github-full-data-set

Generating GitHub data (~1M repositories May 2017).

data-science dataset github github-api kaggle machine-learning

Last synced: 07 May 2025

https://github.com/wlandau/targets-debug

Slides and example code for debugging {targets} pipelines

data-science make pipeline r reproducibility rstats targets targets-pipeline

Last synced: 20 Mar 2025

https://github.com/cjdoris/chevrons.jl

Your friendly >> chevron >> based syntax for piping data through multiple transformations.

data data-science data-transformation julia julia-lang julia-language macros piping repl

Last synced: 07 Mar 2026

https://github.com/smac-group/ds

:notebook: This book is currently under development and has been designed as a support for students who are following (or are interested in) courses that provide the basic knowledge to master "statistical programming" with R. Compiled textbook:

data-science github programming r rstudio statistics

Last synced: 22 Jul 2025

https://github.com/jobar8/subsurface_hackathon_2017

Three notebooks to jump start a data science project

data-science geophysics groundwater ipywidgets

Last synced: 28 Jan 2026

https://github.com/fxstein/code-server-python

VSCode Code Server for Python Developers and Data Scientists

code-server data-science developer docker home-automation iot python synology vscode

Last synced: 25 Jul 2025

https://github.com/tindzk/nix-ds

Nix for Data Science

data-science nix python

Last synced: 26 Jul 2025

https://github.com/fredhutch/tfcb_2022

Course website for MCB 536 Tools for Computational Biology

data-science

Last synced: 16 Aug 2025

https://github.com/ahammadnafiz/predicta

Predicta: Simplify your workflow with our powerful data analysis and machine learning tool.

analytics data-science data-visualization dataanalysis machine-learning pandas project python streamlit streamlit-webapp webapp

Last synced: 28 Jul 2025

https://github.com/tushar2704/common_datasets

Common-datasets is a GitHub repository dedicated to providing a wide collection of common datasets for practicing and learning data science and machine learning.

aritificial-intelligence data-analytics data-engineering data-science data-visualization database dataset-generation datasets machine-learning

Last synced: 09 Aug 2025

https://github.com/thecoderpinar/diabetes_health_prediction_and_analysis

A comprehensive project to predict and analyze diabetes health data using advanced machine learning models, including Logistic Regression, Random Forest, and XGBoost. ๐Ÿ“Š๐Ÿ”

analytics artificial-intelligence classification data-science data-visualization deep-learning diabetes-prediction health healthcare logistic-regression machine-learning medical-analysis mlops prediction python random-forest xgboost

Last synced: 12 Aug 2025

https://github.com/chrislemke/autoembedder

PyTorch autoencoder with additional embeddings layer for categorical data ๐Ÿš˜

anomaly-detection autoencoder data-science embedding machine-learning neural-network python pytorch pytorch-ignite

Last synced: 15 Apr 2025

https://github.com/akkefa/ml-notes

Notes for Mathematics for Machine learning and Data Science.

book computer-science data-science linear-algebra mathematics notes probability statistics topics

Last synced: 04 Feb 2026

https://github.com/csfelix/datascience-exercises

๐Ÿ Just some DataScience exercises, nothing more... ๐Ÿ (๐Ÿ”‘ KeyWords: python, data science, data analysis, pandas ๐Ÿ”‘)

data-analysis data-science datascience pandas python python3

Last synced: 05 Jul 2025

https://github.com/mituskillologies/ds-diploma-internship-jun24

Programs conducted at MITU Skillologies, Pune office in internship training on Data Science during June-July 2024 for Diploma Engineering Students.

data-analytics data-science data-visualization machine-learning project python python3

Last synced: 09 Apr 2025

https://github.com/blurred-machine/computer-vision-image-classification

In this repository I have implemented computer vision on MNIST dataset for images classification for digits between 0-9, fashion clothings and sign language hand signals. The models are implemented using TesorFlow. Feel free to send a PR for any oprimization or modification.

computer-vision data-science deeplearning images-classification machine-learning mnist-dataset python

Last synced: 10 Sep 2025

https://github.com/mrgeislinger/udacitydand_proj_wrangleandanalyzedata

Wrangling and analyzing data project for Udacity's Data Analyst Nanodegree. Wrangles WeRateDogsโ„ข (@dog_rates) Twitter data from local, online, and Twitter API sources.

data-analysis data-analyst data-science datascience jupyter-notebook python3 twitter udacity-data-analyst-nanodegree udacity-nanodegree

Last synced: 09 Oct 2025

https://github.com/just-krivi/real-estate-market-analysis

Streamlit web app using custom ML models (multiple linear regression and one-to-many multiclass kernel SVM) for predicting real estate prices; Scraping and analyzing real estate listings in Serbia

data-science docker gradient-descent machine-learning multiclass-support-vector-machine multiple-linear-regression postgresql python scrapy stramlit svm webscraping

Last synced: 04 Oct 2025

https://github.com/ruivieira/nim-mentat

A Nim library for data science and machine learning

data-science library machine-learning nim scientific-computing

Last synced: 10 Aug 2025

https://github.com/yoshoku/numo-openblas

Numo::OpenBLAS builds and uses OpenBLAS as a background library for Numo::Linalg

data-science machine-learning numo openblas ruby

Last synced: 25 Apr 2025

https://github.com/mdh266/crimetime

Python web application for exploring and forecasting crime rates in NYC

data-science docker flask-application forecasting-crime-rates geospatial-analysis pandas python statsmodels time-series-analysis

Last synced: 30 Jul 2025