An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/dhhruv/kisaani

"Kisaani" is an application that takes required parameters intelligently or from the database of the location (from the cloud) and provides the list of best crops suited for that land. The application should also be able to collect the outcome after cultivation and apply correction as appropriate for further advisories. The details of the crops for the region and conditions are provided. Applications should be interactive, user friendly for farmers (provide local language support) and should provide support in real time.

crop crop-recommendation data-science ieee ieee-hackathon machine-learning

Last synced: 07 Mar 2026

https://github.com/virajbhutada/spotify-track-analysis-and-recommendation

Experience a comprehensive exploration of Spotify's musical landscape seamlessly transitioned from Tableau visualizations to SQL analysis. Dive into track inventory, streaming metrics, and sonic trends via interactive dashboards, while leveraging SQL queries for deeper insights into KPIs and cross-platform rankings.

audio-analysis data-analysis data-analytics data-science data-visualization eda machine-learning-library ml-models mysql recommendation-system spotify spotify-data spotify-dataset sql-database sql-server streaming-metrics tableau tableau-public trends-analysis

Last synced: 28 Apr 2025

https://github.com/valohai/minihai

An open-source application for running notebooks server-side

data-science jupyter jupyter-notebook machine-learning notebook

Last synced: 02 Aug 2025

https://github.com/invia-flights/blitzly

Lightning-fast way to get plots with Plotly โšก๏ธ

data-analysis data-science plotly plotting-in-python python visualization

Last synced: 14 Jan 2026

https://github.com/ahammadmejbah/glossary-of-artificial-intelligence

A "Glossary of Artificial Intelligence" is a concise reference resource defining key terms, concepts, and terminology related to AI. It provides explanations and definitions to help individuals understand and navigate the field of artificial intelligence, making it a valuable tool for both beginners and experts in the AI domain.

artificial-intelligence data data-science deep-learning deep-learning-algorithms detection image-processing machine-learning python

Last synced: 25 Jun 2025

https://github.com/teddyoweh/disease-data-webscrape-analysis

Scraped a table of disease & symptoms data from a website, and turned it to a dataframe, then extracted to a csv fie

data-analytics data-science data-visualization webscraping

Last synced: 09 Apr 2025

https://github.com/lintangwisesa/javascript-on-jupyterlab

JavaScript (Node.js) kernel inside Jupyter Lab (Jupyter Notebook)

data-science javascript jupyter-lab jupyter-notebook

Last synced: 26 Apr 2025

https://github.com/just-krivi/real-estate-market-analysis

Streamlit web app using custom ML models (multiple linear regression and one-to-many multiclass kernel SVM) for predicting real estate prices; Scraping and analyzing real estate listings in Serbia

data-science docker gradient-descent machine-learning multiclass-support-vector-machine multiple-linear-regression postgresql python scrapy stramlit svm webscraping

Last synced: 04 Oct 2025

https://github.com/yoshoku/numo-openblas

Numo::OpenBLAS builds and uses OpenBLAS as a background library for Numo::Linalg

data-science machine-learning numo openblas ruby

Last synced: 25 Apr 2025

https://github.com/montanaz0r/mit-6.0002-course

My solutions to the assignments of MIT 6.0002 course.

data-science mit python python3

Last synced: 12 Aug 2025

https://github.com/inseefrlab/grandedim

Codes correspondant au document de travail "L'รฉconomรฉtrie en grande dimension"

data-science econometrics high-dimensional-data publication r statistics

Last synced: 13 Jun 2025

https://github.com/thetallprogrammer/stock-contender-app

Welcome to Stock Contender โ€“ an AI-powered tool designed to assist your market analysis. This tool is not an investment advisor and does not guarantee profits. Invest at your own risk. Stay updated with my latest developments.

artificial-intelligence chat-gpt data-science financial-data-analysis financial-technology fintech investment-analysis machine-learning openai openai-api python stock-market stock-prediction stock-trading

Last synced: 05 Sep 2025

https://github.com/eftekin/data-science-adventures

๐Ÿ“Š This repository contains my data science journey, showcasing projects, code snippets, and resources as I learn and explore the world of data science using Python.

data-science education python

Last synced: 28 Feb 2025

https://github.com/zachbateman/evogression

Python Machine Learning using an evolutionary regression algorithm. More intuitive with higher transparency than a neural network while providing much greater power and high-dimensionality capabilities than more simplistic regression techniques.

artificial-intelligence data-science machine-learning neural-network python regression

Last synced: 12 Jun 2025

https://github.com/coatless-textbooks/statistical-concepts-with-shiny-apps

Quarto book illustrating various statistical concepts using Shinylive.

data-science quarto quarto-book r-shiny r-shinylive statistics webr

Last synced: 12 Jun 2025

https://github.com/sanvishal/Exoplanet-Explore

An Interactive data visualization of Exoplanets

animation d3js data-analysis data-science exoplanet python space visualization

Last synced: 14 Apr 2025

https://github.com/alexeatscake/gigaanalysis

A toolbox for processing data that can be expressed as a dependent and independent variable.

condensed-matter-physics data-science matplotlib numpy physics scipy

Last synced: 03 Jul 2025

https://github.com/giswqs/timelapse

An interactive streamlit web app for creating satellite timelapse

data-science dataviz earthengine geopython python satellite streamlit

Last synced: 12 May 2025

https://github.com/ryanrudes/wikimedia

A dataset comprised of over 40 million images sourced from Wikimedia Commons

computer-vision data-science data-scraping dataset datasets deep-learning gans image images machine-learning wikimedia wikimedia-commons

Last synced: 13 Sep 2025

https://github.com/waylonwalker/kedro-auto-catalog

Kedro catalog create with default configuration

data data-science kedro kedro-catalog kedro-hook kedro-plugin

Last synced: 12 Jun 2025

https://github.com/m-taghizadeh/python-webinar

Our goal in this webinar is to provide a quick and practical training as your first step to becoming a professional Python programmer, so that after watching this training you will be able to gain a very good knowledge of programming with Python and using Python in artificial intelligence, machine learning, deep learning, data mining, and backend programming using Flask and Django

artificial-intelligence convolutional-neural-networks data-science deep-learning django flask machine-learning python

Last synced: 08 Sep 2025

https://github.com/public-health-scotland/technical-docs

Technical documentation, including guidance and best practice for Public Health Scotland (PHS)

data-science documentation git github python r

Last synced: 14 Apr 2025

https://github.com/mine-cetinkaya-rundel/feedback-at-scale

Slides and sample learnr tutorial for rstudio::global(2021) talk

data-science gradethis learnr rstats tutorial

Last synced: 11 Feb 2026

https://github.com/ramonhpr/knot-lib-python

API to get data from cloud and make some data analytics

data-science iot iot-framework web

Last synced: 26 Jun 2026

https://github.com/dataship/python-dataship

Lightweight tools for reading, writing and storing data, locally and over the internet for python

column-store data-science machine-learning numpy pandas

Last synced: 23 Apr 2025

https://github.com/zmoooooritz/stapy

An easy to use SensorThings API Client written in Python

api cli data-science database ogc python sensor sensor-data sensorthings sensorthings-api

Last synced: 17 Jan 2026

https://github.com/nicodupont/mooc

All my finished Moocs on the subject of the data science mainly

data-analysis data-science data-visualization datacamp jupyter-notebook machine-learning mooc pandas python sas sql

Last synced: 28 Apr 2025

https://github.com/faical-allou/clustering_od

K-means Clustering Algorithm in pure Python 3.5 (solved with Lloyds algorithm)

cluster clustering-algorithm data-science k-means k-means-clustering kmeans-clustering python

Last synced: 26 Jul 2025

https://github.com/jeafreezy/rsgis

A python package for basic to advanced GIS operations.

analysis data-science gis python

Last synced: 12 Apr 2025

https://github.com/nsembleai/nsvision

nsvision is the image data pre and post processing and data augmentation library. It provides utilities for working with image data.

data-science docker image image-classification image-manipulation image-processing jupyter library normalization numpy object-detection opencv opencv-python pillow python python-3 python-library python3 reduce-image-dimensions split-data

Last synced: 22 Feb 2026

https://github.com/surajv311/udemy_course_resources

List of course resources from my Udemy Course : "Numpy for Data Science" 2020

arrays data-science numpy numpy-tutorial python3 udemy udemy-course

Last synced: 16 May 2025

https://github.com/ul-mds/gecko

Python library for the generation and mutation of realistic personal identification data at scale

data-science numpy pandas python record-linkage

Last synced: 24 Apr 2025

https://github.com/memgonzales/pisa-2018-analysis

Jupyter notebook presenting the process of data preparation, research question formulation, data analysis, and data modeling with the goal of extracting insights from the 2018 PISA Dataset

data-cleaning data-modeling data-science data-visualization exploratory-data-analysis jupyter-notebook matplotlib numpy oecd-data pandas pisa scipy statistical-inference

Last synced: 13 Jun 2025

https://github.com/rubentea16/cheat-sheet-road-map

Data Science Cheatsheet and NLP Road Map

data-science machine-learning python

Last synced: 04 Feb 2026

https://github.com/lfrench03/ganaderia-en-cuba

Based on the data provided by the National Office of Statistics and Information ONEI and other alternative trusted sources mentioned in the references, our main objective is to present a detailed vision of how livestock farming has evolved in Cuba during the period until 2022.

cuba data-science dataproduct ganaderia streamlit streamlit-application timeline

Last synced: 26 Jul 2025

https://github.com/stink-po/boxoffice_api

Unofficial Python API for Box Office Mojo

data-science dataset movies-and-cinemas scraper

Last synced: 07 Sep 2025

https://github.com/tchlux/util

My machine learning, optimization, and data science utilities package.

data-science machine-learning numerical-optimization python-utilities splines statistics visualization

Last synced: 02 May 2026

https://github.com/divyanshugit/66daysofdata

This repo contains the source code for a static webpage where you can find out answers to Machine Learning Interview questions.

data-science interview-questions machine-learning

Last synced: 31 Jan 2026

https://github.com/canbula/datascience

Repository for Data Science course given by Assoc. Prof. Dr. Bora Canbula at Computer Engineering Department of Manisa Celal Bayar University.

data-science machine-learning matplotlib numpy pandas python python3 scikit-learn seaborn

Last synced: 04 Apr 2026

https://github.com/nemeslaszlo/social-media-analysis-based-on-covid-19-with-sentiment-analysis-ner-and-information-extraction

This repository contains the social media data scraper and the notebooks of this analysis. Where we analise the Social Media posts - tweets with Sentiment Analysis then we analyse this results with Named Entity Recognition (NER) and Information Extraction methods to get a more accurate and detailed picture of this sentiment results.

bert data-science data-visualization information-extraction keras named-entity-recognition nltk reccurent-neural-network tensorflow textblob

Last synced: 25 Jul 2025

https://github.com/alexcj10/diwali-sales-analysis

This repository contains an analysis of Diwali sales data to uncover trends and patterns in customer behavior. The project aims to provide insights into customer demographics, purchasing habits, and product preferences during the Diwali season.

analysis data-science diwali jupyter-notebook matplotlib numpy pandas python sales seaborn

Last synced: 15 Apr 2025

https://github.com/mdh266/crimetime

Python web application for exploring and forecasting crime rates in NYC

data-science docker flask-application forecasting-crime-rates geospatial-analysis pandas python statsmodels time-series-analysis

Last synced: 30 Jul 2025

https://github.com/stappit/blog

I often post solutions to textbook exercises, including: Bayesian Data Analysis (BDA) by Gelman et al; Causal Inference in Statistics Primer (CISP) by Pearl et al; Purely Functional Data Structures (PFDS) by Okasaki.

bayesian-data-analysis blog data-analysis data-science gelman hakyll haskell pearl purely-functional-data-structures solutions stan static-site statistical-inference statistics

Last synced: 14 Mar 2025

https://github.com/jamesquinlan/intro-python

Introduction to Programming and Data Science with Python

data-science nlp python python-3

Last synced: 18 Aug 2025

https://github.com/sondosaabed/data-analyst-nanodegree

I aquired a full scholarship from Google Launchpad. Advanced data wrangling skills to work with messy, complex real-world datasets. Highly customized visualizations using the Matplotlib Python library

data-science dataanalysis datawrangling nanodegree python udacity-nanodegree

Last synced: 09 Apr 2025

https://github.com/tiagoantao/virtual-core

A data science core based on Docker containers

data-science docker

Last synced: 14 Mar 2026

https://github.com/archie-cm/churn-analysis-ecommerce-customer

The objective of this project to is to predict customer churn, loss opportunity and provide recommendations to the business team so the company can implement a customer persona in retention strategy and can monitoring throught dashboard interactive.

data-science feature-engineering machine-learning python scikit-learn

Last synced: 23 Apr 2025

https://github.com/techn0man1ac/toxiccommentclassification

This project aims to develop a model capable of identifying and classifying different levels of toxicity in comments, using the power of BERT(Bidirectional Encoder Representations from Transformers) for text analysis.

analysis bert-model classifying data-science docker machine-learning python streamlit text-classification transformers-models

Last synced: 18 Aug 2025

https://github.com/tsg405/sql-for-data-science----coursera

This Repo contains - Starter files, Coursework, Programming Assignments for the course --> SQL for Data Science from University of California, Davis [COURSERA]

california chinook-database coursera data-science query-language quiz sql sqlite ucdavis-datalab yelp-dataset

Last synced: 14 Apr 2025

https://github.com/rafaelpermec/live-broker-api

Um estudo sobre raspagem de dados em back-end, simulando uma corretora que realiza aรงรตes de compra e venda de ativos e fluxo de caixa de clientes em tempo real.

authentication authorization backend-api cheerio data-science express helmet jwt-authentication mysql nodejs typescript web-scraping

Last synced: 19 Apr 2025

https://github.com/arbox/learning-scala-for-data-science

Data Science: Scala for brave and impatient

big-data bigdata data-science datascience scala spark

Last synced: 10 Mar 2026

https://github.com/iamyajat/whatsapp-chat-analyzer-api

An API to analyse WhatsApp chats and generate insights

data-analysis data-science fastapi python whatsapp

Last synced: 17 Oct 2025

https://github.com/westlake-ai/dmt-learn

An Explainable Deep Network for Dimension Reduction

data-science dimension-reduction python

Last synced: 07 Aug 2025

https://github.com/ruivieira/nim-mentat

A Nim library for data science and machine learning

data-science library machine-learning nim scientific-computing

Last synced: 10 Aug 2025

https://github.com/frauddi/dataspot

Find data concentration patterns and hotspots. Built for fraud detection and risk analysis.

anomalies anomalies-detection data-analysis data-science fraud-detection hotspots pattern-mining python

Last synced: 09 Apr 2026

https://github.com/WaylonWalker/kedro-auto-catalog

Kedro catalog create with default configuration

data data-science kedro kedro-catalog kedro-hook kedro-plugin

Last synced: 24 Mar 2025

https://github.com/bhattbhavesh91/auto-sklearn-tutorial

Small tutorial on auto-sklearn which is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.

auto-ml auto-sklearn automl data-science machine-learning python tutorial

Last synced: 27 Oct 2025

https://github.com/npow/awesome-metaflow

Every Metaflow extension worth knowing, curated and organized

awesome-list data-science extensions machine-learning metaflow mlops python workflow-orchestration

Last synced: 23 May 2026

https://github.com/timkong21/polyp-segmentation

Polyp segmentation tool utilizing U-Net for accurate medical image analysis, designed to enhance early detection and diagnosis of colorectal cancer. Features a user-friendly Streamlit web app for easy image processing and analysis, leveraging the Kvasir-SEG dataset for improved healthcare outcomes.

aws-s3 cancer-detection colonoscopy computer-vision data-augmentation data-science deep-learning diagnostics healthcare machine-learning medical-application medical-image-analysis medical-image-processing medical-image-segmentation opencv polyp-segmentation python streamlit tensorflow u-net

Last synced: 14 Apr 2025

https://github.com/mohidex/data-pipeline-on-gcp

The Real-time Ecommerce Data Collection and Processing project empowers businesses with real-time insights by efficiently extracting, processing, and storing ecommerce data from multiple sources. Combining Golang and Python, this cutting-edge solution streamlines data handling from diverse ecommerce websites.

beautifulsoup data-engineer data-pipeline data-science database datastore dependency-injection firebase firestore gcp go golang google google-cloud pubsub python solid-principles storage web-scraping

Last synced: 14 Apr 2025

https://github.com/navdeep-g/sdss-2019

Interpretable Machine Learning with rsparkling

data-science h2o-3 machine-learning r rsparkling spark sparklyr xai

Last synced: 07 Apr 2025

https://github.com/miguelgfierro/miguelgfierro

I help people understand and apply AI

ai data-science machine-learning

Last synced: 03 Jan 2026

https://github.com/zMoooooritz/stapy

An easy to use SensorThings API Client written in Python

api cli data-science database ogc python sensor sensor-data sensorthings sensorthings-api

Last synced: 15 May 2025

https://github.com/vbyan/deeva

๐Ÿš€Deeva - your smart analytics companion for Object Detection datasets

data data-science data-visualization datasets deeva machine-learning object-detection plotly python statistics streamlit visualization

Last synced: 26 Jun 2025

https://github.com/ruban2205/data-science-introduction

Welcome to the Data Science Introduction repository! This repository is designed to provide an introduction to the field of data science, covering various topics and techniques commonly used in the industry.

classification-algorithm data-science data-visualization decision-tree-classifier exploratory-data-analysis knn knn-classification python simple-linear-regression

Last synced: 11 Jul 2025

https://github.com/ZackAkil/friendlier-data-labelling

Code resources for generating a google form for labelling data.

data-science google google-apps-script google-forms google-sheets machine-learning

Last synced: 04 Apr 2025

https://github.com/yisaienkov/tinysets

The project aims to collect various datasets for tasks such as classification, clustering, object detection... The purpose of this datasets is quick checking models and algorithms performance.

algorithms classification data data-science dataset datasets kaggle kaggle-dataset lego lego-minifigures lego-sets object-detection pypi python regression text-classification tinysets

Last synced: 14 Apr 2025

https://github.com/macropin/random-name-generator

Generate random male and female names with real-world probability.

data-science python random-generation test-data-generator

Last synced: 17 Jul 2025

https://github.com/pchtsp/pytups

Powerful dictionaries and tuple lists for data wrangling

data data-science dictionaries optimization tuples

Last synced: 14 Apr 2025