An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/oneoffcoder/zava

Parallel coordinates with grand tour for exploratory data visualization of massive and high-dimensional data

angular d3 data-science exploratory-data-visualization grand-tour parallel-coordinates python typescript

Last synced: 06 Apr 2025

https://github.com/kylegrealis/nascar.data

R package of NASCAR race results & other information

data-science data-visualization package r racing

Last synced: 25 Oct 2025

https://github.com/sukanyabag/text-summarization-using-bert-gpt2-xlnet

This notebook leverages Transfer Learning Algorithms and standard NLP procedures to summarize a given paragraph meaningfully.

bert-model data-science gpt-2 huggingface-transformers machine-learning natural-language-processing textsummarization transfer-learning xlnet

Last synced: 24 Apr 2025

https://github.com/giswqs/geog-312-2021

First Steps in GIS Programming (GEOG 312) at the University of Tennessee, Knoxville

data-science geopython geospatial gis jupyter mapping python

Last synced: 12 Jul 2025

https://github.com/lenguyenthedat/dextra-mindef-2015

My solution for Dextra Data Science Challenge #44 (Singapore Ministry of Defense) https://challenges.dextra.sg/challenge/44

classification data-science machine-learning xgboost

Last synced: 02 Jul 2025

https://github.com/TrilemmaFoundation/Trilemma-Beta

Official repo for the Trilemma Beta Tournament

bitcoin data-science forecasting tournament

Last synced: 11 May 2025

https://github.com/pytask-dev/cookiecutter-pytask-project

A minimal cookiecutter template for a project with pytask.

cookiecutter data-science pytask research

Last synced: 26 Jul 2025

https://github.com/tnwei/nbread

Snappy previews of Jupyter notebooks from the command line, with ranger integration

data-science jupyter python ranger

Last synced: 22 Apr 2025

https://github.com/akbaritabar/dask-duckdb-dbeaver

Parallelised and out of memory data analysis using Dask in Python and DuckDB and DBeaver in SQL. Using example of publicly accessible ORCID 2019 XML files

data-analysis data-science pandas parallel-computing python

Last synced: 08 Aug 2025

https://github.com/robertvazan/sourceafis-visualization-java

Visualizations of biometric features in fingerprint templates produced by SourceAFIS and in algorithm transparency data captured during feature extraction and matching in SourceAFIS.

biometrics data-science feature-extraction fingerprint fingerprint-authentication minutia sourceafis visualization-library

Last synced: 14 Oct 2025

https://github.com/andrewhinh/captafied

Multimodal Table Understanding

data-science python

Last synced: 31 Jan 2026

https://github.com/mkesslerct/data_science_Python

Un curso de introducción a Data science con Python, impartido en la Escuela Técnica Superior de Ingeniería de Telecomunicaciones de la Universidad Politécnica de Cartagena

data-science python python3

Last synced: 24 Jun 2026

https://github.com/crdietrich/meerkat

Data acquisition for Raspberry Pi and Micropython

data-science drivers micropython raspberrypi

Last synced: 13 May 2025

https://github.com/touppercase78/tiobe-index-ratings

Index Ratings for Popular Programming Languages from TIOBE

analysis data-science datasets index jupyter-notebook programming-languages python tiobe

Last synced: 01 Apr 2025

https://github.com/zoltan-nz/ci-cd-pipeline-template-for-data-projects

CI/CD pipeline template for data science projects using GitLab CI and Kubernetes

cd ci ci-cd data-science docker gitlab gitlab-runner kubernetes python

Last synced: 07 Mar 2026

https://github.com/ahammadmejbah/artificial-intelligence-research-and-development-projects

The field of Artificial Intelligence (AI) is a frontier of computer science that focuses on creating systems capable of performing tasks that would typically require human intelligence. This encompasses a wide range of capabilities such as visual perception, speech recognition, decision-making, and language translation.

data-engineering data-science data-visualization database datascience deep-learning deep-learning-algorithms deep-neural-networks deep-reinforcement-learning machine-learning machine-learning-algorithms machine-vision machinelearning

Last synced: 27 Apr 2025

https://github.com/vicotrbb/data_science

Repository created to store all my studies about data science, machine learning and artificial intelligence.

data-science machine-learning python roadmap studies

Last synced: 14 Apr 2025

https://github.com/worldbank/rissk

Identify at-risk interviews directly from your Survey Solutions export files.

data-science fraud-detection quality-assurance survey survey-analysis survey-data survey-solutions survey-statistics

Last synced: 24 Apr 2025

https://github.com/albarsil/geneticml

A simple and lightweight genetic algorithm for optimization of any machine learning model

automl data-science genetic-algorithm machine-learning

Last synced: 13 Apr 2025

https://github.com/datumorphism/datumorphism.github.io

My knowledgebase on machine learning, data visualization, and some fun stuff.

artificial-intelligence data-science data-visualization giscus machine-learning statistics

Last synced: 24 Oct 2025

https://github.com/dlopezyse/drug-repurposing-using-kge

💊 Drug repurposing using knowledge graph embeddings with a focus on vector-borne diseases

biotechnology data-science drug-repurposing health knowledge-graph machine-learning

Last synced: 28 Feb 2025

https://github.com/orico/flexeegile

Extending Agile For AI & Data Teams

agile ai data data-science flexeegile methodology

Last synced: 08 Jan 2026

https://github.com/adrienc21/vulpes

Vulpes: Test many classification, regression models and clustering algorithms to see which one is most suitable for your dataset

automl data-analysis data-science machine-learning models package python scikit-learn statistics

Last synced: 25 Oct 2025

https://github.com/teddyoweh/cheat-model

NLP Text Binary Probabilistic Classification Model for predicting cheat statements

data-science machine-learning nlp tokenizer

Last synced: 23 Aug 2025

https://github.com/phanatagama/data-science

🚀 This repository have an Data Science docs in JupyterNote. Using python-3 while learning about material DS.

big-data data-science image-processing matplotlib-pyplot numpy opencv pandas python3 scatter scipy

Last synced: 19 Apr 2025

https://github.com/pottekkat/heart-disease-classifier

Given clinical parameters of a patient, can we predict whether or not they have heart disease?

data-science data-visualization heart-disease-analysis heart-disease-predictor jupyter-notebook machine-learning

Last synced: 25 Oct 2025

https://github.com/samedwardes/pydatafaker

A python package to create fake data with relationships between tables.

data data-science fake-data python

Last synced: 23 Apr 2025

https://github.com/baslia/quant_analysis

I created some notebooks about different concepts of financial engineering

analytics data-science ds quantitative-finance trading trading-strategies

Last synced: 20 Oct 2025

https://github.com/ugurcanerdogan/effects-of-moon-cycles-on-cryptocurrencies

BBM469*DSCP - Data Science Capstone Project - Do Lunar Phases affect Cryptocurrencies or not? : It has been on the social media agenda lately that the moon phases have some effects on "cryptocurrencies" but there is no research on it, it just qualifies as a realization. Here, our goal in this project is a statistical investigation of whether the different phases of the moon have an effect on cryptocurrencies.

bbm469 cryptocurrency data-science dscp lunar-phases moon-cycles moon-phase statistical-analysis technical-analysis

Last synced: 18 Mar 2025

https://github.com/sondosaabed/nics-firearm-background-checks-investigation

🔫 The data comes from the FBI's National Instant Criminal Background Check System. The NICS is used by to determine whether a prospective buyer is eligible to buy firearms or explosives. 🔫

census-data criminal-background data-analyst-nanodegree data-science data-wrangling data-wrangling-data-vis data-wrangling-data-visualisation fbi matplotlib nanodegree numpy pandas python storytelling-with-data usa

Last synced: 01 Jul 2025

https://github.com/tulip-lab/modern-data-science

Modern Data Science Course

big-data data-science python

Last synced: 21 Feb 2026

https://github.com/omegaml/dashserve

develop and serve Plotly Dash apps in Jupyter Notebook or JupyterLab

data-science plotly plotly-dash scikit-learn

Last synced: 17 Mar 2026

https://github.com/bradleyboehmke/r-training-text-mining

Resources for my Text Mining with R course (Mar 8-9, 2018)

data-science education r teaching teaching-materials text-analysis text-mining

Last synced: 13 Apr 2025

https://github.com/moindalvs/forecasting_airline_passengers_traffic

Forecast the Airlines Passengers. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.

additive arima-forecasting data-science double-exponential-smoothing forecasting holt-winters holt-winters-forecasting multiplicative sarima-model seasonality-analysis simple-exponential-smoothing stationarity stationarity-test time-series-forecasting timeseries-analysis trend-analysis triple-exponential-smoothing

Last synced: 23 Apr 2025

https://github.com/smac-group/ds

:notebook: This book is currently under development and has been designed as a support for students who are following (or are interested in) courses that provide the basic knowledge to master "statistical programming" with R. Compiled textbook:

data-science github programming r rstudio statistics

Last synced: 22 Jul 2025

https://github.com/habedi/myr-languagecodes

My efforts for bettering my knowlage of R language

data-mining data-science data-visualization dataset graph r

Last synced: 27 Apr 2025

https://github.com/eurobios-mews-labs/acrocord

This package provide some useful tools to interact with postgresql server using pandas dataframe

data data-science database pandas-dataframe postgresql psycopg2 python python3 sqlalchemy table-factory

Last synced: 15 Apr 2025

https://github.com/mituskillologies/ds-diploma-internship-jun24

Programs conducted at MITU Skillologies, Pune office in internship training on Data Science during June-July 2024 for Diploma Engineering Students.

data-analytics data-science data-visualization machine-learning project python python3

Last synced: 09 Apr 2025

https://github.com/csfelix/datascience-exercises

🐍 Just some DataScience exercises, nothing more... 🐍 (🔑 KeyWords: python, data science, data analysis, pandas 🔑)

data-analysis data-science datascience pandas python python3

Last synced: 05 Jul 2025

https://github.com/blurred-machine/computer-vision-image-classification

In this repository I have implemented computer vision on MNIST dataset for images classification for digits between 0-9, fashion clothings and sign language hand signals. The models are implemented using TesorFlow. Feel free to send a PR for any oprimization or modification.

computer-vision data-science deeplearning images-classification machine-learning mnist-dataset python

Last synced: 10 Sep 2025

https://github.com/blacksuan19/redash-python

A More complete Redash API python client

dashboards data-science data-visualization python

Last synced: 24 Apr 2025

https://github.com/mrgeislinger/udacitydand_proj_wrangleandanalyzedata

Wrangling and analyzing data project for Udacity's Data Analyst Nanodegree. Wrangles WeRateDogs™ (@dog_rates) Twitter data from local, online, and Twitter API sources.

data-analysis data-analyst data-science datascience jupyter-notebook python3 twitter udacity-data-analyst-nanodegree udacity-nanodegree

Last synced: 09 Oct 2025

https://github.com/kalyan4636/python-eering

PYTHON PROJECT WITH SOURCE CODE. the best Python project name is one that is descriptive, memorable, and fun for you to say. Don't be afraid to get creative and use emojis to make your project stand out! 📈

artificial-intelligence artificial-intelligence-algorithms data-science deep-learning django framework machine-learning machine-learning-algorithms numpy opencv opencv-python opensource pandas pil-tinker pillow python python-3 python-library python3

Last synced: 23 Apr 2025

https://github.com/nemeslaszlo/sentiment-analysis-and-stock-values

Sentiment analysis of economic news headlines and examining their effects on stock market changes without the full article or analysis. Awareness and click generation are important roles for business news headlines as well. The effect can be demonstrated.

bert data-science data-visualization nltk recurrent-neural-network tensorflow textblob vader-sentiment-analysis

Last synced: 25 Jul 2025

https://github.com/urbanclimatefr/coursera-applied-data-science-with-python

This repository contains the materials to "Applied Data Science with Python", a specialization provided by University of Michigan through Coursera.

coursera data-science machine-learning python3

Last synced: 22 Apr 2025

https://github.com/dse-capstone-sharknado/advancedbpr

Amazon Recommendation System build on BPR TensorFlow implementation

data-prep data-science exploratory-analysis ipynb machine-learning recommender-system

Last synced: 15 Oct 2025

https://github.com/philipperemy/github-full-data-set

Generating GitHub data (~1M repositories May 2017).

data-science dataset github github-api kaggle machine-learning

Last synced: 07 May 2025

https://github.com/noorkhokhar99/plagiarsim-checker

Plagiarsim checker using cosine algorithm #Plagiarsimchecker

ai api checker data-science database nlp nlptk plagiarsim python

Last synced: 16 Oct 2025

https://github.com/jobar8/subsurface_hackathon_2017

Three notebooks to jump start a data science project

data-science geophysics groundwater ipywidgets

Last synced: 28 Jan 2026

https://github.com/kennethleungty/pymysql-demo

PyMySQL - Connecting Python and SQL for Data Science

data-analysis data-science mysql pandas python sql

Last synced: 12 Jul 2025

https://github.com/sondosaabed/introduction-to-sql

Course with udacity that cover SQL for data Scientists, this is my solution for the lessons and the project

aggregations data-science dvd-rental-database joins nanodegree sql subqueries udacity-nanodegree

Last synced: 21 Jan 2026

https://github.com/epiverse-trace/epi-training-kit

An e-learning strategy for training on analysis, modelling and response to outbreaks and epidemics in Latin-America and the Caribbean

data-science e-learning epidemics training

Last synced: 07 Jul 2025

https://github.com/tushar2704/common_datasets

Common-datasets is a GitHub repository dedicated to providing a wide collection of common datasets for practicing and learning data science and machine learning.

aritificial-intelligence data-analytics data-engineering data-science data-visualization database dataset-generation datasets machine-learning

Last synced: 09 Aug 2025

https://github.com/akkefa/ml-notes

Notes for Mathematics for Machine learning and Data Science.

book computer-science data-science linear-algebra mathematics notes probability statistics topics

Last synced: 04 Feb 2026

https://github.com/zasper-io/zasper-benchmark

Benchmarking Zasper v/s JupyterLab (Jupyter Server)

ai data-science ipython jupyter jupyter-notebook jupyterlab machine-learning zasper

Last synced: 17 May 2026

https://github.com/recodehive/recode-website

recodehive helps you to learn and master the skills on data, and encourage you to code on opensource.

data data-science dataengineering opensource python sql tutorials website

Last synced: 15 Mar 2026

https://github.com/the-pew-inc/the-pew

ThePew is an advanced system of records that enables enterprises to detect trends and patterns from questions to drive marketing and business decisions toward their goals.

data data-science docker javascript machine-learning postgresql rails ruby

Last synced: 06 Oct 2025

https://github.com/akashkobal/data-science

I'm excited to share my data science project🚀, where I've applied various techniques and insights to solve a specific problem. The project follows best practices for maintainability and reproducibility, using the Data Science Project Template. Dive into the project to explore the code, datasets, documentation, and resources that showcase MyJourney

akash akash-kobal akashkobal applied-data-science artificial-intelligence classification data-science dataanalysis dataanalytics datascienceproject datascientist deep-learning kobal machine-learning prediction regression

Last synced: 17 Mar 2026

https://github.com/raphaelsenn/playervectors

Implementation of the paper "Player Vectors: Characterizing Soccer Players Playing Style from Match Event Streams".

data-science

Last synced: 04 Mar 2026

https://github.com/pathwiselabs/pixel-pipeline

A Python application with Gradio UI for batch processing and captioning of images, allowing for easy integration with AI image training workflows.

data-cleaning data-science flux generative-ai stable-diffusion stable-diffusion-webui

Last synced: 04 Mar 2026

https://github.com/thecoderpinar/hms-brainactivity-analysiss

Welcome to the GitHub repo for "HMS - EEG Exploration & Neurocritical Care Journey"! Explore EEG data, understand wave patterns, and delve into conditions like LPDs, GPDs, LRDA, and GRDA.

critical-care data-analysis data-science data-visualization deep-neural-networks eeg eeg-signals exploratory-data-analysis healthcare medical-research neuroscience signal-processing

Last synced: 30 Apr 2025

https://github.com/lars-quaedvlieg/swizz

Modular Python package for simple visualization and ML pipelines.

data-science latex machine-learning open-source plotting python research tables utilities

Last synced: 22 Jun 2025

https://github.com/tushar2704/ml-portfolio

This repository showcases a collection of machine learning projects in various domains, demonstrating my skills and expertise as a data scientist and machine learning engineer. Each project provides step-by-step instructions, code, and visualizations to showcase the data analysis and modeling techniques employed.

artificial-intelligence data-science machine-learning portfolio python streamlit-tushar2704 tushar2704

Last synced: 07 May 2025

https://github.com/alexcj10/analyzing-amazon-sales-data

This repository is dedicated to analyzing Amazon sales data to identify trends and insights that can help improve sales strategies and performance.

amazon beautifulsoup data-analysis data-science data-visualization ecommerce machine-learning matlpotlib numpy pandas python sales skilearn

Last synced: 10 Jul 2025

https://github.com/edaaydinea/estimating-the-probability-of-confirmed-covid-19-cases-taking-into-the-intensive-care-unit-icu-

This repository includes the slides and coding parts for the Estimating the Probability of Confirmed COVID-19 Cases Taking into the Intensive Care Unit (ICU).

covid-19 data-analysis data-science data-visualization machine-learning

Last synced: 11 Apr 2025

https://github.com/edaaydinea/python-ml-dl-ds-projects

This repository is included artificial intelligence, machine learning, data science, computer vision projects which are written Python language.

computer-vision data-science deep-learning machine-learning projects python

Last synced: 02 Jul 2025

https://github.com/hritik5102/fundamentals_of_ds_ml_dl

The repository encompasses the core concepts of Python, Statistics, Machine Learning, Deep Learning, Computer Vision, and Natural Language Processing.

computer-vision data-science deep-learning machine-learning neural-network statistics

Last synced: 13 May 2025

https://github.com/synthesized-io/synthesized-notebooks

Discover the art of enhancing your data using generative modelling in these notebooks.

data-privacy data-science generative-modelling ml notebooks synthetic-data

Last synced: 14 Jul 2025

https://github.com/sabyasachi-seal/stockmarketprediction

Stock Market Prediction using Numerical and Textual Analysis

aiml analysis data-science data-visualization machine-learning notebook prediction python

Last synced: 08 May 2025

https://github.com/egenn/pdsr

Code for the online PDSR book

data-science learning r rstats

Last synced: 17 Jun 2025

https://github.com/zehracakir/verimadenciliginotlarim

My notes and my own studies in the Data Mining course in the computer engineering department of Süleyman Demirel University

classifying clustering data data-mining data-science linear-regression machine-learning pandas python

Last synced: 18 Jun 2025

https://github.com/broadinstitute/pooled-cell-painting-profiling-recipe

:woman_cook: Recipe repository for image-based profiling of Pooled Cell Painting experiments

carpenter-lab cell-painting data-science in-situ-sequencing pooled-cell-painting pooled-screen recipe

Last synced: 01 Mar 2026

https://github.com/prem07a/credit-score-classification

This is ML project which is based on Classification of Credit Score

data-science fastapi feature-extraction machine-learning python3 sklearn-classify website

Last synced: 13 Apr 2025