An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/csfelix/how-to-do-instagram-portfolio

๐ŸŒ Public Codes and Projects shared in my Instagram's Account (C0dePlus) and Portfolio ๐ŸŒ (๐Ÿ”‘ KeyWords: html, css, js, python, java, c#, cypher, sql ๐Ÿ”‘)

c-sharp css cypher data-science html java javascript js jupyter-notebook mongodb mysql neo4j python sql

Last synced: 28 Apr 2025

https://github.com/gi0na/r-ghypernet

R package for Generalised Hypergeometric Ensembles of Random Graphs (gHypEG)

data-mining data-science graphs network network-analysis random-graph-generation random-graphs

Last synced: 05 Feb 2026

https://github.com/chicolucio/ifood-case-data-analyst

Projeto de ensino para o curso Ciรชncia de Dados ministrado por mim na Hashtag

classification-model clustering data-science python segmentation sklearn sklearn-pipeline teaching

Last synced: 07 Oct 2025

https://github.com/robertvazan/sourceafis-visualization-java

Visualizations of biometric features in fingerprint templates produced by SourceAFIS and in algorithm transparency data captured during feature extraction and matching in SourceAFIS.

biometrics data-science feature-extraction fingerprint fingerprint-authentication minutia sourceafis visualization-library

Last synced: 14 Oct 2025

https://github.com/curiousily/ml-in-the-browser-for-hackers-with-tensorflow-js

Machine Learning examples for beginners showing how to use TensorFlow.js in the browser

data-science linear-regression machine-learning tensorflow-js tensorflow-tutorials tensorflowjs

Last synced: 26 Apr 2025

https://github.com/kylegrealis/nascar.data

R package of NASCAR race results & other information

data-science data-visualization package r racing

Last synced: 25 Oct 2025

https://github.com/baslia/quant_analysis

I created some notebooks about different concepts of financial engineering

analytics data-science ds quantitative-finance trading trading-strategies

Last synced: 20 Oct 2025

https://github.com/pabannier/sparseglm

Fast and modular solver for sparse generalized linear models

data-science machine-learning optimization

Last synced: 10 Apr 2025

https://github.com/anselmoo/bashplot

Instant data plotting from the terminal into the terminal

bash cloud data-science hpc instant-data-plotting plot python3 terminal-based trends zsh

Last synced: 17 Mar 2025

https://github.com/arjunan-k/machine-learning

Machine Learning Specialization by Andrew Ng in collaboration between DeepLearning.AI and Stanford Online in Coursera.

data-science deep-learning neural-networks tensorflow

Last synced: 05 Oct 2025

https://github.com/hoxo-m/deltatest

R Package for Statistical Hypothesis Testing Using the Delta Method for Online A/B Testing

ab-testing data-science statistics

Last synced: 22 Oct 2025

https://github.com/pytask-dev/cookiecutter-pytask-project

A minimal cookiecutter template for a project with pytask.

cookiecutter data-science pytask research

Last synced: 26 Jul 2025

https://github.com/teddyoweh/cheat-model

NLP Text Binary Probabilistic Classification Model for predicting cheat statements

data-science machine-learning nlp tokenizer

Last synced: 23 Aug 2025

https://github.com/ahmed-maher77/wind-turbine-power-prediction-app-using-machine-learning

"Wind Power Predictor" is a machine learning project that forecasts turbine output using real-time data from Turkish wind farms. Its web app interface offers convenient access to predictions, enabling informed decisions for maximizing energy production and advancing renewable energy usage.

ai catboost data-analysis data-science flask html-css-javascript javascript machine-learning matplotlib numpy pandas predictive-modeling pwa python sklearn web web-development wind wind-turbine wind-turbine-operational-optimization

Last synced: 10 Apr 2025

https://github.com/dkundih/vandal

Data science, Data manipulation and Machine learning library.

data-science data-visualization digital-transformation logistics logistics-4-0 machine-learning python statistics

Last synced: 16 Jan 2026

https://github.com/anshumansinha3301/matplotlib_visualizations

Some Graphs using Matplotlib in Python

data-science matplotlib python

Last synced: 07 Oct 2025

https://github.com/dlopezyse/drug-repurposing-using-kge

๐Ÿ’Š Drug repurposing using knowledge graph embeddings with a focus on vector-borne diseases

biotechnology data-science drug-repurposing health knowledge-graph machine-learning

Last synced: 28 Feb 2025

https://github.com/santiagxf/mlproject-sample

Sample repository about how to structure an ML project using software engineering practices

data-science data-science-projects git machine-learning mlops

Last synced: 24 Apr 2025

https://github.com/pottekkat/heart-disease-classifier

Given clinical parameters of a patient, can we predict whether or not they have heart disease?

data-science data-visualization heart-disease-analysis heart-disease-predictor jupyter-notebook machine-learning

Last synced: 25 Oct 2025

https://github.com/mayer79/statistical_computing_material

Material for the lecture Statistical Computing

data-science machine-learning r statistics

Last synced: 01 May 2025

https://github.com/hoshibatista/base-of-ds

This repository serves as a foundation for projects in Data Science and Machine Learning.

clustering-algorithm data-science data-visualization machine-learning

Last synced: 05 Sep 2025

https://github.com/ccao-data/model-condo-avm

Automated valuation model for all class 299 and 399 residential condominiums in Cook County

assessment condo data-science machine-learning model property-taxes r tidymodels

Last synced: 11 Apr 2025

https://github.com/sondosaabed/nics-firearm-background-checks-investigation

๐Ÿ”ซ The data comes from the FBI's National Instant Criminal Background Check System. The NICS is used by to determine whether a prospective buyer is eligible to buy firearms or explosives. ๐Ÿ”ซ

census-data criminal-background data-analyst-nanodegree data-science data-wrangling data-wrangling-data-vis data-wrangling-data-visualisation fbi matplotlib nanodegree numpy pandas python storytelling-with-data usa

Last synced: 01 Jul 2025

https://github.com/wesslen/iviz-rstudio-workshop

Interactive Visualizations with RStudio Workshop for UNCC DSI

data-science htmlwidgets interactive-visualizations rstudio shiny shinyapps tidyverse

Last synced: 19 Jan 2026

https://github.com/worldbank/rissk

Identify at-risk interviews directly from your Survey Solutions export files.

data-science fraud-detection quality-assurance survey survey-analysis survey-data survey-solutions survey-statistics

Last synced: 24 Apr 2025

https://github.com/shotahorii/bareml

Machine learning & deep learning implementation from scratch, depending only on numpy.

data-science deep-learning deep-neural-networks machine-learning machine-learning-algorithms machine-learning-from-scratch statistical-models

Last synced: 14 Jan 2026

https://github.com/phanatagama/data-science

๐Ÿš€ This repository have an Data Science docs in JupyterNote. Using python-3 while learning about material DS.

big-data data-science image-processing matplotlib-pyplot numpy opencv pandas python3 scatter scipy

Last synced: 19 Apr 2025

https://github.com/orico/flexeegile

Extending Agile For AI & Data Teams

agile ai data data-science flexeegile methodology

Last synced: 08 Jan 2026

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Apr 2025

https://github.com/andrewhinh/captafied

Multimodal Table Understanding

data-science python

Last synced: 31 Jan 2026

https://github.com/savannahostrowski/jupyter-mercury-aca

๐Ÿ“ˆ A web application used for hosting, sharing and interacting with Jupyter Notebooks via Mercury, hosted on Azure Container Apps.

azd-templates azure data-science jupyter-notebook mercury python python3 template

Last synced: 10 Apr 2025

https://github.com/rueedlinger/ml-resources

A curated list of statistics, data visualization and machine learning resources which in find useful, have read or want to read.

curated-list data-science data-visualization deep-learning machine-learning statistics

Last synced: 01 Apr 2025

https://github.com/josechirif/2018-house-price-estimation---melbourne-australia

The project proposes to calculate the price of a Melbourne house according to its characteristics.

data data-science python

Last synced: 14 Apr 2025

https://github.com/rolv-io/rolvapp

Rolv is your AI-powered research assistant for life sciences!

ai biology data-analysis data-science genomics life-sciences medicine

Last synced: 02 Mar 2026

https://github.com/kalyan4636/python-eering

PYTHON PROJECT WITH SOURCE CODE. the best Python project name is one that is descriptive, memorable, and fun for you to say. Don't be afraid to get creative and use emojis to make your project stand out! ๐Ÿ“ˆ

artificial-intelligence artificial-intelligence-algorithms data-science deep-learning django framework machine-learning machine-learning-algorithms numpy opencv opencv-python opensource pandas pil-tinker pillow python python-3 python-library python3

Last synced: 23 Apr 2025

https://github.com/sirius248/introduction-to-data-science-in-python

Introduction to Data Science in Python (Coursera)

data-science python

Last synced: 11 Nov 2025

https://github.com/moindalvs/forecasting_airline_passengers_traffic

Forecast the Airlines Passengers. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.

additive arima-forecasting data-science double-exponential-smoothing forecasting holt-winters holt-winters-forecasting multiplicative sarima-model seasonality-analysis simple-exponential-smoothing stationarity stationarity-test time-series-forecasting timeseries-analysis trend-analysis triple-exponential-smoothing

Last synced: 23 Apr 2025

https://github.com/moindalvs/resume_screening_and_parser

Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention Sample Data Set Details: Resumes and financial documents

data-science doc2txt doc2vec docx-converter docx-to-pdf docx2txt pdf-document-processor pdf2txt streamlit text text-analysis text-classification text-mining text-processing unstructured-data

Last synced: 23 Apr 2025

https://github.com/kingabzpro/annual-recycled-energy-saved-in-singapore

Learn how much Singapore is saving energy per years by recycling plastics, paper, glass, ferrous and non-ferrous metal

cleaning-data data-analysis data-science deepnote energy environment

Last synced: 19 Jun 2025

https://github.com/gesiscss/ptm

Introduction to Natural Language Processing with a special emphasis on the analysis of Job Advertisements

binder data-science information-retrieval labour-market nlp r text-mining topic-modeling

Last synced: 07 May 2025

https://github.com/zackakil/friendlier-data-labelling

Code resources for generating a google form for labelling data.

data-science google google-apps-script google-forms google-sheets machine-learning

Last synced: 04 Oct 2025

https://github.com/blacksuan19/redash-python

A More complete Redash API python client

dashboards data-science data-visualization python

Last synced: 24 Apr 2025

https://github.com/thecoderpinar/diabetes_health_prediction_and_analysis

A comprehensive project to predict and analyze diabetes health data using advanced machine learning models, including Logistic Regression, Random Forest, and XGBoost. ๐Ÿ“Š๐Ÿ”

analytics artificial-intelligence classification data-science data-visualization deep-learning diabetes-prediction health healthcare logistic-regression machine-learning medical-analysis mlops prediction python random-forest xgboost

Last synced: 12 Aug 2025

https://github.com/mettekou/matrixprofile

The matrix profile data structure and associated algorithms for mining time series data

algorithms anomaly-detection clustering data-mining data-science dotnet fsharp matrixprofile motif-discovery segmentation time-series time-series-analysis

Last synced: 14 Jan 2026

https://github.com/abhaysingh71/ai-powered-healthcare-intelligence-network

The AI-Powered Healthcare Intelligence Network is an AI-driven system offering disease prediction, drug recommendations, heart disease risk assessment, and an AI medical chatbot. Using ML, NLP, and LLMs, it provides accurate diagnoses, insights, and recommendations, enhancing healthcare accessibility, efficiency, and decision-making .

airtificialintelligence chatbot data-analysis data-science datawrangling disease-prediction healthcare-ai heart-disease huggingface langchain large-language-models lightgbm machine-learning mistral-7b recommendation-system retrieval-augmented-generation sentence-transformers shap vector-database

Last synced: 01 Feb 2026

https://github.com/fearlesssolutions/engineering-practice-domains

A mono-repo for the Engineering Practice Domains of Development, Data, Infrastructure, Testing, and Platforms

data data-engineering data-science database-design devops drupal end-to-end-testing engineering infrastructure machine-learning salesforce security testing web-development

Last synced: 26 Oct 2025

https://github.com/carpentries-incubator/open-science-with-r

Carpentry-style lesson on how to use R, RStudio together with git & Github to promote Open Science practices.

alpha carpentries data-science dplyr ggplot2 git github lesson open-science r rstudio scripting tidyr

Last synced: 02 Sep 2025

https://github.com/mrgeislinger/udacitydand_proj_wrangleandanalyzedata

Wrangling and analyzing data project for Udacity's Data Analyst Nanodegree. Wrangles WeRateDogsโ„ข (@dog_rates) Twitter data from local, online, and Twitter API sources.

data-analysis data-analyst data-science datascience jupyter-notebook python3 twitter udacity-data-analyst-nanodegree udacity-nanodegree

Last synced: 09 Oct 2025

https://github.com/phazerooman/dcai-ocr-krooki

OCR model, trained for extracted coordinates from Omani title deeds ("krooki") utilizing a Data Centric AI (DCAI) approach.

ai data-science ocr python

Last synced: 12 Apr 2025

https://github.com/urbanclimatefr/coursera-applied-data-science-with-python

This repository contains the materials to "Applied Data Science with Python", a specialization provided by University of Michigan through Coursera.

coursera data-science machine-learning python3

Last synced: 22 Apr 2025

https://github.com/ishijo/Taylor-Swift-Lyrics

Database (.txt and .csv) of all Taylor Swift Song Lyrics upto April'23

data-science dataset datasets nlp-machine-learning taylor-swift text-mining

Last synced: 27 Jul 2025

https://github.com/epiverse-trace/epi-training-kit

An e-learning strategy for training on analysis, modelling and response to outbreaks and epidemics in Latin-America and the Caribbean

data-science e-learning epidemics training

Last synced: 07 Jul 2025

https://github.com/bradleyboehmke/r-training-text-mining

Resources for my Text Mining with R course (Mar 8-9, 2018)

data-science education r teaching teaching-materials text-analysis text-mining

Last synced: 13 Apr 2025

https://github.com/tushar2704/common_datasets

Common-datasets is a GitHub repository dedicated to providing a wide collection of common datasets for practicing and learning data science and machine learning.

aritificial-intelligence data-analytics data-engineering data-science data-visualization database dataset-generation datasets machine-learning

Last synced: 09 Aug 2025

https://github.com/ksdkamesh99/medium-blogs

It is a stack of all my medium and analytics vidya articles on different technologies in computer science like AI,ML,Deep learning and many more

analytics-vidya-articles computer-science data-science deep-learning machine-learning medium medium-blogs python

Last synced: 12 May 2025

https://github.com/nemeslaszlo/sentiment-analysis-and-stock-values

Sentiment analysis of economic news headlines and examining their effects on stock market changes without the full article or analysis. Awareness and click generation are important roles for business news headlines as well. The effect can be demonstrated.

bert data-science data-visualization nltk recurrent-neural-network tensorflow textblob vader-sentiment-analysis

Last synced: 25 Jul 2025

https://github.com/broadinstitute/pooled-cell-painting-profiling-recipe

:woman_cook: Recipe repository for image-based profiling of Pooled Cell Painting experiments

carpenter-lab cell-painting data-science in-situ-sequencing pooled-cell-painting pooled-screen recipe

Last synced: 01 Mar 2026

https://github.com/akbaritabar/bibliodemography_imprs_phds_2022_idem187

Materials for the day 4 of the course on "Topics in Digital and Computational Demography" on Using large-scale bibliometric data for demographic research; Advantages and pitfalls of using Scopus data to trace internal and international scholarly migration worldwide, Instructor: Aliakbar Akbaritabar

computational-social-science data-science demographic-research migration-research python python3 rstats sql

Last synced: 07 May 2025

https://github.com/tindzk/nix-ds

Nix for Data Science

data-science nix python

Last synced: 26 Jul 2025

https://github.com/ashwinpn/advanced-python

Python for Machine Learning/AI/DS, Game Theory and Convex Optimization using Python, Managing Docker in Python, Web Scraping / Development in Python using Django and Flask, Functional Programming in Python.

convex-optimization data-science docker flask functional-programming game-theory machine-learning machine-learning-algorithms python web-development web-scraping

Last synced: 13 Apr 2025

https://github.com/jobar8/subsurface_hackathon_2017

Three notebooks to jump start a data science project

data-science geophysics groundwater ipywidgets

Last synced: 28 Jan 2026

https://github.com/x-tabdeveloping/rvfln

A Python implementation of random vector functional networks and broad learning systems using Sklearn's Regressor and classifier APIs

broad-learning data-science deep-learning machine-learning scikit-learn sklearn sklearn-compatible

Last synced: 22 Mar 2025

https://github.com/cjdoris/chevrons.jl

Your friendly >> chevron >> based syntax for piping data through multiple transformations.

data data-science data-transformation julia julia-lang julia-language macros piping repl

Last synced: 07 Mar 2026

https://github.com/nok/weka-porter

Transpile trained decision trees from Weka to C, Java or JavaScript.

data-science machine-learning weka

Last synced: 09 May 2025

https://github.com/chrislemke/autoembedder

PyTorch autoencoder with additional embeddings layer for categorical data ๐Ÿš˜

anomaly-detection autoencoder data-science embedding machine-learning neural-network python pytorch pytorch-ignite

Last synced: 15 Apr 2025

https://github.com/ahammadnafiz/predicta

Predicta: Simplify your workflow with our powerful data analysis and machine learning tool.

analytics data-science data-visualization dataanalysis machine-learning pandas project python streamlit streamlit-webapp webapp

Last synced: 28 Jul 2025

https://github.com/giswqs/learning-scipy

Learning SciPy for Numerical and Scientific Computing

data-science jupyter-notebook python scipy

Last synced: 12 May 2025