An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/dayyass/ml-interviews

My solutions for Home Assignments for Machine Learning Job Interviews.

bert data-science deep-learning elmo interview machine-learning natural-language-processing word-sense-induction

Last synced: 13 Apr 2025

https://github.com/zohaibkhandev/weather_app

Experience WeatherWise, your go-to app for accurate forecasts. Get real-time updates on current conditions and detailed forecasts for the week ahead. Plan your day with confidence using hourly forecasts tailored to your location. Stay informed with customizable weather alerts for your favorite locations. With intuitive navigation and beautiful vis

data-science rest-api weather weather-api weather-app world

Last synced: 11 Apr 2025

https://github.com/katrienantonio/workshop-loss-reserv-fraud

Course material for a workshop on loss modelling, reserving and insurance fraud analytics

actuarial-science data-science insurance-claims

Last synced: 06 May 2025

https://github.com/praneeth-katuri/house-worth

HouseWorth is an open-source project used for predicting house prices using machine learning techniques

data-science exploratory-data-analysis house-price-prediction machine-learning python real-estate regression

Last synced: 23 Jun 2025

https://github.com/datalorax/sds-r

Repo for a draft book on social data science methods with R

data-science r rstats social-data-science

Last synced: 11 Apr 2025

https://github.com/szczyglis-dev/python-lottery-dataset-analyze

[Python] A Jupyter notebook illustrating methods for analyzing a historical lottery results dataset. The example demonstrates assessing linear relationships between variables, incorporating astronomical data, and visualizing number distributions.

analyze-data astronomy csv data-science datasets jupyter linear-regression lottery-draw notebook-jupyter plot predictive-modeling probability-distribution python random relationship skyfield

Last synced: 29 Jun 2025

https://github.com/mohammadreza-mohammadi94/data-analysis-projects-with-pandas

A repository featuring practical data analysis projects using Pandas, demonstrating data manipulation, visualization, and real-world problem-solving techniques. Ideal for learning and applying Pandas for data analysis.

data data-science jupyter-notebook pandas

Last synced: 05 May 2025

https://github.com/bradleyboehmke/dw-r

Code and text for the "Data Wrangling with R" book.

book data-science data-wrangling r

Last synced: 13 Apr 2025

https://github.com/sdcastillo/PA-R-Study-Manual

An online study guide for the SOA's predictive analytics exam.

data-science data-visualization machine-learning predictive-modeling r-programming

Last synced: 06 May 2025

https://github.com/mmore500/outset

add zoom indicators, insets, and magnified panels to matplotlib/seaborn visualizations with ease!

data-science data-visualization matplotlib pypi-package python seaborn

Last synced: 30 Apr 2025

https://github.com/olekscode/examples-pca-tsne

Some examples of using PCA and t-SNE for dimensionality reduction in Python and R

data-science dimensionality-reduction examples pca t-sne

Last synced: 18 Mar 2025

https://github.com/n1ghtf1re/map-of-emergency-incidents

Emergency Map allows you to effectively visualize multi-dimensional information, has an intuitive interface. The developed code is easily modified for use in a variety of areas. The use of color mixing technology enhances the perception and analysis of information

big-data big-data-analytics big-data-visualization bigdata color-mixing colors data data-analytics data-science data-visualization data-visualization-challenges data-visualization-simpler mysql open-source-project php student-project

Last synced: 18 Mar 2025

https://github.com/ryanlucas3/macrorandomforest

A modification of traditional random forest for time-series forecasting

data-science machine-learning random-forest time-series

Last synced: 10 Apr 2025

https://github.com/cpcloud/dpyr

Python dplyr operations for SQL databases and pandas DataFrames

data-science dplyr postgres python python-3 python-library python3 sql sqlalchemy sqlite3

Last synced: 09 Sep 2025

https://github.com/syamkakarla98/datascience_head_start

This repository focuses on the building path for the data science.

data-analysis data-science data-visualization machine-learning machinelearning-python python3

Last synced: 03 May 2025

https://github.com/poopoothegorilla/fastframe

DataFrame project that utilizes Apache Arrow

apache-arrow data-science dataframe golang

Last synced: 12 Jun 2025

https://github.com/shuyib/chronic-kidney-disease-kaggle

Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.

data-cleaning-pipeline data-science data-transformation data-visualization diagnostics dimensionality-reduction feature-engineering feature-selection health-data-analysis health-data-science machine-learning machine-learning-algorithm machine-learning-algorithms model-interpretability preventative-medicine

Last synced: 19 Apr 2025

https://github.com/julienmalka/neuralnetwork

Small implementation of a neural network in Python

data-science machine-learning neural-network python

Last synced: 11 Apr 2025

https://github.com/tsdataclinic/TREC

Transit Resilience for Essential Commuting (TREC)

climate-change data-science transit-data

Last synced: 20 Jul 2025

https://github.com/open-risk/dataqualitytoolkit

Python toolkit for evaluating and visualizing the data quality of excel spreadsheets

data-quality data-quality-measurement data-science excel spreadsheet

Last synced: 23 Oct 2025

https://github.com/polyaxon/polyaxon-lib

Deep Learning and Reinforcement learning library for TensorFlow for building end to end models and experiments.

data-science deep-learning machine-learning reinforcement-learning tensorflow tensorflow-experiments

Last synced: 30 Sep 2025

https://github.com/SamEdwardes/pydatafaker

A python package to create fake data with relationships between tables.

data data-science fake-data python

Last synced: 09 Jul 2025

https://github.com/dariodip/rfd-discovery

This project, written in Python and Cython, deals with Discovery of Relaxed Functional Dependencies(RFDs) using a bottom-up approach.

artificial-intelligence cython data-science python python-3 university-project

Last synced: 08 Sep 2025

https://github.com/twipped/spiral

A bio-cycles tracker for all humans

biology data-science health mobile react-native transgender womens-health

Last synced: 10 Jul 2025

https://github.com/imsanjoykb/uber-rides-prediction-flask-deploy

This repository consists of files required to deploy a **Machine Learning** Web App created with **Flask**

data-science data-visualization deployment flask machine-learning ml-algorithms predictive-modeling uber-rides-prediction

Last synced: 30 Oct 2025

https://github.com/ritvik19/vizard

Intuitive, Interactive, Easy and Quick Visualizations for Data Science Projects

data-analysis data-science data-visualization

Last synced: 10 Apr 2025

https://github.com/bradleyboehmke/cinday-rug-iml-2018

Slides and other material for Cincinnati-Dayton useR presentation on interpretable machine learning with R

data-science interpretable-machine-learning machine-learning r shortcourse-material tutorial tutorial-code

Last synced: 13 Apr 2025

https://github.com/bcgov/ghg-emissions-indicator

R scripts for a GHG emissions indicator published on Environmental Reporting BC

data-science env r rstats

Last synced: 07 May 2025

https://github.com/shawn-shan/eru

High Level Framework for PyTorch

data-science deep-learning eru neural-network python pytorch

Last synced: 30 Apr 2025

https://github.com/oscarsaharoy/functionfit

generate functions by placing points on a graph

data-science regression

Last synced: 29 Oct 2025

https://github.com/psyplot/psyplot-gui

Graphical User Interface for the psyplot package

data-science gui interactive ipython psyplot qtconsole sphinx

Last synced: 02 May 2025

https://github.com/tugot17/data-science-blog

Data science blog, https://tugot17.github.io/data-science-blog/

blog data-science xai

Last synced: 11 Jul 2025

https://github.com/trilemmafoundation/trilemma-beta

Official repo for the Trilemma Beta Tournament

bitcoin data-science forecasting tournament

Last synced: 30 Apr 2025

https://github.com/thecoderpinar/big-tech-financial-insights

๐Ÿš€ A comprehensive project analyzing Big Tech stock prices using time series analysis, volatility modeling, and macroeconomic indicators. Featuring interactive dashboards and automated reporting! ๐Ÿ“ˆ๐Ÿ’ผ

data-analysis data-science finance machine-learning macroeconomics stock-analysis time-series-analysis volatility-modeling

Last synced: 03 Apr 2025

https://github.com/tushar2704/pyverse-exploring-python-frameworks

This repository is the Ultimate guide to exploring and mastering Python Libraries & frameworks, collection of code and guide by me, Tushar!

artificial-intelligence data-analysis data-engineering data-science data-visualization machine-learning python streamlit-tushar2704 tushar2704 web-application

Last synced: 30 Oct 2025

https://github.com/tatevkaren/deep-learning-for-data-science

Deep Learning Case Studies with Tensorflow and Keras for Beginners-Advanced: ANN, CNN, RNN, Self-Organizing Maps, Boltzmann Machines, Stacked Autoencoders

ann artificial-intelligence artificial-neural-networks data-preprocessing data-science deep-learning ds keras modelling modelling-framework neural-networks numpy pandas python scikit-learn sklearn tensorflow

Last synced: 10 Apr 2025

https://github.com/anshumansinha3301/some-python-stuffs

Some projects made using pandas, numpy and matplotlib

data-science matplotlib numpy pandas

Last synced: 07 Oct 2025

https://github.com/robertvazan/sourceafis-visualization-java

Visualizations of biometric features in fingerprint templates produced by SourceAFIS and in algorithm transparency data captured during feature extraction and matching in SourceAFIS.

biometrics data-science feature-extraction fingerprint fingerprint-authentication minutia sourceafis visualization-library

Last synced: 14 Oct 2025

https://github.com/anselmoo/bashplot

Instant data plotting from the terminal into the terminal

bash cloud data-science hpc instant-data-plotting plot python3 terminal-based trends zsh

Last synced: 17 Mar 2025

https://github.com/datumorphism/datumorphism.github.io

My knowledgebase on machine learning, data visualization, and some fun stuff.

artificial-intelligence data-science data-visualization giscus machine-learning statistics

Last synced: 24 Oct 2025

https://github.com/pabannier/sparseglm

Fast and modular solver for sparse generalized linear models

data-science machine-learning optimization

Last synced: 10 Apr 2025

https://github.com/chicolucio/ifood-case-data-analyst

Projeto de ensino para o curso Ciรชncia de Dados ministrado por mim na Hashtag

classification-model clustering data-science python segmentation sklearn sklearn-pipeline teaching

Last synced: 07 Oct 2025

https://github.com/baslia/quant_analysis

I created some notebooks about different concepts of financial engineering

analytics data-science ds quantitative-finance trading trading-strategies

Last synced: 20 Oct 2025

https://github.com/arjunan-k/machine-learning

Machine Learning Specialization by Andrew Ng in collaboration between DeepLearning.AI and Stanford Online in Coursera.

data-science deep-learning neural-networks tensorflow

Last synced: 05 Oct 2025

https://github.com/kylegrealis/nascar.data

R package of NASCAR race results & other information

data-science data-visualization package r racing

Last synced: 25 Oct 2025

https://github.com/anshumansinha3301/matplotlib_visualizations

Some Graphs using Matplotlib in Python

data-science matplotlib python

Last synced: 07 Oct 2025

https://github.com/ccao-data/model-condo-avm

Automated valuation model for all class 299 and 399 residential condominiums in Cook County

assessment condo data-science machine-learning model property-taxes r tidymodels

Last synced: 11 Apr 2025

https://github.com/tnwei/nbread

Snappy previews of Jupyter notebooks from the command line, with ranger integration

data-science jupyter python ranger

Last synced: 22 Apr 2025

https://github.com/niaid/r_intro

A Gentle Introduction to R, RStudio, and visualization

bcbb-training data-science machine-learning programming r visualization

Last synced: 28 Aug 2025

https://github.com/zoltan-nz/ci-cd-pipeline-template-for-data-projects

CI/CD pipeline template for data science projects using GitLab CI and Kubernetes

cd ci ci-cd data-science docker gitlab gitlab-runner kubernetes python

Last synced: 07 Mar 2026

https://github.com/dkundih/vandal

Data science, Data manipulation and Machine learning library.

data-science data-visualization digital-transformation logistics logistics-4-0 machine-learning python statistics

Last synced: 16 Jan 2026

https://github.com/phanatagama/data-science

๐Ÿš€ This repository have an Data Science docs in JupyterNote. Using python-3 while learning about material DS.

big-data data-science image-processing matplotlib-pyplot numpy opencv pandas python3 scatter scipy

Last synced: 19 Apr 2025

https://github.com/rueedlinger/ml-resources

A curated list of statistics, data visualization and machine learning resources which in find useful, have read or want to read.

curated-list data-science data-visualization deep-learning machine-learning statistics

Last synced: 01 Apr 2025

https://github.com/giswqs/geog-312-2021

First Steps in GIS Programming (GEOG 312) at the University of Tennessee, Knoxville

data-science geopython geospatial gis jupyter mapping python

Last synced: 12 Jul 2025

https://github.com/app-generator/devtool-data-converter

Open-Source Data Converter - CVS, XLS, DF | AppSeed

appseed-sample data-converter data-science

Last synced: 01 Aug 2025

https://github.com/oneoffcoder/zava

Parallel coordinates with grand tour for exploratory data visualization of massive and high-dimensional data

angular d3 data-science exploratory-data-visualization grand-tour parallel-coordinates python typescript

Last synced: 06 Apr 2025

https://github.com/vicotrbb/data_science

Repository created to store all my studies about data science, machine learning and artificial intelligence.

data-science machine-learning python roadmap studies

Last synced: 14 Apr 2025

https://github.com/srohit0/ml-misc

Miscellaneous Machine Learning and Data Analysis Projects

colaboratory data-analysis data-science data-visualization google-colab machine-learning-algorithms

Last synced: 15 Apr 2025

https://github.com/nagasaki45/dbdapy

Following "Doing Bayesian Data Analysis", in python

bayesian-data-analysis data-science pymc3

Last synced: 29 Jul 2025

https://github.com/splines/deutsche-bahn-analysis

๐Ÿš† Analysis of delays of the Deutsche Bahn (DB)

data-science delay deutsche-bahn public-transport railway

Last synced: 15 Apr 2025

https://github.com/crdietrich/meerkat

Data acquisition for Raspberry Pi and Micropython

data-science drivers micropython raspberrypi

Last synced: 13 May 2025

https://github.com/dlopezyse/drug-repurposing-using-kge

๐Ÿ’Š Drug repurposing using knowledge graph embeddings with a focus on vector-borne diseases

biotechnology data-science drug-repurposing health knowledge-graph machine-learning

Last synced: 28 Feb 2025

https://github.com/samedwardes/pydatafaker

A python package to create fake data with relationships between tables.

data data-science fake-data python

Last synced: 23 Apr 2025

https://github.com/mayer79/statistical_computing_material

Material for the lecture Statistical Computing

data-science machine-learning r statistics

Last synced: 01 May 2025

https://github.com/ahmed-maher77/wind-turbine-power-prediction-app-using-machine-learning

"Wind Power Predictor" is a machine learning project that forecasts turbine output using real-time data from Turkish wind farms. Its web app interface offers convenient access to predictions, enabling informed decisions for maximizing energy production and advancing renewable energy usage.

ai catboost data-analysis data-science flask html-css-javascript javascript machine-learning matplotlib numpy pandas predictive-modeling pwa python sklearn web web-development wind wind-turbine wind-turbine-operational-optimization

Last synced: 10 Apr 2025

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Apr 2025

https://github.com/gvwilson/sys-tutorial

Systems Administration for Weary Data Scientists

data-science system-administration tutorial

Last synced: 24 Mar 2025

https://github.com/chaitanyak77/predictive-maintenance-of-gearbox-using-vibration-sensors-data-

This project focuses on the critical task of predictive maintenance in industrial settings, specifically targeting gearbox machinery. By harnessing the power of vibration sensor data, I have developed a predictive maintenance solution that enables early detection of potential faults and failures in gearboxes

data-science internship-task machine-learning

Last synced: 25 Sep 2025

https://github.com/zincware/znnl

A Python package for studying neural learning

data-science data-selection machinelearning mathematics physics

Last synced: 09 Aug 2025

https://github.com/pottekkat/heart-disease-classifier

Given clinical parameters of a patient, can we predict whether or not they have heart disease?

data-science data-visualization heart-disease-analysis heart-disease-predictor jupyter-notebook machine-learning

Last synced: 25 Oct 2025

https://github.com/sondosaabed/nics-firearm-background-checks-investigation

๐Ÿ”ซ The data comes from the FBI's National Instant Criminal Background Check System. The NICS is used by to determine whether a prospective buyer is eligible to buy firearms or explosives. ๐Ÿ”ซ

census-data criminal-background data-analyst-nanodegree data-science data-wrangling data-wrangling-data-vis data-wrangling-data-visualisation fbi matplotlib nanodegree numpy pandas python storytelling-with-data usa

Last synced: 01 Jul 2025

https://github.com/dr-montasir/mnjs

MATH NODE JS (MNJS): A tiny math library for node.js & JavaScript on browser

data-analysis data-science javascript js jsdelivr library math nextjs npm react svelte sveltekit ts typescript yarn

Last synced: 26 Apr 2025

https://github.com/orico/flexeegile

Extending Agile For AI & Data Teams

agile ai data data-science flexeegile methodology

Last synced: 08 Jan 2026

https://github.com/ahammadmejbah/artificial-intelligence-research-and-development-projects

The field of Artificial Intelligence (AI) is a frontier of computer science that focuses on creating systems capable of performing tasks that would typically require human intelligence. This encompasses a wide range of capabilities such as visual perception, speech recognition, decision-making, and language translation.

data-engineering data-science data-visualization database datascience deep-learning deep-learning-algorithms deep-neural-networks deep-reinforcement-learning machine-learning machine-learning-algorithms machine-vision machinelearning

Last synced: 27 Apr 2025

https://github.com/touppercase78/tiobe-index-ratings

Index Ratings for Popular Programming Languages from TIOBE

analysis data-science datasets index jupyter-notebook programming-languages python tiobe

Last synced: 01 Apr 2025

https://github.com/pytask-dev/cookiecutter-pytask-project

A minimal cookiecutter template for a project with pytask.

cookiecutter data-science pytask research

Last synced: 26 Jul 2025

https://github.com/dsacms/deduplifhir

Prototype for basic deduplication and aggregation of eCQM data

ai cmsoss-tier3 data-science deduplication electron government healthcare poetry python

Last synced: 13 Apr 2025

https://github.com/adrienc21/vulpes

Vulpes: Test many classification, regression models and clustering algorithms to see which one is most suitable for your dataset

automl data-analysis data-science machine-learning models package python scikit-learn statistics

Last synced: 25 Oct 2025