An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/manikantasanjay/financial_analysis_using_python_and_ml_libraries

This repository has been created as part of my Udemy Course learning "Python & Machine Learning for Financial Analysis" by Dr. Ryan Ahmed.

data-science deep-learning financial-analysis portfolio-management predictive-modeling time-series-analysis

Last synced: 28 Jul 2025

https://github.com/gholamrezadar/ghd-snippets-next

GHD Snippets - A Data Science Snippet Library

data-science python pytorch snippets typescript

Last synced: 02 Sep 2025

https://github.com/devscast/cd-data

important background data for the creation of a solution for the DRC

congo congo-kinshasa data data-science json rdata rdc rdc-data

Last synced: 06 Apr 2025

https://github.com/pmgraham/datagrunt

Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.

csv csv-parser data-analysis data-engineering data-science data-wrangling dataframe duckdb open-source polars python python3

Last synced: 26 Aug 2025

https://github.com/dusenberrymw/systemml-nn

A deep learning library for Apache SystemML.

data-science deep-learning machine-learning neural-networks systemml

Last synced: 02 Sep 2025

https://github.com/jasdumas/depaul

Coursework from DePaul MS in Predictive Analytics

coursework data-science grad-school predictive-analytics

Last synced: 05 Mar 2025

https://github.com/ccao-data/model-condo-avm

Automated valuation model for all class 299 and 399 residential condominiums in Cook County

assessment condo data-science machine-learning model property-taxes r tidymodels

Last synced: 11 Apr 2025

https://github.com/crdietrich/meerkat

Data acquisition for Raspberry Pi and Micropython

data-science drivers micropython raspberrypi

Last synced: 13 May 2025

https://github.com/rueedlinger/ml-resources

A curated list of statistics, data visualization and machine learning resources which in find useful, have read or want to read.

curated-list data-science data-visualization deep-learning machine-learning statistics

Last synced: 01 Apr 2025

https://github.com/zoltan-nz/ci-cd-pipeline-template-for-data-projects

CI/CD pipeline template for data science projects using GitLab CI and Kubernetes

cd ci ci-cd data-science docker gitlab gitlab-runner kubernetes python

Last synced: 07 Mar 2026

https://github.com/teddyoweh/cheat-model

NLP Text Binary Probabilistic Classification Model for predicting cheat statements

data-science machine-learning nlp tokenizer

Last synced: 23 Aug 2025

https://github.com/pottekkat/heart-disease-classifier

Given clinical parameters of a patient, can we predict whether or not they have heart disease?

data-science data-visualization heart-disease-analysis heart-disease-predictor jupyter-notebook machine-learning

Last synced: 25 Oct 2025

https://github.com/ugurcanerdogan/effects-of-moon-cycles-on-cryptocurrencies

BBM469*DSCP - Data Science Capstone Project - Do Lunar Phases affect Cryptocurrencies or not? : It has been on the social media agenda lately that the moon phases have some effects on "cryptocurrencies" but there is no research on it, it just qualifies as a realization. Here, our goal in this project is a statistical investigation of whether the different phases of the moon have an effect on cryptocurrencies.

bbm469 cryptocurrency data-science dscp lunar-phases moon-cycles moon-phase statistical-analysis technical-analysis

Last synced: 18 Mar 2025

https://github.com/pytask-dev/cookiecutter-pytask-project

A minimal cookiecutter template for a project with pytask.

cookiecutter data-science pytask research

Last synced: 26 Jul 2025

https://github.com/albarsil/geneticml

A simple and lightweight genetic algorithm for optimization of any machine learning model

automl data-science genetic-algorithm machine-learning

Last synced: 13 Apr 2025

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Apr 2025

https://github.com/tnwei/nbread

Snappy previews of Jupyter notebooks from the command line, with ranger integration

data-science jupyter python ranger

Last synced: 22 Apr 2025

https://github.com/dsacms/deduplifhir

Prototype for basic deduplication and aggregation of eCQM data

ai cmsoss-tier3 data-science deduplication electron government healthcare poetry python

Last synced: 13 Apr 2025

https://github.com/dr-montasir/mnjs

MATH NODE JS (MNJS): A tiny math library for node.js & JavaScript on browser

data-analysis data-science javascript js jsdelivr library math nextjs npm react svelte sveltekit ts typescript yarn

Last synced: 26 Apr 2025

https://github.com/vicotrbb/data_science

Repository created to store all my studies about data science, machine learning and artificial intelligence.

data-science machine-learning python roadmap studies

Last synced: 14 Apr 2025

https://github.com/samedwardes/pydatafaker

A python package to create fake data with relationships between tables.

data data-science fake-data python

Last synced: 23 Apr 2025

https://github.com/curiousily/ml-in-the-browser-for-hackers-with-tensorflow-js

Machine Learning examples for beginners showing how to use TensorFlow.js in the browser

data-science linear-regression machine-learning tensorflow-js tensorflow-tutorials tensorflowjs

Last synced: 26 Apr 2025

https://github.com/dkundih/vandal

Data science, Data manipulation and Machine learning library.

data-science data-visualization digital-transformation logistics logistics-4-0 machine-learning python statistics

Last synced: 16 Jan 2026

https://github.com/splines/deutsche-bahn-analysis

๐Ÿš† Analysis of delays of the Deutsche Bahn (DB)

data-science delay deutsche-bahn public-transport railway

Last synced: 15 Apr 2025

https://github.com/santiagxf/mlproject-sample

Sample repository about how to structure an ML project using software engineering practices

data-science data-science-projects git machine-learning mlops

Last synced: 24 Apr 2025

https://polis-community.github.io/red-dwarf/

A DIMensional REDuction library for stellarpunk democracy into the long haul. (Inspired by Pol.is)

civic-tech collective-intelligence data-science deliberative-democracy democracy dimensionality-reduction participatory-democracy polis

Last synced: 17 Apr 2025

https://github.com/poopoothegorilla/fastframe

DataFrame project that utilizes Apache Arrow

apache-arrow data-science dataframe golang

Last synced: 12 Jun 2025

https://github.com/twipped/spiral

A bio-cycles tracker for all humans

biology data-science health mobile react-native transgender womens-health

Last synced: 10 Jul 2025

https://github.com/polyaxon/polyaxon-lib

Deep Learning and Reinforcement learning library for TensorFlow for building end to end models and experiments.

data-science deep-learning machine-learning reinforcement-learning tensorflow tensorflow-experiments

Last synced: 30 Sep 2025

https://github.com/shawn-shan/eru

High Level Framework for PyTorch

data-science deep-learning eru neural-network python pytorch

Last synced: 30 Apr 2025

https://github.com/ritvik19/vizard

Intuitive, Interactive, Easy and Quick Visualizations for Data Science Projects

data-analysis data-science data-visualization

Last synced: 10 Apr 2025

https://github.com/tushar2704/pyverse-exploring-python-frameworks

This repository is the Ultimate guide to exploring and mastering Python Libraries & frameworks, collection of code and guide by me, Tushar!

artificial-intelligence data-analysis data-engineering data-science data-visualization machine-learning python streamlit-tushar2704 tushar2704 web-application

Last synced: 30 Oct 2025

https://github.com/syamkakarla98/datascience_head_start

This repository focuses on the building path for the data science.

data-analysis data-science data-visualization machine-learning machinelearning-python python3

Last synced: 03 May 2025

https://github.com/dariodip/rfd-discovery

This project, written in Python and Cython, deals with Discovery of Relaxed Functional Dependencies(RFDs) using a bottom-up approach.

artificial-intelligence cython data-science python python-3 university-project

Last synced: 08 Sep 2025

https://github.com/ryanlucas3/macrorandomforest

A modification of traditional random forest for time-series forecasting

data-science machine-learning random-forest time-series

Last synced: 10 Apr 2025

https://github.com/shuyib/chronic-kidney-disease-kaggle

Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.

data-cleaning-pipeline data-science data-transformation data-visualization diagnostics dimensionality-reduction feature-engineering feature-selection health-data-analysis health-data-science machine-learning machine-learning-algorithm machine-learning-algorithms model-interpretability preventative-medicine

Last synced: 19 Apr 2025

https://github.com/imsanjoykb/uber-rides-prediction-flask-deploy

This repository consists of files required to deploy a **Machine Learning** Web App created with **Flask**

data-science data-visualization deployment flask machine-learning ml-algorithms predictive-modeling uber-rides-prediction

Last synced: 30 Oct 2025

https://github.com/tsdataclinic/TREC

Transit Resilience for Essential Commuting (TREC)

climate-change data-science transit-data

Last synced: 20 Jul 2025

https://github.com/SamEdwardes/pydatafaker

A python package to create fake data with relationships between tables.

data data-science fake-data python

Last synced: 09 Jul 2025

https://github.com/cpcloud/dpyr

Python dplyr operations for SQL databases and pandas DataFrames

data-science dplyr postgres python python-3 python-library python3 sql sqlalchemy sqlite3

Last synced: 09 Sep 2025

https://github.com/psyplot/psyplot-gui

Graphical User Interface for the psyplot package

data-science gui interactive ipython psyplot qtconsole sphinx

Last synced: 02 May 2025

https://github.com/sdcastillo/PA-R-Study-Manual

An online study guide for the SOA's predictive analytics exam.

data-science data-visualization machine-learning predictive-modeling r-programming

Last synced: 06 May 2025

https://github.com/bcgov/ghg-emissions-indicator

R scripts for a GHG emissions indicator published on Environmental Reporting BC

data-science env r rstats

Last synced: 07 May 2025

https://github.com/open-risk/dataqualitytoolkit

Python toolkit for evaluating and visualizing the data quality of excel spreadsheets

data-quality data-quality-measurement data-science excel spreadsheet

Last synced: 23 Oct 2025

https://github.com/trilemmafoundation/trilemma-beta

Official repo for the Trilemma Beta Tournament

bitcoin data-science forecasting tournament

Last synced: 30 Apr 2025

https://github.com/tugot17/data-science-blog

Data science blog, https://tugot17.github.io/data-science-blog/

blog data-science xai

Last synced: 11 Jul 2025

https://github.com/oscarsaharoy/functionfit

generate functions by placing points on a graph

data-science regression

Last synced: 29 Oct 2025

https://github.com/olekscode/examples-pca-tsne

Some examples of using PCA and t-SNE for dimensionality reduction in Python and R

data-science dimensionality-reduction examples pca t-sne

Last synced: 18 Mar 2025

https://github.com/n1ghtf1re/map-of-emergency-incidents

Emergency Map allows you to effectively visualize multi-dimensional information, has an intuitive interface. The developed code is easily modified for use in a variety of areas. The use of color mixing technology enhances the perception and analysis of information

big-data big-data-analytics big-data-visualization bigdata color-mixing colors data data-analytics data-science data-visualization data-visualization-challenges data-visualization-simpler mysql open-source-project php student-project

Last synced: 18 Mar 2025

https://github.com/bradleyboehmke/cinday-rug-iml-2018

Slides and other material for Cincinnati-Dayton useR presentation on interpretable machine learning with R

data-science interpretable-machine-learning machine-learning r shortcourse-material tutorial tutorial-code

Last synced: 13 Apr 2025

https://github.com/mmore500/outset

add zoom indicators, insets, and magnified panels to matplotlib/seaborn visualizations with ease!

data-science data-visualization matplotlib pypi-package python seaborn

Last synced: 30 Apr 2025

https://github.com/tatevkaren/deep-learning-for-data-science

Deep Learning Case Studies with Tensorflow and Keras for Beginners-Advanced: ANN, CNN, RNN, Self-Organizing Maps, Boltzmann Machines, Stacked Autoencoders

ann artificial-intelligence artificial-neural-networks data-preprocessing data-science deep-learning ds keras modelling modelling-framework neural-networks numpy pandas python scikit-learn sklearn tensorflow

Last synced: 10 Apr 2025

https://github.com/thecoderpinar/big-tech-financial-insights

๐Ÿš€ A comprehensive project analyzing Big Tech stock prices using time series analysis, volatility modeling, and macroeconomic indicators. Featuring interactive dashboards and automated reporting! ๐Ÿ“ˆ๐Ÿ’ผ

data-analysis data-science finance machine-learning macroeconomics stock-analysis time-series-analysis volatility-modeling

Last synced: 03 Apr 2025

https://github.com/julienmalka/neuralnetwork

Small implementation of a neural network in Python

data-science machine-learning neural-network python

Last synced: 11 Apr 2025

https://github.com/touppercase78/tiobe-index-ratings

Index Ratings for Popular Programming Languages from TIOBE

analysis data-science datasets index jupyter-notebook programming-languages python tiobe

Last synced: 01 Apr 2025

https://github.com/phanatagama/data-science

๐Ÿš€ This repository have an Data Science docs in JupyterNote. Using python-3 while learning about material DS.

big-data data-science image-processing matplotlib-pyplot numpy opencv pandas python3 scatter scipy

Last synced: 19 Apr 2025

https://github.com/sondosaabed/nics-firearm-background-checks-investigation

๐Ÿ”ซ The data comes from the FBI's National Instant Criminal Background Check System. The NICS is used by to determine whether a prospective buyer is eligible to buy firearms or explosives. ๐Ÿ”ซ

census-data criminal-background data-analyst-nanodegree data-science data-wrangling data-wrangling-data-vis data-wrangling-data-visualisation fbi matplotlib nanodegree numpy pandas python storytelling-with-data usa

Last synced: 01 Jul 2025

https://github.com/lenguyenthedat/dextra-mindef-2015

My solution for Dextra Data Science Challenge #44 (Singapore Ministry of Defense) https://challenges.dextra.sg/challenge/44

classification data-science machine-learning xgboost

Last synced: 02 Jul 2025

https://github.com/srohit0/ml-misc

Miscellaneous Machine Learning and Data Analysis Projects

colaboratory data-analysis data-science data-visualization google-colab machine-learning-algorithms

Last synced: 15 Apr 2025

https://github.com/pirocheto/phishing-url-detection

Train a machine learning model for Phishing URL Detection with mlops practices.

ai anti-phishing cybersecurity data-science machine-learning mlops phishing-detection

Last synced: 06 Apr 2025

https://github.com/zincware/znnl

A Python package for studying neural learning

data-science data-selection machinelearning mathematics physics

Last synced: 09 Aug 2025

https://github.com/ahmed-maher77/wind-turbine-power-prediction-app-using-machine-learning

"Wind Power Predictor" is a machine learning project that forecasts turbine output using real-time data from Turkish wind farms. Its web app interface offers convenient access to predictions, enabling informed decisions for maximizing energy production and advancing renewable energy usage.

ai catboost data-analysis data-science flask html-css-javascript javascript machine-learning matplotlib numpy pandas predictive-modeling pwa python sklearn web web-development wind wind-turbine wind-turbine-operational-optimization

Last synced: 10 Apr 2025

https://github.com/aditeyabaral/kepler-exoplanet-analysis

Analysis of Kepler Objects of Interest using Machine Learning for Exoplanet Identification.

data-analytics data-science exoplanet-analysis exoplanets kepler machine-learning nasa space

Last synced: 16 Apr 2025

https://github.com/mrankitgupta/python-libraries-roadmap

I am sharing lessons in various Python Libraries from scratch to intermediate including practice sets which were useful into my journey of Data Science.

66daysofdata ai analytics ankitgupta artificial-intelligence data-science data-visualization libraries library machine-learning matplotlib mrankitgupta numpy pandas python python-libraries python-library pythonlib scikit-learn tensorflow

Last synced: 22 Apr 2025

https://github.com/akbaritabar/dask-duckdb-dbeaver

Parallelised and out of memory data analysis using Dask in Python and DuckDB and DBeaver in SQL. Using example of publicly accessible ORCID 2019 XML files

data-analysis data-science pandas parallel-computing python

Last synced: 08 Aug 2025

https://github.com/rajveersinghcse/excelr-assignments

๐Ÿ™‡This GitHub repository hosts my internship assignment projects, tasks, and reports, showcasing my skills and contributions during the internship period.

assignments data-science excelr excelr-assignments excelr-assignments-github

Last synced: 04 May 2025

https://github.com/hoshibatista/base-of-ds

This repository serves as a foundation for projects in Data Science and Machine Learning.

clustering-algorithm data-science data-visualization machine-learning

Last synced: 05 Sep 2025

https://github.com/giswqs/geog-312-2021

First Steps in GIS Programming (GEOG 312) at the University of Tennessee, Knoxville

data-science geopython geospatial gis jupyter mapping python

Last synced: 12 Jul 2025

https://github.com/mayer79/statistical_computing_material

Material for the lecture Statistical Computing

data-science machine-learning r statistics

Last synced: 01 May 2025

https://github.com/sukanyabag/text-summarization-using-bert-gpt2-xlnet

This notebook leverages Transfer Learning Algorithms and standard NLP procedures to summarize a given paragraph meaningfully.

bert-model data-science gpt-2 huggingface-transformers machine-learning natural-language-processing textsummarization transfer-learning xlnet

Last synced: 24 Apr 2025

https://github.com/niaid/r_intro

A Gentle Introduction to R, RStudio, and visualization

bcbb-training data-science machine-learning programming r visualization

Last synced: 28 Aug 2025

https://github.com/yash22222/ibm-csrbox-internship-project

The objective of the Data Analytics internship at CSRBOX is to provide interns with hands-on experience in applying data analytics techniques to real-world projects in the field of corporate social responsibility (CSR). Interns will gain practical skills in data collection, cleaning, analysis, visualization, and reporting, while working on projects

data-mining data-preprocessing data-science exploratory-data-analysis feature-engineering lemmatization machine-learning pandas pos-tagging random-forest random-forest-classifier scikit-learn sentiment-analysis web-scraping wordcloud

Last synced: 22 Apr 2025

https://github.com/dlopezyse/drug-repurposing-using-kge

๐Ÿ’Š Drug repurposing using knowledge graph embeddings with a focus on vector-borne diseases

biotechnology data-science drug-repurposing health knowledge-graph machine-learning

Last synced: 28 Feb 2025

https://github.com/nagasaki45/dbdapy

Following "Doing Bayesian Data Analysis", in python

bayesian-data-analysis data-science pymc3

Last synced: 29 Jul 2025

https://github.com/TrilemmaFoundation/Trilemma-Beta

Official repo for the Trilemma Beta Tournament

bitcoin data-science forecasting tournament

Last synced: 11 May 2025

https://github.com/upsonic/server

Self-Driven Autonomous Python Libraries

data data-science gpt-4o library-management ml mlops python

Last synced: 22 Aug 2025

https://github.com/gvwilson/sys-tutorial

Systems Administration for Weary Data Scientists

data-science system-administration tutorial

Last synced: 24 Mar 2025

https://github.com/worldbank/rissk

Identify at-risk interviews directly from your Survey Solutions export files.

data-science fraud-detection quality-assurance survey survey-analysis survey-data survey-solutions survey-statistics

Last synced: 24 Apr 2025