An open API service indexing awesome lists of open source software.

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/app-generator/devtool-data-converter

Open-Source Data Converter - CVS, XLS, DF | AppSeed

appseed-sample data-converter data-science

Last synced: 01 Aug 2025

https://github.com/mrankitgupta/python-libraries-roadmap

I am sharing lessons in various Python Libraries from scratch to intermediate including practice sets which were useful into my journey of Data Science.

66daysofdata ai analytics ankitgupta artificial-intelligence data-science data-visualization libraries library machine-learning matplotlib mrankitgupta numpy pandas python python-libraries python-library pythonlib scikit-learn tensorflow

Last synced: 22 Apr 2025

https://github.com/rajveersinghcse/excelr-assignments

๐Ÿ™‡This GitHub repository hosts my internship assignment projects, tasks, and reports, showcasing my skills and contributions during the internship period.

assignments data-science excelr excelr-assignments excelr-assignments-github

Last synced: 04 May 2025

https://github.com/ramanks19/aiml-projects

Projects which were completed as part of assignments of Great Learning's PGP in Artificial Intelligence and Machine Learning

computer-vision data-science ensemble-machine-learning greatlearning neural-networks nlp-machine-learning recommendation-system supervised-learning unsupervised-learning

Last synced: 03 Jan 2026

https://github.com/mayer79/statistical_computing_material

Material for the lecture Statistical Computing

data-science machine-learning r statistics

Last synced: 01 May 2025

https://github.com/niaid/r_intro

A Gentle Introduction to R, RStudio, and visualization

bcbb-training data-science machine-learning programming r visualization

Last synced: 28 Aug 2025

https://github.com/aditeyabaral/kepler-exoplanet-analysis

Analysis of Kepler Objects of Interest using Machine Learning for Exoplanet Identification.

data-analytics data-science exoplanet-analysis exoplanets kepler machine-learning nasa space

Last synced: 16 Apr 2025

https://github.com/nagasaki45/dbdapy

Following "Doing Bayesian Data Analysis", in python

bayesian-data-analysis data-science pymc3

Last synced: 29 Jul 2025

https://github.com/lenguyenthedat/dextra-mindef-2015

My solution for Dextra Data Science Challenge #44 (Singapore Ministry of Defense) https://challenges.dextra.sg/challenge/44

classification data-science machine-learning xgboost

Last synced: 02 Jul 2025

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Apr 2025

https://github.com/orico/flexeegile

Extending Agile For AI & Data Teams

agile ai data data-science flexeegile methodology

Last synced: 08 Jan 2026

https://github.com/wesslen/iviz-rstudio-workshop

Interactive Visualizations with RStudio Workshop for UNCC DSI

data-science htmlwidgets interactive-visualizations rstudio shiny shinyapps tidyverse

Last synced: 19 Jan 2026

https://github.com/tnwei/nbread

Snappy previews of Jupyter notebooks from the command line, with ranger integration

data-science jupyter python ranger

Last synced: 22 Apr 2025

https://github.com/pytask-dev/cookiecutter-pytask-project

A minimal cookiecutter template for a project with pytask.

cookiecutter data-science pytask research

Last synced: 26 Jul 2025

https://github.com/curiousily/ml-in-the-browser-for-hackers-with-tensorflow-js

Machine Learning examples for beginners showing how to use TensorFlow.js in the browser

data-science linear-regression machine-learning tensorflow-js tensorflow-tutorials tensorflowjs

Last synced: 26 Apr 2025

https://github.com/datumorphism/datumorphism.github.io

My knowledgebase on machine learning, data visualization, and some fun stuff.

artificial-intelligence data-science data-visualization giscus machine-learning statistics

Last synced: 24 Oct 2025

https://github.com/rueedlinger/ml-resources

A curated list of statistics, data visualization and machine learning resources which in find useful, have read or want to read.

curated-list data-science data-visualization deep-learning machine-learning statistics

Last synced: 01 Apr 2025

https://github.com/worldbank/rissk

Identify at-risk interviews directly from your Survey Solutions export files.

data-science fraud-detection quality-assurance survey survey-analysis survey-data survey-solutions survey-statistics

Last synced: 24 Apr 2025

https://github.com/dr-montasir/mnjs

MATH NODE JS (MNJS): A tiny math library for node.js & JavaScript on browser

data-analysis data-science javascript js jsdelivr library math nextjs npm react svelte sveltekit ts typescript yarn

Last synced: 26 Apr 2025

https://github.com/csfelix/how-to-do-instagram-portfolio

๐ŸŒ Public Codes and Projects shared in my Instagram's Account (C0dePlus) and Portfolio ๐ŸŒ (๐Ÿ”‘ KeyWords: html, css, js, python, java, c#, cypher, sql ๐Ÿ”‘)

c-sharp css cypher data-science html java javascript js jupyter-notebook mongodb mysql neo4j python sql

Last synced: 28 Apr 2025

https://github.com/santiagxf/mlproject-sample

Sample repository about how to structure an ML project using software engineering practices

data-science data-science-projects git machine-learning mlops

Last synced: 24 Apr 2025

https://github.com/shotahorii/bareml

Machine learning & deep learning implementation from scratch, depending only on numpy.

data-science deep-learning deep-neural-networks machine-learning machine-learning-algorithms machine-learning-from-scratch statistical-models

Last synced: 14 Jan 2026

https://github.com/adrienc21/vulpes

Vulpes: Test many classification, regression models and clustering algorithms to see which one is most suitable for your dataset

automl data-analysis data-science machine-learning models package python scikit-learn statistics

Last synced: 25 Oct 2025

https://github.com/dlopezyse/drug-repurposing-using-kge

๐Ÿ’Š Drug repurposing using knowledge graph embeddings with a focus on vector-borne diseases

biotechnology data-science drug-repurposing health knowledge-graph machine-learning

Last synced: 28 Feb 2025

https://github.com/kylegrealis/nascar.data

R package of NASCAR race results & other information

data-science data-visualization package r racing

Last synced: 25 Oct 2025

https://github.com/samedwardes/pydatafaker

A python package to create fake data with relationships between tables.

data data-science fake-data python

Last synced: 23 Apr 2025

https://github.com/pottekkat/heart-disease-classifier

Given clinical parameters of a patient, can we predict whether or not they have heart disease?

data-science data-visualization heart-disease-analysis heart-disease-predictor jupyter-notebook machine-learning

Last synced: 25 Oct 2025

https://github.com/sayakpaul/talksgiven

Contains the deck of my talks given at different developer meet-ups and conferences.

data-science machine-learning

Last synced: 28 Apr 2025

https://github.com/robertvazan/sourceafis-visualization-java

Visualizations of biometric features in fingerprint templates produced by SourceAFIS and in algorithm transparency data captured during feature extraction and matching in SourceAFIS.

biometrics data-science feature-extraction fingerprint fingerprint-authentication minutia sourceafis visualization-library

Last synced: 14 Oct 2025

https://github.com/baslia/quant_analysis

I created some notebooks about different concepts of financial engineering

analytics data-science ds quantitative-finance trading trading-strategies

Last synced: 20 Oct 2025

https://github.com/pabannier/sparseglm

Fast and modular solver for sparse generalized linear models

data-science machine-learning optimization

Last synced: 10 Apr 2025

https://github.com/anshumansinha3301/matplotlib_visualizations

Some Graphs using Matplotlib in Python

data-science matplotlib python

Last synced: 07 Oct 2025

https://github.com/moindalvs/resume_screening_and_parser

Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention Sample Data Set Details: Resumes and financial documents

data-science doc2txt doc2vec docx-converter docx-to-pdf docx2txt pdf-document-processor pdf2txt streamlit text text-analysis text-classification text-mining text-processing unstructured-data

Last synced: 23 Apr 2025

https://github.com/hissain/jscipy

Java Scientific Computing Library for Signal Processing, Filters, and Transformations. A NumPy/SciPy port for JVM & Android, used in Machine Learning and Data Science.

android chebyshev-filter cubic-splines data-science dsp fft findpeaks hilbert-transform interpolation java machine-learning numerical-computing python resample savitzky-golay scientific-computing scipy scipy-signal signal-processing

Last synced: 23 Jan 2026

https://github.com/blacksuan19/redash-python

A More complete Redash API python client

dashboards data-science data-visualization python

Last synced: 24 Apr 2025

https://github.com/mituskillologies/ds-diploma-internship-jun24

Programs conducted at MITU Skillologies, Pune office in internship training on Data Science during June-July 2024 for Diploma Engineering Students.

data-analytics data-science data-visualization machine-learning project python python3

Last synced: 09 Apr 2025

https://github.com/habedi/myr-languagecodes

My efforts for bettering my knowlage of R language

data-mining data-science data-visualization dataset graph r

Last synced: 27 Apr 2025

https://github.com/moindalvs/forecasting_airline_passengers_traffic

Forecast the Airlines Passengers. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.

additive arima-forecasting data-science double-exponential-smoothing forecasting holt-winters holt-winters-forecasting multiplicative sarima-model seasonality-analysis simple-exponential-smoothing stationarity stationarity-test time-series-forecasting timeseries-analysis trend-analysis triple-exponential-smoothing

Last synced: 23 Apr 2025

https://github.com/kingabzpro/annual-recycled-energy-saved-in-singapore

Learn how much Singapore is saving energy per years by recycling plastics, paper, glass, ferrous and non-ferrous metal

cleaning-data data-analysis data-science deepnote energy environment

Last synced: 19 Jun 2025

https://github.com/pathwiselabs/pixel-pipeline

A Python application with Gradio UI for batch processing and captioning of images, allowing for easy integration with AI image training workflows.

data-cleaning data-science flux generative-ai stable-diffusion stable-diffusion-webui

Last synced: 04 Mar 2026

https://github.com/recodehive/recode-website

recodehive helps you to learn and master the skills on data, and encourage you to code on opensource.

data data-science dataengineering opensource python sql tutorials website

Last synced: 15 Mar 2026

https://github.com/noorkhokhar99/plagiarsim-checker

Plagiarsim checker using cosine algorithm #Plagiarsimchecker

ai api checker data-science database nlp nlptk plagiarsim python

Last synced: 16 Oct 2025

https://github.com/kalyan4636/python-eering

PYTHON PROJECT WITH SOURCE CODE. the best Python project name is one that is descriptive, memorable, and fun for you to say. Don't be afraid to get creative and use emojis to make your project stand out! ๐Ÿ“ˆ

artificial-intelligence artificial-intelligence-algorithms data-science deep-learning django framework machine-learning machine-learning-algorithms numpy opencv opencv-python opensource pandas pil-tinker pillow python python-3 python-library python3

Last synced: 23 Apr 2025

https://github.com/csfelix/datascience-exercises

๐Ÿ Just some DataScience exercises, nothing more... ๐Ÿ (๐Ÿ”‘ KeyWords: python, data science, data analysis, pandas ๐Ÿ”‘)

data-analysis data-science datascience pandas python python3

Last synced: 05 Jul 2025

https://github.com/akashkobal/data-science

I'm excited to share my data science project๐Ÿš€, where I've applied various techniques and insights to solve a specific problem. The project follows best practices for maintainability and reproducibility, using the Data Science Project Template. Dive into the project to explore the code, datasets, documentation, and resources that showcase MyJourney

akash akash-kobal akashkobal applied-data-science artificial-intelligence classification data-science dataanalysis dataanalytics datascienceproject datascientist deep-learning kobal machine-learning prediction regression

Last synced: 17 Mar 2026

https://github.com/fearlesssolutions/engineering-practice-domains

A mono-repo for the Engineering Practice Domains of Development, Data, Infrastructure, Testing, and Platforms

data data-engineering data-science database-design devops drupal end-to-end-testing engineering infrastructure machine-learning salesforce security testing web-development

Last synced: 26 Oct 2025

https://github.com/the-pew-inc/the-pew

ThePew is an advanced system of records that enables enterprises to detect trends and patterns from questions to drive marketing and business decisions toward their goals.

data data-science docker javascript machine-learning postgresql rails ruby

Last synced: 06 Oct 2025

https://github.com/dse-capstone-sharknado/advancedbpr

Amazon Recommendation System build on BPR TensorFlow implementation

data-prep data-science exploratory-analysis ipynb machine-learning recommender-system

Last synced: 15 Oct 2025

https://github.com/linwin-cloud/linwin-db-server

ๅœจๅนฟ่ขคๆ— ๅž ็š„็Žฐไปฃๅคงๆ•ฐๆฎๆตทๆด‹ไน‹ไธญ๏ผŒ่ฎก็ฎ—ๆœบๆทฑๅบฆ็š„ๅ’ŒไฟกๆฏไปฅๅŠๆ•ฐๆฎ็ป‘ๅฎš๏ผŒๆ‰ฟ่ฝฝ่ฟ™ไบฟไธ‡ๆ•ฐๆฎ็š„ๅฐฑๆ˜ฏๆ•ฐๆฎๅบ“่ฝฏไปถใ€‚ Linwin Data Server๏ผŒๅŸบไบŽJavaๅผ€ๅ‘็š„ๅ›ฝไบง้ซ˜ๆ€ง่ƒฝๆ•ฐๆฎๅบ“่ฝฏไปถใ€‚ๆ”ฏๆŒๅ›ฝไบงๅ’ŒLinuxๆ“ไฝœ็ณป็ปŸ๏ผŒๆ”ฏๆŒๅคš็”จๆˆทๆ“ไฝœใ€‚้‡‡็”จNosql็ป“ๆž„๏ผŒ่‡ช็ ”mysๆ•ฐๆฎๅบ“ๆ“ไฝœ่ฏญ่จ€๏ผŒๆ›ดๅŠ ็ฎ€ๅ•ๆ–นไพฟ้ซ˜ๆ•ˆใ€‚ ็”จๆˆทๆ•ฐๆฎ็š„ๅขžๅˆ ๆ”นๆŸฅๅ…จ้ƒจๅœจๅ†…ๅญ˜ๅ†…ๆ“ไฝœ๏ผŒไธŽ็กฌ็›˜็š„ไบคไบ’ๅ†™ๅ…ฅ่ฏปๅ–ไบค็”ฑไธ“้—จ็š„็บฟ็จ‹็ฎก็†๏ผŒๆ— ไธๅฆจ็ข.

data data-science database hashmap http java javascript key-value linux programming-language python server typescript webserver website

Last synced: 05 Mar 2026

https://github.com/epiverse-trace/epi-training-kit

An e-learning strategy for training on analysis, modelling and response to outbreaks and epidemics in Latin-America and the Caribbean

data-science e-learning epidemics training

Last synced: 07 Jul 2025

https://github.com/mrgeislinger/udacitydand_proj_wrangleandanalyzedata

Wrangling and analyzing data project for Udacity's Data Analyst Nanodegree. Wrangles WeRateDogsโ„ข (@dog_rates) Twitter data from local, online, and Twitter API sources.

data-analysis data-analyst data-science datascience jupyter-notebook python3 twitter udacity-data-analyst-nanodegree udacity-nanodegree

Last synced: 09 Oct 2025

https://github.com/abhaysingh71/ai-powered-healthcare-intelligence-network

The AI-Powered Healthcare Intelligence Network is an AI-driven system offering disease prediction, drug recommendations, heart disease risk assessment, and an AI medical chatbot. Using ML, NLP, and LLMs, it provides accurate diagnoses, insights, and recommendations, enhancing healthcare accessibility, efficiency, and decision-making .

airtificialintelligence chatbot data-analysis data-science datawrangling disease-prediction healthcare-ai heart-disease huggingface langchain large-language-models lightgbm machine-learning mistral-7b recommendation-system retrieval-augmented-generation sentence-transformers shap vector-database

Last synced: 01 Feb 2026

https://github.com/x-tabdeveloping/rvfln

A Python implementation of random vector functional networks and broad learning systems using Sklearn's Regressor and classifier APIs

broad-learning data-science deep-learning machine-learning scikit-learn sklearn sklearn-compatible

Last synced: 22 Mar 2025

https://github.com/carpentries-incubator/open-science-with-r

Carpentry-style lesson on how to use R, RStudio together with git & Github to promote Open Science practices.

alpha carpentries data-science dplyr ggplot2 git github lesson open-science r rstudio scripting tidyr

Last synced: 02 Sep 2025

https://github.com/zackakil/friendlier-data-labelling

Code resources for generating a google form for labelling data.

data-science google google-apps-script google-forms google-sheets machine-learning

Last synced: 04 Oct 2025

https://github.com/darenasc/auto-fes

Automated exploration of files in a folder structure to extract metadata and potential usage of information.

data-exploration data-profiling data-science eda plain-text python

Last synced: 16 Mar 2025

https://github.com/apreshill/ohsu-basic-stats

Introduction to Data Wrangling, Analysis, & Communication

data-science education r-stats statistics teaching

Last synced: 05 Mar 2025

https://github.com/smac-group/ds

:notebook: This book is currently under development and has been designed as a support for students who are following (or are interested in) courses that provide the basic knowledge to master "statistical programming" with R. Compiled textbook:

data-science github programming r rstudio statistics

Last synced: 22 Jul 2025

https://github.com/tushar2704/common_datasets

Common-datasets is a GitHub repository dedicated to providing a wide collection of common datasets for practicing and learning data science and machine learning.

aritificial-intelligence data-analytics data-engineering data-science data-visualization database dataset-generation datasets machine-learning

Last synced: 09 Aug 2025

https://github.com/ksdkamesh99/medium-blogs

It is a stack of all my medium and analytics vidya articles on different technologies in computer science like AI,ML,Deep learning and many more

analytics-vidya-articles computer-science data-science deep-learning machine-learning medium medium-blogs python

Last synced: 12 May 2025

https://github.com/arm-university/smart-school-projects

A collection of accessible and engaging projects for teachers and learners that utilise the more advanced features of Arduino in real-world contexts.

arduino coding computerscience computing data-science education educationprojects pbl physical-computing projects stem

Last synced: 15 Jun 2025

https://github.com/adrtod/rchallenge

A simple datascience challenge system using R Markdown and Dropbox.

challenge data-science r

Last synced: 21 Feb 2026

https://github.com/thecoderpinar/diabetes_health_prediction_and_analysis

A comprehensive project to predict and analyze diabetes health data using advanced machine learning models, including Logistic Regression, Random Forest, and XGBoost. ๐Ÿ“Š๐Ÿ”

analytics artificial-intelligence classification data-science data-visualization deep-learning diabetes-prediction health healthcare logistic-regression machine-learning medical-analysis mlops prediction python random-forest xgboost

Last synced: 12 Aug 2025

https://github.com/eurobios-mews-labs/acrocord

This package provide some useful tools to interact with postgresql server using pandas dataframe

data data-science database pandas-dataframe postgresql psycopg2 python python3 sqlalchemy table-factory

Last synced: 15 Apr 2025

https://github.com/badr-moufad/cookiecutter-simple-ds-project

A simple cookiecutter template to structure your Data Science projects.

cookiecutter data-science project-structure python simple-ds-project

Last synced: 23 Apr 2025

https://github.com/tulip-lab/modern-data-science

Modern Data Science Course

big-data data-science python

Last synced: 21 Feb 2026

https://github.com/ahammadnafiz/predicta

Predicta: Simplify your workflow with our powerful data analysis and machine learning tool.

analytics data-science data-visualization dataanalysis machine-learning pandas project python streamlit streamlit-webapp webapp

Last synced: 28 Jul 2025

https://github.com/fxstein/code-server-python

VSCode Code Server for Python Developers and Data Scientists

code-server data-science developer docker home-automation iot python synology vscode

Last synced: 25 Jul 2025

https://github.com/giswqs/learning-scipy

Learning SciPy for Numerical and Scientific Computing

data-science jupyter-notebook python scipy

Last synced: 12 May 2025

https://github.com/gesiscss/ptm

Introduction to Natural Language Processing with a special emphasis on the analysis of Job Advertisements

binder data-science information-retrieval labour-market nlp r text-mining topic-modeling

Last synced: 07 May 2025

https://github.com/phazerooman/dcai-ocr-krooki

OCR model, trained for extracted coordinates from Omani title deeds ("krooki") utilizing a Data Centric AI (DCAI) approach.

ai data-science ocr python

Last synced: 12 Apr 2025

https://github.com/hassanalgoz/python

ูƒุชุงุจ ุงู„ุจุงูŠุซูˆู†ูŠุฉ: ู…ุฏุฎู„ ุนู…ู„ูŠ ู„ุชุนู„ู… ุงู„ุจุฑู…ุฌุฉ ุจู„ุบุฉ ุจุงูŠุซูˆู†. ูƒุชุงุจ ู…ูˆุฌู‡ ู„ู„ู…ุจุชุฏุฆูŠู† ููŠ ุงู„ุจุฑู…ุฌุฉ ู…ู† ุฎู„ููŠุงุช ุชู‚ู†ูŠุฉ ุฃูˆ ุบูŠุฑ ุชู‚ู†ูŠุฉุŒ ูŠุนุฑุถ ุงู„ู…ูุงู‡ูŠู… ุงู„ุฃุณุงุณูŠุฉ ุจู„ุบุฉ ูˆุงุถุญุฉุŒ ุจุชุณู„ุณู„ ู…ู†ุทู‚ูŠุŒ ู…ุน ู…ุณุงุฆู„ ูˆุงู‚ุนูŠุฉ ูˆุชุทุจูŠู‚ุงุช ู†ุงูุนุฉุŒ ุจุนูŠุฏู‹ุง ุนู† ุงู„ุนุดูˆุงุฆูŠุฉ ูˆุงู„ุณุทุญูŠุฉ. ูŠุตู„ุญ ู„ู„ุชุนู„ู… ุงู„ุฐุงุชูŠุŒ ูˆู„ู„ุชุฏุฑูŠุณ ูƒุฐู„ูƒ.

ai arabic curriculum data-science learn-to-code learning-by-doing programming project-based-learning python

Last synced: 03 May 2025