Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2026-07-03 00:07:42 UTC
- JSON Representation
https://github.com/devinterview-io/optimization-interview-questions
🟣 Optimization interview questions and answers to help you prepare for your next machine learning and data science interview in 2025.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions optimization optimization-interview-questions optimization-questions optimization-tech-interview software-engineer-interview technical-interview-questions
Last synced: 30 Jan 2026
https://github.com/amirhosseinhonardoust/workout-efficiency-benchmark
Streamlit + Python pipeline that benchmarks gym workout efficiency (kcal/min) using present sessions only. Generates sortable workout-type benchmarks, distribution plots, fairness-aware gap analysis with uncertainty/low-sample flags, and a data-quality report to prevent misleading comparisons.
analytics benchmarking bias-audit dashboard data-analysis data-quality data-science eda fairness fitness health-data pandas plotly python reporting reproducible-research statistics streamlit visualization workout
Last synced: 10 Jun 2026
https://github.com/nikhilaravi/neuralnetflix
Movie Genre Prediction from movie posters using Deep Learning
Last synced: 18 Oct 2025
https://github.com/liamarguedas/uber-eats-delivery-time
Delivery time prediction system for Uber Eats
data-science machine-learning regression
Last synced: 10 Oct 2025
https://github.com/gianlucatruda/warfit-learn
A machine learning toolkit for reproducible research in anticoagulant dose estimation.
data-science iwpc pandas preprocessing python reproducible-research sklearn supervised-learning warfarin warfit-learn
Last synced: 24 Oct 2025
https://github.com/brunocampos01/porto-seguro-safe-driver-prediction
Predict if a driver will file an insurance claim next year. (Kaggle Competition)
challenge data-cleansing data-engineering data-science dataset insurance-claims kaggle kaggle-competition machine-learning porto-seguro python random-forest xgboost
Last synced: 05 Sep 2025
https://github.com/xability/py-maidr
Python binder for maidr library
accessibility binder braille data-science data-visualization python
Last synced: 03 Apr 2026
https://github.com/rishisankineni/capital-one-data-challenge
NYC Taxi Data Challenge - Data Scientist
capital-one data-science eda machine-learning python-3-6 xgboost
Last synced: 09 Apr 2025
https://github.com/teddyoweh/dimensionality-reduction-pca
Dimensionality reduction is basically a process of reducing the amount of random features,attributes variables or in this case called dimensions in a dataset and leaving as much variation in the dataset as possible by obtaining a set of only relevant features to increase the effiency of a model.
data-science dataset dimensional-analysis dimensionality-reduction feature-extraction feature-selection machine-learning
Last synced: 09 Apr 2025
https://github.com/joshuaulrich/stl-rug
Content presented at the Saint Louis R User Group
Last synced: 26 Aug 2025
https://github.com/cadcad-org/snippets
Repo containing notebooks showcasing features and applications of cadCAD.
cadcad data-science education python simulation snippets
Last synced: 23 Apr 2025
https://github.com/dionhaefner/fowd
Processing framework for FOWD, a free ocean wave dataset, ready for your ML application :ocean:
data-science machine-learning ocean open-data waves
Last synced: 21 Aug 2025
https://github.com/tuanle618/AEDA
AEDA - Automated Data Exploratory Analysis in R
data-science eda eda-report exploratory-data-analysis r
Last synced: 29 Jul 2025
https://github.com/devinterview-io/naive-bayes-interview-questions
🟣 Naive Bayes interview questions and answers to help you prepare for your next machine learning and data science interview in 2025.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions naive-bayes naive-bayes-interview-questions naive-bayes-questions naive-bayes-tech-interview software-engineer-interview technical-interview-questions
Last synced: 23 Feb 2026
https://github.com/durgeshsamariya/100daysofdatascience
A 100 Day DS Challenge to learn and implement DS concepts ranging from the beginner of Data Science to Data Scientist.
100days 100daysofcode 100daysofdscode 100daysofmlcode data data-science
Last synced: 15 Apr 2025
https://github.com/quantifyearth/yirgacheffe
A declarative geospatial library for Python to make data-science with maps easier
data-science geospatial python3
Last synced: 01 Apr 2026
https://github.com/ihmeuw/easylink
A tool that allows users to build and run highly configurable record linkage/entity resolution pipelines.
data-science entity-resolution record-linkage
Last synced: 01 Apr 2026
https://github.com/JRaviLab/compbio-gists
Computational Biology & Bioinformatics Resources
bioinformatics comparative-genomics computational-biology data-science gists molecular-evolution phylogeny r shell transcriptomics
Last synced: 07 Oct 2025
https://github.com/cdcgov/cdh-lava-react
CDC Data Hub Lifecycle, Analysis & Visualization Accelerator (LAVA) REACT Components based on machine readable requirements.
agile-development azure data-analysis data-catalog data-governance data-quality data-science data-visualization databricks datavisualization devops excel-export metadata operations powerautomate powerbi pyspark security sql test-automation
Last synced: 22 Apr 2025
https://github.com/tezansahu/dvc-pycaret-fastapi-demo
Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)
data-science demo deployment dvc fastapi machine-learning mlops-workflow pycaret
Last synced: 26 Dec 2025
https://github.com/mrdandelion6/learn-to-code
This repository is a collection of my notes and code snippets as I journey through learning different programming languages and coding concepts.
c data-analysis data-science javascript learn-to-code machine-learning matlab python r react shell-script
Last synced: 11 Apr 2025
https://github.com/clowdr/clowdr
Command-line utility for iteratively developing pipelines, deploying them at scale, and sharing data and derivatives
data-science docker hpc-applications pipelines python singularity
Last synced: 14 Jan 2026
https://github.com/ravaghi/kaggle-notebooks
Kaggle Notebooks
data-science kaggle machine-learning python
Last synced: 28 Jan 2026
https://github.com/storopoli/linguagem-r
Disciplina de Linguagem R para Ciência de Dados de Pós-Graduação da UNINOVE
data-science r-language r-programming r-stats
Last synced: 31 Oct 2025
https://github.com/cakecrusher/mimicbot
Mimicbot enables the effortless yet modular creation of an AI chat bot model that imitates another person's manner of speech.
ai bot data-science discord discord-bot huggingface natural-language-processing pypi python python-package
Last synced: 28 Oct 2025
https://github.com/nalomran/pyreqtl
A collection of Python modules equivalent to R ReQTL Toolkit aims to identify the association between expressed SNVs with their gene expression using RNA-sequencing data.
bioinformatics bioinformatics-analysis bioinformatics-tool data-science gene-expression matrixeqtl numpy pandas python python3 r rna-seq rna-seq-analysis rpy2 scipy snvs
Last synced: 27 Oct 2025
https://github.com/giswqs/leafmaptools
A Python package for building a tool widgets infrastructure with ipyleaflet and ipywidgets
data-science data-visualization geopython geospatial ipyleaflet ipywidgets jupyter jupyter-notebook mapping python
Last synced: 12 May 2025
https://github.com/devinterview-io/logistic-regression-interview-questions
🟣 Logistic Regression interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation logistic-regression logistic-regression-interview-questions logistic-regression-questions logistic-regression-tech-interview machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 08 Jan 2026
https://github.com/benedekrozemberczki/hullcoverconditionedunitdiskgraph
A generator for unit disk graphs conditioned on concave hull cover.
data data-generator data-science data-visualization deep-learning fun funny graph graph-clustering graph-embedding graph-visualization hull-cover joke machine-learning network-visualization networkx node-embedding non-planar-graph synthetic unit-disk-graph
Last synced: 06 Jul 2025
https://github.com/mahdi-eth/linear-regression-from-scratch
This project implements a Python-based linear regression model from scratch, complete with custom functions for mean squared error and gradient descent algorithm. It is tested on data, using features to predict target variables. The project offers a practical introduction to linear regression.
algorithm data-science data-visualization linear-regression machine-learning machine-learning-algorithms python
Last synced: 15 Apr 2025
https://github.com/vanessaaleung/data-science-notes
Data Science Learning Notes
data-science data-visualization machine-learning marketing-analytics object-oriented-programming probability product-management python3 sql statistics
Last synced: 25 Apr 2025
https://github.com/serialbandicoot/great-assertions
This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.
data-science data-testing databricks great-expectations jupyter-notebook python python3 quality-assurance testing
Last synced: 28 Oct 2025
https://github.com/kennethleungty/datawig-missing-data-imputation
Imputation of Missing Data in Tables
data-imputation data-science datawig deep-learning imputation machine-learning
Last synced: 12 Jul 2025
https://github.com/iterative/features
A collection of development container 'features' for machine learning and data science
data-science dvc features machine-learning
Last synced: 18 Jun 2025
https://github.com/bhattbhavesh91/catboost-tutorial
A small tutorial to demonstrate the power of CatBoost Algorithm
catboost catboost-algorithm catboost-tutorial categorical-features data-science decision-trees gpu gpu-computing gradient-boosting machine-learning tutorial
Last synced: 17 Apr 2025
https://github.com/waylonwalker/kedro-diff
quickly diff kedro history
data-science diff kedro kedro-hook kedro-plugin python
Last synced: 10 Mar 2026
https://github.com/egemenzeytinci/data-science-notes
My own notes about data science
course-materials data-science machine-learning neo4j pandas python scala spark
Last synced: 23 Apr 2025
https://github.com/minusxai/minusx
MinusX is an Agentic Business Intelligence platform. It's Claude Code for data.
artificial-intelligence data-analytics data-science jupyter metabase
Last synced: 18 Feb 2026
https://github.com/google-marketing-solutions/fractional_uplift
A flexible python package for cost-aware uplift modelling.
data-science marketing python uplift-modeling
Last synced: 31 Jul 2025
https://github.com/srlozano/tinder-big-data-analysis
Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech
big-data big-data-analytics data-science dating-app mongodb python
Last synced: 11 Oct 2025
https://github.com/mljar/variable-inspector
Explore variables in Jupyter notebooks
data-science jupyter jupyterlab jupyterlab-extension mljar python
Last synced: 01 Mar 2025
https://github.com/mikeroyal/apache-arrow-guide
Apache Arrow Guide
apache arrow data-science database etl-automation etl-pipeline
Last synced: 31 Mar 2025
https://github.com/celbridge-org/celbridge
Celbridge is an open source tool that provides a bridge between spreadsheets and Python scripting.
data-science data-visualization excel markdown python spreadjs spreadsheets webviewer
Last synced: 02 Feb 2026
https://github.com/giswqs/learning-python
Python notebooks
data-mining data-science jupyter-notebook python
Last synced: 27 Jul 2025
https://github.com/stainlessai/micronaut-jupyter
A Micronaut configuration that integrates your app with an existing Jupyter installation.
data-science jupyter jupyter-notebooks jupyterlab micronaut microservices
Last synced: 14 Jan 2026
https://github.com/devinterview-io/classification-algorithms-interview-questions
🟣 Classification Algorithms interview questions and answers to help you prepare for your next machine learning and data science interview in 2025.
ai-interview-questions classification-algorithms classification-algorithms-interview-questions classification-algorithms-questions classification-algorithms-tech-interview coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 08 Jan 2026
https://github.com/bfortuner/zoosearch
Search engine for machine learning models and datasets
data-science deep-learning fusejs machine-learning react
Last synced: 23 Oct 2025
https://github.com/akshaysharma096/classify-human-diseases-using-deeplearning
Automated methods to detect and classify human diseases from medical images, using Deep Neural Networks
convolutional-neural-networks data-science deep-learning keras keras-neural-networks machine-learning python3
Last synced: 12 Sep 2025
https://github.com/jl33-ai/dotplotlib
A basic extension library for creating tree dot plots, strip plots or dot charts w/ matplotlib or seaborn in Python
data-analysis data-science data-visualization dot-chart dotplot dotplots matplotlib-pyplot matplotlib-python python seaborn seaborn-plots strip-plots
Last synced: 07 Sep 2025
https://github.com/sjcobb/three-earthquake
Earthquake effect & music visualization using Three.js
3d animation cgi data-science data-visualization earthquake javascript js json midi music-visualization physics threejs vfx visual-effects
Last synced: 10 Apr 2025
https://github.com/zjunlp/datamind
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
agent artificial-intelligence data-analysis data-science language-model natural-language-processing
Last synced: 04 Oct 2025
https://github.com/omarsar/nlp_research
🔥 Summary of interesting NLP Papers and Research (Fast and easy reads!) 🔥
artificial-intelligence data-science deep-learning machine-learning nlp
Last synced: 13 Feb 2026
https://github.com/cool-japan/pandrs
DataFrame library for data analysis implemented in Rust. It has features and design inspired by Python's pandas library, combining fast data processing with type safety.
data-analysis data-science datafrane pandas rust rust-lang
Last synced: 04 Apr 2026
https://github.com/alteryx/featuretools_sql
Automated creation of EntitySets from relational data stored in SQL databases
automated-feature-engineering automated-machine-learning automl data-science feature-engineering featuretools machine-learning mysql postgres postgresql sql
Last synced: 12 Dec 2025
https://github.com/correia-jpv/fucking-awesome-bigdata
A curated list of awesome big data frameworks, resources and other awesomeness. With repository stars⭐ and forks🍴
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 27 Apr 2025
https://github.com/aiwithqasim/datascience-python
to keep myself motivated toward the daily habit of programming.
data-analysis data-science data-visualization machine-learning matplotlib-pyplot numpy pandas python python3 seaborn
Last synced: 17 Mar 2025
https://github.com/adityakamble49/loss-ratio-prediction
Predicting Loss Ratios for Auto Insurance Portfolios - ITCS 6100 Big Data Analytics for Competitive Advantage
big-data big-data-analytics data-science insurance jupyter-notebook politics python
Last synced: 04 Apr 2026
https://github.com/devinterview-io/decision-tree-interview-questions
🟣 Decision Tree interview questions and answers to help you prepare for your next machine learning and data science interview in 2025.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview decision-tree decision-tree-interview-questions decision-tree-questions decision-tree-tech-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 28 Jan 2026
https://github.com/chifisource/ipycells.jl
cells, pluto, ipython, and olive readers and writers
data-science ipython-notebook julia jupyter-notebook odd-data olive pluto pluto-notebooks
Last synced: 28 Oct 2025
https://github.com/madhurimarawat/developer-resources-hub
A comprehensive collection of valuable resources for developers, covering job preparation, programming, frontend, backend, IoT, databases, and more.
ai-art ai-ml app-links aptitude awesome-list blockchain coding-questions data-science databases developer-experience developer-tools free-books free-courses full-stack-development graphic-design iot-resources linux llm powerbi python
Last synced: 11 Oct 2025
https://github.com/scigolib/matlab
Pure Go library for reading and writing MATLAB .mat files (v5-v7.3+). No CGo, no external dependencies. Full support for numeric types, complex numbers, and multi-dimensional arrays. Cross-platform (Windows/Linux/macOS). Part of SciGoLib ecosystem.
cross-platform data-science go golang hdf5 mat-files matlab no-cgo octave pure-go scientific-computing scientific-data
Last synced: 05 Apr 2026
https://github.com/bgreenwell/statlingua
Explain Statistical Output with Large Language Models
data-science explainability large-language-models llm llms statistics teaching-tools
Last synced: 28 Feb 2026
https://github.com/manuparra/volleyball-performance-analysis
R package to Volleyball Performance Analysis and Visualization
analysis data-science datavisualization performance r sport volleyball
Last synced: 12 Apr 2025
https://github.com/mynameisvinn/docker-for-data-scientists
containerizing jupyter notebooks
data-science docker docker-container ipython-notebook
Last synced: 24 Jul 2025
https://github.com/neeru1207/gui-face-recognizer
A GUI based face recognizer coded in Python
cnn-keras computer-vision data-science deep-learning face-detection face-recognition gui-application haar-cascade-classifier haarcascade haarcascade-frontalface machine-learning neural-network neural-networks opencv-python opencv2 python3 tkinter-gui tkinter-python
Last synced: 17 Aug 2025
https://github.com/aspuru-guzik-group/molar
Molar is a database management to make it easy to store experiment whether computational or not
alembic chemistry chemistry-lab chemistry-laboratory data-science data-structures database database-management databases fastapi pandas postgresql pydata python python3 rest-api sql sqlalchemy
Last synced: 15 Apr 2025
https://github.com/kehaowu/dailypython
python日报,每天分享5篇精选python好文
data-science data-visualization machine-learning python
Last synced: 10 Mar 2026
https://github.com/gagolews/programowanie_w_jezyku_r
M. Gągolewski, Programowanie w języku R, PWN, 2016
data-science polski r statistics
Last synced: 14 Jul 2025
https://github.com/kennethleungty/simulated-annealing-feature-selection
Feature Selection using Simulated Annealing
annealing data-science feature-engineering feature-selection global-optima global-optimization global-optimization-algorithms global-optimizers global-search global-searching machine-learning ml optimisation optimisation-algorithms optimization optimization-algorithms python search simulated-annealing
Last synced: 12 Jul 2025
https://github.com/calkit/calkit-cloud
A platform for creating and sharing knowledge via Calkit projects.
analytics data-science open-science reproducibility reproducible-research research sharing sharing-data
Last synced: 11 Apr 2026
https://github.com/skblaz/autobot
An autoML for explainable text classification.
automl automl-algorithms automl-experiments classification data-mining data-science distributed-computing ensemble-learning evolutionary-algorithms machine-learning multimodal-learning natural-language-processing nlp python representation-learning sparse-matrices text-classification transfer-learning transformers-models
Last synced: 18 Aug 2025
https://github.com/autuanliu/mdeeplearn
:books: :peach: Machine Learning and Deep Learning with examples.
containers data-science deep-learning docker examples jupyter-notebook machine-learning neural-network python r statistics timeseries visualization
Last synced: 11 Jul 2025
https://github.com/6chaoran/data-story
data story tech-blog
data-science data-visualization
Last synced: 08 Apr 2026
https://github.com/edaaydinea/op2-prediction-of-the-different-progressive-levels-of-alzheimer-s-disease-with-mri-data
This is an optional model development project on a real dataset related to predicting the different progressive levels of Alzheimer’s disease (AD) with MRI data.
anova-analysis catboost-classifier chi-square-test data-science deep-learning keras-neural-networks lightgbm-classifier logistic-regression machine-learning multi-layer-perceptron-classifier neural-networks random-forest-classifier smote-oversampler tensorflow xgboost-classifier
Last synced: 11 Apr 2025
https://github.com/tsg405/applied-machine-learning-in-python
This Repo contains - Starter files, Coursework, Programming Assignments for the course --> Applied Machine Learning in Python, University of Michigan [COURSERA]
applied-machine-learning assignment classification coursera data-science fruit-dataset machine-learning matplotlib-pyplot numpy pandas python quiz regression scikit-learn scipy seaborn supervised-machine-learning university-of-michigan unsupervised-machine-learning
Last synced: 14 Apr 2025
https://github.com/pegah-ardehkhani/customer-churn-prediction-and-analysis
Analysis and Prediction of the Customer Churn Using Machine Learning Models (Highest Accuracy) and Plotly Library
accuracy churn-prediction classification classification-algorithm cross-validation customer-churn customer-churn-analysis customer-churn-prediction data-science feature-engineering feature-importance gridsearchcv imbalanced-data machine-learning machine-learning-algorithms plotly python roc-auc sklearn telco
Last synced: 29 Jun 2025
https://github.com/sachinl0har/lgmvip-data-science
Lets Grow More Data Science Internship. Blog 👇🏻
data-science letsgrowmore lgm lgmvip
Last synced: 28 Jul 2025
https://github.com/ahmednasef3/data-science-roadmap
A Roadmap that it is divided into weeks and tasks for beginners to learn and master data science
beginners data-science master roadmap
Last synced: 03 Jul 2025
https://github.com/selva221724/edasql
edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.
correlation data-analysis data-science data-visualization dataprofiling eda missing-values outlier-detection pandas python sql
Last synced: 10 Jun 2025
https://github.com/ppatrzyk/merkury
Generate HTML reports from Python scripts
analytics data-analysis data-science data-visualization python reporting static-site
Last synced: 14 Dec 2025
https://github.com/georgesalkhouri/l3wtransformer
A word hashing method based on vectors of letter n-grams. Currently transforms text into sequences of numbers.
bag-of-words data-science feature-extraction letter-trigram-word-hashing python text-processing
Last synced: 10 Apr 2025
https://github.com/tushar2704/my_homebrewed_notebooks_archived-account-kaggle.com-tusharaggarwal27
My_homebrewed_NOTEBOOKS is a GitHub repository that houses a collection of personal notebooks derived from various sources, including Kaggle and Jupyter Notebooks. This repository serves as a curated collection of notebooks created and customized by the repository owner, providing a valuable resource for learning and exploring different topics.
data-analysis data-science kaggle kaggle-competition kaggle-competition-notebooks kaggle-competiton kaggle-scripts machine-learning python
Last synced: 07 May 2025
https://github.com/akbaritabar/bibliometric_data_for_demographic_research
Materials for workshop on "Using bibliometric data in demographic research". A report here: https://iussp.org/en/using-bibliometric-data-demographic-research-0
computational-social-science data-science demographic-research migration-research
Last synced: 07 May 2025
https://github.com/chrislemke/sk-transformers
A collection of pandas & scikit-learn compatible transformers for preprocessing and feature engineering 🛠
data-science feature-engineering feature-selection machine-learning pandas preprocessing python scikit-learn scikit-learn-pipelines scikit-learn-transformer
Last synced: 17 Jun 2025
https://github.com/code2k13/nlphose
Enables creation of complex NLP pipelines in seconds, for processing static files or streaming text, using a set of simple command line tools. Perform multiple operation on text like NER, Sentiment Analysis, Chunking, Language Identification, Q&A, 0-shot Classification and more by executing a single command in the terminal. Can be used as a low code or no code Natural Language Processing solution. Also works with Kubernetes and PySpark !
ai artifical-intelligense data-science language-detection low-code machine-learning named-entity-recognition natural-language-processing nlp no-code sentiment-analysis text-mining twitter-sentiment-analysis
Last synced: 06 May 2025
https://github.com/govau/galileo
Quantifying interactions with government services to support delivery teams to improve their own products and services
analytics data data-science government observatory pandas python r shiny website
Last synced: 10 Jul 2025
https://github.com/ibm-cloud/iot-device-phone-simulator
A web application which acts as an IoT device when loaded in a smart phone browser. The data from the sensors are then used for Anomaly detection.
anomaly-detection cloud data-science datascience gyroscope-data ibm-cloud-solutions internet-of-things iot iot-device machine-learning mobile-web
Last synced: 11 Jul 2025
https://github.com/rngil/datafort
Dataframes in Fortran
data-analysis data-science dataframe fortran fortran90
Last synced: 17 Feb 2026
https://github.com/codepawl/loclean
An AI Data Cleaning Library
automated-cleaning data data-cleaning data-engineering data-preprocessing data-science data-wrangling etl llm normalization open-source polars privacy-preserving python semantic-analysis slm structured-data
Last synced: 04 Apr 2026
https://github.com/elahe-dastan/data-scientist-interview
Data Science Interview Questions and Answers
data-science data-science-interview datascientist interview interview-questions machine-learning
Last synced: 11 Apr 2025
https://github.com/jincheng9/python-tutorial
Python tutorial,量化交易,涵盖基础、中级和高级教程
data data-analysis-python data-analyst data-science django flask numpy pandas python quant quant-dev tutorial
Last synced: 07 May 2025
https://github.com/anquetos/openclassrooms
Projets du parcours Data Analyst OpenClassrooms
data-science etl jupyter-notebook knime knime-analytics-platform machine-learning matplotlib numpy pandas powerbi python scikit-learn scipy seaborn statsmodels
Last synced: 16 Jul 2025
https://github.com/elixirnote/elixirnote
Analyze data any time, anywhere
analytics bigdata collaboration data-science deepnote hex jupyter jupyterlab jupyterlab-notebooks notebook notebook-application notebook-jupyter notebook-publish visualization
Last synced: 23 Jun 2025
https://github.com/contextlab/data-wrangler
Wrangle messy numerical, image, and text data into consistent well-organized formats
data data-analysis data-science data-wrangling hugging-face image-data machine-learning nlp numpy pandas python scikit-learn
Last synced: 10 Apr 2025
https://github.com/nathaneastwood/brew-ds
Common Data Science set up for Mac and Linux 🍺🔬
data-science homebrew linuxbrew package-manager
Last synced: 08 Sep 2025
https://github.com/juliadatascience/juliadatascience-pt
Book on Julia for Data Science (Portuguese Edition)
book data data-manipulation data-science data-visualization julia julia-language
Last synced: 24 Jun 2025
https://github.com/carefree0910/carefree-toolkit
Some commonly used functions and modules
Last synced: 19 Jul 2025