Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2025-06-26 00:07:34 UTC
- JSON Representation
https://github.com/starpig1129/DATAGEN
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.
agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python
Last synced: 19 Feb 2025
https://github.com/xorbitsai/xorbits
Scalable Python DS & ML, in an API compatible & lightning fast way.
data-science distributed-systems lightgbm machine-learning ml numpy pandas python scalable xgboost
Last synced: 14 May 2025
https://github.com/starpig1129/ai-data-analysis-multiagent
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, data analysis, visualization, and report writing. Perfect for researchers and data scientists seeking to enhance their workflow and productivity.
agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python
Last synced: 15 Feb 2025
https://github.com/okfn-brasil/querido-diario
📰 Diários oficiais brasileiros acessíveis a todos | 📰 Brazilian government gazettes, accessible to everyone.
civic-tech data-science digital-public-goods dpg governments-gazettes govtech hacktoberfest open-data politics scraping sdg-16 spider
Last synced: 12 Apr 2025
https://github.com/aeon-toolkit/aeon
A toolkit for machine learning from time series
data-mining data-science machine-learning scikit-learn time-series time-series-analysis time-series-anomaly-detection time-series-classification time-series-clustering time-series-regression time-series-segmentation
Last synced: 13 May 2025
https://github.com/zama-ai/concrete-ml
Concrete ML: Privacy Preserving ML framework using Fully Homomorphic Encryption (FHE), built on top of Concrete, with bindings to traditional ML frameworks.
data-science fhe fully-homomorphic-encryption homomorphic-encryption machine-learning ppml privacy python scikit-learn tfhe torch
Last synced: 14 May 2025
https://github.com/deepwisdom/autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.
ai artificial-intelligence autodl autodl-challenge automated-machine-learning automl big-data data-science deeplearning feature-engineering full-automl lightgbm machine-learning model-selection multi-label nas python pytorch resnet tensorflow
Last synced: 15 May 2025
https://github.com/DeepWisdom/AutoDL
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.
ai artificial-intelligence autodl autodl-challenge automated-machine-learning automl big-data data-science deeplearning feature-engineering full-automl lightgbm machine-learning model-selection multi-label nas python pytorch resnet tensorflow
Last synced: 12 May 2025
https://github.com/teomewhy/teomerefs
Guia de referências técnicas para carreira em dados
data data-science machine-learning python
Last synced: 14 May 2025
https://github.com/JuliaStats/Distributions.jl
A Julia package for probability distributions and associated functions.
data-science julia probability-distributions statistics
Last synced: 08 May 2025
https://github.com/juliastats/distributions.jl
A Julia package for probability distributions and associated functions.
data-science julia probability-distributions statistics
Last synced: 13 May 2025
https://github.com/sajal2692/data-science-portfolio
Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
data-science keras machine-learning nlp pandas portfolio python scikit-learn
Last synced: 16 May 2025
https://github.com/nfstream/nfstream
NFStream: a Flexible Network Data Analysis Framework.
artificial-intelligence cybersecurity data-analysis data-mining data-science dataset-generation deep-packet-inspection machine-learning ndpi netflow network-analysis network-monitoring network-security packet-analyser packet-capture pcap python traffic-analysis traffic-classification
Last synced: 14 May 2025
https://github.com/deepfence/FlowMeter
⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐
awesome data-science data-science-projects forensics-tools hacktoberfest infosectools machine-learning machine-learning-projects machinelearning machinelearningproject network-analysis network-security packet-analyser pcap security security-tools tcpdump-like
Last synced: 30 Mar 2025
https://github.com/deepfence/flowmeter
⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐
awesome data-science data-science-projects forensics-tools hacktoberfest infosectools machine-learning machine-learning-projects machinelearning machinelearningproject network-analysis network-security packet-analyser pcap security security-tools tcpdump-like
Last synced: 18 Jan 2025
https://github.com/pro1code1hack/your-journey-to-fluent-python
Your Journey To Fluent Python
advanced-programming asyncio beginner-programming coding data-science education exercises functions learning learning-python oop oop-principles projects python python-3 python3 roadmap senior software-engineering tutorials
Last synced: 16 May 2025
https://github.com/shujian2015/freeml
A List of Data Science/Machine Learning Resources (Mostly Free)
data-science deep-learning machine-learning natural-language-processing
Last synced: 25 Mar 2025
https://github.com/sintel-dev/orion
Unsupervised time series anomaly detection library
anomaly-detection benchmarking data-science deep-learning generative-adversarial-network machine-learning orion signals time-series unsupervised-learning
Last synced: 14 May 2025
https://github.com/Shujian2015/FreeML
A List of Data Science/Machine Learning Resources (Mostly Free)
data-science deep-learning machine-learning natural-language-processing
Last synced: 05 May 2025
https://github.com/elixir-nx/explorer
Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
data-science dataframes elixir rust
Last synced: 03 Mar 2025
https://github.com/predict-idlab/plotly-resampler
Visualize large time series data with plotly.py
data-analysis data-science data-visualization plotly plotly-dash python time-series visualization
Last synced: 13 May 2025
https://github.com/qri-io/qri
you're invited to a data party!
data-science dataset golang hacktoberfest hacktoberfest2021 ipfs opendata p2p qri service trust web3
Last synced: 08 Apr 2025
https://github.com/areed1192/sigma_coding_youtube
This is a collection of all the code that can be found on my YouTube channel Sigma Coding.
data-science google-maps-api m-language mlanguage office-applications outlook-vba power-bi power-query powerpoint-vba python python-tutorials python-windows vba vba-excel win32 win32com word-vba yelp-fusion-api
Last synced: 08 Apr 2025
https://github.com/novak-99/mlpp
A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
cpp data-science deep-learning machine-learning
Last synced: 16 May 2025
https://github.com/alishobeiri/thread
AI-powered Jupyter Notebook — use local AI to generate and edit code cells, automatically fix errors, and chat with your data
ai analysis analytics data-science jupyter jupyter-notebook jupyter-notebooks jupyterhub jupyterlab ollama python react reactjs
Last synced: 14 May 2025
https://github.com/novak-99/MLPP
A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
cpp data-science deep-learning machine-learning
Last synced: 20 Mar 2025
https://github.com/red-data-tools/pycall.rb
Calling Python functions from the Ruby language
data-science pycall python ruby rubydatascience rubyml
Last synced: 14 May 2025
https://github.com/mrkn/pycall.rb
Calling Python functions from the Ruby language
data-science pycall python ruby rubydatascience rubyml
Last synced: 13 Apr 2025
https://github.com/datumbox/datumbox-framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
big-data data-science java machine-learning nlp statistics
Last synced: 15 May 2025
https://github.com/TeoMeWhy/teomerefs
Guia de referências técnicas para carreira em dados
data data-science machine-learning python
Last synced: 25 Mar 2025
https://github.com/makcedward/nlp
:memo: This repository recorded my NLP journey.
ai data-science deep-learning machine-learning nlp
Last synced: 12 Apr 2025
https://github.com/daochenzha/data-centric-ai
A curated, but incomplete, list of data-centric AI resources.
ai artificial-intelligence data-centric data-centric-ai data-centric-machine-learning data-curation data-engineering data-quality data-science machine-learning
Last synced: 24 Mar 2025
https://github.com/LongOnly/Quantitative-Notebooks
Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
algorithmic-trading algotrading asset-allocation asset-management asset-pricing data-analysis data-science financial-analysis jupyter machine-learning notebook pairs-trading python quantitative-finance quantitative-trading stock-trading trading-algorithms trading-strategies
Last synced: 30 Mar 2025
https://github.com/squaredtechnologies/thread
AI-powered Jupyter Notebook — use local AI to generate and edit code cells, automatically fix errors, and chat with your data
ai analysis analytics data-science jupyter jupyter-notebook jupyter-notebooks jupyterhub jupyterlab ollama python react reactjs
Last synced: 06 Dec 2024
https://github.com/cleanlab/cleanvision
Automatically find issues in image datasets and practice data-centric computer vision.
computer-vision data-centric-ai data-exploration data-profiling data-quality data-science data-validation deep-learning exploratory-data-analysis image-analysis image-classification image-generation image-quality image-segmentation
Last synced: 09 Apr 2025
https://github.com/rhiever/datacleaner
A Python tool that automatically cleans data sets and readies them for analysis.
automation data-science machine-learning python
Last synced: 15 May 2025
https://github.com/egbertbouman/youtube-comment-downloader
Simple script for downloading Youtube comments without using the Youtube API
data-science data-scraper python youtube youtube-comments
Last synced: 15 May 2025
https://github.com/daochenzha/data-centric-AI
A curated, but incomplete, list of data-centric AI resources.
ai artificial-intelligence data-centric data-centric-ai data-centric-machine-learning data-curation data-engineering data-quality data-science machine-learning
Last synced: 26 Mar 2025
https://github.com/pixiedust/pixiedust
Python Helper library for Jupyter Notebooks
data-science jupyter-notebook pixiedust python python-notebook scala-notebooks spark visualization
Last synced: 15 May 2025
https://github.com/sintel-dev/Orion
A machine learning library for detecting anomalies in signals.
anomaly-detection benchmarking data-science deep-learning generative-adversarial-network machine-learning orion signals time-series unsupervised-learning
Last synced: 26 Mar 2025
https://ibm-cds-labs.github.io/pixiedust
Python Helper library for Jupyter Notebooks
data-science jupyter-notebook pixiedust python python-notebook scala-notebooks spark visualization
Last synced: 31 Jan 2025
https://github.com/run-house/runhouse
Distribute and run AI workloads magically in Python, like PyTorch for ML infra.
api artificial-intelligence aws azure collaboration data-science deployment distributed fastapi gcp infrastructure machine-learning middleware observability python pytorch ray sagemaker serverless
Last synced: 14 May 2025
https://github.com/longonly/quantitative-notebooks
Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
algorithmic-trading algotrading asset-allocation asset-management asset-pricing data-analysis data-science financial-analysis jupyter machine-learning notebook pairs-trading python quantitative-finance quantitative-trading stock-trading trading-algorithms trading-strategies
Last synced: 21 Jan 2025
https://github.com/dataquestio/project-walkthroughs
Data science, machine learning, and web development project code for https://www.youtube.com/c/Dataquestio .
data-science machine-learning pandas python
Last synced: 08 Apr 2025
https://github.com/maxpumperla/deep_learning_and_the_game_of_go
Code and other material for the book "Deep Learning and the Game of Go"
alphago alphago-zero data-science deep-learning game-of-go games machine-learning neural-networks python
Last synced: 15 May 2025
https://github.com/zinggai/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
analytics analytics-engineering data-science data-transformation data-transformations dataengineering datalake dataquality dedupe deduplication entity-resolution etl fuzzy-matching fuzzymatch identity identity-resolution masterdata ml modern-data-stack spark
Last synced: 14 May 2025
https://github.com/ropensci/targets
Function-oriented Make-like declarative workflows for R
data-science high-performance-computing make peer-reviewed pipeline r r-package r-targetopia reproducibility reproducible-research rstats targets workflow
Last synced: 13 May 2025
https://github.com/mlr-org/mlr3
mlr3: Machine Learning in R - next generation
classification data-science machine-learning mlr3 r r-package regression
Last synced: 14 May 2025
https://github.com/dssg/hitchhikers-guide
The Hitchhiker's Guide to Data Science for Social Good
data-science dssg machine-learning training tutorial-exercises
Last synced: 03 May 2025
https://github.com/sematic-ai/sematic
An open-source ML pipeline development platform
ai data-science machine-learning ml ml-ops ml-pipeline ml-pipelines mlops pipeline python python3
Last synced: 14 May 2025
https://github.com/towardsai/tutorials
AI-related tutorials. Access any of them for free → https://towardsai.net/editorial
collaborative-filtering data-science deep-learning google-colab linear-algebra machine-learning math mathematics monte-carlo-simulation neural-networks nlp programming python python-tutorial recommendation-system sentiment-analysis tutorial
Last synced: 06 May 2025
https://github.com/tidyverse/datascience-box
Data Science Course in a Box
data-science education r rstats teaching
Last synced: 15 May 2025
https://github.com/rstudio-education/datascience-box
Data Science Course in a Box
data-science education r rstats teaching
Last synced: 26 Mar 2025
https://github.com/mybridge/machine-learning-open-source
Monthly Series - Machine Learning Top 10 Open Source Projects
ai algorithm artificial-intelligence data-science machine-learning neural-network
Last synced: 19 Feb 2025
https://github.com/Mybridge/machine-learning-open-source
Monthly Series - Machine Learning Top 10 Open Source Projects
ai algorithm artificial-intelligence data-science machine-learning neural-network
Last synced: 22 Mar 2025
https://github.com/hurshd0/must-read-papers-for-ml
Collection of must read papers for Data Science, or Machine Learning / Deep Learning Engineer
convolutional-networks data-analysis data-science deep-learning exploratory-data-analysis generalized-additive-models machine-learning neural-networks papers recommender-system recurrent-neural-networks rnn-lstm
Last synced: 10 Apr 2025
https://github.com/WenjieDu/PyPOTS
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
classification clustering data-mining data-science deep-learning forecasting healthcare imputation incomplete industrial interpolation machine-learning missing-values missingness neural-network partially-observed-time-series pytorch science-research time-series time-series-analysis
Last synced: 01 Apr 2025
https://github.com/firmai/data-science-career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
analytics big-data business-analytics business-intelligence career data-science machine-learning resources
Last synced: 06 May 2025
https://github.com/grailbio/reflow
A language and runtime for distributed, incremental data processing in the cloud
analysis-pipeline aws bioinformatics-pipeline cloud-computing data-science golang language runtime scientific-computing
Last synced: 15 Mar 2025
https://github.com/dataprofessor/code
Compilation of R and Python programming codes on the Data Professor YouTube channel.
data-professor data-science data-science-python dataprofessor datascience exploratory-data-analysis machine-learning machinelearning pandas python python-data-science r scikit-learn scikit-learn-python shiny streamlit
Last synced: 14 May 2025
https://github.com/caserec/Datasets-for-Recommender-Systems
This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
data-science database datasets public-data recommender-systems
Last synced: 28 Nov 2024
https://github.com/ipython-books/cookbook-2nd
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
computing data-analysis data-mining data-science data-visualization ipython jupyter jupyter-notebook machine-learning numerical-computation python visualization
Last synced: 16 May 2025
https://github.com/chiphuyen/just-pandas-things
An ongoing list of pandas quirks
data-science machine-learning pandas pandas-dataframe pandas-tutorial python
Last synced: 12 Apr 2025
https://github.com/webartifex/intro-to-python
An intro to Python & programming for wanna-be data scientists
data-science introduction-to-programming jupyter python tutorial
Last synced: 11 May 2025
https://github.com/iamaziz/pydataset
Instant access to many datasets in Python.
Last synced: 16 May 2025
https://github.com/nannyml/the-little-book-of-ml-metrics
The book every data scientist needs on their desk.
book classification-metrics clustering-metrics computer-vision-metrics data-science machine-learning machine-learning-evaluation machine-learning-metrics nlp-metrics python ranking-metrics regression-metrics
Last synced: 15 May 2025
https://github.com/iamaziz/PyDataset
Instant access to many datasets in Python.
Last synced: 27 Nov 2024
https://github.com/probcomp/bayeslite
BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself.
automatic-data-modeling data-science databases machine-learning probabilistic-programming
Last synced: 16 May 2025
https://github.com/opengeos/streamlit-geospatial
A multi-page streamlit app for geospatial
data-science datascience dataviz geopython geospatial housing-data housing-market huggingface mapping open-source python real-estate streamlit streamlit-webapp
Last synced: 14 May 2025
https://github.com/fraunhoferportugal/tsfel
An intuitive library to extract features from time series.
classification colab-notebook data-science feature-engineering feature-extraction time-series
Last synced: 14 Mar 2025
https://github.com/youssefHosni/Practical-Machine-Learning
Practical machine learning notebook & articles covers the machine learning end to end life cycle.
Last synced: 17 Mar 2025
https://github.com/youssefhosni/practical-machine-learning
Practical machine learning notebook & articles covers the machine learning end to end life cycle.
Last synced: 12 Apr 2025
https://github.com/RamiAwar/dataline
Chat with your data - AI data analysis and visualization on CSV, Postgres, MySQL, Snowflake, SQLite...
ai chart data-science data-visualization llm sql
Last synced: 30 Nov 2024
https://github.com/osgeo/grass
GRASS - free and open-source geospatial processing engine
arrays data-science earth-observation geospatial geospatial-analysis gis grass-gis hacktoberfest image-processing jupyter machine-learning open-science parallel-computing python raster remote-sensing science spatial timeseries-analysis vector
Last synced: 14 May 2025
https://github.com/oml-team/open-metric-learning
Metric learning and retrieval pipelines, models and zoo.
computer-vision data-science deep-learning hacktoberfest hacktoberfest-2023 hacktoberfest2023 metric-learning pytorch pytorch-lightning representation-learning similarity-learning
Last synced: 14 Apr 2025
https://github.com/norskregnesentral/skweak
skweak: A software toolkit for weak supervision applied to NLP tasks
data-science distant-supervision natural-language-processing nlp-library nlp-machine-learning python spacy training-data weak-supervision
Last synced: 15 May 2025
https://github.com/NorskRegnesentral/skweak
skweak: A software toolkit for weak supervision applied to NLP tasks
data-science distant-supervision natural-language-processing nlp-library nlp-machine-learning python spacy training-data weak-supervision
Last synced: 14 Mar 2025
https://github.com/tirthajyoti/Stats-Maths-with-Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
analytics anova bayesian-statistics clustering data-science hypothesis-testing inferential-statistics machine-learning mathematical-programming mathematics matplotlib normal-distribution numerical-analysis numpy pandas probability python scipy statistics statsmodels
Last synced: 01 May 2025
https://github.com/tirthajyoti/stats-maths-with-python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
analytics anova bayesian-statistics clustering data-science hypothesis-testing inferential-statistics machine-learning mathematical-programming mathematics matplotlib normal-distribution numerical-analysis numpy pandas probability python scipy statistics statsmodels
Last synced: 12 Apr 2025
https://github.com/bansalkanav/ultimate-data-science-toolkit---from-python-basics-to-generativeai
aws cnn data-analysis data-science deep-learning-algorithms flask machine-learning mlflow mlops mongodb pandas-python prefect python3 search-engine sklearn-library sql statistics streamlit tutorial-code visualization
Last synced: 15 May 2025
https://github.com/ahmetozlu/vehicle_counting_tensorflow
:oncoming_automobile: "MORE THAN VEHICLE COUNTING!" This project provides prediction for speed, color and size of the vehicles with TensorFlow Object Counting API.
color-recognition computer-vision data-science deep-learning deep-neural-networks detection image-processing machine-learning object-detection object-detection-label opencv prediction python speed-prediction tensorflow tensorflow-object-detection-api vehicle-counting vehicle-detection vehicle-detection-and-tracking vehicle-tracking
Last synced: 16 May 2025
https://github.com/wecoai/aideml
AIDE: AI-Driven Exploration in the Space of Code. State of the Art machine Learning engineering agents that automates AI R&D.
ai data-science llm machine-learning
Last synced: 18 Jun 2025
https://github.com/mentatinnovations/datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
alerts anomaly anomaly-detection anomalydetection anomalydiscovery bokeh-dashboard dashboard data-science data-stream datascience dataset dsio elasticsearch iot jupyter kibana machinelearning python sklearn timeseries
Last synced: 13 Apr 2025
https://github.com/epsilla-cloud/vectordb
Epsilla is a high performance Vector Database Management System. Try out hosted Epsilla at https://cloud.epsilla.com/
ai chatgpt data data-science database embeddings embeddings-similarity infrastructure llms machine-learning neural-network neural-search rag retrieval search-engine vector-database vector-search
Last synced: 15 May 2025
https://github.com/bansalkanav/Ultimate-Data-Science-Toolkit---From-Python-Basics-to-GenerativeAI
aws cnn data-analysis data-science deep-learning-algorithms flask machine-learning mlflow mlops mongodb pandas-python prefect python3 search-engine sklearn-library sql statistics streamlit tutorial-code visualization
Last synced: 16 Apr 2025
https://github.com/giswqs/streamlit-geospatial
A multi-page streamlit app for geospatial
data-science datascience dataviz geopython geospatial housing-data housing-market huggingface mapping open-source python real-estate streamlit streamlit-webapp
Last synced: 10 Feb 2025
https://github.com/MentatInnovations/datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
alerts anomaly anomaly-detection anomalydetection anomalydiscovery bokeh-dashboard dashboard data-science data-stream datascience dataset dsio elasticsearch iot jupyter kibana machinelearning python sklearn timeseries
Last synced: 14 Mar 2025
https://github.com/OSGeo/grass
GRASS - free and open-source geospatial processing engine
arrays data-science earth-observation geospatial geospatial-analysis gis grass-gis hacktoberfest image-processing jupyter machine-learning open-science parallel-computing python raster remote-sensing science spatial timeseries-analysis vector
Last synced: 05 Apr 2025
https://github.com/sberbank-ai-lab/LightAutoML
LAMA - automatic model creation framework
automated-machine-learning automl blackbox classification data-science ensembling feature-engineering gradient-boosting kaggle lama linear-model model-selection multiclass nlp parameter-tuning pipeline pytorch regression stacking whitebox
Last synced: 27 Nov 2024
https://github.com/graspologic-org/graspologic
Python package for graph statistics
data-science graph graph-statistics machine-learning networks python
Last synced: 14 May 2025
https://github.com/chawlaavi/daily-dose-of-data-science
A collection of code snippets from the publication Daily Dose of Data Science on Substack: http://www.dailydoseofds.com/
data-analysis data-science data-science-tips data-visualization jupyter jupyter-notebook jupyter-tips matplotlib matplotlib-tips numpy pandas pandas-tips python python-tips sklearn
Last synced: 04 Apr 2025
https://github.com/google/lightweight_mmm
LightweightMMM 🦇 is a lightweight Bayesian Marketing Mix Modeling (MMM) library that allows users to easily train MMMs and obtain channel attribution information.
bayesian data-science econometrics marketing-science mmm
Last synced: 06 May 2025
https://github.com/empathy87/the-elements-of-statistical-learning-python-notebooks
A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book
data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials
Last synced: 13 Apr 2025
https://github.com/turicas/rows
A common, beautiful interface to tabular data, no matter the format
convert-data csv data data-science excel hacktoberfest python table tabular-data xls xlsx
Last synced: 14 May 2025
https://github.com/Kotlin/dataframe
Structured data processing in Kotlin
data-analysis data-science dataframe kotlin
Last synced: 11 Apr 2025
https://github.com/d0r1h/ML-University
Machine Learning Open Source University
artificial-intelligence awsome awsome-list computer-science course data-science deep-learning free learning machine-learning mathematics natural-language-processing neural-network open-source reinforcement-learning university
Last synced: 08 May 2025
https://github.com/shenweichen/coursera
Quiz & Assignment of Coursera
computer-vision coursera data-science data-structures deep-learning machine-learning natural-language-processing reinforcement-learning
Last synced: 12 Apr 2025
https://github.com/shenweichen/Coursera
Quiz & Assignment of Coursera
computer-vision coursera data-science data-structures deep-learning machine-learning natural-language-processing reinforcement-learning
Last synced: 27 Nov 2024
https://github.com/stitchfix/hamilton
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
dag data-engineering data-platform data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hamilton hamiltonian machine-learning numpy pandas python software-engineering stitch-fix
Last synced: 18 Jan 2025
https://github.com/GoogleCloudPlatform/DataflowJavaSDK
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
big-data data-analysis data-mining data-processing data-science google-cloud-dataflow
Last synced: 01 May 2025