Projects in Awesome Lists tagged with datascience
A curated list of projects in awesome lists tagged with datascience .
https://github.com/avaiga/taipy
Turns Data and AI algorithms into production-ready web applications in no time.
automation data-engineering data-integration data-ops data-visualization datascience developer-tools hacktoberfest hacktoberfest2023 job-scheduler mlops orchestration pipeline pipelines python scenario scenario-analysis taipy-core taipy-gui workflow
Last synced: 05 Feb 2026
https://github.com/Avaiga/taipy
Turns Data and AI algorithms into production-ready web applications in no time.
automation data-engineering data-integration data-ops data-visualization datascience developer-tools hacktoberfest hacktoberfest2023 job-scheduler mlops orchestration pipeline pipelines python scenario scenario-analysis taipy-core taipy-gui workflow
Last synced: 05 Apr 2025
https://github.com/faviovazquez/ds-cheatsheets
List of Data Science Cheatsheets to rule the world
cheatsheet datascience jupyter programming python r spark
Last synced: 18 Oct 2025
https://github.com/FavioVazquez/ds-cheatsheets
List of Data Science Cheatsheets to rule the world
cheatsheet datascience jupyter programming python r spark
Last synced: 26 Mar 2025
https://github.com/virgili0/virgilio
Your new Mentor for Data Science E-Learning.
business-intelligence computer-vision data-science datascience guide guidelines hacktoberfest learning learning-python machine-learning machine-vision nlp path python scikit-learn statistics study studypath tensorflow virgilio
Last synced: 13 May 2025
https://github.com/virgili0/Virgilio
Your new Mentor for Data Science E-Learning.
business-intelligence computer-vision data-science datascience guide guidelines hacktoberfest learning learning-python machine-learning machine-vision nlp path python scikit-learn statistics study studypath tensorflow virgilio
Last synced: 14 Mar 2025
https://github.com/modin-project/modin
Modin: Scale your Pandas workflows by changing a single line of code
analytics data-science dataframe datascience distributed modin pandas python sql
Last synced: 11 May 2025
https://github.com/netflix/metaflow
Build, Manage and Deploy AI/ML Systems
agents ai aws azure data-science datascience gcp generative-ai high-performance-computing kubernetes llm llmops machine-learning ml ml-infrastructure ml-platform mlops model-management python
Last synced: 09 Feb 2026
https://github.com/Netflix/metaflow
:rocket: Build and manage real-life ML, AI, and data science projects with ease!
ai aws azure data-science datascience gcp high-performance-computing kubernetes machine-learning ml ml-infrastructure ml-platform mlops model-management productivity python r r-package reproducible-research rstats
Last synced: 13 Mar 2025
https://github.com/firmai/industry-machine-learning
A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
data-science datascience example firmai jupyter-notebook machine-learning practical-machine-learning python
Last synced: 14 May 2025
https://github.com/traceloop/openllmetry
Open-source observability for your GenAI or LLM application, based on OpenTelemetry
artifical-intelligence datascience generative-ai good-first-issue good-first-issues help-wanted llm llmops metrics ml model-monitoring monitoring observability open-source open-telemetry opentelemetry opentelemetry-python python
Last synced: 23 Nov 2025
https://github.com/holoviz/panel
Panel: The powerful data exploration & web app framework for Python
bokeh control-panels dashboards dataapp datascience dataviz gui holoviews holoviz hvplot jupyter matplotlib panel plotly
Last synced: 13 May 2025
https://github.com/nyandwi/machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
computer-vision data-analysis data-science data-visualization datascience deep-learning keras machine-learning matplotlib neural-networks nlp numpy open-source pandas python scikit-learn seaborn tensorflow
Last synced: 12 Apr 2025
https://github.com/Nyandwi/machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
computer-vision data-analysis data-science data-visualization datascience deep-learning keras machine-learning matplotlib neural-networks nlp numpy open-source pandas python scikit-learn seaborn tensorflow
Last synced: 05 Apr 2025
https://github.com/lk-geimfari/mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
data dataframe datascience dummy factory factory-boy fake fixtures generator json-generator mimesis mock pandas polars pytest-plugin python schema syntetic synthetic-data testing
Last synced: 28 Dec 2025
https://github.com/whoiskatrin/sql-translator
SQL Translator is a tool for converting natural language queries into SQL code using artificial intelligence. This project is 100% free and open source.
data-analysis data-engineering dataquery datascience dataset openai postgresql query sql
Last synced: 14 May 2025
https://github.com/theoehrly/fast-f1
FastF1 is a python package for accessing and analyzing Formula 1 results, schedules, timing data and telemetry
datascience formula1 motorsport
Last synced: 12 May 2025
https://github.com/theOehrly/Fast-F1
FastF1 is a python package for accessing and analyzing Formula 1 results, schedules, timing data and telemetry
datascience formula1 motorsport
Last synced: 14 Mar 2025
https://github.com/entilzha/pyfunctional
Python library for creating data pipelines with chain functional programming
data datascience functional-programming pipeline python
Last synced: 14 May 2025
https://github.com/EntilZha/PyFunctional
Python library for creating data pipelines with chain functional programming
data datascience functional-programming pipeline python
Last synced: 26 Mar 2025
https://github.com/indrajeetpatil/ggstatsplot
Enhancing {ggplot2} plots with statistical analysis ๐๐ฃ
bayes-factors datascience dataviz effect-size ggplot-extension hypothesis-testing non-parametric-statistics r regression-models statistical-analysis
Last synced: 06 Feb 2026
https://indrajeetpatil.github.io/ggstatsplot/
Enhancing {ggplot2} plots with statistical analysis ๐๐ฃ
bayes-factors datascience dataviz effect-size ggplot-extension hypothesis-testing non-parametric-statistics r regression-models statistical-analysis
Last synced: 25 Nov 2025
https://github.com/IndrajeetPatil/ggstatsplot
Enhancing {ggplot2} plots with statistical analysis ๐๐ฃ
bayes-factors datascience dataviz effect-size ggplot-extension hypothesis-testing non-parametric-statistics r regression-models statistical-analysis
Last synced: 07 Apr 2025
https://github.com/ujjwalkarn/datasciencer
a curated list of R tutorials for Data Science, NLP and Machine Learning
data-science datascience r text-mining
Last synced: 15 May 2025
https://github.com/ujjwalkarn/DataScienceR
a curated list of R tutorials for Data Science, NLP and Machine Learning
data-science datascience r text-mining
Last synced: 10 May 2025
https://github.com/chris1610/pbpython
Code, Notebooks and Examples from Practical Business Python
data-analysis data-visualization datascience pandas python scikit-learn
Last synced: 15 May 2025
https://github.com/microsoft/vscode-jupyter
VS Code Jupyter extension
datascience jupyter machine-learning vscode vscode-extension
Last synced: 13 May 2025
https://github.com/alan-turing-institute/clevercsv
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3
Last synced: 13 May 2025
https://github.com/alan-turing-institute/CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3
Last synced: 26 Mar 2025
https://github.com/abhishek-ch/around-dataengineering
A Data Engineering & Machine Learning Knowledge Hub
airflow data-engineering datascience devops infrastructure machine-learning mlops spark
Last synced: 08 Apr 2025
https://github.com/easystats/easystats
:milky_way: The R easystats-project
dataanalytics datascience easystats hacktoberfest models performance-metrics r regression-models rstats statistics
Last synced: 13 May 2025
https://github.com/dataprofessor/code
Compilation of R and Python programming codes on the Data Professor YouTube channel.
data-professor data-science data-science-python dataprofessor datascience exploratory-data-analysis machine-learning machinelearning pandas python python-data-science r scikit-learn scikit-learn-python shiny streamlit
Last synced: 14 May 2025
https://github.com/opengeos/streamlit-geospatial
A multi-page streamlit app for geospatial
data-science datascience dataviz geopython geospatial housing-data housing-market huggingface mapping open-source python real-estate streamlit streamlit-webapp
Last synced: 14 May 2025
https://github.com/mentatinnovations/datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
alerts anomaly anomaly-detection anomalydetection anomalydiscovery bokeh-dashboard dashboard data-science data-stream datascience dataset dsio elasticsearch iot jupyter kibana machinelearning python sklearn timeseries
Last synced: 04 Oct 2025
https://github.com/MentatInnovations/datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
alerts anomaly anomaly-detection anomalydetection anomalydiscovery bokeh-dashboard dashboard data-science data-stream datascience dataset dsio elasticsearch iot jupyter kibana machinelearning python sklearn timeseries
Last synced: 14 Mar 2025
https://github.com/amodinho/datacamp-python-data-science-track
All the slides, accompanying code and exercises all stored in this repo. ๐
bokeh data-science datacamp datacamp-course datacamp-exercises datacamp-machine-learning datacamp-projects datacamp-python datacamp-solutions-python datascience machinelearning natural-language-processing neural-network neural-networks nlp pandas python scikit-learn tokenization
Last synced: 24 Oct 2025
https://github.com/AmoDinho/datacamp-python-data-science-track
All the slides, accompanying code and exercises all stored in this repo. ๐
bokeh data-science datacamp datacamp-course datacamp-exercises datacamp-machine-learning datacamp-projects datacamp-python datacamp-solutions-python datascience machinelearning natural-language-processing neural-network neural-networks nlp pandas python scikit-learn tokenization
Last synced: 26 Mar 2025
https://github.com/firmai/business-machine-learning
A curated list of practical business machine learning (BML) and business data science (BDS) applications for Accounting, Customer, Employee, Legal, Management and Operations (by @firmai)
business-machine-learning datascience example jupyter jupyter-notebook machine-learning practical-machine-learning
Last synced: 06 May 2025
https://github.com/wx-chevalier/ai-notes
:books: [.md & .ipynb] Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc. ๐ซ ไบบๅทฅๆบ่ฝไธๆทฑๅบฆๅญฆไน ๅฎๆ๏ผๆฐ็็ป่ฎก็ฏ | ๆบๅจๅญฆไน ็ฏ | ๆทฑๅบฆๅญฆไน ็ฏ | ่ช็ถ่ฏญ่จๅค็็ฏ | ๅทฅๅ ทๅฎ่ทต Scikit & Tensoflow & PyTorch ็ฏ | ่กไธๅบ็จ & ่ฏพ็จ็ฌ่ฎฐ
artificial-intelligence datascience deeplearning machinelearning natural-language-processing neural-network wx-doc
Last synced: 17 Jun 2025
https://github.com/eka-foundation/numerical-computing-is-fun
Learning numerical computing with notebooks for all ages.
algorithm datascience introduction-to-algorithms introduction-to-data-science introduction-to-python jupyter-notebook kids-learn learn-to-code learning-by-doing learning-python numerical-computation prime-numbers
Last synced: 17 Jan 2026
https://github.com/wx-chevalier/AI-Notes
:books: [.md & .ipynb] Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc. ๐ซ ไบบๅทฅๆบ่ฝไธๆทฑๅบฆๅญฆไน ๅฎๆ๏ผๆฐ็็ป่ฎก็ฏ | ๆบๅจๅญฆไน ็ฏ | ๆทฑๅบฆๅญฆไน ็ฏ | ่ช็ถ่ฏญ่จๅค็็ฏ | ๅทฅๅ ทๅฎ่ทต Scikit & Tensoflow & PyTorch ็ฏ | ่กไธๅบ็จ & ่ฏพ็จ็ฌ่ฎฐ
artificial-intelligence datascience deeplearning machinelearning natural-language-processing neural-network wx-doc
Last synced: 09 May 2025
https://github.com/vegas-viz/Vegas
The missing MatPlotLib for Scala + Spark
Last synced: 07 May 2025
https://github.com/safe-graph/DGFraud
A Deep Graph-based Toolbox for Fraud Detection
anomaly-detection datamining datascience dblp-dataset financial-engineering fraud-detection fraud-prevention graph graph-algorithms graph-convolutional-networks graph-neural-networks graphneuralnetwork machine-learning opensource outlier-detection security security-tools spamdetection toolkit yelp-dataset
Last synced: 01 Apr 2025
https://github.com/safe-graph/dgfraud
A Deep Graph-based Toolbox for Fraud Detection
anomaly-detection datamining datascience dblp-dataset financial-engineering fraud-detection fraud-prevention graph graph-algorithms graph-convolutional-networks graph-neural-networks graphneuralnetwork machine-learning opensource outlier-detection security security-tools spamdetection toolkit yelp-dataset
Last synced: 04 Apr 2025
https://github.com/techascent/tech.ml.dataset
A Clojure high performance data processing system
clojure csv dataframe datascience dataset etl-pipeline java machine-learning xlsx
Last synced: 15 May 2025
https://github.com/kkulma/climate-change-data
:earth_africa: A curated list of APIs, open data and ML/AI projects on climate change
climate climate-analysis climate-change climate-data data data-science datascience hacktoberfest python r resources rstats
Last synced: 04 Apr 2025
https://github.com/turicas/socios-brasil
Captura os dados de sรณcios das empresas brasileiras na Receita Federal e exporta para um formato legรญvel por humanos
brazil data-driven-journalism datascience economic-data empresas hacktoberfest opendata python socios
Last synced: 15 May 2025
https://github.com/ShopRunner/jupyter-notify
A Jupyter Notebook magic for browser notifications of cell completion
Last synced: 12 Apr 2025
https://github.com/shoprunner/jupyter-notify
A Jupyter Notebook magic for browser notifications of cell completion
Last synced: 16 May 2025
https://github.com/juliaearth/geostats.jl
An extensible framework for geospatial data science and geostatistical modeling fully written in Julia
datascience geo geospatial geostatistics gis spatial-statistics statistical-learning statistics
Last synced: 31 Jan 2026
https://github.com/holgerbrandl/krangl
krangl is a {K}otlin DSL for data w{rangl}ing
data-mining datascience java kotlin sql
Last synced: 11 Apr 2025
https://github.com/JuliaEarth/GeoStats.jl
An extensible framework for geospatial data science and geostatistical modeling fully written in Julia
datascience geo geospatial geostatistics gis spatial-statistics statistical-learning statistics
Last synced: 14 Mar 2025
https://github.com/Gmousse/dataframe-js
A javascript library providing a new data structure for datascientists and developpers
data data-frame dataframe datascience datastructures functional groupby javascript manipulation matrix sql sql-syntax
Last synced: 15 Mar 2025
https://github.com/pirate/wikipedia-mirror
๐ Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump
archiving datascience docker docker-compose html internet-archiving kiwix kiwix-offline-wikipedia mediawiki mwdumper nginx openzim wiki wikipedia wikipedia-dump wikipedia-mirror xowa zim
Last synced: 16 May 2025
https://github.com/Mohitkr95/Best-Data-Science-Resources
This repository contains the best Data Science free hand-picked resources to equip you with all the industry-driven skills and interview preparation kit.
ai artificial-intelligence artificial-intelligence-algorithms aws computer-vision data data-structures datascience deep-learning git github jupyter-notebook machine-learning mongodb natural-language-processing neural-network python sql statistics
Last synced: 07 May 2025
https://github.com/datavane/datavines
Know your data better๏ผDatavines is Next-gen Data Observability Platform, support metadata manage and data quality.
dataobservability dataprofile dataquality datascience doris metadata spark
Last synced: 09 Apr 2025
https://github.com/DataScienceUB/introduction-datascience-python-book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
analytics data data-science datascience machine-learning python sentiment-analysis
Last synced: 19 Jul 2025
https://github.com/rohan-paul/machinelearning-deeplearning-code-for-my-youtube-channel
The full collection of all codes for my Youtube Channel segregated as per topic.
computer-vision data-science data-science-portfolio datascience deep-learning deep-neural-networks machine-learning machine-learning-algorithms math neural-network python pytorch pytorch-implementation pytorch-tutorial statistics tensorflow tensorflow-examples tensorflow-tutorials tensorflow2 youtube
Last synced: 04 Apr 2025
https://github.com/gacwr/openuba
A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
analytics anomaly-detection cybersecurity datascience elasticsearch elk flask information-security machine-learning nodejs react security siem sklearn spark tensorflow threathunting uba ueba user-behaviour
Last synced: 04 Apr 2025
https://github.com/Niketkumardheeryan/ML-CaPsule
ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.
analytics data-analysis data-science data-visualization datascience deep-learning deep-neural-networks deployment flask heroku-deployment machine-learning python r statistics streamlit-webapp
Last synced: 05 May 2025
https://github.com/marcogdepinto/emotion-classification-from-audio-files
Understanding emotions from audio files using neural networks and multiple datasets.
audio audio-processing classification-report datascience deep-learning deep-neural-networks emotion emotion-classification-ravdess keras keras-neural-networks librosa livingstone machine-learning python python3 ravdess-dataset song songs speech tensorflow
Last synced: 05 Apr 2025
https://github.com/marcogdepinto/Emotion-Classification-Ravdess
Understanding emotions from audio files using neural networks and multiple datasets.
audio audio-processing classification-report datascience deep-learning deep-neural-networks emotion emotion-classification-ravdess keras keras-neural-networks librosa livingstone machine-learning python python3 ravdess-dataset song songs speech tensorflow
Last synced: 12 Mar 2025
https://github.com/maif/melusine
๐ง Melusine: Use python to automatize your email processing workflow
courriels datascience emails natural-language-processing nlp nlp-machine-learning python python3
Last synced: 16 May 2025
https://github.com/afizs/be-theboss-in-python
This repo helps you to be the boss in Python.
beginner-friendly contributions-welcome datascience good-first-issue good-first-pr hacktoberfest machinelearning opensource python
Last synced: 08 Apr 2025
https://github.com/MigoXLab/dingo
Dingo: A Comprehensive AI Data Quality Evaluation Tool
common-crawl data-evaluation data-quality data-quality-assessment data-quality-report data-science data-validation dataquality datascience deepseek gpt hallucination hallucination-detection llm openai opencompass qwen spark vlm
Last synced: 29 Aug 2025
https://github.com/MAIF/melusine
๐ง Melusine: Use python to automatize your email processing workflow
courriels datascience emails natural-language-processing nlp nlp-machine-learning python python3
Last synced: 02 Apr 2025
https://github.com/leonvanbokhorst/notebooks-statistics-and-machinelearning
Jupyter Notebooks from the old UnsupervisedLearning.com (RIP) machine learning and statistics blog
data-science datascience ipynb ipynb-jupyter-notebook ipynb-notebook ipython-notebook jupiter-notebook jupyter-notebook machine-learning machine-learning-algorithms machinelearning python statistics
Last synced: 17 Dec 2025
https://github.com/leonvanbokhorst/NoteBooks-Statistics-and-MachineLearning
Jupyter Notebooks from the old UnsupervisedLearning.com (RIP) machine learning and statistics blog
data-science datascience ipynb ipynb-jupyter-notebook ipynb-notebook ipython-notebook jupiter-notebook jupyter-notebook machine-learning machine-learning-algorithms machinelearning python statistics
Last synced: 19 Jul 2025
https://github.com/traceloop/openllmetry-js
Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry
datascience generative-ai javascript llmops metrics ml model-monitoring monitoring nextjs observability open-source opentelemetry opentelemetry-javascript typescript
Last synced: 14 May 2025
https://github.com/akabe/ocaml-jupyter
An OCaml kernel for Jupyter (IPython) notebook
dataanalysis datascience functional-programming jupyter jupyter-kernels jupyter-notebook machine-learning ocaml ocaml-kernel ocaml-repl
Last synced: 28 Dec 2025
https://github.com/openml/openml-python
OpenML's Python API for a World of Data and More ๐ซ
benchmarking data datascience machine-learning meta-learning openml python tabular-data
Last synced: 10 Apr 2025
https://github.com/amanovishnu/ineuron-full-stack-data-science-assignments
this repository features assignments and projects from the iNeuron full stack data science course, providing valuable resources for learners to enhance their skills and apply their knowledge.
computer-vision data-science datascience deep-learning exploratory-data-analysis linear-regression machine-learning natural-language-processing python recommender-system sql statistics
Last synced: 08 Apr 2025
https://github.com/amanovishnu/ineuron-full-stack-data-science-assignment-collection
this repository features assignments and projects from the iNeuron full stack data science course, providing valuable resources for learners to enhance their skills and apply their knowledge.
computer-vision data-science datascience deep-learning exploratory-data-analysis linear-regression machine-learning natural-language-processing python recommender-system sql statistics
Last synced: 28 Feb 2025
https://github.com/turicas/salarios-magistrados
Baixa as planilhas de salรกrios de magistrados, extrai os contracheques, limpa e exporta pra CSV
brazil data-driven-journalism datascience justice opendata python
Last synced: 30 Apr 2025
https://github.com/MLWhiz/data_science_blogs
A repository to keep track of all the code that I end up writing for my blog posts.
blogging chatbot data datascience gan graphs machine-learning mcmc python spark streamlit time-series xgboost
Last synced: 05 May 2025
https://github.com/mlwhiz/data_science_blogs
A repository to keep track of all the code that I end up writing for my blog posts.
blogging chatbot data datascience gan graphs machine-learning mcmc python spark streamlit time-series xgboost
Last synced: 06 Apr 2025
https://github.com/zavtech/morpheus-core
The foundational library of the Morpheus data science framework
data-analysis data-analytics dataframe dataframe-library datascience finance principal-component-analysis quantitative-finance regression regression-models statistical-analysis statistics
Last synced: 02 Apr 2025
https://github.com/anaconda/anaconda-project
Tool for encapsulating, running, and reproducing data science projects
anaconda conda-environment data datascience encapsulation reproducibility running
Last synced: 11 Dec 2025
https://github.com/packtworkshops/the-data-science-workshop
A New, Interactive Approach to Learning Data Science
binaryclassification clusteranalysis data-preparation datascience dimensionality-reduction ensemble-learning- feature-engineering hyperparameter-tuning- machine-learning machine-learning-pipelines python random-forest regression
Last synced: 05 Apr 2025
https://github.com/eurostat/gridviz
A library for visualizing gridded data ๐
cartography census csv d3 data data-analysis data-science data-visualization datascience geospatial gis gridded-statistics grids map map-making mapping mapping-tools maps visualization webgl
Last synced: 24 Dec 2025
https://github.com/chasedehan/BoostARoota
A fast xgboost feature selection algorithm
algorithm boruta data-science datascience datascientist dimension-reduction feature-selection machine-learning machine-learning-algorithms machinelearning xgboost xgboost-algorithm
Last synced: 16 Nov 2025
https://github.com/chasedehan/boostaroota
A fast xgboost feature selection algorithm
algorithm boruta data-science datascience datascientist dimension-reduction feature-selection machine-learning machine-learning-algorithms machinelearning xgboost xgboost-algorithm
Last synced: 05 Apr 2025
https://github.com/IngestAI/embedditor
โก GUI for editing LLM vector embeddings. No more blind chunking. Upload content in any file extension, join and split chunks, edit metadata and embedding tokens + remove stop-words and punctuation with one click, add images, and download in .veml to share it with your team.
datapreprocessing datascience embedding-vectors embeddings genai laravel llm markup-language ml nlp nltk php vector-database vector-search vectorization veml
Last synced: 28 Mar 2025
https://github.com/harunurrashid97/100-Days-Of-ML-Code
A day to day plan for this challenge. Covers both theoritical and practical aspects
100-days-of-code 100daysofmlcode article data-preprocessing data-science datascience decision-tree eda exploratory-data-analysis implementation infographics linear-regression machine-learning machine-learning-algorithms python regression-algorithms siraj-raval-challenge textsummarization tutorials vizualization
Last synced: 19 Jul 2025
https://github.com/brianruizy/covid19-dashboard
๐ฆ Django + Plotly Coronavirus dashboard. Powerful data driven Python web-app, with an awesome UI. Contributions welcomed! Featured on ๐ถAwesome-list
coronavirus coronavirus-real-time coronavirus-tracker covid-19 covid-dashboard covid-data dashboard data-visualization datascience django django-application django-web-app heroku pandemic plot plotly python
Last synced: 01 Oct 2025
https://github.com/SkillCorner/opendata
SkillCorner Open Data with 9 matches of broadcast tracking data.
datascience soccer sportsanalytics
Last synced: 27 Apr 2025
https://github.com/storieswithsiva/Data-Science-Resources
๐จ๐ฝโ๐ซYou can learn about what data science is and why it's important in today's modern world. Are you interested in data science?๐
artificial-intelligence artificial-neural-networks data data-analysis data-analytics data-mining data-science data-science-resource data-science-resources data-scientist data-scientists data-visualization data-world datascience dataset learning learning-kit machine-learning python repository
Last synced: 10 Apr 2025
https://github.com/jthomasmock/gtextras
A Collection of Helper Functions for the gt Package.
data-science data-visualization datascience ggplot2 gt plots r rstats sparkline sparkline-graphs sparklines tables
Last synced: 12 Apr 2025
https://github.com/jthomasmock/gtExtras
A Collection of Helper Functions for the gt Package.
data-science data-visualization datascience ggplot2 gt plots r rstats sparkline sparkline-graphs sparklines tables
Last synced: 29 Jul 2025
https://github.com/holgerbrandl/kravis
A {K}otlin g{ra}mmar for data {vis}ualization
data-visualization datascience ggplot2 kotlin krangl
Last synced: 04 Apr 2025
https://khuyentran1401.github.io/machine-learning-articles/
List of interesting articles on different topics of machine learning and deep learning
ai articles-summaries artificial-intelligence awesome-list datascience deep-learning machinelearning neural-network
Last synced: 06 May 2025
https://github.com/crmne/cookiecutter-modern-datascience
Start a data science project with modern tools
cookiecutter cookiecutter-data-science cookiecutter-template datascience python
Last synced: 11 Jul 2025
https://github.com/aaaastark/data-scientist-books
Data-Scientist-Books (Machine Learning, Deep Learning, Natural Language Processing, Computer Vision, Long Short Term Memory, Generative Adversarial Network, Time Series Forecasting, Probability and Statistics, and more.)
ai artificial-intelligence datascience deeplearning dl ds gans lstm machinelearning ml probability python r statistics tsf
Last synced: 03 Apr 2025
https://github.com/traceloop/hub
High-scale LLM gateway, written in Rust. OpenTelemetry-based observability included
artificial-intelligence datascience generative-ai llm llmops ml model-monitoring observability open-source opentelemetry rust
Last synced: 08 Feb 2026
https://github.com/nl4dv/nl4dv
A python toolkit to create Visualizations (Vis) using natural language (NL) or add an NL interface to existing Vis.
conversational conversational-interaction conversational-interactions data-visualization datascience jupyter-notebook natural-language natural-language-interface natural-language-processing nl-interface opensource python toolkit vega-lite visualization
Last synced: 12 Apr 2025
https://github.com/jupyterhub/repo2docker-action
A GitHub action to build data science environment images with repo2docker and push them to registries.
actions binder data-science datascience docker jupyter jupyter-notebook repo2docker repo2docker-action
Last synced: 06 Apr 2025
https://github.com/cs-mohamedayman/data-science-case-studies
Data Science Case Studies for computer science students.
casestudies dashboard dataanalysis datacamp datascience datascienceindustries deeplearning excel googlesheets hackerrank kaggle leetcode machinelearning powerbi powerpoint sql tableau
Last synced: 23 Feb 2025
https://github.com/adijo/data-science-prep
Problems from https://datascienceprep.com/
data-science data-science-interview datascience interview-prep machine-learning machine-learning-interview machinelearning probability statistics
Last synced: 14 Apr 2025
https://github.com/safe-graph/dgfraud-tf2
A Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X
anomaly-detection datamining datascience dblp-dataset financial-engineering fraud-detection fraud-prevention graph-algorithms graphneuralnetwork machine-learning opensource outlier-detection security security-tools spam-detection toolkit yelp-dataset
Last synced: 27 Apr 2025
https://github.com/juliaai/datasciencetutorials.jl
A set of tutorials to show how to use Julia for data science (DataFrames, MLJ, ...)
datascience julia-language mlj tutorials
Last synced: 12 Apr 2025