Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2025-05-19 00:07:12 UTC
- JSON Representation
https://github.com/girder/girder
A data management platform for the web, developed by Kitware
data-analytics data-management data-science javascript kitware python resonant
Last synced: 03 Apr 2025
https://github.com/wilsonrljr/sysidentpy
A Python Package For System Identification Using NARMAX Models
data-science dynamical-systems machine-learning narmax narx system-identification time-series
Last synced: 01 May 2025
https://github.com/probabl-ai/skore
the scikit-learn sidekick
data-analysis data-science data-visualization machine-learning python scikit-learn workflow
Last synced: 15 May 2025
https://github.com/BlackHC/toma
Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory
data-science gpu machine-learning python pytorch
Last synced: 08 May 2025
https://github.com/blackhc/toma
Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory
data-science gpu machine-learning python pytorch
Last synced: 12 Apr 2025
https://github.com/tlkh/ai-lab
All-in-one AI container for rapid prototyping
cuda data-science deep-learning docker jupyter nvidia pytorch tensorflow
Last synced: 05 Apr 2025
https://github.com/bodywork-ml/bodywork-core
ML pipeline orchestration and model deployments on Kubernetes.
batch cicd continuous-deployment data-science devops framework kubernetes machine-learning mlops orchestration pipeline python serving
Last synced: 19 Apr 2025
https://github.com/ptyadana/data-science-and-machine-learning-projects-dojo
collections of data science, machine learning and data visualization projects with pandas, sklearn, matplotlib, tensorflow2, Keras, various ML algorithms like random forest classifier, boosting, etc
boosting-algorithms data-analysis data-science data-visualization deep-learning keras machine-learning machine-learning-algorithms natural-language-processing pandas probability-statistics scikit-learn seaborn tensorflow
Last synced: 05 Apr 2025
https://github.com/kevintpeng/Learn-Something-Every-Day
📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->
algorithm aws blog computer-science course-materials data-engineering data-science education educational engineering learning math mathematics research software-engineering university unix waterloo
Last synced: 20 Mar 2025
https://github.com/jobream/List-of-Learning-Resources
This collection provides a list of educational resources for Software Engineers. Feel free to add your favorite resources as well and help others in their journey of learning.
competitive-programming computer-science data-science resources software-engineering web-development
Last synced: 02 May 2025
https://github.com/apecloud/myduckserver
Unified MySQL, Postgres & FlightSQL Server, Powered by DuckDB.
analytics arrow business-analytics business-intelligence columnar-storage data-engineering data-science database duckdb htap mariadb mysql olap pandas parquet polars postgres replication sql zero-etl
Last synced: 15 May 2025
https://github.com/terrytangyuan/distributed-ml-patterns
Distributed Machine Learning Patterns from Manning Publications by Yuan Tang https://bit.ly/2RKv8Zo
argo argo-workflows book cloud-computing cloud-native data-science devops distributed-machine-learning distributed-systems kubeflow kubernetes large-scale-machine-learning machine-learning machine-learning-pipelines manning-publications mlops python tensorflow
Last synced: 16 May 2025
https://github.com/DataScienceUB/introduction-datascience-python-book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
analytics data data-science datascience machine-learning python sentiment-analysis
Last synced: 26 Nov 2024
https://github.com/moabukar/Everything-Tech
A collection of online resources to help you on your Tech journey.
ansible aws azure backend data-engineering data-science devops docker frontend gcp kubernetes machine-learning networking python serverless software-engineering tech terraform
Last synced: 10 Apr 2025
https://github.com/plotly/dash-table
OBSOLETE: now part of https://github.com/plotly/dash
dash data-science data-visualization plotly plotly-dash python react table
Last synced: 04 Apr 2025
https://github.com/firmai/pandasvault
Advanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).
data-science data-structures dataframe functions pandas python snippets table tips
Last synced: 06 May 2025
https://github.com/ashishpatel26/resourcebank_cv_nlp_mlops_2022
This repository offers a goldmine of materials for students of computer vision, natural language processing, and machine learning operations.
computer-vision data-science deep-learning mlops natural-language-processing
Last synced: 05 Apr 2025
https://github.com/rohan-paul/machinelearning-deeplearning-code-for-my-youtube-channel
The full collection of all codes for my Youtube Channel segregated as per topic.
computer-vision data-science data-science-portfolio datascience deep-learning deep-neural-networks machine-learning machine-learning-algorithms math neural-network python pytorch pytorch-implementation pytorch-tutorial statistics tensorflow tensorflow-examples tensorflow-tutorials tensorflow2 youtube
Last synced: 04 Apr 2025
https://github.com/epistasislab/scikit-rebate
A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
data-science feature-selection python
Last synced: 16 May 2025
https://github.com/moabukar/everything-tech
A collection of online resources to help you on your Tech journey.
ansible aws azure backend data-engineering data-science devops docker frontend gcp kubernetes machine-learning networking python serverless software-engineering tech terraform
Last synced: 19 Jan 2025
https://github.com/predict-idlab/tsflex
Flexible time series feature extraction & processing
data-science feature-engineering feature-extraction multimodal multivariate pandas processing python time-series window-stride
Last synced: 16 May 2025
https://github.com/HHammond/PrettyPandas
A Pandas Styler class for making beautiful tables
data-analysis data-science pandas pandas-dataframe pandas-dataframes pandas-styler python reporting
Last synced: 23 Apr 2025
https://github.com/5agado/data-science-learning
Repository of code and resources related to different data science and machine learning topics. For learning, practice and teaching purposes.
data-science deep-learning jupyter-notebook learning-by-doing machine-learning statistics
Last synced: 17 Apr 2025
https://github.com/publicdomaincompany/scroll
Scroll is a language for scientists of all ages. Scroll includes a command line app that builds static blogs, websites, CSVs, text files, and more.
blog cms csv data-science knowledge-base knowledge-graph markdown markup markup-language note-taking scroll static-site-generator tree-notation
Last synced: 23 Nov 2024
https://github.com/stackloklabs/promptwright
Generate large synthetic data using an LLM
ai data-science dataset huggingface huggingface-datasets machine-learning synthetic-data synthetic-dataset-generation
Last synced: 16 May 2025
https://github.com/ankonzoid/artificio
Deep Learning Computer Vision Algorithms for Real-World Use
ai applications artificial-intelligence auto-encoders computer-vision convolutional-neural-networks data-science deep-learning image-classification image-finder image-processing image-recognition image-retrieval machine-learning neural-networks object-recognition python recommender-system recommender-systems transfer-learning
Last synced: 08 May 2025
https://github.com/rebecca-vickery/data-science-learning-resources
A comprehensive list of free resources for learning data science
artificial-intelligence data data-science machine-learning python
Last synced: 26 Apr 2025
https://github.com/Niketkumardheeryan/ML-CaPsule
ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.
analytics data-analysis data-science data-visualization datascience deep-learning deep-neural-networks deployment flask heroku-deployment machine-learning python r statistics streamlit-webapp
Last synced: 05 May 2025
https://github.com/okfn-brasil/rosie
🤖 Python application responsible for Serenata de Amor's intelligence
artificial-intelligence data-science machine-learning
Last synced: 28 Mar 2025
https://github.com/ClimbsRocks/machineJS
[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
auto-ml automated-machine-learning automl data-science data-scientists javascript javascript-library kaggle machine-learning machine-learning-algorithms machine-learning-library ml numerai scikit-learn
Last synced: 27 Nov 2024
https://github.com/sforaidl/genrl
A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms
Last synced: 04 Apr 2025
https://github.com/tobgu/qframe
Immutable data frame for Go
data-frame data-science dataframe go golang immutable
Last synced: 04 Apr 2025
https://github.com/xoolive/traffic
A toolbox for processing and analysing air traffic data
adsb air-traffic-data data-analytics data-science data-visualisation declarative-pipeline mode-s trajectory
Last synced: 14 May 2025
https://github.com/Chicago/food-inspections-evaluation
This repository contains the code to generate predictions of critical violations at food establishments in Chicago. It also contains the results of an evaluation of the effectiveness of those predictions.
cdph chicago data-science food-poisoning open-data open-science public-health
Last synced: 27 Mar 2025
https://github.com/ShawhinT/YouTube-Blog
Codes to complement YouTube videos and blog posts on Medium.
data-science example-code machine-learning medium-articles youtube
Last synced: 25 Nov 2024
https://github.com/kunalj101/Data-Science-Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks
Last synced: 05 May 2025
https://github.com/stacklok/promptwright
Generate large synthetic data using an LLM
ai data-science dataset huggingface huggingface-datasets machine-learning synthetic-data synthetic-dataset-generation
Last synced: 08 Apr 2025
https://github.com/weavefox/libro
A Notebook with Flexible Customization and Easy Integration.
agent ai artificial-intelligence data-science jupyter jupyter-notebook jupyter-notebooks libro machine-learning notebook python sql
Last synced: 16 May 2025
https://github.com/wiseaidev/rust-data-analysis
Rust for data analysis encyclopedia (WIP).
calculas data-analysis data-science eda evcxr hacktoberfest jupyter jupyter-notebook ndarray notebook plotters plotters-rs polars probability probability-distribution probability-theory rust statrs
Last synced: 05 Apr 2025
https://github.com/SforAiDl/genrl
A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms
Last synced: 01 May 2025
https://github.com/kunalj101/data-science-hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks
Last synced: 13 Feb 2025
https://github.com/climbsrocks/machinejs
[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
auto-ml automated-machine-learning automl data-science data-scientists javascript javascript-library kaggle machine-learning machine-learning-algorithms machine-learning-library ml numerai scikit-learn
Last synced: 05 Apr 2025
https://github.com/aiguofer/gspread-pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
data data-analytics data-engineering data-science dataframes google google-sheets google-spreadsheets gspread pandas python sheets
Last synced: 15 May 2025
https://github.com/airbnb/artificial-adversary
🗣️ Tool to generate adversarial text examples and test machine learning models against them
adversarial-examples black-box-attacks black-box-benchmarking classification data-mining data-science machine-learning metrics python python2 python3 spam spam-classification spam-detection spam-filtering text text-analysis text-classification text-mining text-processing
Last synced: 04 Apr 2025
https://github.com/basedosdados/sdk
⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/sdk/
bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia
Last synced: 14 May 2025
https://github.com/ledell/user-machine-learning-tutorial
useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html
data-science deep-learning ensemble-learning gradient-boosting-machine machine-learning r random-forest tutorial
Last synced: 05 Apr 2025
https://github.com/carefree0910/carefree-learn
Deep Learning ❤️ PyTorch
algorithm automl computer-vision data-science deep-learning ensemble machine-learning numpy python pytorch tabular-data tabular-datasets
Last synced: 13 Apr 2025
https://github.com/ledell/useR-machine-learning-tutorial
useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html
data-science deep-learning ensemble-learning gradient-boosting-machine machine-learning r random-forest tutorial
Last synced: 27 Nov 2024
https://github.com/rio-labs/rio
WebApps in pure Python. No JavaScript, HTML and CSS needed
data-analysis data-science data-visualization deep-learning machine-learning python ui webapp
Last synced: 09 Apr 2025
https://github.com/InfuseAI/primehub
open-source MLOps platform
data-science distributed-systems docker jupyter jupyterhub keycloak kubernetes machine-learning primehub primehub-ce
Last synced: 18 Apr 2025
https://github.com/plotly/plotly_matlab
Plotly Graphing Library for MATLAB®
d3 d3js data-science data-visualization matlab plotly technical-computing webgl
Last synced: 15 May 2025
https://github.com/kevinliao159/mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
api data-science data-visualization machine-learning neural-networks nlp recommendation-system spark
Last synced: 05 Apr 2025
https://github.com/airalcorn2/Michael-s-Data-Science-Curriculum
This is the companion curriculum to my guide to becoming a data scientist.
curriculum data-science machine-learning statistics
Last synced: 22 Nov 2024
https://github.com/olavolav/uniplot
Lightweight plotting to the terminal. 4x resolution via Unicode.
data-analysis data-science plot python
Last synced: 29 Mar 2025
https://github.com/weijie-chen/econometrics-with-python
Tutorials of econometrics featuring Python programming. This is a crash course for reviewing the most important concepts and techniques of basic econometrics, the theories are presented lightly without hustles of derivation and Python codes are straightforward.
data-analysis data-science econometrics economics python statistics time-series
Last synced: 05 Apr 2025
https://github.com/datalayer/jupyter-ui
🪐 ⚛️ React.js components 💯% compatible with 🪐 Jupyter https://jupyter-ui-storybook.datalayer.tech
data data-product data-science data-visualisation datalayer ipywidgets jupyter jupyterlab lumino notebook reactjs ui
Last synced: 15 May 2025
https://github.com/EpistasisLab/scikit-rebate
A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
data-science feature-selection python
Last synced: 27 Mar 2025
https://github.com/thoughtworks/mlops-platforms
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...
azureml data-science databricks dataiku datarobot google-ai-platform h2oai iguazio knime kubeflow machine-learning mlflow mlops pachyderm sagemaker seldon
Last synced: 07 May 2025
https://github.com/Desbordante/desbordante-core
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data
Last synced: 03 Apr 2025
https://github.com/yzkang/My-Data-Competition-Experience
本人多次机器学习与大数据竞赛Top5的经验总结,满满的干货,拿好不谢
automl catboost data-science deep-learning feature-engineering feature-selection gan hyperparameter-optimization kaggle-competition lightgbm machine-learning model-fusion model-selection python sql tianchi-competition xgboost
Last synced: 27 Apr 2025
https://github.com/solegalli/feature-engineering-for-machine-learning
Code repository for the online course Feature Engineering for Machine Learning
data-science feature-engineering feature-extraction machine-learning python
Last synced: 04 Apr 2025
https://github.com/dagshub/fds
Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc
Last synced: 12 Apr 2025
https://github.com/operatorai/modelstore
🏬 modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud storage provider.
data-science keras machine-learning mlops modelstore python-library pytorch s3-storage scikit-learn tensorflow transformer
Last synced: 14 May 2025
https://github.com/DagsHub/fds
Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc
Last synced: 08 May 2025
https://github.com/neptune-ai/open-solution-mapping-challenge
Open solution to the Mapping Challenge :earth_americas:
competition crowdai data-science data-science-learning deep-learning kaggle lightgbm machine-learning machine-learning-lab mapping-challenge neptune pipeline pipeline-framework python satellite-imagery unet unet-image-segmentation unet-pytorch
Last synced: 05 Apr 2025
https://github.com/azkadev/isar_inspector
Isar inspector local
dart data-science database flutter isar nosql sql
Last synced: 06 Apr 2025
https://github.com/plotly/dashR
Create data science and AI web apps in R
dash data-science data-visualization plotly plotly-dash python r react web-application
Last synced: 15 Mar 2025
https://github.com/matrix-profile-foundation/matrixprofile
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
algorithms anomaly-detection clustering data-mining data-science hacktoberfest matrixprofile motif-discovery python python2 python3 segmentation time-series time-series-analysis
Last synced: 16 May 2025
https://github.com/okld/streamlit-pandas-profiling
Pandas profiling component for Streamlit.
data-science demo pandas pandas-profiling python streamlit streamlit-component streamlit-pandas-profiling
Last synced: 16 May 2025
https://github.com/jpmorganchase/jupyterlab_templates
Support for jupyter notebook templates in jupyterlab
data-science dataviz jupyter jupyterlab jupyterlab-extension machine-learning notebook
Last synced: 10 Feb 2025
https://github.com/finos/jupyterlab_templates
Support for jupyter notebook templates in jupyterlab
data-science dataviz jupyter jupyterlab jupyterlab-extension machine-learning notebook
Last synced: 12 Apr 2025
https://github.com/liyangbit/PyDataLab
open source for wechat-official-account (ID: PyDataLab)
data-analysis data-mining data-science data-visualization machine-learning python wechat-official-account
Last synced: 01 Dec 2024
https://github.com/jkrumbiegel/chain.jl
A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
data-analysis data-science julia julia-language julia-package macro pipeline
Last synced: 12 Apr 2025
https://github.com/ozlerhakan/datacamp
🍧 DataCamp data-science and machine learning courses
data-analysis data-science datacamp datacamp-course deep-learning machine-learning python statistics visualization
Last synced: 05 Apr 2025
https://github.com/jkrumbiegel/Chain.jl
A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
data-analysis data-science julia julia-language julia-package macro pipeline
Last synced: 14 May 2025
https://github.com/adicherlavenkatasai/ml-workspace
Machine Learning (Beginners Hub), information(courses, books, cheat sheets, live sessions) related to machine learning, data science and python is available
cheat-sheets convolutional-networks data-science deep-learning deep-neural-networks gans harvard-edx interview-questions machine-learning python
Last synced: 11 Apr 2025
https://github.com/theislab/cellrank
CellRank: dynamics from multi-view single-cell data
bioinformatics cell-fate-determination cell-fate-transitions data-science fuzzy-clustering-analyses genetics machine-learning manifold-learning markov-chains rna-velocity single-cell-genomics single-cell-rna-seq trajectory-generation
Last synced: 01 May 2025
https://github.com/anothersamwilson/miceforest
Multiple Imputation with LightGBM in Python
data-science imputed-values mice-algorithm python random-forest
Last synced: 08 Apr 2025
https://github.com/aunum/goro
A High-level Machine Learning Library for Go
data-science go golang machine-learning machinelearning
Last synced: 20 Mar 2025
https://github.com/astronomer/astro-sdk
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows
Last synced: 13 Apr 2025
https://github.com/aaronpenne/data_visualization
A collection of my data visualizations, mostly in Python.
data-science data-visualization python3 visualization
Last synced: 14 Mar 2025
https://github.com/MaxHalford/xam
:dart: Personal data science and machine learning toolbox
data-science machine-learning preprocessing python stacking
Last synced: 08 May 2025
https://github.com/maxhalford/xam
:dart: Personal data science and machine learning toolbox
data-science machine-learning preprocessing python stacking
Last synced: 07 Apr 2025
https://github.com/triestpa/cryptocurrency-analysis-python
Open-Source Tutorial For Analyzing and Visualizing Cryptocurrency Data
bitcoin cryptocurrency data-analysis data-science data-visualization ethereum jupyter-notebook plotly python tutorial
Last synced: 07 Apr 2025
https://github.com/zhiningliu1998/imbalanced-ensemble
🛠️ Class-imbalanced Ensemble Learning Toolbox. | 类别不平衡/长尾机器学习库
class-imbalance classification data-mining data-science ensemble ensemble-imbalanced-learning ensemble-learning ensemble-model imbalanced-classification imbalanced-data imbalanced-learning long-tail machine-learning multi-class-classification python python3 scikit-learn sklearn
Last synced: 15 May 2025
https://github.com/paddymul/buckaroo
Buckaroo - the data wrangling assistant for pandas. Quickly explore dataframes, and run pandas commands via a GUI. Works inside the jupyter notebook.
buckaroo data-science jupyter paddy pandas
Last synced: 15 May 2025
https://github.com/triestpa/Cryptocurrency-Analysis-Python
Open-Source Tutorial For Analyzing and Visualizing Cryptocurrency Data
bitcoin cryptocurrency data-analysis data-science data-visualization ethereum jupyter-notebook plotly python tutorial
Last synced: 27 Nov 2024
https://github.com/ibm/automlpipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
automl chaining classification data-mining data-mining-algorithms data-science ensemble-learning julia machine-learning machine-learning-models pipeline pipeline-optimization pipeline-structure scikitlearn-wrapper stacking symbolic-expressions symbolic-pipeline
Last synced: 15 May 2025
https://github.com/souzatharsis/open-quant-live-book
An open source, hands-on and fully reproducible book in quantitative finance, data science and econophysics. Join us and help Make Wall Street Great Again!
algo-trading altdata data-science econophysics financial-analysis financial-markets machine-learning open-source quantitative-finance
Last synced: 23 Feb 2025
https://github.com/XORbit01/webpalm
🕸️ Crawl in the web network
crawler crawling data data-science datamining go golang hack mining osint redteam spider tool
Last synced: 14 Apr 2025
https://github.com/joaquinamatrodrigo/estadistica-con-r
Apuntes personales sobre estadística, machine learning y lenguaje de programación R
bioestadistica data-mining data-science estadistica machine-learning mineria-de-datos r
Last synced: 05 Apr 2025
https://github.com/IBM/AutoMLPipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
automl chaining classification data-mining data-mining-algorithms data-science ensemble-learning julia machine-learning machine-learning-models pipeline pipeline-optimization pipeline-structure scikitlearn-wrapper stacking symbolic-expressions symbolic-pipeline
Last synced: 04 May 2025
https://github.com/ZhiningLiu1998/imbalanced-ensemble
🛠️ Class-imbalanced Ensemble Learning Toolbox. | 类别不平衡/长尾机器学习库
class-imbalance classification data-mining data-science ensemble ensemble-imbalanced-learning ensemble-learning ensemble-model imbalanced-classification imbalanced-data imbalanced-learning long-tail machine-learning multi-class-classification python python3 scikit-learn sklearn
Last synced: 11 Apr 2025
https://github.com/tellery/tellery
Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.
analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql
Last synced: 16 May 2025
https://github.com/kennethleungty/mlops-specialization-notes
Notes for Machine Learning Engineering for Production (MLOps) Specialization course by DeepLearning.AI & Andrew Ng
andrew-ng course coursera data-science deep-learning deeplearningai machine-learning machine-learning-engineering machine-learning-ops ml-engineering ml-engineering-for-production mlops notes
Last synced: 16 Mar 2025
https://github.com/AnotherSamWilson/miceforest
Multiple Imputation with LightGBM in Python
data-science imputed-values mice-algorithm python random-forest
Last synced: 22 Nov 2024
https://github.com/meteostat/meteostat-python
Access and analyze historical weather and climate data with Python.
climate climate-change climate-data data-science meteostat open-data statistics weather weather-data weather-station
Last synced: 27 Nov 2024
https://github.com/kaskada-ai/kaskada
Modern, open-source event-processing
cep complex-event-processing data-science event-processing olap-engine streaming
Last synced: 22 Apr 2025
https://github.com/timkpaine/lantern
Data exploration glue
bokeh data-science ipysheet jupyter jupyter-widgets jupyterlab jupyterlab-extension matplotlib pandas perspective plotly python python3 qgrid visualization
Last synced: 04 Apr 2025
https://github.com/InseeFrLab/onyxia
🔬 Data science environment for k8s
bluehats data-science datalab helm insee kubernetes onyxia
Last synced: 27 Dec 2024