Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-12-25 00:06:44 UTC
- JSON Representation
https://github.com/epistasislab/scikit-rebate
A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
data-science feature-selection python
Last synced: 22 Dec 2024
https://github.com/EpistasisLab/scikit-rebate
A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
data-science feature-selection python
Last synced: 30 Oct 2024
https://github.com/basedosdados/sdk
⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/
bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia
Last synced: 22 Dec 2024
https://github.com/terrytangyuan/distributed-ml-patterns
Distributed Machine Learning Patterns from Manning Publications by Yuan Tang https://bit.ly/2RKv8Zo
argo argo-workflows book cloud-computing cloud-native data-science devops distributed-machine-learning distributed-systems kubeflow kubernetes large-scale-machine-learning machine-learning machine-learning-pipelines manning-publications mlops python tensorflow
Last synced: 22 Dec 2024
https://github.com/basedosdados/mais
⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/
bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia
Last synced: 13 Oct 2024
https://github.com/plotly/plotly_matlab
Plotly Graphing Library for MATLAB®
d3 d3js data-science data-visualization matlab plotly technical-computing webgl
Last synced: 20 Dec 2024
https://github.com/dagshub/fds
Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc
Last synced: 27 Dec 2024
https://github.com/DagsHub/fds
Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc
Last synced: 15 Nov 2024
https://github.com/neptune-ai/open-solution-mapping-challenge
Open solution to the Mapping Challenge :earth_americas:
competition crowdai data-science data-science-learning deep-learning kaggle lightgbm machine-learning machine-learning-lab mapping-challenge neptune pipeline pipeline-framework python satellite-imagery unet unet-image-segmentation unet-pytorch
Last synced: 22 Dec 2024
https://github.com/plotly/dashR
Create data science and AI web apps in R
dash data-science data-visualization plotly plotly-dash python r react web-application
Last synced: 27 Oct 2024
https://github.com/yzkang/My-Data-Competition-Experience
本人多次机器学习与大数据竞赛Top5的经验总结,满满的干货,拿好不谢
automl catboost data-science deep-learning feature-engineering feature-selection gan hyperparameter-optimization kaggle-competition lightgbm machine-learning model-fusion model-selection python sql tianchi-competition xgboost
Last synced: 11 Nov 2024
https://github.com/finos/jupyterlab_templates
Support for jupyter notebook templates in jupyterlab
data-science dataviz jupyter jupyterlab jupyterlab-extension machine-learning notebook
Last synced: 07 Nov 2024
https://github.com/liyangbit/PyDataLab
open source for wechat-official-account (ID: PyDataLab)
data-analysis data-mining data-science data-visualization machine-learning python wechat-official-account
Last synced: 01 Dec 2024
https://github.com/InfuseAI/primehub
open-source MLOps platform
data-science distributed-systems docker jupyter jupyterhub keycloak kubernetes machine-learning primehub primehub-ce
Last synced: 09 Nov 2024
https://github.com/wilsonrljr/sysidentpy
A Python Package For System Identification Using NARMAX Models
data-science dynamical-systems machine-learning narmax narx system-identification time-series
Last synced: 12 Nov 2024
https://github.com/operatorai/modelstore
🏬 modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud storage provider.
data-science keras machine-learning mlops modelstore python-library pytorch s3-storage scikit-learn tensorflow transformer
Last synced: 22 Dec 2024
https://github.com/thoughtworks/mlops-platforms
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...
azureml data-science databricks dataiku datarobot google-ai-platform h2oai iguazio knime kubeflow machine-learning mlflow mlops pachyderm sagemaker seldon
Last synced: 12 Nov 2024
https://github.com/solegalli/feature-engineering-for-machine-learning
Code repository for the online course Feature Engineering for Machine Learning
data-science feature-engineering feature-extraction machine-learning python
Last synced: 20 Dec 2024
https://github.com/jkrumbiegel/chain.jl
A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
data-analysis data-science julia julia-language julia-package macro pipeline
Last synced: 21 Dec 2024
https://github.com/ptyadana/data-science-and-machine-learning-projects-dojo
collections of data science, machine learning and data visualization projects with pandas, sklearn, matplotlib, tensorflow2, Keras, various ML algorithms like random forest classifier, boosting, etc
boosting-algorithms data-analysis data-science data-visualization deep-learning keras machine-learning machine-learning-algorithms natural-language-processing pandas probability-statistics scikit-learn seaborn tensorflow
Last synced: 23 Dec 2024
https://github.com/wiseaidev/rust-data-analysis
Rust for data analysis encyclopedia (WIP).
calculas data-analysis data-science eda evcxr hacktoberfest jupyter jupyter-notebook ndarray notebook plotters plotters-rs polars probability probability-distribution probability-theory rust statrs
Last synced: 22 Dec 2024
https://github.com/aunum/goro
A High-level Machine Learning Library for Go
data-science go golang machine-learning machinelearning
Last synced: 28 Oct 2024
https://github.com/adicherlavenkatasai/ml-workspace
Machine Learning (Beginners Hub), information(courses, books, cheat sheets, live sessions) related to machine learning, data science and python is available
cheat-sheets convolutional-networks data-science deep-learning deep-neural-networks gans harvard-edx interview-questions machine-learning python
Last synced: 31 Oct 2024
https://github.com/jkrumbiegel/Chain.jl
A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
data-analysis data-science julia julia-language julia-package macro pipeline
Last synced: 19 Nov 2024
https://github.com/aaronpenne/data_visualization
A collection of my data visualizations, mostly in Python.
data-science data-visualization python3 visualization
Last synced: 25 Oct 2024
https://github.com/xoolive/traffic
A toolbox for processing and analysing air traffic data
adsb air-traffic-data data-analytics data-science data-visualisation declarative-pipeline mode-s trajectory
Last synced: 27 Dec 2024
https://github.com/weijie-chen/econometrics-with-python
Tutorials of econometrics featuring Python programming. This is a crash course for reviewing the most important concepts and techniques of basic econometrics, the theories are presented lightly without hustles of derivation and Python codes are straightforward.
data-analysis data-science econometrics economics python statistics time-series
Last synced: 22 Dec 2024
https://github.com/triestpa/cryptocurrency-analysis-python
Open-Source Tutorial For Analyzing and Visualizing Cryptocurrency Data
bitcoin cryptocurrency data-analysis data-science data-visualization ethereum jupyter-notebook plotly python tutorial
Last synced: 24 Dec 2024
https://github.com/okld/streamlit-pandas-profiling
Pandas profiling component for Streamlit.
data-science demo pandas pandas-profiling python streamlit streamlit-component streamlit-pandas-profiling
Last synced: 23 Dec 2024
https://github.com/matrix-profile-foundation/matrixprofile
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
algorithms anomaly-detection clustering data-mining data-science hacktoberfest matrixprofile motif-discovery python python2 python3 segmentation time-series time-series-analysis
Last synced: 22 Dec 2024
https://github.com/maxhalford/xam
:dart: Personal data science and machine learning toolbox
data-science machine-learning preprocessing python stacking
Last synced: 24 Dec 2024
https://github.com/triestpa/Cryptocurrency-Analysis-Python
Open-Source Tutorial For Analyzing and Visualizing Cryptocurrency Data
bitcoin cryptocurrency data-analysis data-science data-visualization ethereum jupyter-notebook plotly python tutorial
Last synced: 27 Nov 2024
https://github.com/MaxHalford/xam
:dart: Personal data science and machine learning toolbox
data-science machine-learning preprocessing python stacking
Last synced: 15 Nov 2024
https://github.com/anothersamwilson/miceforest
Multiple Imputation with LightGBM in Python
data-science imputed-values mice-algorithm python random-forest
Last synced: 23 Dec 2024
https://github.com/predict-idlab/tsflex
Flexible time series feature extraction & processing
data-science feature-engineering feature-extraction multimodal multivariate pandas processing python time-series window-stride
Last synced: 02 Nov 2024
https://github.com/IBM/AutoMLPipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
automl chaining classification data-mining data-mining-algorithms data-science ensemble-learning julia machine-learning machine-learning-models pipeline pipeline-optimization pipeline-structure scikitlearn-wrapper stacking symbolic-expressions symbolic-pipeline
Last synced: 13 Nov 2024
https://github.com/ibm/automlpipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
automl chaining classification data-mining data-mining-algorithms data-science ensemble-learning julia machine-learning machine-learning-models pipeline pipeline-optimization pipeline-structure scikitlearn-wrapper stacking symbolic-expressions symbolic-pipeline
Last synced: 23 Dec 2024
https://github.com/aeturrell/skimpy
skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.
data-science eda exploratory-data-analysis pandas statistics summary-statistics
Last synced: 14 Nov 2024
https://github.com/olavolav/uniplot
Lightweight plotting to the terminal. 4x resolution via Unicode.
data-analysis data-science plot python
Last synced: 31 Oct 2024
https://github.com/joaquinamatrodrigo/estadistica-con-r
Apuntes personales sobre estadística, machine learning y lenguaje de programación R
bioestadistica data-mining data-science estadistica machine-learning mineria-de-datos r
Last synced: 22 Dec 2024
https://github.com/tellery/tellery
Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.
analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql
Last synced: 22 Dec 2024
https://github.com/azkadev/isar_inspector
Isar inspector local
dart data-science database flutter isar nosql sql
Last synced: 23 Dec 2024
https://github.com/AnotherSamWilson/miceforest
Multiple Imputation with LightGBM in Python
data-science imputed-values mice-algorithm python random-forest
Last synced: 22 Nov 2024
https://github.com/meteostat/meteostat-python
Access and analyze historical weather and climate data with Python.
climate climate-change climate-data data-science meteostat open-data statistics weather weather-data weather-station
Last synced: 27 Nov 2024
https://github.com/timkpaine/lantern
Data exploration glue
bokeh data-science ipysheet jupyter jupyter-widgets jupyterlab jupyterlab-extension matplotlib pandas perspective plotly python python3 qgrid visualization
Last synced: 20 Dec 2024
https://github.com/souzatharsis/open-quant-live-book
An open source, hands-on and fully reproducible book in quantitative finance, data science and econophysics. Join us and help Make Wall Street Great Again!
algo-trading altdata data-science econophysics financial-analysis financial-markets machine-learning open-source quantitative-finance
Last synced: 11 Nov 2024
https://github.com/InseeFrLab/onyxia
🔬 Data science environment for k8s
bluehats data-science datalab helm insee kubernetes onyxia
Last synced: 27 Dec 2024
https://github.com/finlay-liu/kaggle_public
阿水的数据竞赛开源分支
data-science kaggle-competition
Last synced: 24 Dec 2024
https://github.com/kaskada-ai/kaskada
Modern, open-source event-processing
cep complex-event-processing data-science event-processing olap-engine streaming
Last synced: 09 Nov 2024
https://github.com/astronomer/astro-sdk
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows
Last synced: 27 Dec 2024
https://github.com/RemoteML/bestofml
The best resources around Machine Learning
data-science deep-learning machine-learning machine-learning-algorithms machine-learning-tutorials paper papers
Last synced: 13 Nov 2024
https://github.com/KiranGershenfeld/VisualizingTwitchCommunities
Graphing communities on Twitch.tv in a visually intuitive way
community data-science python twitch visualization
Last synced: 25 Oct 2024
https://github.com/wagamamaz/tensorlayer-tricks
How to use TensorLayer
computer-vision data-science deep-learning keras lasagne machine-learning natural-language-processing neural-network neural-networks nlp reinforcement-learning tensorboard tensorflow tensorflow-experiments tensorflow-framework tensorflow-library tensorflow-models tensorflow-tutorials tensorlayer tflearn
Last synced: 24 Dec 2024
https://github.com/datmo/datmo
Open source production model management tool for data scientists
artificial-intelligence data-science deep-learning machine-learning reproducibility version-control
Last synced: 12 Nov 2024
https://github.com/ozlerhakan/datacamp
🍧 DataCamp data-science and machine learning courses
data-analysis data-science datacamp datacamp-course deep-learning machine-learning python statistics visualization
Last synced: 21 Dec 2024
https://github.com/scilab/scilab
Read only copy of https://gitlab.com/scilab/scilab
data-science data-structures graphical-functions mathematical-functions scientific-computing system-modeling
Last synced: 15 Nov 2024
https://github.com/theislab/cellrank
CellRank: dynamics from multi-view single-cell data
bioinformatics cell-fate-determination cell-fate-transitions data-science fuzzy-clustering-analyses genetics machine-learning manifold-learning markov-chains rna-velocity single-cell-genomics single-cell-rna-seq trajectory-generation
Last synced: 12 Nov 2024
https://github.com/primaryobjects/voice-gender
Gender recognition by voice and speech analysis
acoustic-properties ai artificial-intelligence data-science gender gender-recognition logistic-regression machine-learning neural-network signal speech vocal voice
Last synced: 23 Dec 2024
https://github.com/Malwarize/webpalm
🕸️ Crawl in the web network
crawler crawling data data-science datamining go golang hack mining osint redteam spider tool
Last synced: 08 Nov 2024
https://github.com/zhiningliu1998/imbalanced-ensemble
🛠️ Class-imbalanced Ensemble Learning Toolbox. | 类别不平衡/长尾机器学习库
class-imbalance classification data-mining data-science ensemble ensemble-imbalanced-learning ensemble-learning ensemble-model imbalanced-classification imbalanced-data imbalanced-learning long-tail machine-learning multi-class-classification python python3 scikit-learn sklearn
Last synced: 20 Dec 2024
https://github.com/darwinex/DarwinexLabs
Datasets, tools and more from Darwinex Labs - Prop Investing Arm & Quant Team @ Darwinex
algorithmic-trading artificial-intelligence data-science deep-learning investing machine-learning neural-networks quantitative-finance sentiment-analysis systematic-trading-strategies trading-strategies
Last synced: 12 Nov 2024
https://github.com/datalayer/jupyter-ui
⚛️ React.js components 💯% compatible with 🪐 Jupyter - Storybook on https://jupyter-ui-storybook.datalayer.tech
data data-product data-science data-visualisation datalayer ipywidgets jupyter jupyterlab lumino notebook reactjs ui
Last synced: 27 Dec 2024
https://github.com/apecloud/myduckserver
MySQL & Postgres Analytics, Reimagined
analytics arrow business-analytics business-intelligence columnar-storage data-engineering data-science database duckdb htap mariadb mysql olap pandas parquet polars postgres replication sql zero-etl
Last synced: 20 Dec 2024
https://github.com/larswaechter/voici.js
A Node.js library for pretty printing your data on the terminal🎨
console data-science javascript shell terminal tty typescript
Last synced: 31 Oct 2024
https://github.com/ibm/lale
Library for Semi-Automated Data Science
artificial-intelligence automated-machine-learning automl data-science dataquality hyperparameter-optimization hyperparameter-search hyperparameter-tuning ibm-research ibm-research-ai interoperability machine-learning pipeline-testing pipeline-tests python scikit-learn
Last synced: 20 Dec 2024
https://github.com/mikekeith52/scalecast
The practitioner's forecasting library
auto-ml data-science deep-learning easy-to-use forecasting keras lstm machine-learning mase msis pandas python recurrent-neural-networks scikit-learn scikit-learn-python smape time-series vecm
Last synced: 20 Dec 2024
https://github.com/IBM/lale
Library for Semi-Automated Data Science
artificial-intelligence automated-machine-learning automl data-science dataquality hyperparameter-optimization hyperparameter-search hyperparameter-tuning ibm-research ibm-research-ai interoperability machine-learning pipeline-testing pipeline-tests python scikit-learn
Last synced: 15 Nov 2024
https://github.com/yzhao062/data-mining-conferences
Ranking, acceptance rate, deadline, and publication tips
data-mining data-science research
Last synced: 27 Dec 2024
https://github.com/ZhiningLiu1998/imbalanced-ensemble
🛠️ Class-imbalanced Ensemble Learning Toolbox. | 类别不平衡/长尾机器学习库
class-imbalance classification data-mining data-science ensemble ensemble-imbalanced-learning ensemble-learning ensemble-model imbalanced-classification imbalanced-data imbalanced-learning long-tail machine-learning multi-class-classification python python3 scikit-learn sklearn
Last synced: 07 Nov 2024
https://github.com/jovianhq/opendatasets
A Python library for downloading datasets from Kaggle, Google Drive, and other online sources.
data-science datasets machine-learning python
Last synced: 21 Dec 2024
https://github.com/anonyfox/elixir-scrape
Scrape any website, article or RSS/Atom Feed with ease!
data-science elixir feed html information-retrieval readability rss scrape scraping
Last synced: 25 Dec 2024
https://github.com/Anonyfox/elixir-scrape
Scrape any website, article or RSS/Atom Feed with ease!
data-science elixir feed html information-retrieval readability rss scrape scraping
Last synced: 01 Nov 2024
https://github.com/machine-learning-apps/Issue-Label-Bot
Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"
bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow
Last synced: 25 Oct 2024
https://github.com/machine-learning-apps/issue-label-bot
Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"
bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow
Last synced: 29 Sep 2024
https://github.com/profjsb/python-seminar
Python for Data Science (Seminar Course at UC Berkeley; AY 250)
data-science distributed-computing machine-learning python visualization
Last synced: 27 Nov 2024
https://github.com/hugoblox/theme-research-group
👥 轻松创建研究组或组织网站 Easily create a stunning Research Group, Team, or Business Website with no-code
academia academic blogdown college-website data-science hugo hugo-theme landing-page landing-page-theme r research research-group research-lab research-lab-website research-tool team-website university university-website wowchemy
Last synced: 21 Dec 2024
https://github.com/maxhumber/redframes
General Purpose Data Manipulation Library
Last synced: 26 Dec 2024
https://github.com/upgini/upgini
Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs
automated-feature-engineering automl automl-pipeline chatgpt data-enrichment data-science feature-engineering feature-extraction feature-selection features kaggle kaggle-solution large-language-models llm machine-learning open-data open-datasets public-data python-library scikit-learn
Last synced: 27 Dec 2024
https://github.com/tommyod/efficient-apriori
An efficient Python implementation of the Apriori algorithm.
apriori-algorithm association-rules data-mining data-science machinelearning
Last synced: 26 Dec 2024
https://github.com/leonvanbokhorst/notebooks-statistics-and-machinelearning
Jupyter Notebooks from the old UnsupervisedLearning.com (RIP) machine learning and statistics blog
data-science datascience ipynb ipynb-jupyter-notebook ipynb-notebook ipython-notebook jupiter-notebook jupyter-notebook machine-learning machine-learning-algorithms machinelearning python statistics
Last synced: 22 Dec 2024
https://github.com/leonvanbokhorst/NoteBooks-Statistics-and-MachineLearning
Jupyter Notebooks from the old UnsupervisedLearning.com (RIP) machine learning and statistics blog
data-science datascience ipynb ipynb-jupyter-notebook ipynb-notebook ipython-notebook jupiter-notebook jupyter-notebook machine-learning machine-learning-algorithms machinelearning python statistics
Last synced: 27 Nov 2024
https://github.com/gdsbook/book
This book serves as an introduction to a whole new way of thinking systematically about geographic data, using geographical analysis and computation to unlock new insights hidden within data.
data-analysis-python data-science geographic-data geographical-information-system spatial-analysis spatial-data-analysis spatial-statistics statistics
Last synced: 27 Oct 2024
https://github.com/khuangaf/pytorch-geometric-yoochoose
This is a tutorial for PyTorch Geometric on the YooChoose dataset
artificial-intelligence data-science deep-learning deep-neural-networks graph-neural-networks machine-learning
Last synced: 25 Dec 2024
https://github.com/jrnold/r4ds-exercise-solutions
Exercise solutions to "R for Data Science"
bookdown data-science dplyr exercise-solutions ggplot2 r r4ds rmarkdown tidyr tidyverse
Last synced: 20 Dec 2024
https://github.com/autonlab/auton-survival
Auton Survival - an open source package for Regression, Counterfactual Estimation, Evaluation and Phenotyping with Censored Time-to-Events
causal-inference counterfactual-inference data-science deep-learning graphical-models machine-learning python regression reliability-analysis survival-analysis time-to-event
Last synced: 12 Nov 2024
https://github.com/microsoft/genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
data-generation data-science machine-learning ner ocr-recognition python synthetic-data synthetic-data-generation synthetic-images text-alignment
Last synced: 21 Dec 2024
https://github.com/databrickslabs/tempo
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
data-analysis data-science pandas python scala time-series timeseries timeseries-analysis timeseries-data
Last synced: 11 Nov 2024
https://github.com/weecology/retriever
Quickly download, clean up, and install public datasets into a database management system
data data-retrieval data-science dataset datasets hacktobefest python
Last synced: 04 Nov 2024
https://github.com/noahgift/functional_intro_to_python
[tutorial]A functional, Data Science focused introduction to Python
commandline data-science functional-programming ipynb jupyter-notebook learning-by-doing machine-learning optimization pandas python python3 screencast spot-price tutorial
Last synced: 21 Dec 2024
https://github.com/Technion-Kishony-lab/quibbler
Your data - interactive!
data-analysis data-science data-visualization declarative graphics gui interactive jupyter matplotlib python widgets
Last synced: 30 Oct 2024
https://github.com/kamu-data/kamu-cli
Next-generation decentralized data lakehouse and a multi-party stream processing network
blockchain data-as-code data-management data-science datafusion flink jupyter kamu open-data open-data-fabric spark sql
Last synced: 20 Dec 2024
https://github.com/sn3fru/datascience_course
Curso de Data Science em Português
artificial-intelligence brasil curso dados data data-analysis data-science data-science-learning dataset deep-learning machine-learning python
Last synced: 11 Nov 2024
https://github.com/ml-tooling/ml-hub
🧰 Multi-user development platform for machine learning teams. Simple to setup within minutes.
data-science docker jupyter jupyterhub machine-learning python
Last synced: 27 Dec 2024
https://github.com/solegalli/feature-selection-for-machine-learning
Code repository for the online course Feature Selection for Machine Learning
data-science feature-selection machine-learning python
Last synced: 21 Dec 2024
https://github.com/tirthajyoti/pydbgen
Random dataframe and database table generator
data-generation data-science database fake-data generator pandas-dataframe python random-generation sqlite sqlite3 synthetic-data synthetic-dataset-generation
Last synced: 21 Dec 2024
https://github.com/mljar/plotai
PlotAI - Your Ultimate Plotting Assistant! 📊🤖 Use ChatGPT-3.5 to create plots in Python and Matplotlib directly in your Python script or notebook.
charts chatgpt data-science llm matplotlib plots python visualization
Last synced: 09 Nov 2024
https://github.com/CJWorkbench/cjworkbench
The data journalism platform with built in training
data-analysis data-journalism data-science data-visualization journalism notebook
Last synced: 24 Nov 2024
https://github.com/kennethleungty/mlops-specialization-notes
Notes for Machine Learning Engineering for Production (MLOps) Specialization course by DeepLearning.AI & Andrew Ng
andrew-ng course coursera data-science deep-learning deeplearningai machine-learning machine-learning-engineering machine-learning-ops ml-engineering ml-engineering-for-production mlops notes
Last synced: 22 Nov 2024
https://github.com/alibaba/feathub
FeatHub - A stream-batch unified feature store for real-time machine learning
apache-flink data data-engineering data-quality data-science feature-engineering feature-store machine-learning mlops streaming
Last synced: 05 Nov 2024
https://github.com/tommyod/Efficient-Apriori
An efficient Python implementation of the Apriori algorithm.
apriori-algorithm association-rules data-mining data-science machinelearning
Last synced: 30 Oct 2024